Audio Solutions Architect
Innodata (Nasdaq: INOD) is a global data engineering company. We believe that data and Artificial Intelligence (AI) are inextricably linked. Our mission is to enable the responsible advancement of artificial intelligence by providing the data, evaluation frameworks, and human expertise required to build AI systems that can be trusted at scale. We provide a range of transferable solutions, platforms, and services for Generative AI / AI builders and adopters. In every relationship, we honor our 36+ year legacy delivering the highest quality data and outstanding outcomes for our customers.
Scope of the Role:
Innodata builds the high-quality voice and audio datasets that power the world's leading speech AI — text-to-speech, speech recognition, and the new generation of speech-to-speech and conversational voice models. We're hiring an Audio Solutions Architect to be both the technical partner to our customers in presales and the external technical voice of our audio practice.
This is a hybrid role with two equally weighted halves. In presales, you sit with a frontier lab or enterprise team, understand what they're trying to train, and shape the data collection program that gets them there. In thought leadership, you keep us at the frontier of speech AI — producing go-to-market research and content, speaking at conferences, and establishing Innodata as the most technically credible audio data partner in the market. The two reinforce each other.
What You’ll Own:
Presales & solutioning
- Partner with customers in presales to understand their model objectives, current data gaps, and technical constraints.
- Shape requirements: define acoustic specs, language/accent coverage, speaker demographics, emotional/paralinguistic range, transcript and metadata schema, and QA targets (WER/DER, LUFS, etc.).
- Translate requirements into scoped execution plans — volumes, timelines, methodology, pricing inputs — in partnership with delivery.
- Serve as the credible technical voice in the room: explain tradeoffs (studio vs. real-world vs. telephonic, scripted vs. spontaneous, single vs. multi-speaker) and defend methodology choices.
- Build reusable solutioning assets: scoping frameworks, spec templates, reference architectures for common audio data use cases.
Thought leadership & GTM
- Stay at the tip of the spear on speech-AI developments (TTS, ASR, speech-to-speech) and what data the next generation of models will need.
- Produce go-to-market material: technical blog posts, white papers, benchmark reports, and reference content that demonstrates Innodata's depth.
- Represent Innodata externally: speak at and work conferences (Interspeech, ICASSP, industry events), engage the speech-AI community, and build our public technical profile.
- Feed market intelligence back into strategy — advise on emerging data categories and where to invest ahead of demand.
You’ll Thrive in This Role If You Have:
- Deep working knowledge of speech/audio AI: how TTS, ASR, and speech-to-speech systems are trained and evaluated, and what data they require.
- Experience in a solutions engineering, solutions architect, technical presales, or applied/forward-deployed role — or a technical audio/speech background plus strong commercial instincts.
- Demonstrated ability (and appetite) to produce public-facing technical content and represent a company externally — writing, speaking, or community engagement.
- Ability to shape ambiguous requirements into precise specs and communicate them to both researchers and business stakeholders.
- Strong presence and persuasion; comfortable being the technical authority in a sales conversation and on a conference stage.
- Familiarity with audio technical specifications (sample rates, LUFS, formats), transcript/metadata schemas, and quality metrics (WER, DER).
- A public body of work in speech/audio: talks, papers, blog posts, benchmarks.
- Hands-on experience with speech datasets, annotation, or audio production.
- Background working with or at a frontier AI lab or voice-AI product company.
- Multilingual / localization exposure.
The expected salary range for this position is $150,000 – $230,000 USD per year, based on experience, skills, and qualifications.
Please be aware of recruitment scams involving individuals or organizations falsely claiming to represent employers. Innodata will never ask for payment, banking details, or sensitive personal information during the application process. To learn more on how to recognize job scams, please visit the Federal Trade Commission’s guide at https://consumer.ftc.gov/articles/job-scams.
If you believe you’ve been targeted by a recruitment scam, please report it to Innodata at verifyjoboffer@innodata.com and consider reporting it to the FTC at ReportFraud.ftc.gov.
Apply for this job
*
indicates a required field
