
Audio Quality and Data Engineer
About Hark
Hark is an artificial intelligence company building advanced, personalized intelligence. One that is proactive, multimodal, and capable of interacting with the world through speech, text, vision, and persistent memory.
We're pairing that intelligence with next-generation hardware to create a universal interface between humans and machines. While today's AI largely operates through chat boxes and decade-old devices, Hark is focused on what comes next: agentic systems that interact naturally with people and the real world.
To get there, we're developing multimodal models and next-generation AI hardware together - designed from the ground up as a single, unified interface for a new era of intelligent systems.
About the Role
Voice is the primary interface between Hark's AI and the people who use it, so audio quality is a product requirement, not an afterthought. We are hiring an Audio Quality and Data Engineer to own the end-to-end measurement of how our voice AI sounds, listens, and behaves — from the microphone array on the device, through the speech enhancement front end, across telephony and network paths, and into the conversational model itself.
In this role, you will design test methodologies and metrics, build the automated test infrastructure that runs them, and lead the audio data collection efforts that feed both evaluation and model training. Your work will directly shape how customers experience our consumer AI devices and voice AI agents, and the quality bar you define will become the bar the rest of the company ships against.
Responsibilities
- Define and execute audio quality evaluation for Hark's voice AI conversation system, including end-to-end conversational metrics (latency, barge-in behavior, turn-taking, intelligibility, naturalness) and component-level metrics for STT, TTS, and dialogue.
- Evaluate audio quality of consumer AI devices across the full signal chain: acoustic echo cancellation (AEC), noise suppression (NS), beamforming, dereverberation, automatic gain control, and speech enhancement.
- Characterize telephony audio quality — loudness, latency, jitter, packet loss resilience, codec behavior, double-talk performance, and compliance with ITU-T recommendations.
- Design, build, and maintain an automated audio test system: instrumented test fixtures, playback/capture orchestration, signal generation, scoring pipelines, and CI integration for regression coverage.
- Lead audio data collection programs: define recording protocols, acoustic conditions, device configurations, and speaker/language diversity targets; manage capture sessions and partner vendors; deliver labeled, structured datasets for both evaluation and model training.
- Investigate audio quality regressions and field issues — reproduce in the lab, isolate root cause across hardware, DSP, network, and model layers, and partner with hardware, firmware, ML, and platform teams to drive fixes.
- Establish the audio quality bar for product launches: write test plans, define pass/fail criteria, run pre-launch verification, and communicate results and risks to engineering and product leadership.
Requirements
- 5+ years of experience in audio quality, audio test, or audio DSP engineering, ideally on consumer audio devices, voice communication systems, or voice AI products.
- Strong background in digital signal processing and statistics, with hands-on understanding of AEC, noise reduction, beamforming, AGC, and speech enhancement algorithms.
- Working knowledge of telecommunication and audio quality standards (ITU-T P-series and G-series, 3GPP, TIA), and the objective metrics built on them (POLQA, PESQ, STOI, DNSMOS, etc.).
- Experience building automated audio test infrastructure in Python (or similar), including instrument control, signal generation/analysis, and scoring pipelines.
- Hands-on experience with professional audio test equipments: Audio Precision, ACQUA, HATS, B&K / GRAS measurement microphones, artificial ears, R&S or equivalent RF/network test gear.
- Demonstrated ownership of audio data collection: protocol design, session execution, dataset curation, and quality control.
- Skilled at troubleshooting complex audio issues across hardware, software, and acoustic environments, and at communicating findings to cross-functional engineering teams.
- B.S. in Electrical Engineering, Computer Engineering, or a related field; M.S. or Ph.D. preferred.
Bonus Qualifications
- Experience evaluating LLM-based voice agents or full-duplex conversational systems (turn-taking, interruption handling, end-pointing, latency budgets).
- Familiarity with ASR evaluation methodology — WER, robustness testing, regression criteria, and dataset construction for noisy/far-field conditions.
- Contributions to industry standards bodies (ITU-T, 3GPP, IEEE) or published audio quality research.
- Experience setting up an audio test lab from scratch — equipment selection, acoustic treatment, calibration procedures.
- Familiarity with embedded audio platforms, Bluetooth audio stacks, or wearable/hearable form factors.
- C/C++ proficiency for working close to the DSP or firmware layer.
- Patents or publications in audio signal processing, echo cancellation, or speech enhancement.
Compensation
The US base salary range for this full-time position is between $250,000 - $300,000 annually.
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
Apply for this job
*
indicates a required field