
Data Collection
About Hark
Hark is an artificial intelligence company building advanced, personalized intelligence. One that is proactive, multimodal, and capable of interacting with the world through speech, text, vision, and persistent memory.
We're pairing that intelligence with next-generation hardware to create a universal interface between humans and machines. While today's AI largely operates through chat boxes and decade-old devices, Hark is focused on what comes next: agentic systems that interact naturally with people and the real world.
To get there, we're developing multimodal models and next-generation AI hardware together - designed from the ground up as a single, unified interface for a new era of intelligent systems.
About the Role
You'll own data collection at Hark — the programs, the vendors, and the pipelines that turn raw signal into training data our models can actually learn from.
That means running end-to-end campaigns across human feedback, synthetic data, and product-embedded signals. The quality of what we collect shapes the quality of what we ship, and this role owns that loop.
This is a high-ownership role on a small team. You'll work directly with researchers, engineers, and external partners, and the data you deliver will directly influence how our models behave in the real world.
Responsibilities
- Design and run data collection programs end-to-end — scoping requirements, writing instructions, defining success criteria, and driving execution with vendors and annotators.
- Manage external vendor relationships. Be the primary interface between Hark and data partners, keeping quality high and timelines on track.
- Assess collected data using internal tooling, identify quality issues, and feed clear, actionable feedback back to vendors and annotators.
- Collaborate closely with model researchers and engineers to understand what data is needed, translate that into operational plans, and deliver.
- Track program metrics, surface insights, and drive continuous improvements to quality, throughput, and process.
- Identify gaps in tooling and workflows and propose concrete improvements.
Requirements
- Operational excellence. You can manage multiple programs simultaneously, keep track of details under pressure, and bring structure to fast-moving situations.
- Experience working with external vendors or contractors. You know how to set expectations, manage relationships, and hold partners accountable to quality.
- A knack for data. You've gone beyond surface-level metrics — you dig in, find patterns, and use what you find to make things better.
- Strong communication. You can translate between research requirements and operational reality, and you keep everyone aligned without letting things slip.
- Comfort with ambiguity and fast iteration. You take a rough problem, build a process around it, get feedback, and tighten it quickly.
- Genuine curiosity about AI. You don't need to be an ML researcher, but you care about how models learn and why data quality matters.
- 2+ years of relevant experience in data operations, program management, or a related field.
Bonus Qualifications
- Experience managing human feedback or preference data programs.
- Familiarity with data annotation platforms or labeling pipelines.
- Experience with synthetic data generation or evaluation dataset design.
- Background working at a fast-moving AI or research-driven company.
Compensation
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
Apply for this job
*
indicates a required field