
Back to jobs
Data Operations Engineer
Mountain View, CA
About Abaka AI
Abaka AI is built on one mission: to be the world’s most trusted data partner for AI companies. More than 1,000 industry leaders across Generative AI, Embodied AI, and Automotive AI rely on us to power their data pipelines. With our headquarters in Silicon Valley—and teams in Paris, Singapore, and Tokyo—we support global partners with fast, reliable, and scalable data solutions.
Our offerings include a diverse catalog of off-the-shelf datasets (image, video, multimodal, reasoning, 3D, and beyond) as well as comprehensive data collection and annotation services. Whether teams need raw data, curated datasets, or full-cycle data engineering, Abaka AI provides the foundation for building high-performance AI systems.
About the Role
We are hiring a Data Operations Engineer to own and operate Abaka AI’s internal dataset library. This role will serve as the central point of knowledge for all datasets across the company, working closely with engineering, product, and business teams to ensure fast, accurate, and scalable access to data.
You will develop a deep understanding of our dataset inventory, including structure, quality, and use cases, and act as the primary point of contact for internal data-related questions. You will translate ambiguous requests into clear solutions, validate dataset quality, and coordinate across global teams to resolve issues efficiently.
This role is highly cross-functional and requires strong problem-solving ability, technical fluency, and a high level of ownership. You will play a critical role in improving how datasets are organized, accessed, and utilized across the company.
Responsibilities
-
Develop and maintain a comprehensive understanding of Abaka AI’s dataset library, including data structure, quality, and applicable use cases across modalities (text, image, video, audio, 3D).
-
Serve as the internal point of contact for dataset-related inquiries, providing clear and timely responses to questions from engineering, product, and business teams.
-
Translate ambiguous or high-level requests into concrete dataset solutions, identifying appropriate data sources or gaps.
-
Inspect and validate datasets for quality, completeness, and consistency using SQL, Python, or other tools as needed.
-
Coordinate with global data teams, including teams in China, to resolve data issues, clarify requirements, and ensure timely delivery without unnecessary escalation.
-
Maintain and improve internal documentation, organization, and accessibility of datasets.
-
Identify inefficiencies in current workflows and propose improvements to systems, tooling, and processes that support dataset management and usage.
-
Support cross-functional initiatives by providing dataset insights, technical context, and operational guidance.
Qualifications
-
Bachelor’s degree in Computer Science, Data Engineering, or a related field, or equivalent practical experience.
-
1–4 years of experience in data operations, data engineering, or a related role involving direct interaction with datasets.
-
Professional proficiency in Mandarin Chinese and English is required, as this role involves frequent collaboration with China-based vendors and external partners
-
Strong problem-solving skills and ability to operate effectively in ambiguous, fast-paced environments.
-
Proficiency in SQL and/or Python for data inspection, validation, and basic analysis.
-
Experience working with real-world datasets, including handling data quality issues, inconsistencies, and edge cases.
-
Strong communication skills, with the ability to work across technical and non-technical teams.
-
High level of ownership and accountability, with the ability to manage multiple requests and priorities simultaneously.
Preferred Qualifications
-
Experience with multimodal datasets (text, image, video, audio, or 3D).
-
Familiarity with data annotation, labeling workflows, or dataset preparation for machine learning.
-
Experience working with international teams, particularly in cross-border environments.
-
Exposure to AI/ML workflows, including training, fine-tuning, or evaluation datasets.
Compensation & Benefits
The base salary range for this position is $110,000 - $160,000 USD annually.
Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies and experience. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work at Abaka AI. This role is eligible for equity, as well as a comprehensive benefits package (health, dental, vision, PTO, flexible work schedule).
Apply for this job
*
indicates a required field
