Software Research Engineer (ML)
About team
Avride is at the forefront of autonomous mobility, developing and deploying state-of-the-art self-driving cars and delivery robots. We’re shaping the future of transportation and logistics—and our Labeling Team plays a vital role in bringing that vision to life.
The Labeling Backend Team builds the data infrastructure that powers groundbreaking research and development across our labeling pipelines, data preparation workflows, and model training processes. The high-quality labeled data we deliver is critical to advancing our core technologies and supports the diverse range of models that drive our entire business.
About the Role
We are looking for a Research Engineer to improve the quality and representativeness of datasets powering our self-driving systems. You will design algorithms and tools for auto-labeling, data mining and dataset monitoring, combining strong Python engineering with applied ML concepts. Your work will directly enhance data efficiency, reduce labeling costs, and improve model performance.
What You'll Do
- Design and implement algorithms that optimize annotation, including auto-labeling systems that reduce manual effort and increase throughput
- Build data-mining and active-learning pipelines to surface the highest-value samples for training
- Create dataset-quality monitoring systems identifying noise, redundancy, and low-value data
- Develop analytics platforms (databases, dashboards, reporting) to track dataset quality and coverage over time
- Collaborate with ML and Perception teams to integrate research results into production workflows
- Explore emerging approaches (vision-language models, weak supervision, uncertainty estimation) to expand dataset quality and automation
What You’ll Need
- Bachelor’s or Master’s degree in Computer Science or related field
- Strong Python skills for algorithm development and prototyping
- Solid understanding of ML concepts (metrics, evaluation, dataset sampling, etc.)
- Experience with data processing and analysis at scale
- Ability to move between research prototyping and production engineering
- Strong analytical mindset and curiosity to dig deep into data quality problems
Nice to Have
- Experience with auto-labeling, weak supervision, or human-in-the-loop ML
- Exposure to 3D data (point clouds, sensor fusion, 3D annotation pipelines)
- Background in AV, robotics, or large-scale ML dataset development
- Experience with foundation models / VLMs
- Workflow orchestration systems (Argo, Airflow, etc.)
- Backend engineering experience (APIs, ORMs, databases)
- Experience building dashboards or analytics systems (Grafana, Superset, etc.)
Candidates are required to be authorized to work in the U.S. The employer is not offering relocation sponsorship, and remote work options are not available.
Apply for this job
*
indicates a required field