Labeling Backend Software Engineer
About team
Avride is at the forefront of autonomous mobility, developing and deploying state-of-the-art self-driving cars and delivery robots. We’re shaping the future of transportation and logistics—and our Labeling Team plays a vital role in bringing that vision to life.
The Labeling Backend Team builds the data infrastructure that powers groundbreaking research and development across our labeling pipelines, data preparation workflows, and model training processes. The high-quality labeled data we deliver is critical to advancing our core technologies and supports the diverse range of models that drive our entire business.
About the Role
As a Software Engineer on the Labeling Backend team, you will design, build, and scale backend systems that produce the large-scale datasets powering our perception and machine-learning pipelines. You’ll work across data platforms, orchestration, and backend services to enable rapid iteration on cutting-edge autonomous systems.
What You'll Do
- Design, develop, and own backend systems that manage the full lifecycle of labeled data—from ingestion to delivery—including core Python services, APIs, and databases.
- Build and scale data preparation and labeling pipelines to transform massive sensor streams (lidar, camera, radar) into label-ready formats and manage annotation workflows at scale, leveraging both automated and human labeling with modern orchestration frameworks.
- Design and implement robust integrations with external vendors and internal tools to ensure seamless data and feedback flow across labeling workflows.
- Develop automated validation and QA systems to safeguard dataset integrity, monitor quality, and surface key performance insights.
- Collaborate with ML and Perception teams to translate evolving requirements into scalable data solutions that directly drive model improvements.
- Boost human labeling efficiency by integrating ML automation into annotation workflows.
What You’ll Need
- Bachelor's degree in Computer Science, Engineering, or related field, and 3+ years equivalent experience
- Solid understanding of algorithms, data structures, and system design
- Strong proficiency in Python, including backend frameworks and asynchronous stacks (asyncio, aiohttp, FastAPI)
- Expertise with relational databases (preferably PostgreSQL): schema design, query optimization, ORMs (e.g., SQLAlchemy, Django ORM), and migrations (e.g., Alembic)
- Familiarity with cloud platforms and containerization (Docker, Kubernetes)
- Hands-on experience with cloud object storage (AWS S3, GCS, etc.)
Nice to Have
- Experience with workflow orchestration (Argo, Airflow, etc.)
- Experience in large-scale distributed systems or products
- Big-data / distributed data-processing tooling (Spark, PyArrow, etc.)
- Experience working with 3D sensor data (point clouds, sensor fusion, etc.)
- Previous work in AV, robotics, or ML-heavy environments
- C++ development experience
Candidates are required to be authorized to work in the U.S. The employer is not offering relocation sponsorship, and remote work options are not available.
Apply for this job
*
indicates a required field