
DevOps/MLOps Engineering Consultant - Office of the CTO
Join a high-performing team at Sonatus that’s redefining what cars can do in the era of Software-Defined Vehicles (SDV).
At Sonatus, we’re driving the transformation to AI-enabled software-defined vehicles. Traditional automotive software methods can’t keep pace with consumer expectations shaped by the mobile industry—where features evolve rapidly, update seamlessly, and improve continuously. That’s why leading OEMs trust Sonatus to accelerate this shift. Our technology is already in production across more than 5 million vehicles on the road today and rapidly expanding.
Headquartered in Sunnyvale, CA, with 250+ employees worldwide, Sonatus combines the agility of a fast-growing company with the scale and impact of an established partner. Backed by strong funding and proven by global deployment, we’re solving some of the most interesting and complex challenges in the industry. Join us and help redefine what’s possible as we shape the future of mobility.
Role Summary:
We are seeking a highly experienced and strategic DevOps & MLOps Engineering Consultant to architect, build, and scale our end-to-end DevOps and MLOps platform. In this role, you will be responsible for the full cloud CI/CD pipeline, cloud infrastructure management, and machine learning model lifecycle, from implementing the MLOps framework that enables models to move from experimentation to production with velocity and reliability, to managing the serving infrastructure. You'll leverage deep expertise in DevOps and MLOps and Site Reliability Engineering (SRE) to make critical decisions that span model training, serving, and monitoring. This is a key leadership position for a hands-on engineer who will define our model versioning, production observability, and infrastructure-as-code best practices.
Roles and Responsibilities:
- Design and build the foundational, end-to-end DevOps and MLOps platform for our Generative AI systems, making critical decisions that span large language model-based systems evaluation, monitoring, and deployment
- Implement the full DevOps and MLOps framework. You will build the CI/CD/CT (Continuous Integration/Delivery/Training) automation that takes models from experiment to production with velocity and reliability.
- Deploy, scale, and optimize our model serving infrastructure. You will manage GPU/NPU resources, minimize inference latency, and build robust monitoring to ensure our AI is always fast, accurate, and cost-effective.
- Create a single, cohesive set of best practices for the entire AI lifecycle. Your work will define how we handle model versioning, infrastructure as code, and production observability in one seamless system.
Requirements:
- A seasoned engineer with 8+ years of experience building and scaling production-grade cloud services and systems, with a strong focus on DevOps, MLOps, and/or SRE.
- A "systems thinker" with a demonstrated ability to architect end-to-end solutions and a deep understanding of the full CI/CD pipeline and machine learning lifecycle.
- Deep proficiency in Python and Infrastructure as Code (e.g., Terraform, Pulumi, etc.).
- Experience with MLOps tools (e.g., MLflow, Kubeflow, Vertex AI) and production monitoring frameworks
- Enforce reproducibility, approvals, audit trails, PII handling, model cards, and policy/compliance (e.g., privacy, evals, guardrails).
- Experience with robust ML deployment systems (e.g., Kubeflow, MLflow, model servers like BentoML or TensorFlow Serving).
- Hands-on experience with public cloud platforms (GCP, AWS, and/or Azure) and containerization/orchestration (Docker, Kubernetes).
- Package, version, and deploy software modules and AI models (batch & online) with blue/green or canary rollouts; build feature & model registries, and automate retraining
- Experience with Pytorch, vLLMs, and GPUs a plus
- Experience with tracking Modes and Agentic drift is a plus
- Experience with tuning serving stacks (GPU/CPU utilization, batching, quantization)
- Direct experience building and operationalizing systems for LLMs, especially RAG pipelines, is a plus
- Experience with vector databases (e.g., Pinecone, Weaviate) and embedding management from a deployment and scaling perspective is a plus
Create a Job Alert
Interested in building your career at Sonatus? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field