
Machine Learning Ops Engineer
For more than 20 years, our global network of passionate technologists and pioneering craftspeople has delivered cutting-edge technology and game-changing consulting to companies on the brink of AI-driven digital transformation. Since 2001, we have grown into a full-service digital consulting company with 5500+ professionals working on a worldwide ambition. Driven by the desire to make a difference, we keep innovating. Fueling the growth of our company with our knowledge worker culture. When teaming up with Xebia, expect in-depth expertise based on an authentic, value-led, and high-quality way of working that inspires all we do.
About the Role
We’re seeking a skilled and driven MLOps Engineer to join our Data & AI practice at Xebia. In this role, you’ll play a critical part in automating, scaling, and operationalizing machine learning workflows in the AWS ecosystem. You will work closely with data scientists, engineers, and DevOps teams to enable reliable deployment, monitoring, and lifecycle management of ML models across production environments. This role is ideal for an engineer who thrives at the intersection of machine learning and infrastructure automation.
What You’ll Do
MLOps Platform Engineering
- Design and implement robust MLOps pipelines using AWS SageMaker, Lambda, Docker, and CDK.
- Automate model training, validation, registration, and deployment of workflows across environments.
- Implement batch inference and real-time endpoint deployments with proper scalability and governance.
- Configure and monitor ML endpoints, tracking performance, data drift, and model health metrics.
Infrastructure & CI/CD Automation
- Build and manage cloud-native infrastructure with AWS CDK to support ML pipelines, storage, and APIs.
- Set up GitLab CI pipelines for ML workflows, including automated testing, artifact scanning, and vulnerability detection.
- Implement deployment strategies (blue-green, canary) for model rollouts and updates.
- Integrate API Gateway, Lambda, and Docker-based services into ML lifecycle workflows.
Model Lifecycle Management
- Collaborate with data scientists to convert notebooks into reproducible training and inference pipelines.
- Manage the full lifecycle of ML models: training, feature engineering, versioning, and deployment.
- Set up robust monitoring for model drift, feature quality, and prediction accuracy.
- Use YAML, metadata tracking tools, and Git-based workflows to standardize and document ML processes.
What You Bring
- 5+ years of experience in engineering roles, with at least 2+ years in MLOps or ML-focused infrastructure
- Strong proficiency with AWS services such as SageMaker, Lambda, API Gateway, CDK, and S3
- Practical experience implementing CI/CD workflows with GitLab CI, including artifact scanning and security practices
- Experience containerizing and deploying ML workloads using Docker and managing configurations via YAML
- Deep understanding of MLOps concepts including model registration, versioning, endpoint deployment, and monitoring
- Familiarity with data science workflows including feature engineering, training pipelines, and batch inference
- Solid understanding of SDLC, agile delivery, and best practices in software reliability and security
Nice to have:
- Exposure to model monitoring frameworks or tools like SageMaker Model Monitor, Evidently, or Prometheus
- Experience with feature stores, ML metadata tracking (MLflow, SageMaker Experiments), or data quality tooling
- Hands-on experience with IaC for secure API development and model service orchestration
Apply for this job
*
indicates a required field