Back to jobs
New

Machine Learning Deployment Engineer

Montreal, Canada

About the Company

At Torc, we have always believed that autonomous vehicle technology will transform how we travel, move freight, and do business.

A leader in autonomous driving since 2007, Torc has spent over a decade commercializing our solutions with experienced partners. Now a part of the Daimler family, we are focused solely on developing software for automated trucks to transform how the world moves freight.

Join us and catapult your career with the company that helped pioneer autonomous technology, and the first AV software company with the vision to partner directly with a truck manufacturer.

Job Description Summary

The model development department is an ML deployment engineer who will deploy our next generation machine learning models for our autonomous driving stack.

As a senior engineer of the team, you are applying machine learning science in a production focused environment. You are using machine learning models in both a unimodal and multimodal context, to solve all tasks across the functional autonomous driving stack. Training, validation, data science, architectural design are your daily work. You are interested in understanding how your model performs in deployment, for what you collaborate closely with deployment focused teams. You mentor and guide more junior members of the team and are always interested in the newest trends in research, eager to translate scientific improvements into our production grade machine learning pipelines.

Meet the team

Torc's Autonomy Applications software utilizes cutting-edge deep learning techniques to perceive the vehicle's environment, predict the movements of other vehicles, and execute accurate driving decisions. We are actively seeking an experienced ML deployment engineer to join our model development department. This is an exceptional opportunity for you to have a significant impact on the future of the autonomous vehicle industry by leveraging AI.

 

What You’ll Do:

Model Deployment & Optimization

Deploy and optimize machine learning models for production environments, ensuring real-time performance and resource efficiency on edge devices and automotive-grade hardware.

Implement model quantization, pruning, and compression techniques to enhance inference speed while maintaining accuracy.

Collaborate with ML engineers to transition research-grade code (e.g., PyTorch) into production-ready, scalable systems.

Inference Pipeline Development

Design and optimize end-to-end inference pipelines for embedded systems, leveraging frameworks like ONNX, TensorFlow Serving, or PyTorch Serve.

Integrate model outputs with upstream & downstream systems (e.g., perception, control modules) via APIs or middleware.

Cross-Functional Collaboration

Partner with DevOps teams to build CI/CD pipelines for automated model deployment, testing, and rollback.

Work with hardware engineers to profile and optimize model performance on target devices (e.g., NVIDIA Jetson).

Monitoring & Maintenance

Develop tools and dashboards to monitor model performance, data drift, and system health in production.

Implement A/B testing and canary deployment strategies to validate model updates.

Infrastructure & Tools

Optimize data pipelines for low-latency inference, including preprocessing and postprocessing workflows.

Advocate for MLOps best practices (versioning, reproducibility, logging) across the ML lifecycle.

 

What You’ll Need to Succeed:

Education & Experience

Bachelor’s degree in computer science, engineering, or related field with 2+ years of experience in deploying ML models (or master’s with 1+ years).

Proven expertise in deploying models to edge devices or cloud platforms (AWS, Azure, GCP).

Technical Skills

Mastery of Python and C++; familiarity with CUDA, TensorRT, or OpenVINO for acceleration.

Experience with deployment frameworks (e.g., ONNX, TensorFlow Lite, PyTorch Mobile) and containerization (Docker, Kubernetes).

Knowledge of performance profiling tools (e.g., NVIDIA Nsight, VTune) and optimization techniques (e.g., layer fusion, memory management).

Domain Knowledge

Understanding of ML model lifecycle challenges (e.g., drift, scalability) and MLOps principles.

Familiarity with computer vision, LiDAR/radar data, or sensor fusion workflows is a plus.

 

Bonus Points!

Experience with NVIDIA libraries (CUDA, CuDNN, TensorRT) or embedded SDKs (JetPack, DeepStream).

Proficiency in distributed inference using Ray or Horovod.

Cloud certifications (AWS ML Specialty, Azure AI Engineer) or MLOps tools (MLflow, Kubeflow).

Knowledge of security practices for ML systems (e.g., adversarial defense, encrypted inference

At Torc, we’re committed to building a diverse and inclusive workplace. We celebrate the uniqueness of our Torc’rs and do not discriminate based on race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, veteran status, or disabilities.

Even if you don’t meet 100% of the qualifications listed for this opportunity, we encourage you to apply. 

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...
Select...
Select...