Senior DevOps Engineer
Factored was conceived in Palo Alto, California by Andrew Ng and a team of highly experienced AI researchers, educators, and engineers to help address the significant shortage of qualified AI & Machine-Learning engineers globally. We know that exceptional technical aptitude, intelligence, communication skills, and passion are equally distributed around the world, and we are very committed to testing, vetting, and nurturing the most talented engineers for our program and on behalf of our clients.
We're actively seeking a Senior DevOps Engineer with 6+ years of hands-on expertise in designing, implementing, and maintaining robust infrastructure and deployment pipelines. In this crucial role, you'll be instrumental in tackling complex infrastructure challenges, ensuring the seamless and scalable operation of our applications.
You'll work alongside a talented team of engineers, focusing on the smooth operation of AI technologies within scalable systems. Your experience in deploying machine learning models and AI-driven applications into production environments will be a significant asset as you support our evolving AI initiatives.
Functional Responsibilities:
- Design, deploy, and manage scalable and secure cloud infrastructure across platforms like AWS, Azure, or GCP.
- Lead the implementation, optimization, and management of core DevOps tools including Terraform, GitHub Actions, and comprehensive CI/CD pipelines.
- Manage and optimize cloud-based storage systems (e.g., AWS S3, AWS RDS, or similar).
- Deploy and manage applications, with a focus on serverless environments (e.g., AWS Fargate, AWS Lambda, or similar).
- Create and maintain efficient automation workflows using shell scripting and Python to streamline operations.
- Collaborate closely with both development and operations teams to integrate new technologies and optimize production pipelines.
- Ensure our overall infrastructure remains secure, stable, and highly scalable.
- Stay up-to-date with the latest DevOps best practices, with an eye on relevant MLOps concepts to continuously improve our deployment processes.
- Contribute to the optimization of deployment and operation of machine learning models in production, ensuring their smooth integration into scalable systems.
Qualifications:
- 6+ years of DevOps experience, with a strong understanding of deploying and maintaining machine learning models in cloud environments.
- Hands-on experience with cloud platforms such as AWS, Azure, or GCP, with a focus on deploying and managing AI/ML infrastructure.
- Proficient in DevOps tools such as Terraform, GitHub Actions, and CI/CD practices.
- Experience managing cloud-based storage systems like AWS S3, AWS RDS, or similar.
- Previous exposure to deploying applications in serverless environments like AWS Fargate, AWS Lambda, or similar.
- Strong shell scripting skills and Python experience for automation and workflow management.
- Understanding of machine learning workflows, including the deployment and maintenance of models in production (MLOps exposure is a plus, but not required).
- Excellent English communication skills, with the ability to work cross-functionally between development and operations teams.
Apply for this job
*
indicates a required field