DevOps Engineer
Fully remote | Complete engagement job
Founded in Palo Alto by Dr. Andrew Ng and Israel Niezen, Factored helps U.S. companies build and scale world-class AI, ML, and Data teams, powered by the top 1% of LATAM talent, with a defining purpose: To empower brilliant humans, unleash their potential, and amplify their impact in the world.
At Factored, you’ll be part of a community that values learning, ownership, and authenticity, where your growth is personal and your ideas matter. We’re transparent, curious, and collaborative. We strive for excellence, celebrate diversity, encourage curiosity, and build an environment where you can truly thrive.
We’re seeking a DevOps Engineer with 5+ years of hands-on experience designing, building, and operating reliable infrastructure and deployment pipelines in production environments. In this role, you’ll help tackle complex infrastructure challenges and ensure systems are secure, stable, and scalable.
You’ll play an important role in supporting AI and GenAI workloads, building and operating deployment pipelines, and contributing to the evolution of a shared GenAI application platform used across multiple teams. This is a hands-on, high-ownership role with meaningful impact, working closely with engineers to bring AI-enabled applications into production responsibly and reliably.
Functional Responsibilities
- Design, deploy, and manage cloud infrastructure on AWS, optimized for AI and ML workloads with high computational demands.
- Build, maintain, and optimize CI/CD pipelines tailored for AI/ML and GenAI applications.
- Automate model training, testing, deployment, and monitoring workflows.
- Ensure scalability, reliability, and high availability of AI-powered applications in production.
- Implement monitoring and observability systems to track model performance, data drift, logs, and system uptime.
- Deploy and operate applications using containerized and orchestrated environments (Docker, Kubernetes).
- Apply Infrastructure as Code practices using tools such as Terraform or Ansible.
- Contribute hands-on to complex application and web development projects, delivering high-quality, production-grade code when needed.
- Help build and scale a central GenAI application platform, enabling teams to share data, code, and best practices.
- Collaborate closely with data scientists, ML engineers, and backend developers to ensure secure and smooth deployment of AI services.
- Apply an SRE mindset to improve system resilience, operational excellence, and long-term maintainability.
- Build trust and collaboration through strong communication and active engagement with cross-functional partners.
Qualifications
- 5+ years of DevOps experience, with at least 1 year of MLOps and software engineering exposure.
- Strong experience designing and operating systems on AWS (experience with GCP or Azure is a plus).
- Hands-on experience with Docker and Kubernetes in production environments.
- Proven use of Infrastructure as Code tools such as Terraform (Ansible a plus).
- Strong scripting and automation skills using Python and/or Go, plus shell scripting.
- Experience building and operating CI/CD pipelines (e.g., GitHub Actions).
- Familiarity with vector databases and data-intensive systems is a plus.
- Exposure to MLOps tools such as MLflow, Kubeflow, or DVC is a nice-to-have.
- Solid understanding of site reliability engineering principles.
- Strong communication skills, ability to work across disciplines, wear multiple hats, and thrive in a fast-paced environment.
- Bachelor’s degree in computer science, engineering, mathematics, statistics, or a related technical field (or equivalent experience).
Our Benefits:
- Ownership through equity participation.
- Annual company retreat.
- Education bonus for continuous learning.
- Company-wide winter break.
- Paid time off.
- Optional in-person events and meetups.
- Tailored career roadmaps.
- High-performance culture.
Create a Job Alert
Interested in building your career at Factored? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
