
Site Reliability Engineering Manager
Description
The Cloud team at Lucid Motors is currently seeking a Senior Site Reliability Engineering (SRE) Manager for leading the reliability, scalability, and
operational excellence of Lucid Motors’ cloud infrastructure and production services. This role combines hands-on technical leadership with people
management, ensuring systems are highly available while developing and empowering a team of SRE engineers.
Responsibilities
• SRE Leadership & Reliability Ownership
o Own the availability, performance, and reliability of cloud services deployed and operated in KSA.
o Define, implement, and track SRE best practices, including SLIs, SLOs, SLAs, and error budgets.
o Lead the architecture and governance of highly available and disaster-resilient systems, ensuring DR strategies are tested and
maintained.
o Drive capacity planning, auto-scaling, and performance tuning across Kubernetes-based platforms.
o Own monitoring, observability, and alerting using Prometheus, Grafana, and logging platforms.
o Lead incident response, impact assessment, and root-cause analysis for complex production issues.
• Team Management, Mentorship & Growth
o Manage a team of SRE engineers, providing technical direction, career coaching, and performance feedback.
o Review and approve infrastructure code, deployment configurations, automation scripts, and SRE tooling.
o Foster a culture of ownership, learning, blameless postmortems, and continuous improvement.
o Lead hiring, onboarding, and skill development initiatives for the SRE function.
o Ensure fair, sustainable, and well-documented on-call rotations.
• Cloud Platforms & Automation
o Oversee production environments on Oracle Cloud Infrastructure (OCI) and AWS.
o Govern Infrastructure-as-Code practices using Terraform and configuration management tools.
o Lead CI/CD strategy and implementation using ArgoCD, Jenkins, Maven, Docker, and GitLab.
o Ensure secure and reliable deployment of microservices and data pipelines on Kubernetes using Helm.
• Platform Services & Data Systems
o Collaborate closely with Product Owners, Engineering Managers, Security, and Architecture teams.
o Oversee the reliability and scaling of platform services such as Kafka, Spark, Trino, Airflow, MQTT, and microservices ecosystems.
o Ensure stable operations of NoSQL and RDBMS systems including ElasticSearch, MongoDB, PostgreSQL, and MySQL.
o Support distributed data processing and messaging systems, addressing performance and scalability challenges.
Requirements and Skills
- B.S. or M.S. degree in Computer Science, Engineering, or a related field.
- 8+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering.
- 2–4 years of experience managing or leading SRE/DevOps engineers.
- Strong hands-on experience with OCI and AWS cloud platforms.
- Solid expertise in Kubernetes, Terraform, CI/CD pipelines, and cloud-native architectures.
- Proficiency in Python, Go, Bash/Shell, or similar languages.
- Strong Experience with incident management, observability, and performance optimization.
- Fluent in English, with experience collaborating across regions and time zones.
- Experience scaling SRE practices across multiple teams or services.
- Familiarity with compliance, security, and regulated cloud environments.
Additional Compensation and Benefits: Lucid offers a wide range of competitive benefits, including medical, dental, vision, life insurance, disability insurance, vacation, and 401k. The successful candidate may also be eligible to participate in Lucid’s equity program and/or a discretionary annual incentive program, subject to the rules governing such programs. (Cash or equity incentive awards, if any, will depend on various factors, including, without limitation, individual and company performance.)
By Submitting your application, you understand and agree that your personal data will be processed in accordance with our Candidate Privacy Notice. If you are a California resident, please refer to our California Candidate Privacy Notice.
Apply for this job
*
indicates a required field