Senior Manager - Engineering
Job Description: Sr. Engineering Manager SRE
Company Overview: Myntra’s Engineering team builds the technology platform that
empowers our customers’ shopping experience and enables the smooth flow of
products from suppliers to our customers’ doorsteps. We work on areas such as
building massive-scale web applications, engaging user interfaces, big-data analytics,
mobile apps, workflow systems, inventory management, etc. We are a small technology
team where each individual has a huge impact. You will have the opportunity to be part
of a rapidly growing organization and gain exposure to all the parts of a comprehensive
e-commerce platform.
About The Team: The Cloud Platform Engineering (CPE) group is responsible for
developing and managing platforms that allow Myntra’s tech products to be deployed
and run at scale. The CPE team builds and maintains centralized and high-scale
platforms for sophisticated application security frameworks, log collection, monitoring
systems, access management, secret management, database access, change
management systems, build, release and deployment. You will be part of the SRE team
under the CPE division.
Position: Sr. Engg. Manager - Site Reliability Engineering (SRE)
Location: Bengaluru
Employment Type: Full-time
Role Overview:
We are looking for a Senior Engineering Manager - SRE to lead the reliability,
scalability, and automation efforts of our e-commerce platform. This role will focus on
architecting observability solutions, automation frameworks, and cloud-native
infrastructure while driving a culture of Site Reliability Engineering. The ideal candidate
will have strong leadership experience, deep technical expertise, and a passion for
building highly available, scalable, and secure distributed systems.
Responsibilities: Hosting infrastructure and setting up the core platform form the
backbone of any system. As part of this team, you will be responsible for
● Strategizing, and leading the implementation of scalable monitoring, logging,
and observability solutions.
● Drive automation initiatives using Python or Golang, reducing manual
intervention and improving operational and functional efficiency.
● Own the reliability roadmap, ensuring high availability of services deployed on
Kubernetes and cloud platforms (AWS, GCP, Azure).
● Leading and mentoring a team of SREs and DevOps engineers, fostering a
culture of automation, reliability, and continuous improvement.
● Collaborate with engineering teams to define SRE best practices and implement
them across the organization.
● Design and enforce SLIs, SLOs, and SLAs, ensuring reliability targets are met.
● Develop and oversee incident response processes, ensuring effective
post-mortems and root cause analysis.
● Optimize cloud infrastructure, ensuring cost efficiency, scalability, and security.
● Define and implement CI/CD pipelines for efficient software deployment and
rollback mechanisms.
● Advocate for SRE and DevOps principles, influencing architectural decisions and
organizational strategies.
Requirements:
● 15+ years of experience in software engineering, SRE, or DevOps, with at least
5+ years in a leadership role.
● Proven experience in architecting observability platforms (Prometheus, Grafana,
ELK, OpenTelemetry, etc.).
● Strong expertise in automation and scripting using Python or Golang.
● Deep understanding of Kubernetes and container orchestration in production
environments.
● Hands-on experience with cloud platforms (AWS, GCP, or Azure) and cloud-native
architectures.
● Experience with Infrastructure as Code (IaC) tools like Terraform, Helm, or
Ansible.
● Expertise in CI/CD tooling (Jenkins, GitHub Actions, ArgoCD, etc.).
● Strong background in scalability, performance tuning, and high availability
architectures.
● Ability to mentor and grow high-performing engineering teams.
● Exceptional problem-solving skills and
Apply for this job
*
indicates a required field