
Site Reliability Engineer - Database (7 to 10 Years)
About PhonePe Limited:
Headquartered in India, its flagship product, the PhonePe digital payments app, was launched in Aug 2016. As of April 2025, PhonePe has over 60 Crore (600 Million) registered users and a digital payments acceptance network spread across over 4 Crore (40+ million) merchants. PhonePe also processes over 33 Crore (330+ Million) transactions daily with an Annualized Total Payment Value (TPV) of over INR 150 lakh crore.
PhonePe’s portfolio of businesses includes the distribution of financial products (Insurance, Lending, and Wealth) as well as new consumer tech businesses (Pincode - hyperlocal e-commerce and Indus AppStore Localized App Store for the Android ecosystem) in India, which are aligned with the company’s vision to offer every Indian an equal opportunity to accelerate their progress by unlocking the flow of money and access to services.
Culture:
At PhonePe, we go the extra mile to make sure you can bring your best self to work, Everyday!. And that starts with creating the right environment for you. We empower people and trust them to do the right thing. Here, you own your work from start to finish, right from day one. PhonePe-rs solve complex problems and execute quickly; often building frameworks from scratch. If you’re excited by the idea of building platforms that touch millions, ideating with some of the best minds in the country and executing on your dreams with purpose and speed, join us!
Site Reliability Engineer - Database
Experience: 7 tp 10 Years
We are seeking a highly skilled and experienced SRE Engineer (7 to 10 years of experience) with deep expertise in MySQL database administration and a solid foundation in Linux systems engineering. You will play a critical role in ensuring the resilience, scalability, and performance of our distributed, high-volume database infrastructure spanning tens of terabytes of data across multiple data centers. In this role, you will be expected to design, build, and lead initiatives to improve reliability and efficiency across the database stack, mentor SRE/DBA team members, and drive strategic improvements to infrastructure.
Responsibilities
- Database Architecture & Management: Lead the design, provisioning, and lifecycle management of large-scale MySQL/Galera multi-master clusters across multiple geographic locations.
- Reliability Engineering: Develop and implement database reliability strategies, including automated failure recovery and disaster recovery solutions.
- Troubleshooting & Support: Investigate and resolve database-related issues, including performance problems, connectivity issues, and data corruption.
- Performance, optimization & Security: Own and continuously improve performance tuning, including query optimization, indexing, and resource management, security hardening, and high availability of database systems.
- Operational Excellence:
- Standardize and automate database operational tasks such as upgrades, backups, schema changes, and replication management.
- Drive capacity planning, monitoring, and incident response across infrastructure.
- Incident Management: Proactively identify, diagnose, and resolve complex production issues in collaboration with the engineering team.
- On-Call & Tooling:
- Participate in and enhance on-call rotations, implementing tools to reduce alert fatigue and human error.
- Develop and maintain observability tooling for database systems.
- Leadership & Mentorship: Mentor and guide junior and mid-level SREs and DBAs, fostering knowledge sharing and skill development within the team.
Skills and Qualifications
Core Expertise:
- Expertise in Linux systems administration, scripting (Bash/Python), file systems, disk management, and debugging system-level performanceissues.
- 7–8+ years of hands-on experience in MySQL database administration in large-scale, high-availability environments.
- Deep understanding of MySQL internals, InnoDB storage engine, replication mechanisms (async, semi-sync, Galera), and tuning parameters.
- Proven experience managing 100+ production clusters and databases larger than 1TB in size.
Preferred Experience:
- Hands-on experience with Galera clusters is a strong plus.
- Familiarity with Infrastructure-as-Code tools like Ansible, Terraform, or similar.
- Experience with observability tools such as Prometheus, Grafana, or Percona Monitoring & Management.
- Exposure to other NOSQL (e.g., Aerospike) will be a plus.
- Experience working in on-premise environments is highly desirable.
Leadership & Communication:
- Proven ability to lead cross-functional initiatives, including database migrations, major version upgrades, and scaling efforts.
- Excellent communication skills with a demonstrated track record of mentoring and technical leadership.
PhonePe Full Time Employee Benefits (Not applicable for Intern or Contract Roles)
- Insurance Benefits - Medical Insurance, Critical Illness Insurance, Accidental Insurance, Life Insurance
- Wellness Program - Employee Assistance Program, Onsite Medical Center, Emergency Support System
- Parental Support - Maternity Benefit, Paternity Benefit Program, Adoption Assistance Program, Day-care Support Program
- Mobility Benefits - Relocation benefits, Transfer Support Policy, Travel Policy
- Retirement Benefits - Employee PF Contribution, Flexible PF Contribution, Gratuity, NPS, Leave Encashment
- Other Benefits - Higher Education Assistance, Car Lease, Salary Advance Policy
Our inclusive culture promotes individual expression, creativity, innovation, and achievement and in turn helps us better understand and serve our customers. We see ourselves as a place for intellectual curiosity, ideas and debates, where diverse perspectives lead to deeper understanding and better quality results. PhonePe is an equal opportunity employer and is committed to treating all its employees and job applicants equally; regardless of gender, sexual preference, religion, race, color or disability. If you have a disability or special need that requires assistance or reasonable accommodation, during the application and hiring process, including support for the interview or onboarding process, please fill out this form.
Read more about PhonePe on our blog.
Apply for this job
*
indicates a required field