Back to jobs
New

Principal Database Reliability Engineer

Austin, TX

Join Udemy. Help define the future of learning.

Udemy is an AI-powered skills acceleration platform built to help people and teams grow. It’s personalized, practical, and focused on real-world impact.

Our mission is simple: to transform lives through learning. Your work helps people around the world build skills they can use, whether they’re picking up something new or leveling up to stay ahead.

Over 80 million learners and 17,000 businesses already learn with Udemy. If you’re excited by change, energized by learning, and ready to have a real impact, you’ll feel right at home. 

Learn more about us on our company page.

Principal Database Reliability Engineer

About this role:

As part of Udemy's Platform team, the Datastore Infrastructure (DSI) team is responsible for overseeing all aspects of Databases (MySQL, Aurora, DynamoDB), Message Queues (RabbitMQ), Streaming (Kafka), and Caching (Redis, Memcache) in our infrastructure. This includes ensuring uptime, security and compliance, observability, performance,  improving developers' productivity and developing future growth strategies. The team is split between EU and US regions. You will play a vital role in overseeing day-to-day activities and engineering strategies of DSI, ensuring that millions of students worldwide achieve greater learning and career outcomes on Udemy. We value teamwork, a good sense of humor, strong ownership, technological curiosity, and a desire to learn.

To be successful in this role, you will collaborate closely with engineering, product, and a diverse set of stakeholders around the world. You are not just interested in maintaining systems but also writing the software that maintains them. You strongly believe in a no-blame culture and advocate for humane on-call practices. You constantly seek opportunities for improvement and thrive in an environment where you can drive positive change.

 

What you'll be doing: 

  • Lead improvement projects for our datastores and platform teams to align with the company’s long term objectives.
  • Maintain Infrastructure Uptime, monitor performance, and ensure infrastructure continues scaling as we grow.
  • Develop Immutable infrastructure patterns, and automate Infrastructure provisioning via Code (Terraform, Python, Ansible etc ..)
  • Ensure adherence to PCI and ISO27001 compliance as well as SOC 2 security requirements, modifying CI/CD processes when necessary, and upholding policies and standards.
  • Advocate for and implement positive changes in tools and processes through healthy discussions.
  • Participate in the on-call rotation, demonstrating a systematic approach to incident management.
  • Participate in day-to-day activities, support requests, and project-related tasks for the team.
  • Contribute to documentation, maintain ticketing queues, provide project support, troubleshoot, and offer after-hours assistance as required
  • Provide coaching and mentorship to new hires, fostering their technical growth and integration into the team. Maintain close communication with team members throughout their tenure.

What you’ll have:

We do not expect you to have all the below, but the more mix/max skills you have the easier you will onboard

  • 8-10 years of professional experience working in a Cloud Engineering team (also  SRE/DBRE team)  with Infrastructure responsibilities in managing large production workloads.
  • Proficiency with managing MySQL at scale (Horizontal Scaling, sharding, InnoDB optimizations, Query Optimization, HA/DR, Monitoring, Backups Strategy, Security, Automations).
  • Strong understanding in running Production Workloads in Kubernetes
  • Proficiency with tools like Terraform, Ansible, Git and how to work with Infrastructure as Code, and automated provisioning.
  • Strong experience in Kafka cluster management, topic configuration, performance tuning, and ensuring high availability and fault tolerance. Experience with MSK is also good.
  • Experience with  Message Queues (MQ/SQS) and Caching (Redis, Memcache) or similar products
  • Experience in Python.
  • Knowledge of configuration management tools, monitoring systems (Datadog or similar) for database infrastructure, and scaling strategies for handling increased data volumes.
  • Strong troubleshooting skills to diagnose complex database issues.
  • Hands-on experience with AWS cloud infrastructure and a grasp of security best practices.
  • Adaptability and comfort working in a fast-paced, hands-on environment.

 

Nice to have :  

  • Experience with any additional Programming Languages (Golang, Kotlin, Java)
  • Experience in implementing CDC pipelines for reliable data replication and synchronization
  • Experience with Vitess Operator running MySQL on Kubernetes.
  • Experience with Writing Kubernetes Helm Charts.
  • Experience with tools like ArgoCD/Argo Workflows, or similar alternatives in various combinations.
  • Knowledge of security standards, vulnerability patching, TLS/SSL and related..
  • Any additional experience or familiarity with related technologies would be advantageous.

 

infrastructure. This includes ensuring uptime, security and compliance, observability, performance,  improving developers' productivity and developing future growth strategies. The team is split between EU and US regions. You will play a vital role in overseeing day-to-day activities and engineering strategies of DSI, ensuring that millions of students worldwide achieve greater learning and career outcomes on Udemy. We value teamwork, a good sense of humor, strong ownership, technological curiosity, and a desire to learn.

To be successful in this role, you will collaborate closely with engineering, product, and a diverse set of stakeholders around the world. You are not just interested in maintaining systems but also writing the software that maintains them. You strongly believe in a no-blame culture and advocate for humane on-call practices. You constantly seek opportunities for improvement and thrive in an environment where you can drive positive change.

What you'll be doing: 

  • Lead improvement projects for our datastores and platform teams to align with the company’s long term objectives.
  • Maintain Infrastructure Uptime, monitor performance, and ensure infrastructure continues scaling as we grow.
  • Develop Immutable infrastructure patterns, and automate Infrastructure provisioning via Code (Terraform, Python, Ansible etc ..)
  • Ensure adherence to PCI and ISO27001 compliance as well as SOC 2 security requirements, modifying CI/CD processes when necessary, and upholding policies and standards.
  • Advocate for and implement positive changes in tools and processes through healthy discussions.
  • Participate in the on-call rotation, demonstrating a systematic approach to incident management.
  • Participate in day-to-day activities, support requests, and project-related tasks for the team.
  • Contribute to documentation, maintain ticketing queues, provide project support, troubleshoot, and offer after-hours assistance as required
  • Provide coaching and mentorship to new hires, fostering their technical growth and integration into the team. Maintain close communication with team members throughout their tenure.

What you’ll have:

We do not expect you to have all the below, but the more mix/max skills you have the easier you will onboard

  • 8-10 years of professional experience working in a Cloud Engineering team (also  SRE/DBRE team)  with Infrastructure responsibilities in managing large production workloads.
  • Proficiency with managing MySQL at scale (Horizontal Scaling, sharding, InnoDB optimizations, Query Optimization, HA/DR, Monitoring, Backups Strategy, Security, Automations).
  • Strong understanding in running Production Workloads in Kubernetes
  • Proficiency with tools like Terraform, Ansible, Git and how to work with Infrastructure as Code, and automated provisioning.
  • Strong experience in Kafka cluster management, topic configuration, performance tuning, and ensuring high availability and fault tolerance. Experience with MSK is also good.
  • Experience with  Message Queues (MQ/SQS) and Caching (Redis, Memcache) or similar products
  • Experience in Python.
  • Knowledge of configuration management tools, monitoring systems (Datadog or similar) for database infrastructure, and scaling strategies for handling increased data volumes.
  • Strong troubleshooting skills to diagnose complex database issues.
  • Hands-on experience with AWS cloud infrastructure and a grasp of security best practices.
  • Adaptability and comfort working in a fast-paced, hands-on environment.

Nice to have :  

  • Experience with any additional Programming Languages (Golang, Kotlin, Java)
  • Experience in implementing CDC pipelines for reliable data replication and synchronization
  • Experience with Vitess Operator running MySQL on Kubernetes.
  • Experience with Writing Kubernetes Helm Charts.
  • Experience with tools like ArgoCD/Argo Workflows, or similar alternatives in various combinations.
  • Knowledge of security standards, vulnerability patching, TLS/SSL and related..
  • Any additional experience or familiarity with related technologies would be advantageous.

 

We understand that not everyone will match each of the above qualifications. However, we also realize that everyone has unique experiences that can add value to our company. Even if you think your background might not perfectly align, we'd love to hear from you!

Posting Date: November 05, 2025
Application window: November 05, 2025 - December 05, 2025

At Udemy, we strive to be transparent around compensation. Actual compensation for this role is based on several factors, including but not limited to job-related skills, qualifications, experience, and specific work location due to differences in the cost of labor. In addition to a base salary, this role is also eligible for equity.

Hiring Compensation Range

$184,000 - $230,000 USD

Why work here?

You’ll grow here.
Learning is part of the job. You’ll get full access to Udemy courses, a monthly UDay to invest in yourself, and a budget to spend on whatever helps you improve. Many people are diving into AI lately, but what you focus on is up to you.

AI is real here.
We use it in the way we learn and the way we work. You’ll have the space and tools to experiment, apply, and get better at using AI in practical ways.

You’ll own your work.
We trust people to lead, make decisions, and follow through. You don’t need to wait for permission or layers of approval to have an impact.

You’ll build with others.
We collaborate openly and shape ideas together. Everyone has a voice, and good thinking is welcomed from any direction.

You’ll see your impact.
What you build helps people grow their skills, change their careers, or find a path forward. You’ve got the experience, why not use it to help others gain theirs?

Bring your curiosity. We’ll bring the platform and the support. Let’s LEARN together. 

Our Benefits Start with U

Our benefits start with you and were built to provide you and your family with the protection and care you need, making it easy to access the right coverage when you need it most. Benefits vary by region, and we encourage applicants to review our Australia Benefits, India Benefits, Ireland Benefits, Mexico BenefitsTurkiye Benefits & US Benefits, pages to get an understanding of some of the benefits we offer. For details on region-specific benefits, please refer to the information provided during the hiring process. 

Benefits outlined are provided as a general overview and may vary depending on the location, role, and employment classification. All benefits are subject to change at the discretion of the organization and in accordance with applicable laws and policies.

At Udemy, we value diversity and inclusion and consider qualified applicants without regard to race, color, religion, sex, national origin, ancestry, age, genetic information, sexual orientation, gender identity, marital or family status, veteran status, medical condition, or disability. We understand that not everyone will match each of the qualifications. However, we also realize that everyone has unique experiences that can add value to our company. Even if you think your background might not perfectly align, we'd love to hear from you! 

Information regarding data privacy is available within the Udemy Careers Privacy Notice.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...

We want to understand all of the ways that you have interacted or been exposed to Udemy so that we can continue to invest in efforts that resonate with candidates.

Select...
Select...

Saying "No" to this question indicates you are eligible to working and do not require sponsorship

Select...

Demographic Questions

Voluntary Self-Identification

To support our inclusive recruiting process and for reporting purposes, we welcome you to participate in the self-identification survey. This survey is confidential, voluntary and anonymous. 

We believe everyone has something special to give – their authenticity, empathy, unique backgrounds. At Udemy, we make a promise to each other to respect that and be kind. And because we believe the best ideas are born as a result of people from all walks of life coming together, we work hard to create an inclusive space for all.

As part of Udemy’s Equal Employment Opportunity policy, we don’t discriminate based on any protected group status under any applicable law. So rest assured, whatever your decision, the survey will not be considered in the hiring process or thereafter.

Information regarding data privacy is available within the Udemy Careers Privacy Notice.

Select...
Select...
Select...

Voluntary Self-Identification

For government reporting purposes, we ask candidates to respond to the below self-identification survey. Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in a confidential file.

As set forth in BEDI Partnerships’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

Select...
Select...
Race & Ethnicity Definitions

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection. As a government contractor subject to the Vietnam Era Veterans Readjustment Assistance Act (VEVRAA), we request this information in order to measure the effectiveness of the outreach and positive recruitment efforts we undertake pursuant to VEVRAA. Classification of protected categories is as follows:

A "disabled veteran" is one of the following: a veteran of the U.S. military, ground, naval or air service who is entitled to compensation (or who but for the receipt of military retired pay would be entitled to compensation) under laws administered by the Secretary of Veterans Affairs; or a person who was discharged or released from active duty because of a service-connected disability.

A "recently separated veteran" means any veteran during the three-year period beginning on the date of such veteran's discharge or release from active duty in the U.S. military, ground, naval, or air service.

An "active duty wartime or campaign badge veteran" means a veteran who served on active duty in the U.S. military, ground, naval or air service during a war, or in a campaign or expedition for which a campaign badge has been authorized under the laws administered by the Department of Defense.

An "Armed forces service medal veteran" means a veteran who, while serving on active duty in the U.S. military, ground, naval or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985.

Select...

Voluntary Self-Identification of Disability

Form CC-305
Page 1 of 1
OMB Control Number 1250-0005
Expires 04/30/2026

Why are you being asked to complete this form?

We are a federal contractor or subcontractor. The law requires us to provide equal employment opportunity to qualified people with disabilities. We have a goal of having at least 7% of our workers as people with disabilities. The law says we must measure our progress towards this goal. To do this, we must ask applicants and employees if they have a disability or have ever had one. People can become disabled, so we need to ask this question at least every five years.

Completing this form is voluntary, and we hope that you will choose to do so. Your answer is confidential. No one who makes hiring decisions will see it. Your decision to complete the form and your answer will not harm you in any way. If you want to learn more about the law or this form, visit the U.S. Department of Labor’s Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp.

How do you know if you have a disability?

A disability is a condition that substantially limits one or more of your “major life activities.” If you have or have ever had such a condition, you are a person with a disability. Disabilities include, but are not limited to:

  • Alcohol or other substance use disorder (not currently using drugs illegally)
  • Autoimmune disorder, for example, lupus, fibromyalgia, rheumatoid arthritis, HIV/AIDS
  • Blind or low vision
  • Cancer (past or present)
  • Cardiovascular or heart disease
  • Celiac disease
  • Cerebral palsy
  • Deaf or serious difficulty hearing
  • Diabetes
  • Disfigurement, for example, disfigurement caused by burns, wounds, accidents, or congenital disorders
  • Epilepsy or other seizure disorder
  • Gastrointestinal disorders, for example, Crohn's Disease, irritable bowel syndrome
  • Intellectual or developmental disability
  • Mental health conditions, for example, depression, bipolar disorder, anxiety disorder, schizophrenia, PTSD
  • Missing limbs or partially missing limbs
  • Mobility impairment, benefiting from the use of a wheelchair, scooter, walker, leg brace(s) and/or other supports
  • Nervous system condition, for example, migraine headaches, Parkinson’s disease, multiple sclerosis (MS)
  • Neurodivergence, for example, attention-deficit/hyperactivity disorder (ADHD), autism spectrum disorder, dyslexia, dyspraxia, other learning disabilities
  • Partial or complete paralysis (any cause)
  • Pulmonary or respiratory conditions, for example, tuberculosis, asthma, emphysema
  • Short stature (dwarfism)
  • Traumatic brain injury
Select...

PUBLIC BURDEN STATEMENT: According to the Paperwork Reduction Act of 1995 no persons are required to respond to a collection of information unless such collection displays a valid OMB control number. This survey should take about 5 minutes to complete.