
Senior Cloud Ops Engineer
Want to help us help others? We’re hiring!
GoFundMe is the world’s most powerful community for good, dedicated to helping people help each other. By uniting individuals and nonprofits in one place, GoFundMe makes it easy and safe for people to ask for help and support causes—for themselves and each other. Together, our community has raised more than $40 billion since 2010.
Join us! The GoFundMe team is searching for our next Senior Cloud Ops Engineer to join our Platform Infrastructure and Operations team. This crucial role focuses on building and maintaining an advanced cloud infrastructure vital for our online fundraising platform, which supports nonprofits worldwide. You will be instrumental in ensuring our infrastructure achieves 99.999% availability, meeting the high demands of our global payments platform.
Candidates considered for this role will be located in Buenos Aires, Argentina. There will be an in-office requirement of 2-3x a week.
The Job
- Design and implement robust, fault-tolerant cloud solutions to process billions of dollars annually, ensuring scalability, resilience, and compliance.
- Share expertise and foster a culture of continuous improvement, innovation, and learning within the team, contributing to technical mentorship and knowledge sharing.
- Participate in strategic decisions regarding cloud architecture, influencing the adoption of best practices and cutting-edge technologies.
- Work collaboratively to enhance system performance, observability, and reliability across the infrastructure, focusing on improving real-time monitoring and logging for operational excellence.
- Lead initiatives to improve infrastructure resiliency, leveraging tools like AWS Resilience Hub and Fault Injection Simulator to test and enhance system robustness.
- Drive application resilience by designing and executing load tests, simulating infrastructure faults, and analyzing results to improve fault tolerance.
- Incorporate scalability and performance testing as integral parts of service design, ensuring services meet reliability and performance goals under high transaction volumes.
- Embed testing phases within CI/CD pipelines to promote shift-left performance testing practices, improve efficiency, and reduce development cycle times.
- Contribute to implementing and analyzing DORA (DevOps Research and Assessment) metrics to enhance the efficiency and effectiveness of the development lifecycle.
- Participate in an on-call rotation to promptly address and resolve critical incidents, ensuring continuous operational excellence and rapid recovery during outages.
You
- Bachelor’s Degree in Computer Science, a related field, or 8+ years of equivalent practical experience.
- Minimum of 6 years of experience designing and managing scalable, cloud-based infrastructure, preferably in SaaS environments.
- Deep technical expertise with a strong foundation in computer science, sharp engineering skills, and a commitment to delivering high-quality solutions.
- Expert-level knowledge of AWS cloud services, container technologies like Docker and Kubernetes, and Infrastructure as Code (IaC) tools like Terraform and CloudFormation.
- Proficiency in software architecture, including asynchronous event-driven architecture and microservices.
- Experienced in performance and reliability testing using tools like Artillery, K6, or similar frameworks.
- Experience in defining, monitoring, and managing Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to ensure the cloud infrastructure consistently meets performance and availability targets.
- Proven expertise in disaster recovery planning and execution, including developing and implementing robust strategies to maintain business continuity and achieve rapid recovery in the event of an outage.
- Hands-on experience with application performance management (APM) tools like New Relic, DataDog, and Splunk.
- Advanced scripting and development skills in Bash, PHP, and NodeJS languages.
- Skilled in managing distributed data systems, troubleshooting complex issues under high load, and designing for high transaction volumes.
- Knowledgeable in compliance regulations, including PCI, SOC2, and GDPR.
Preferred
- AWS cloud certifications.
- Experience with fault-tolerant system design, large-scale distributed systems, and high-transaction environments.
- Familiarity with tools and processes for infrastructure resiliency and fault injection testing.
Traits
- Strong collaborative skills with a track record of leading initiatives and working with cross-functional teams.
- Adaptable and thrives in a fast-paced and agile environment.
- Excellent communication skills, capable of effectively collaborating across diverse teams and cultural backgrounds.
Why you’ll love it here
- Make an Impact: Be part of a mission-driven organization making a positive difference in millions of lives every year.
- Innovative Environment: Work with a diverse, passionate, and talented team in a fast-paced, forward-thinking atmosphere.
- Collaborative Team: Join a fun and collaborative team that works hard and celebrates success together.
- Competitive Benefits: Enjoy competitive pay and comprehensive healthcare benefits.
- Holistic Support: Enjoy financial assistance for things like hybrid work, family planning, along with generous parental leave, flexible time-off policies, and mental health and wellness resources to support your overall well-being.
- Growth Opportunities: Participate in learning, development, and recognition programs to help you thrive and grow.
- Commitment to DEI: Contribute to diversity, equity, and inclusion through ongoing initiatives and employee resource groups.
- Community Engagement: Make a difference through our volunteering and Gives Back programs.
We live by our core values: impatient to be great, find a way, earn trust every day, fueled by purpose. Be a part of something bigger with us!
GoFundMe is proud to be an equal opportunity employer that actively pursues candidates of diverse backgrounds and experiences. We do not discriminate on the basis of race, color, religion, ethnicity, nationality or national origin, sex, sexual orientation, gender, gender identity or expression, pregnancy status, marital status, age, medical condition, mental or physical disability, or military or veteran status.
Individual pay is determined by work location and additional factors including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range based on your location during the hiring process.
If you require a reasonable accommodation to complete a job application or a job interview or to otherwise participate in the hiring process, please contact us at accommodationrequests@gofundme.com.
Global Data Privacy Notice for Job Candidates and Applicants:
Depending on your location, the General Data Protection Regulation (GDPR) or certain US privacy laws may regulate the way we manage the data of job applicants. Our full notice outlining how data will be processed as part of the application procedure for applicable locations is available here. By submitting your application, you are agreeing to our use and processing of your data as required.
Learn more about GoFundMe:
We’re proud to partner with GoFundMe.org, an independent public charity, to extend the reach and impact of our generous community, while helping drive critical social change. You can learn more about GoFundMe.org’s activities and impact in their FY ‘24 annual report.
Our annual “Year in Help” report reflects our community’s impact in advancing our mission of helping people help each other.
For recent company news and announcements, visit our Newsroom.
Apply for this job
*
indicates a required field