Job Application for Senior Site Reliability Engineer, PaaS at Algolia

Back to jobs

Algolia is set to enable every company to create world-class Search and Discovery experiences with an API-first approach. Performance and Scalability is at the heart of our mission: we power 1.5 trillion searches a year, for 10K+ customers all over the world.

If you're a problem solver, able to think outside the box and eager to nurture others and learn from them, then this is your challenge!

The Team

The Platform as a Service (PaaS) team is dedicated to empowering development teams by creating toolchains, guidelines, and standards.
Our focus is on enabling seamless automation and CI/CD, comprehensive observability, and unwavering reliability in a secured cloud-native
environment.

The Opportunity

The Senior Site Reliability Engineer (IC4) position within the Platform As a Service team presents an exciting opportunity for a seasoned
professional to enhance scalable infrastructure with a focus on CI/CD, Observability, and application hosting. In this role, you will bridge the
gap between our junior and senior staff, playing a critical role in ensuring the reliability, scalability, and performance of Algolia’s Search
Products.
As a senior contributor, you will be responsible for building and optimizing systems that ensure the platform’s efficiency and reliability, while
also mentoring junior engineers and collaborating across teams. Your work will be pivotal in improving infrastructure, enhancing
observability standards, and streamlining CI/CD processes. You will play a significant role in transitioning legacy systems to a modern
Kubernetes-based architecture, contributing to long-term infrastructure strategies, and ensuring alignment with business needs.

Your role will consist of:

CI/CD Development and Maintenance: Contribute to the design, optimization, and maintenance of the CI/CD pipelines to improve the
speed, reliability, and efficiency of the development lifecycle. Assist in driving standardization across various services hosted on the
platform.
Observability Enhancement: Lead efforts to improve the observability of critical systems, working closely with cross-functional teams to
ensure actionable monitoring and alerting frameworks are in place. Help troubleshoot complex issues and optimize system reliability.
Kubernetes and Cloud Management: Contribute to the development and operation of our Kubernetes-based architecture. Ensure
systems are resilient, scalable, and optimized for performance. Actively participate in enhancing cloud-based solutions for API
management and microservices.
System Optimization and Scaling: Collaborate with team members to ensure system scalability, operability, and performance. Lead
initiatives to optimize resource utilization, focusing on cost efficiency while maintaining high system availability.
Mentorship and Knowledge Sharing: Mentor mid-level engineers (IC3) by providing guidance on technical challenges and SRE best
practices. Support team growth by fostering knowledge-sharing sessions and helping establish processes that drive operational
excellence.
Cross-Team Collaboration: Work closely with product, software, and other SRE teams to ensure that platform goals align with broader
business objectives. Drive initiatives aimed at enhancing platform stability, security, and scalability.
You might be a fit if you have:
Strong Programming Skills: Proficient in Golang and Python with a solid understanding of software craftsmanship. Knowledge of Ruby
is a plus.
Experience in CI/CD Pipelines: Hands-on experience in building and maintaining CI/CD pipelines using tools like GitHub Actions,
CircleCI, or alternatives. Familiarity with best practices for ensuring build and deployment reliability.
Observability: Experience designing and implementing monitoring, alerting, and observability frameworks that provide actionable
insights. Strong troubleshooting skills in production environments.
Kubernetes and Cloud Infrastructure: Proven experience in managing and optimizing Kubernetes-based architectures and working
with public cloud providers such as GCP, AWS, or Microsoft Azure.
Distributed Systems Expertise: Experience in designing, building, and operating distributed systems at scale, with a focus on reliability,
availability, and performance.
Mentorship and Leadership: Experience mentoring junior engineers and helping them grow. Ability to collaborate with cross-functional
teams and contribute to strategic initiatives.
Problem-Solving Skills: Ability to independently solve complex technical problems with minimal supervision while collaborating
effectively with other team members.
Excellent Communication and Organizational Skills: Strong ability to communicate complex technical issues to both technical and
non-technical audiences. Ability to organize and prioritize multiple projects.

We’re looking for someone who can live our values:

GRIT - Problem-solving and perseverance capability in an ever-changing and growing environment
TRUST - Willingness to trust our co-workers and to take ownership
CANDOR - Ability to receive and give constructive feedback.
CARE - Genuine care about other team members, our clients and the decisions we make in the company.
HUMILITY- Aptitude for learning from others, putting ego aside.

REMOTE STRATEGY:

Algolia’s flexible workplace model is designed to empower all Algolians to fulfill our mission to power search and discovery with ease. We place an emphasis on an individual’s impact, contribution, and output, over their physical location. Algolia is a high-trust environment and our team members have the autonomy to choose where they want to work and when. We know community comes in many forms and strive to create opportunities for intentional in-person connection in our offices and virtually for our remote colleagues around the world.

We have a global presence with physical offices in Paris, NYC, London, Sydney and Bucharest.

ABOUT US:

Algolia prides itself on being a pioneer and market leader offering an AI Search solution that empowers 17,000+ businesses to compose customer experiences at internet scale that predict what their users want with blazing fast search and web browse experience. Algolia powers more than 30 billion search requests a week – four times more than Microsoft Bing, Yahoo, Baidu, Yandex and DuckDuckGo combined.

Algolia is part of a cadre of innovative new companies that are driving the next generation of software development, creating APIs that make developers’ lives easier; solutions that are better than building from scratch and better than having to tweak monolithic SaaS solutions.

In 2021, the company closed $150 million in series D funding and quadrupled its post-money valuation of $2.25 billion. Being well capitalized enables Algolia to continue to invest in its market leading platform, to better serve its thousands of customers–including Under Armor, Petsmart, Stripe, Gymshark, and Walgreens, to name just a few.

WHO WE'RE LOOKING FOR:

We’re looking for talented, passionate people to build the world’s best search & discovery technology. As an ownership-driven company, we seek team members who thrive within an environment based on autonomy and diversity. We're committed to building an inclusive and diverse workplace. We care about each other and the world around us, and embrace talented people regardless of their race, age, ancestry, religion, sex, gender identity, sexual orientation, marital status, color, veteran status, disability and socioeconomic background.

READY TO APPLY?

If you share our values and our enthusiasm for building the world’s best search & discovery technology, we’d love to review your application!

Senior Site Reliability Engineer, PaaS

The Opportunity

Your role will consist of:

We’re looking for someone who can live our values:

Apply for this job