Senior SRE Engineer - future opening
Who We Are
While Xebia is a global tech company, our journey in CEE started with two Polish companies – PGS Software, known for world-class cloud and software solutions, and GetInData, a pioneer in Big Data. Today, we’re a team of 1,000+ experts delivering top-notch work across cloud, data, and software. And we’re just getting started.
What We Do
We work on projects that matter – and that make a difference. From fintech and e-commerce to aviation, logistics, media, and fashion, we help our clients build scalable platforms, data and AI solutions, and cutting-edge applications to shape the future of tech. Our clients include McLaren, Aviva, Deloitte, Spotify, Disney, ING, UPS, Tesco, Truecaller, AllSaints, Volotea, Schmitz Cargobull, Allegro, InPost, and many, many more.
We value smart tech, real ownership, and continuous growth. We use modern, open-source stacks, and we’re proud to be trusted partners of Databricks, dbt, Snowflake, Azure, GCP, and AWS. Fun fact: we were the first AWS Premier Partner in Poland!
Beyond Projects
What makes Xebia special? Our community. We support tech communities, organize meetups (Software Talks, Data Tech Talks), and have a culture that actively support your growth via Guilds, Labs, and personal development budgets — for both tech and soft skills. It’s not just a job. It’s a place to grow.
What sets us apart?
Our mindset. Our vibe. Our people. And while that’s hard to capture in text – come visit us and see for yourself.
You will be:
- designing and implementing SRE practices, including SLI/SLO frameworks, error budgets, toil budgets, and reliability reviews,
- leading the maturity progression from Level 1 (Reactive) through Level 5 (Autonomous),
- driving toil elimination by identifying, measuring, and automating repetitive operational work,
- designing and executing chaos engineering experiments to proactively identify reliability weaknesses,
- establishing production readiness review processes for new application onboarding,
- collaborating with engineering teams on joint RCA backlogs and incident reduction initiatives,
- defining and tracking SRE KPIs, including MTTD, MTTR, error budget consumption, toil ratio, and automation coverage,
- mentoring L2 engineers in SRE practices and engineering-led problem solving,
- contributing to capacity planning, performance engineering, and reliability architecture reviews,
- championing a blameless post-incident culture and continuous improvement.
Your profile:
- 5 - 8 years of experience in SRE, DevOps, or platform engineering,
- practical experience using AI-powered assistants (e.g. Claude Code, GitHub Copilot, Cursor) to improve productivity, quality, or decision-making in software delivery,
- deep understanding of SRE principles (Google SRE book concepts), including SLIs, SLOs, error budgets, and toil elimination,
- strong programming skills in Python, Go, or similar languages,
- extensive experience with cloud platforms such as AWS, Azure, or GCP, as well as Kubernetes,
- proficiency with observability tools, including Datadog, Splunk, Prometheus, and Grafana,
- experience with Infrastructure as Code (Terraform, Ansible) and CI/CD pipelines,
- proven track record of driving reliability improvements in production environments,
- experience with chaos engineering tools such as Gremlin, Chaos Monkey, or Litmus,
- strong analytical, problem-solving, and English communication skills (at least B2 level).
Work from the European Union region and a work permit are required.
Nice to have:
-
experience applying GenAI in a more structured way within the SDLC, including defined workflows, prompt patterns, or tool integrations embedded into daily work,
-
experience in managed services or consulting environments,
-
familiarity with AIOps and ML-driven operations,
-
contributions to the SRE community through talks, articles, or open-source projects,
-
experience working with large-scale distributed systems (1,000+ services),
-
SRE or cloud architect certifications,
- interest in and familiarity with emerging AI-driven practices (e.g. agent-based workflows, automation patterns, AI-augmented development), with a willingness to explore and experiment beyond standard approaches.
Recruitment Process:
CV review – HR call – Technical Interview – Client Interview – Decision
Create a Job Alert
Interested in building your career at Poland and Eastern Europe? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
.png?1773750017)