Senior Site Reliability Engineer
Snyk, the leader in secure AI software development, empowers organizations to build fast and stay secure by unleashing developer productivity and reducing business risk. The company’s AI Trust Platform seamlessly integrates into developer and security workflows to accelerate secure software delivery in the AI Era. Snyk delivers trusted, actionable insights and automated remediation, enabling modern organizations to innovate without limits. Snyk is redefining secure AI-driven software delivery for over 4,500 customers worldwide today.
Joining Snyk means embracing our core values: One Team, Care Deeply, Customer Centric, and Forward Thinking. As a member of our team, you’ll have the opportunity to thrive in a dynamic environment where fostering collaboration, leading with empathy, driving business impact, and inspiring trust are at the heart of everything we do.
Our Opportunity
Snyk, a leader in developer security, has acquired Probely, a modern Dynamic Application Security Testing (DAST) provider based in Portugal, with coverage of API security testing and web applications.
We are seeking a skilled and proactive Site Reliability Engineer (SRE) to join our team and support our hypergrowth by building scalable, reliable, and secure cloud infrastructure. You will be responsible for ensuring the performance and uptime of our systems while adopting DevOps best practices and leveraging modern tools.
You’ll Spend Your Time:
- Design, deploy, and maintain infrastructure on AWS, including VPC, EC2, RDS, IAM and EKS clusters.
- Manage Kubernetes clusters across multiple environments with a focus on performance, security, and availability.
- Utilize ArgoCD, Kustomize and Helm for continuous deployment and GitOps workflows.
- Implement and manage monitoring and alerting systems using Prometheus, Grafana, and custom exporters.
- Maintain centralized logging and observability using Graylog and OpenSearch.
- Automate infrastructure provisioning with Terraform and custom scripting in Python or Bash.
- Implement best practices around networking, including VPN, load balancing, routing, and firewalls.
- Troubleshoot complex system issues across network, infrastructure, and application layers.
- Ensure high availability, scalability, and disaster recovery across all systems.
- Collaborate with development and operations teams to improve deployment processes and infrastructure resiliency.
What You’ll Need:
- Strong hands-on experience with AWS services (VPC, EC2, EKS, RDS, IAM).
- Deep understanding of Kubernetes architecture and day-to-day cluster management.
- Experience with Cloudflare products (DNS, Zero Trust, WAF, CDN).
- Proficiency in the Prometheus + Grafana monitoring stack.
- Strong with Calico for managing Kubernetes network policies.
- Solid experience with Graylog and OpenSearch for logging and search analytics.
- Proficient with Infrastructure as Code tools, especially Terraform, Kustomize and Helm.
- Experience with CI/CD pipelines and GitOps practices using ArgoCD.
- Strong scripting and automation skills in Bash and/or Python.
- Solid knowledge of networking principles (TCP/IP, DNS, HTTP/HTTPS, VPNs, security groups, etc.).
We’d be Lucky if You:
- Familiarity with incident management practices (on-call, runbooks, postmortem, disaster recovery).
- Understanding of Zero Trust security models and security best practices in cloud environments.
- Exposure to Service Mesh (Istio, Linkerd) and container networking.
- Experience with cost optimization and cloud spend monitoring.
- Familiarity with Linux system administration and shell scripting.
- Knowledge of RBAC and IAM in AWS and Kubernetes.
#LI-CR1 #LI-Hybrid
We care deeply about the warm, inclusive environment we’ve created and we value diversity – we welcome applications from those typically underrepresented in tech. If you like the sound of this role but are not totally sure whether you’re the right person, do apply anyway!
About Snyk
Snyk is committed to creating an inclusive and engaging environment where our employees can thrive as we rally behind our common mission to make the digital world a safer place. From Snyk employee resource groups, to global benefits that help our employees prioritize their health, wellness, financial security, and a work/life blend, we aim to support our employees along their entire journeys here at Snyk.
Benefits & Programs
Prioritize health, wellness, financial security, and life balance with programs tailored to your location and role.
- Flexible working hours, work-from home allowances, in-office perks, and time off for learning and self development
- Generous vacation and wellness time off, country-specific holidays, and 100% paid parental leave for all caregivers
- Health benefits, employee assistance plans, and annual wellness allowance
- Country-specific life insurance, disability benefits, and retirement/pension programs, plus mobile phone and education allowances
Apply for this job
*
indicates a required field