
Back to jobs
About the Role
We are looking for a Site Reliability Engineer (SRE) with strong DevOps skills to join our Platform Engineering team in LATAM. In this role, you will work at the intersection of infrastructure, automation, and operations—delivering cloud-native solutions that empower developers, streamline systems, and elevate reliability. You’ll be responsible for managing cloud infrastructure on Google Cloud Platform (GCP), supporting self -service tooling, and ensuring platform resilience through monitoring and governance. This position is ideal for professionals who thrive in fast-paced, collaborative environments and are passionate about scaling and securing cloud-native platforms.
What You’ll Do
Infrastructure and Automation
Operational Support
Security & Governance
What You Bring
Nice to have:
Senior Platform Engineer
Colombia
For more than 20 years, our global network of passionate technologists and pioneering craftspeople has delivered cutting-edge technology and game-changing consulting to companies on the brink of AI-driven digital transformation. Since 2001, we have grown into a full-service digital consulting company with 5500+ professionals working on a worldwide ambition. Driven by the desire to make a difference, we keep innovating. Fueling the growth of our company with our knowledge worker culture. When teaming up with Xebia, expect in-depth
expertise based on an authentic, value-led, and high-quality way of working that inspires all we do.
expertise based on an authentic, value-led, and high-quality way of working that inspires all we do.
About the Role
We are looking for a Site Reliability Engineer (SRE) with strong DevOps skills to join our Platform Engineering team in LATAM. In this role, you will work at the intersection of infrastructure, automation, and operations—delivering cloud-native solutions that empower developers, streamline systems, and elevate reliability. You’ll be responsible for managing cloud infrastructure on Google Cloud Platform (GCP), supporting self -service tooling, and ensuring platform resilience through monitoring and governance. This position is ideal for professionals who thrive in fast-paced, collaborative environments and are passionate about scaling and securing cloud-native platforms.
What You’ll Do
Infrastructure and Automation
- Provision and manage GCP infrastructure using Terraform, implementing reusable and scalable modules.
- Deliver reliable, self-service infrastructure that supports developer velocity and operational consistency.
- Perform upgrades and patching of shared platform services, ensuring availability and compliance.
- Standardize infrastructure configurations using Infrastructure as Code (IaC) best practices
Operational Support
- Act as the primary technical resource for platform-related incidents, alerts, and system escalations.
- Collaborate with development teams to troubleshoot deployment and cloud configuration issues.
- Monitor platform health, track system performance, and optimize cost efficiency using GCP-native tools.
- Support platform reliability through on-call rotations, incident response practices, and runbook maintenance.
Security & Governance
- Contribute to the design and rollout of a secure and scalable IAM policy framework across GCP projects.
- Help implement a consistent enterprise-wide resource labeling strategy to support auditability, cost tracking, and lifecycle governance.
- Work with InfoSec and Compliance teams to ensure cloud environments meet internal and regulatory requirements
Collaboration & Knowledge Sharing
- Partner with developers, data scientists, and security teams to support infrastructure needs across projects.
- Maintain internal documentation and platform wikis to improve onboarding, adoption, and issue resolution.
- Share knowledge through code reviews, design sessions, and internal demos to promote platform reliability culture
- Support technical evaluations of other consultants when required, contributing to the assessment of skills and alignment with project needs
What You Bring
- 5+ years of experience in a Site Reliability, DevOps, or Cloud Infrastructure Engineering role.
- Proficiency in provisioning infrastructure using Terraform on Google Cloud Platform (GCP).
- Strong understanding of infrastructure provisioning, networking, IAM, and cloud- native best practices.
- Solid knowledge of CI/CD processes, system monitoring, and distributed system troubleshooting.
- Experience collaborating with multiple stakeholders in a dynamic, fast-paced environment.
- Fluent in English, with excellent communication and documentation skills.
Nice to have:
- Experience writing automation or infrastructure tooling scripts in Python.
- Familiarity with GCP labeling/tagging best practices and cost tracking strategies
- Exposure to incident response workflows and participation in on-call rotations.
- Understanding of cloud cost management principles and budget optimization strategies.
What We Offer
- 100% remote work to provide flexibility and work-life balance.
- Company laptop and necessary equipment to perform your role effectively.
- Competitive salary package aligned with local market benchmarks.
Create a Job Alert
Interested in building your career at LATAM? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field