Site Reliability Engineer
Key Responsibilities:
Infrastructure Design & Management: Design, build, and maintain scalable infrastructure on
AWS, Google Cloud, or Azure.
Automation: Implement Infrastructure as Code (IaC) using tools like T erraform and Ansible
for automation.
Cost Management: Monitor and manage cloud infrastructure costs efficiently.
Security: Build and maintain security frameworks across infrastructure and operations.Monitoring & Maintenance: Establish robust monitoring systems using tools such as New
Relic, Datadog, and Prometheus.
Ops & Deployment: Develop and maintain CI/CD pipelines, and manage deployment
processes.
Required Skills & Experience:
Experience with cloud computing services (AWS, Google Cloud, Azure).
Experience with GitHub, GitHub Actions, ECR, ECS, Docker, and EC2"
Proficiency in scripting languages (e.g., JavaScript, Python, PHP , Ruby, Go).
Hands-on experience with IaC tools (T erraform, CloudFormation, Ansible).
Knowledge of system monitoring tools (New Relic, Datadog, Prometheus).
Preferred Skills & Experience:
Experience in incident management and performance tuning.
Familiarity with cloud-native technologies (Kubernetes, Docker, Istio).
Experience in web application development.
Ideal Candidate Attributes:
Strategic thinker with the ability to view projects from a high-level perspective.
Strong sense of responsibility and integrity.
Excellent communication and collaboration skills.
Apply for this job
*
indicates a required field