Back to jobs
New

Senior Site Reliability Engineer

Chennai, Tamil Nadu, India

Senior Site Reliability Engineer

Who we are

Arcadia is the technology company empowering energy innovators and consumers to fight the climate crisis. Our software and APIs are revolutionizing an industry held back by outdated systems and institutions by creating unprecedented access to the data and clean energy needed to make a decarbonized energy grid possible.

In 2014, Arcadia set out on its mission to break the fossil fuel monopoly and since then we have been knocking down the institutional barriers to unlock decarbonization. To date, we have connected hundreds of thousands of consumers and small businesses with high-quality clean energy options. Fast forward to today, and now, we’re thinking even bigger. We have launched Arcadia Platform, an industry-defining SaaS platform that empowers developers and energy innovators to deliver their own custom, personalized energy experiences, accelerating the transformation of the industry from an analog energy system into a digitized information network.

Tackling one of the world’s biggest challenges requires out-of-the-box thinking & diverse perspectives. We’re building a team of individuals from different backgrounds, industries, & educational experiences. If you share our passion for ushering in the era of the clean electron, we look forward to learning what you would uniquely bring to Arcadia! Visit www.arcadia.com.

HQ: Greenwood Village, Colorado

What we're looking for:

We are seeking an experienced Senior Site Reliability Engineer (L3) to join our SRE/Platform Engineering team in India. This role will focus on building, scaling, and maintaining our AWS- and Kubernetes-based platform, ensuring high reliability, cost efficiency, and secure operations across multiple environments. The successful candidate will work closely with Engineering, Security, DevOps, and Product teams to drive automation, improve infrastructure resilience, and elevate observability across mission-critical systems.

The ideal candidate is a self-starter and hands-on engineer who can dive deep into complex distributed systems, automate away manual processes, and proactively identify reliability gaps. They should have a proven track record of managing production-grade AWS infrastructure, Kubernetes clusters, CI/CD pipelines, and cloud security. They will collaborate daily with US-based engineering teams and cross-functional partners to ensure our platform remains scalable, secure, and cost-optimized as we continue to grow.

What you'll do:

  • Design, build, and maintain AWS infrastructure (EKS, VPC, RDS, IAM, CloudWatch, CloudTrail, GuardDuty, Load Balancers, S3, CloudFront) using Terraform and CloudFormation

  • Lead all aspects of Kubernetes operations including cluster upgrades, performance tuning, CNI troubleshooting, workload scaling, Helm chart packaging, and GitOps deployments

  • Own and evolve our CI/CD ecosystem across Jenkins (Groovy scripting), GitHub Actions, AWS CodePipeline, ArgoCD, and FluxCD

  • Improve platform reliability by reducing operational toil through automation, scripting (Python/Bash), and proactive system hardening

  • Implement and enhance observability across Prometheus, Grafana, Loki, Tempo, Datadog, and CloudWatch—ensuring actionable alerting, dashboards, and metrics alignment with SLO/SLIs

  • Drive FinOps initiatives, identifying cost inefficiencies and working with engineering teams to implement best practices, tagging standards, budgeting, and resource right-sizing

  • Manage database operations across MySQL and PostgreSQL including backups, performance tuning, replication, and operational runbooks

  • Maintain and improve secret management using Vault, AWS Secrets Manager, and Parameter Store

  • Strengthen cloud security posture with IAM least privilege, CSPM reviews, audit readiness, GuardDuty/CloudTrail monitoring, and environment hardening

  • Troubleshoot complex production issues across networking, Kubernetes, compute, databases, and CI/CD systems

  • Collaborate daily with US-based teams for incident reviews, migrations, roadmap work, and platform enhancements

  • Contribute to development and adoption of AI-enabled tooling (e.g., automation, debugging assistants, MCP, RAG pipelines—good to have, not mandatory)

  • Document runbooks, architecture diagrams, SOPs, troubleshooting guides, and operational best practices

  • Participate in on-call rotations (if required) and drive post-incident analysis and long-term fixes

What will help you succeed:

Must-haves:

  • Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience

  • 8–10+ years of experience in SRE/DevOps/Cloud Engineering, with deep hands-on exposure to AWS and Kubernetes

  • Strong hands-on experience with:

    • Terraform & Infrastructure as Code

    • AWS core services (EKS, IAM, RDS, EC2, VPC, CloudWatch, CloudTrail, GuardDuty)

    • Jenkins + Groovy, GitHub Actions, ArgoCD, FluxCD

    • Kubernetes troubleshooting and operations

    • Prometheus/Grafana/Datadog observability stacks

  • Proven ability to operate in high-scale, high-uptime, multi-environment production systems

  • Experience building automation via Python/Bash and reducing operational toil

  • Strong understanding of incident management, root cause analysis, and reliability engineering principles

  • Experience working with globally distributed teams across multiple time zones

  • Excellent communication skills (must interact with US teams daily)

  • Ability to work independently with minimal supervision, take ownership, and drive initiatives end-to-end

  • A growth mindset, strong troubleshooting ability, and comfort with complex cloud-native environments

Nice to have (Good-to-haves):

  • Experience with n8n self-hosted, workflow automation platforms

  • Exposure to LLMs, RAG, vector DBs, MCP concepts

  • Experience with cloud security/DevSecOps tools (Trivy, Inspector, OPA, Kyverno)

  • Hands-on experience with FinOps platforms and cloud cost governance

  • Certifications in related field ( AWS , Kubernetes , Terraform ..etc)

 

Benefits

  • Competitive compensation and employee stock options
  • Hybrid/remote-first working model (India-based role, with global collaboration)
  • Flexible leave policy
  • Comprehensive medical insurance (self + family members)
  • Annual performance cycle + quarterly recognition awards
  • A supportive, diverse engineering culture grounded in empathy, teamwork, and innovation

Eliminating carbon footprints, eliminating carbon copies.

Here at Arcadia, we cultivate diversity, celebrate individuality, and believe unique perspectives are key to our collective success in creating a clean energy future. Arcadia is committed to equal employment opportunities regardless of race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, protected veteran status, or any status protected by applicable federal, state, or local law. While we are currently unable to consider candidates who will require visa sponsorship, we welcome applications from all qualified candidates eligible to work in India

Thank you

Create a Job Alert

Interested in building your career at Arcadia? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Arcadia Self-Identification Questions

For government reporting purposes, we ask candidates to respond to the below self-identification survey. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in a confidential file.

As set forth in Arcadia’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

---

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection. As a government contractor subject to the Vietnam Era Veterans Readjustment Assistance Act (VEVRAA), we request this information in order to measure the effectiveness of the outreach and positive recruitment efforts we undertake pursuant to VEVRAA. Classification of protected categories is as follows:

A "disabled veteran" is one of the following: a veteran of the U.S. military, ground, naval or air service who is entitled to compensation (or who but for the receipt of military retired pay would be entitled to compensation) under laws administered by the Secretary of Veterans Affairs; or a person who was discharged or released from active duty because of a service-connected disability.

A "recently separated veteran" means any veteran during the three-year period beginning on the date of such veteran's discharge or release from active duty in the U.S. military, ground, naval, or air service.

An "active duty wartime or campaign badge veteran" means a veteran who served on active duty in the U.S. military, ground, naval or air service during a war, or in a campaign or expedition for which a campaign badge has been authorized under the laws administered by the Department of Defense.

An "Armed forces service medal veteran" means a veteran who, while serving on active duty in the U.S. military, ground, naval or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985.

---

Voluntary Self-Identification of Disability

Why are you being asked to complete this form?

We are required to measure our progress toward having at least 7% of our workforce be individuals with disabilities. To do this, we must ask applicants and employees if they have a disability or have ever had a disability. Because a person may become disabled at any time, we ask all of our employees to update their information at least every five years.

Identifying yourself as an individual with a disability is voluntary, and we hope that you will choose to do so. Your answer will be maintained confidentially and not be seen by selecting officials or anyone else involved in making personnel decisions. Completing the form will not negatively impact you in any way, regardless of whether you have self-identified in the past. For more information about this form or the equal employment obligations of federal contractors under Section 503 of the Rehabilitation Act, visit the U.S. Department of Labor’s Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp.

How do you know if you have a disability?

You are considered to have a disability if you have a physical or mental impairment or medical condition that substantially limits a major life activity, or if you have a history or record of such an impairment or medical condition.

Disabilities include, but are not limited to:

  • Autism
  • Autoimmune disorder, for example, lupus, fibromyalgia, rheumatoid arthritis, or HIV/AIDS
  • Blind or low vision
  • Cancer
  • Cardiovascular or heart disease
  • Celiac disease
  • Cerebral palsy
  • Deaf or hard of hearing
  • Depression or anxiety
  • Diabetes
  • Epilepsy
  • Gastrointestinal disorders, for example, Crohn's Disease, or irritable bowel syndrome
  • Intellectual disability
  • Missing limbs or partially missing limbs
  • Nervous system condition for example, migraine headaches, Parkinson’s disease, or Multiple sclerosis (MS)
  • Psychiatric condition, for example, bipolar disorder, schizophrenia, PTSD, or major depression
Select...
Select...
Select...
Select...