Back to jobs
New

Site Reliability Engineer - Platform

Remote - India

Team/ Role Paragraph: 

Team - The Coinbase Embedded SRE Program provides teams with dedicated Site Reliability Engineering (SRE) support to level-up the reliability, availability, efficiency, and scalability of their services. The team’s role is critical to ensure Coinbase systems can service customers during peak traffic conditions, as well as maximize availability during unexpected events. 

Role - We would like to add a Site Reliability Engineer to directly embed with product engineers to uplift Coinbase services and systems and  promote reliability culture across the company. You would be helping company-wide goals to scale the system by 10-50x and helping teams improve their services’ availability, reliability, scalability, and operational excellence  through various initiatives.

What you’ll be doing (ie. job duties):

  • Improve observability, reliability and availability by defining and measuring key metrics
  • Build automation and improve systems to eliminate toil and operations work.
  • Collaborate with our core infrastructure team to performance tune and optimize our cloud deployments. (Think Docker, Terraform, Kubernetes, EC2, etc.)
  • Collaborate with Coinbase product teams to reduce service disruptions and automate incident response
  • Proactively find and analyze reliability problems across our business units and stack, then design and implement software to create step-function improvements.
  • Educate, mentor and hold accountable the engineering team to improve the reliability of our systems and make reliability a core value of the Coinbase engineering culture.
  • Write high quality, well tested code to meet the needs of your customers.
  • Debugging extremely difficult technical problems, and making systems and products both work better and are easier to deploy, own, operate and diagnose.
  • Review all feature designs within your product area and across the company for cross-cutting projects.
  • Be an owner of the security, safety, scale, operational integrity, and architectural clarity of these designs.
  • Build pipelines to integrate with 3rd party vendors

What we look for in you (ie. job requirements):

  • You have at least 2 to 5 years of experience in software engineering.
  • You’ve designed, built, scaled and maintained production services, and know how to compose a service oriented architecture.
  • You write high quality, well tested code to meet the needs of your customers.
  • You’re passionate about building an open financial system that brings the world together.
  • You possess strong technical skills for system design and coding
  • Excellent written and verbal communication skills, and a bias toward open, transparent cultural practices.
  • Strong skills around observability, debugging and performance tuning
  • Strong communication skills and ability to explain technical concepts clearly and simply
  • Strong interpersonal skills working with Engineers from junior to principal levels
  • Demonstrated critical thinking under pressure
  • A willingness to dive into understanding, debugging, and improving any layer of the stack

Nice to haves:

  • Experience building reliable systems capable of handling high throughput and low latency
  • Experience with observability and monitoring systems such as Kibana, Datadog, etc.
  • Familiarity with working in rapid growth environments
  • Experience in Ruby, Go, and Terraform
  • Experience with AWS, GCP, Azure, or other cloud environment
  • Experience designing and building reliable systems
  • Experience working in a highly regulated environment
  • Experience writing company-facing blog posts and training materials

Job #: GPSRE04IN

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...
Select...
Select...
Select...
Select...