Back to jobs

Senior Site Reliability Engineer - Platform

Remote - USA

The Reliability Engineering team helps realize our vision by supporting Coinbase engineering teams to build software that is world-class in terms of its reliability. As a core service team, Coinbase Reliability Engineers work closely with the rest of engineering. We proactively seek out and gather the state-of-the-art, best practices from the industry at large. Through education and advocacy, we seek to ensure that reliability is a core value of our engineering culture. We level up other engineers by sharing deep knowledge, performing proactive analysis and improving processes, tools, and automation. Ultimately, Reliability Engineering succeeds when all engineering teams are able to build reliable software on their own.

Our Reliability Engineering team highly values people with intellectual curiosity and openness. We collaborate across the organization, helping our engineers think big and take risks while building a culture of diversity, positive energy and blameless truth-seeking. We encourage self-starting on high-impact projects within the context of strong support and mentorship.

The mission of the Platform Product Group engineers is to build a trusted, scalable and compliant platform to operate with speed, efficiency and quality. Our teams build and maintain the platforms critical to the existence of Coinbase. There are many teams that make up this group which include Product Foundations (i.e. Identity, Payment, Risk, Proofing & Regulatory, Finhub), Machine Learning, Customer Experience, and Infrastructure. 

What you’ll be doing (ie. job duties):

  • Build automation and improve systems to eliminate toil and operations work.
  • Improve observability, reliability and availability by defining and measuring key metrics
  • Collaborate with our core infrastructure team to performance tune and optimize our cloud deployments. (Think Docker, Terraform, Kubernetes, EC2, etc.)
  • Collaborate with Coinbase product teams to reduce service disruptions and automate incident response
  • Proactively find and analyze reliability problems across our business units and stack, then design and implement software to create step-function improvements.
  • Facilitate incident response, conduct root cause analysis and blameless retrospectives
  • Educate, mentor and hold accountable the engineering team to improve the reliability of our systems and make reliability a core value of the Coinbase engineering culture.

What we look for in you (ie. job requirements):

  • You have at least 5+ years of software engineering experience
  • You have a strong understanding of data structures & algorithms, especially as they pertain to performance and reliability
  • You are fluent in at least one programming language such as Golang, Ruby, Python or JavaScript
  • You possess strong skills around observability, debugging and performance tuning
  • You have the ability to debug complex systems and the willingness to dive into understanding, debugging, and improving any layer of the stack
  • You have experience working with containers / container orchestration systems (Docker, ECS, EKS, etc) and monitoring tools (DataDog, Graphite, Grafana, and Prometheus)
  • You have deep knowledge of UNIX/Linux system internals such as system calls, TCP/IP and debugging tools.
  • You have strong communication skills and the ability to explain technical concepts clearly and simply
  • You have demonstrated critical thinking under pressure

Nice to haves:

  • Crypto-forward experience, including familiarity with onchain activity such as interacting with Ethereum addresses, using ENS, and engaging with dApps or blockchain-based services
  • Experience with AWS, GCP, Azure, or other cloud environment
  • Experience designing and building reliable systems capable of handling high throughput and low latency
  • Experience with observability and monitoring systems such as Kibana, Datadog, etc.
  • Experience working in a highly regulated environment
  • Exposure to both NoSQL and SQL database technologies such as DynamoDB, MongoDB, PostgreSQL, AWS Aurora.
  • Familiarity with working in rapid growth environments

Job #: GPSRE05US

Pay Transparency Notice:  Depending on your work location, the target annual salary for this position can range from $[Zone 3 Pay] to $[Zone 1 Pay] + target bonus + target equity + benefits (including medical, dental, vision and 401(k)).

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Employment

Select...
Select...

Education

Select...
Select...
Select...
Select...
Select...

Select...
Select...
Select...
Select...
Select...
Select...
Select...

If you do not have a public Ethereum Address, Basename or ENS (Ethereum Name Service) name, you can obtain one through hyperlinks.