Back to jobs
tags.new

Staff Site Reliability Engineer

Sunnyvale, CA

Figure is an AI robotics company developing autonomous general-purpose humanoid robots. The goal of the company is to ship humanoid robots with human level intelligence. Its robots are engineered to perform a variety of tasks in the home and commercial markets. Figure is headquartered in San Jose, CA.

We are looking for a Site Reliability Engineer to own our internal systems infrastructure. This role is responsible for setting up and managing cloud and on-prem infrastructure to deliver highly available, reliable, and automated systems.

Responsibilities:

  • Be the go to person for mission critical infrastructure enabling critical operations such as Source Configuration Management, CI/CD systems, software distribution, supplier portals, manufacturing and more.
  • Migrate SaaS to self-hosted solutions to enhance security and reliability.
  • Implement monitoring and alerting systems, and define incident response plans and runbooks.
  • Reduce human workload through automation to automate deployment and scaling.
  • Establish strong relationships with stakeholders to identify infrastructure needs and establish Service Level Objectives.
  • Use a data driven approach to demonstrate service robustness and track optimization work.
  • Partner with the security team to ensure that security remediations and updates are applied in a timely manner.

Requirements:

  • Strong experience with Linux/Unix systems administration
  • Proficiency in programming/scripting
  • Extensive experience with cloud platforms (Azure, AWS, GCP) and on-prem hardware architectures
  • Experience designing, deploying, and operating high-availability, fault-tolerant, and distributed systems.
  • Mastery of infrastructure as code (Terraform, CloudFormation, Ansible…)
  • Familiarity with monitoring, logging, and alerting tools (Prometheus, Grafana, Datadog…)
  • Solid understanding of networking fundamentals (TCP/IP, DNS, HTTP, load balancers, firewalls)
  • Experience defining Service Level Objectives (SLO), developing runbooks/incident response plans, facilitating post-mortems and managing systems assets.
  • Ability to work in cross-functional teams with developers, infra, and product teams
  • Excellent verbal and written  communication skills

The US base salary range for this full-time position is between $175,000 - $250,000 annually.

The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended. 



Create a Job Alert

Interested in building your career at Figure? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV

Accepted file types: pdf, doc, docx, txt, rtf