Back to jobs
New

Senior Site Reliability Engineer (SRE)

United States or Canada

Finite State partners with product security teams, the guardians of our connected world, to create transparency for their connected devices and supply chains. Our platform handles connected devices and embedded systems across all industries, including those found in enterprises, healthcare, utilities, connected vehicles, manufacturing facilities, critical infrastructure, and government entities. 

We are a fast-growing series-B company with a fully distributed workforce. Led by a team of seasoned experts, we are a mission-driven team passionate about arming our customers with the actionable insights, critical vulnerability data, and remediation guidance necessary to mitigate product risk and protect the connected attack surface. We are committed to a remote first culture.

About the Role

We are seeking a Senior Site Reliability Engineer (SRE) / Infrastructure Engineering leader to define, architect, and drive a modern observability and reliability strategy for an AI-first development organization. This is a highly impactful technical leadership role responsible for establishing best-in-class operational practices, reliability standards, and AI-enabled infrastructure automation across our product ecosystem.

This individual will bring deep experience in reliability engineering, distributed systems, and production operations—along with a forward-thinking mindset around AI-assisted development and infrastructure-as-code.

If you are passionate about building resilient systems, defining SLOs that actually matter, and leveraging AI tooling to accelerate operational excellence, this role is for you.

What You’ll Do

Observability & Reliability Leadership

Leverage AI tools and Agentic processes to drive observability, quality, responsiveness, and operational clarity.

  • Design modern telemetry pipelines (metrics, logs, traces, events) for distributed systems and AI-driven workloads.
  • Define and implement a comprehensive observability framework across applications and
    infrastructure.
  • Establish and operationalize meaningful SLIs, SLOs, and SLAs aligned with business objectives.
  • Lead the adoption and optimization of observability tooling including Honeycomb, Grafana, and related telemetry platforms.
  • Drive best practices in error budgeting, alert design, and production health monitoring.

Operational Excellence

  • Define and evolve incident management processes, including:
    • On-call structures and escalation models
    • Postmortems and blameless retrospectives
    • Runbooks and operational playbooks
  • Improve system reliability, performance, scalability, and cost efficiency.
  • Establish operational KPIs and reliability dashboards for engineering and leadership visibility.
  • Lead reliability reviews for new architecture and product initiatives.

Infrastructure Engineering

  • Architect and implement scalable cloud infrastructure primarily within AWS.
  • Work closely with modern application platforms such as Vercel and Supabase.
  • Implement and improve Infrastructure-as-Code practices.
  • Leverage AI-assisted tooling to accelerate infrastructure design, validation, and automation.
  • Ensure production-grade security, compliance, and resilience standards.

AI-First Enablement

  • Champion the use of AI tools to:
    • Accelerate infrastructure provisioning
    • Improve operational workflows
    • Enhance observability signal quality
    • Automate incident response and remediation
  • Partner with AI-focused product teams to ensure observability supports model performance, experimentation, and reliability.

Technical Leadership

  • Serve as a senior technical authority for reliability and infrastructure decisions.
  • Mentor engineers on production best practices.
  • Influence architectural decisions to improve system resilience and maintainability.
  • Drive a culture of reliability, accountability, and continuous improvement.

What You Bring

Experience

  • 10+ years of experience in Site Reliability Engineering, Infrastructure Engineering, or Production Engineering.
  • Proven experience defining and implementing SLOs, SLAs, SLIs, and error budget frameworks at scale.
  • Deep experience building and managing on-call rotations and incident management processes.
  • Strong background in distributed systems and cloud-native architectures.

Technical Expertise

  • Hands-on experience with:
    • Honeycomb
    • Grafana
    • AWS
    • Vercel
    • Supabase
  • Strong experience with observability instrumentation and telemetry design.
  • Infrastructure-as-Code experience (e.g., Terraform, Pulumi, or similar).
  • Experience designing resilient CI/CD pipelines.
  • Deep understanding of high-availability, scalability, and performance engineering principles.

AI & Automation

  • Demonstrated experience leveraging AI tools (Cursor, Claude, Codex, etc.) in development or infrastructure workflows.
  • Experience using AI-assisted tooling to generate, validate, or optimize infrastructure configurations.
  • Strong interest in building AI-native operational practices.

Leadership & Communication

  • Ability to operate as both strategic architect and hands-on implementer.
  • Strong written and verbal communication skills.
  • Experience influencing cross-functional teams.
  • Comfort working in fast-paced, high-growth environments.

Nice to Have

  • Experience supporting AI/ML workloads in production.
  • Experience building internal developer platforms (IDP).
  • Experience with cost observability and FinOps practices.
  • Experience scaling observability in high-growth SaaS environments.

What Success Looks Like in the First 6 Months

  • Clear SLO framework implemented across core services.
  • Observability tooling standardized and adopted organization-wide.
  • On-call and incident management processes running smoothly with measurable improvements.
  • AI-driven infrastructure workflows reducing operational toil.

Increased system reliability and reduced mean time to detection (MTTD) and recovery (MTTR).

Compensation

Our salary ranges are categorized into two tiers based on geographic location:
  • Tier 1 (San Francisco, New York, Seattle): $230,000 - $250,000
  • Tier 2 (All Other Locations): $215,000 - $240,000
The final base salary will be determined by experience, skill set, and specific location. In addition to base pay, this role is eligible for equity and benefits.

 

 

About Finite State

At Finite State, we're on a mission to secure the connected world. Our platform empowers product security teams to detect vulnerabilities, manage software supply chain risks, and ensure compliance across complex device ecosystems. From IoT to critical infrastructure, we provide unparalleled visibility into firmware and software components, helping organizations protect their products and customers.

We move with urgency and intent — we’re transparent, own outcomes, put customers first, speak up, and learn fast — turning evidence into action. CLARITY is how we move fast without breaking trust.

  • C - Customer first - Learn from customers. Ship with urgency.
  • - Leverage - Outsource the routine. Own the result.
  • A - Agency - We take responsibility—end to end.
  • R - Results - Ship value. Improve fast.
  • I - Integrity - Speak up. Experiment boldly. Be kind.
  • T - Transparency - Clear context. Faster decisions.
  • Y - "Why" - Our mission—securing the connected products humanity depends on—is the reason Finite State exists. CLARITY is how we make that mission real, every day, at speed

Bold Innovation – We push boundaries, explore new ideas, and take initiative to solve complex problems.

The Finite State platform brings visibility and control to the supply chains that create connected devices and embedded systems—all in a simple to use platform and at the scale manufacturers need to keep device production on time and on budget. After unpacking and analyzing every file, configuration, and setting in a firmware build, the platform generates a complete bill of materials for software components, identifies known and 0-day vulnerabilities, shows a contextual risk score, and provides actionable insights that product teams can use to secure their software

We are proud to be an Equal Employer Opportunity employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. Finite State is committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities.

Create a Job Alert

Interested in building your career at Finite State? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Select...

Voluntary Self-Identification

For government reporting purposes, we ask candidates to respond to the below self-identification survey. Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in a confidential file.

As set forth in Finite State’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

Select...
Select...
Race & Ethnicity Definitions

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection. As a government contractor subject to the Vietnam Era Veterans Readjustment Assistance Act (VEVRAA), we request this information in order to measure the effectiveness of the outreach and positive recruitment efforts we undertake pursuant to VEVRAA. Classification of protected categories is as follows:

A "disabled veteran" is one of the following: a veteran of the U.S. military, ground, naval or air service who is entitled to compensation (or who but for the receipt of military retired pay would be entitled to compensation) under laws administered by the Secretary of Veterans Affairs; or a person who was discharged or released from active duty because of a service-connected disability.

A "recently separated veteran" means any veteran during the three-year period beginning on the date of such veteran's discharge or release from active duty in the U.S. military, ground, naval, or air service.

An "active duty wartime or campaign badge veteran" means a veteran who served on active duty in the U.S. military, ground, naval or air service during a war, or in a campaign or expedition for which a campaign badge has been authorized under the laws administered by the Department of Defense.

An "Armed forces service medal veteran" means a veteran who, while serving on active duty in the U.S. military, ground, naval or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985.

Select...

Voluntary Self-Identification of Disability

Form CC-305
Page 1 of 1
OMB Control Number 1250-0005
Expires 04/30/2026

Why are you being asked to complete this form?

We are a federal contractor or subcontractor. The law requires us to provide equal employment opportunity to qualified people with disabilities. We have a goal of having at least 7% of our workers as people with disabilities. The law says we must measure our progress towards this goal. To do this, we must ask applicants and employees if they have a disability or have ever had one. People can become disabled, so we need to ask this question at least every five years.

Completing this form is voluntary, and we hope that you will choose to do so. Your answer is confidential. No one who makes hiring decisions will see it. Your decision to complete the form and your answer will not harm you in any way. If you want to learn more about the law or this form, visit the U.S. Department of Labor’s Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp.

How do you know if you have a disability?

A disability is a condition that substantially limits one or more of your “major life activities.” If you have or have ever had such a condition, you are a person with a disability. Disabilities include, but are not limited to:

  • Alcohol or other substance use disorder (not currently using drugs illegally)
  • Autoimmune disorder, for example, lupus, fibromyalgia, rheumatoid arthritis, HIV/AIDS
  • Blind or low vision
  • Cancer (past or present)
  • Cardiovascular or heart disease
  • Celiac disease
  • Cerebral palsy
  • Deaf or serious difficulty hearing
  • Diabetes
  • Disfigurement, for example, disfigurement caused by burns, wounds, accidents, or congenital disorders
  • Epilepsy or other seizure disorder
  • Gastrointestinal disorders, for example, Crohn's Disease, irritable bowel syndrome
  • Intellectual or developmental disability
  • Mental health conditions, for example, depression, bipolar disorder, anxiety disorder, schizophrenia, PTSD
  • Missing limbs or partially missing limbs
  • Mobility impairment, benefiting from the use of a wheelchair, scooter, walker, leg brace(s) and/or other supports
  • Nervous system condition, for example, migraine headaches, Parkinson’s disease, multiple sclerosis (MS)
  • Neurodivergence, for example, attention-deficit/hyperactivity disorder (ADHD), autism spectrum disorder, dyslexia, dyspraxia, other learning disabilities
  • Partial or complete paralysis (any cause)
  • Pulmonary or respiratory conditions, for example, tuberculosis, asthma, emphysema
  • Short stature (dwarfism)
  • Traumatic brain injury
Select...

PUBLIC BURDEN STATEMENT: According to the Paperwork Reduction Act of 1995 no persons are required to respond to a collection of information unless such collection displays a valid OMB control number. This survey should take about 5 minutes to complete.