Site Reliability Engineering (SRE) Tech Lead
Site Reliability Engineering (SRE) Tech Lead
Role Overview
As the SRE Tech Lead at Obsidian, you will define and build the reliability foundation for a complex, multi-tenant SaaS platform serving enterprise and financial customers. You will operate as a peer to the DevOps and Platform Engineering leads, driving a unified reliability strategy across the organization.
Your core mandate: ensure Obsidian detects every system failure before customers do—and communicates proactively when issues arise.
This is a hands-on technical leadership role with high ownership and visibility, reporting directly to the CTO. You will architect and implement systems that handle real-world complexity: upstream SaaS dependencies, sparse and noisy data, and mission-critical enterprise workloads.
Key Responsibilities:
- Map and instrument critical system paths for top-tier enterprise customers
- Build connector health models to classify issues:
- Internal defects (“our bug”)
- Upstream SaaS outages
- Expected sparse/low-signal scenarios
- Establish tiered incident communication:
- Public status page for all customers
- Direct outreach for high-priority accounts
- Define and begin rollout of SLI/SLO standards across microservices
- Develop self-service instrumentation tooling enabling engineering teams to own observability
- Implement baseline-aware anomaly detection across all connectors (beyond static thresholds)
- Mature incident response processes, including:
- Structured post-mortems
- Continuous reliability improvements
Required Qualifications
- 7+ years in SRE, production engineering, or similar roles
- 2+ years operating as a technical lead
- Deep expertise with:
- AWS and/or GCP
- Kubernetes, Helm
- Observability stack (Prometheus, Grafana)
- CI/CD systems (GitLab CI/CD, ArgoCD)
- Proven experience building monitoring for multi-tenant SaaS systems with complex data pipelines
- Strong debugging skills across distributed microservices and legacy systems
- Hands-on engineering mindset — able to instrument services directly, not just configure tooling
- Track record of building or significantly improving incident detection and response systems
Preferred Qualifications
- Experience in B2B SaaS serving enterprise or financial customers
- Familiarity with third-party SaaS connector ingestion patterns
- Experience building anomaly detection systems or baseline-aware alerting
- Experience implementing customer-facing status pages and incident communication frameworks
Why This Role
- Direct impact: Work closely with the CTO and shape company-wide reliability strategy
- Greenfield opportunity: Build a detection and reliability platform from the ground up
- Technically challenging: Solve for multi-tenant systems with upstream dependencies and sparse data
- High stakes: Protect systems relied upon by major financial institutions
What Success Looks Like
- Obsidian consistently detects and diagnoses issues before customers are impacted
- Clear, proactive communication builds customer trust during incidents
- Engineering teams independently own observability through scalable tooling
- Reliability becomes a measurable, continuously improving capability across the platform
If you’re excited about building systems that make failure predictable—and invisible to customers—this role offers both the challenge and the ownership to do it right.
Employee Benefits
Our competitive benefits packages are designed to support our employees' well-being, both at work and at home. Our US based employees enjoy:
- Competitive compensation with equity and 401k
- Comprehensive healthcare with dental and vision coverage
- Flexible paid time off and paid holiday time off
- 12 weeks of new parent or family leave
- Personal and professional development resources
For more details on our US benefits, or for information on our international benefits, please see here.
Pay Transparancy
Please note that the base pay range is a guideline and for candidates who receive an offer, the base pay will vary based on factors such as work location, as well as the knowledge, skills and experience of the candidate. In addition to a competitive base salary, this position is eligible for equity awards and may be eligible for sales commission or incentive compensation based on the role or function within the company.
At Obsidian, we are proud to be an equal-opportunity employer. We value diversity and hire for talent, passion, and compassion. In compliance with federal law, all persons hired will be required to submit satisfactory proof of identity and legal authorization. If you have a need that requires accommodation, please contact accommodations@obsidiansecurity.com
Information collected and processed as part of any job applications you choose to submit is subject to Obsidian’s Applicant Privacy Policy.
Base Salary Range
$250,000 - $280,000 USD
Create a Job Alert
Interested in building your career at Obsidian Security? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field