Data Operations Tech Lead
Machinify is a leading healthcare intelligence company with expertise across the payment continuum, delivering unmatched value, transparency, and efficiency to health plan clients across the country. Deployed by over 85 health plans, including many of the top 20, and representing more than 270 million lives, Machinify brings together a fully configurable and content-rich, AI-powered platform along with best-in-class expertise. We’re constantly reimagining what’s possible in our industry, creating disruptively simple, powerfully clear ways to maximize financial outcomes and drive down healthcare costs.
We’re seeking a skilled Data Operations Tech Lead to join our team!
We are a data-driven organization building and operating a growing analytics and data platform that supports critical business decisions. Our Data Operations team plays a key role in ensuring data pipelines are reliable, observable, and scalable as we evolve from legacy, SQL-centric systems to a modern cloud-based data stack. We value operational excellence, continuous improvement, and collaboration across engineering, platform, and product teams.
We are seeking a Data Operations Tech Lead to help lead the evolution of our Data Operations team as it expands from supporting legacy SQL-based pipelines to also operating and supporting modern data pipelines built on Airflow and Spark.
This role is ideal for a technically strong, operationally minded leader who has experience running data pipelines at scale and enjoys building reliability, observability, and automation. You will serve as both a hands-on technical leader and a mentor, helping a team with strong SQL experience successfully transition to a modern data stack.
You will play a key role in defining how we monitor, triage, and resolve Tier 1 / Tier 2 data pipeline issues, while continuously improving the stability and operability of our platform.
Does this sound like the right opportunity to explore?... Lets connect!
What You’ll Do
Technical Leadership & Operations
- Act as the technical lead for Data Operations, owning the operational readiness of modern data pipelines built on Airflow, Spark, and cloud data infrastructure
- Lead incident triage and resolution for Tier 1 and Tier 2 pipeline issues (data delays, job failures, data quality alerts, SLA breaches)
- Establish clear runbooks, escalation paths, and operational best practices for pipeline support
- Partner with Data Engineering to influence pipeline design with operability, observability, and supportability in mind
Monitoring, Reliability & Automation
- Design and improve monitoring, alerting, and dashboards for data pipelines and workflows
- Implement automation to reduce manual intervention (auto-retries, self-healing workflows, standardized alerts)
- Drive root cause analysis (RCA) and post-incident reviews to prevent recurring issues
- Continuously improve pipeline reliability, performance, and cost efficiency
Team Enablement & Mentorship
- Mentor and upskill team members transitioning from SQL Server–centric workflows to Airflow, Spark, ELK stack, and distributed data systems
- Create learning paths, documentation, and hands-on guidance for modern data tooling
- Lead by example with hands-on troubleshooting, debugging, and operational support
- Help establish a culture of ownership, quality, and continuous improvement within Data Operations
Cross-Functional Collaboration
- Work closely with Data Engineering, Platform, and Product teams to align on priorities and operational expectations
- Serve as a bridge between legacy data systems and the modern data platform during the transition period
- Provide feedback on operational gaps, tooling needs, and process improvements
What you Bring
- 8+ years of experience in data engineering, data platform operations, or data reliability roles
- Hands-on experience operating data pipelines built on Airflow (or similar orchestrators) and Spark
- Strong understanding of distributed data systems, batch processing, and failure modes at scale
- Solid SQL skills and experience working with relational databases (e.g., SQL Server, Postgres)
- Proven experience supporting production data pipelines with SLAs and on-call responsibilities
- Experience with one of the Cloud platforms (AWS, GCP, or Azure)
Operational & Automation Mindset
- Experience building monitoring, alerting, and incident response processes for data systems
- Strong troubleshooting skills across orchestration, compute, and data layers
- Passion for automation and reducing toil through tooling and process improvements
- Ability to prioritize operational stability while enabling team velocity
Leadership & Communication
- Experience leading or mentoring engineers in an operational or support-focused environment
- Ability to explain complex distributed system issues in clear, practical terms
- Comfortable working with teams of varying technical backgrounds
- Strong documentation and knowledge-sharing habits
- Familiarity with data quality frameworks, lineage, or observability tools
- Experience in healthcare, fintech, or other regulated environments
- Exposure to data reliability engineering (DRE) or SRE practices applied to data platforms
What We Offer
- Work from anywhere in the US! Machinify is digital-first.
- Full Medical/Dental/Vision for employees & their families
- Flexible and trusting environment where you’ll feel empowered to do your best work
- Unlimited FTO
- Competitive salary, equity, 401(k) including employer match
The salary for this position is based on an array of factors unique to each candidate: Such as years and depth of experience, set skills, certifications, etc. The base salary range for this role is $200k-$250k. We are hiring for different levels, and our Recruiting team will let you know if you qualify for a different role/range. Salary is one component of the total compensation package, which includes meaningful equity, excellent healthcare, flexible time off, and other benefits and perks.
Apply for this job
*
indicates a required field
.png?1763750838)