Back to jobs

Senior Data Engineer, Operations

Role

We are seeking a Senior Data Engineer to join our team, with focus on resolving operational data pipeline incidents and increasing its stability and resilience. In this role, you will play a crucial role in resolving data related incidents and maintaining the operational stability of our data pipeline that serve critical product and business functions. You will also help implement and integrate client data into SmarterDx products with the rest of the data engineering team, ensuring the seamless transformation and validation of data according to our standards. This position offers the opportunity to work on cutting-edge ETL pipelines, enhance system resilience, and improve data processes. 

You will be responsible for:

  • Working in cross functional teams to resolve data related incidents that occur in the data pipeline
  • Designing and executing initiatives that improve system resilience against unexpected changes
  • Testing and validating client data to SmarterDx data specifications
  • Participate in a rotation of engineers that diagnose, triage, and solve production data issues should they occur
  • Apply industry standards and best practices around data testing and observability, platform stability, etc
  • Transforming client data to SmarterDx standard data models

In this role, you will spend time on:

  • 60% Working to understand and resolve root cause of data pipeline incidents
  • 30% Updating our infrastructure to make it more incident resilient
  • 10% Implementation of additional client data pipelines

Your Qualifications

As an engineer:

  • You have 5+ years of data engineering experience, preferably in the healthcare industry involving clinical and/or billing/claims data
  • You are well versed in SQL and ETL processes, experience with DBT and AWS is a significant plus
  • You are comfortable with essentials of data orchestration, prior experience with Airflow and python is a plus

As a person, you:

  • Own and take full responsibility for tasks from initiation to completion
  • Demonstrate versatility and proficiency in navigating complex processes and balancing multiple priorities
  • Approach ambiguity with an open mind, experiment with potential solutions, but know when to seek assistance
  • Employ pragmatism in selecting the most straightforward and effective solution for tasks
  • Welcome and provide constructive advice, prioritizing positive outcomes for the entire team
  • Thrive in a collaborative team environment

Our Tech Stack

We have a very diverse tech stack, and open minds for new tools and approaches. What we have now consists of the following (plus many other parts not directly relevant to this position):

  • Databases (from most to least use): Snowflake, Postgres, Elasticsearch, DynamoDB; plus SQLite for niche use cases. CDC is predictably a big component.
  • Cloud Infrastructure: AWS CDK (Cloud Development Kit) extensively used, integrated with AWS services like ECS, Lambda, Step Functions, and State Machines.
  • Languages: Data Science and Data Engineering are done almost exclusively in Python; other apps are in Typescript, Javascript and Rust so some familiarity would be helpful.
  • Libraries: Pandas and Polars throughout our existing stack; DBT and Snowflake for our newer stack. Airflow for orchestration.

Compensation

  • $170K to $200K base + equity

Benefits

  • Medical/dental/vision benefits
  • 401k
  • Free One Medical membership
  • Parental leave
  • Remote first
  • Minimal bureaucracy
  • Incredible teammates! 

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf

Select...
Select...
Select...