
Senior Data Engineer
About Baselayer
Trusted by 2,200+ financial institutions, Baselayer is the intelligent business identity platform that helps verify any business, automate KYB, and monitor real-time risk. Baselayer’s B2B risk solutions and identity graph network leverage state and federal government filings and proprietary data sources to prevent fraud, accelerate onboarding, and lower credit losses.
About the Role
We are looking for a Data Engineer to design, build, and operate the data infrastructure that powers Baselayer’s analytics and machine learning capabilities. You will own robust, scalable pipelines that ingest, transform, and validate structured and unstructured data from internal systems and external sources, with a strong focus on reliability, observability, and data quality.
This is a hands-on role for someone who thrives in complexity, cares deeply about correctness, and wants to work close to AI and product workflows in a regulated domain.
What You’ll Do
-
Design, build, and maintain robust ETL and ELT pipelines that power analytics and machine learning use cases
-
Own and improve the architecture and tooling for storing, processing, and querying large-scale datasets in cloud data platforms
-
Implement orchestration and automation for data workflows using tools such as Airflow, dbt, or similar
-
Build and maintain reusable data models to enable faster experimentation and reliable reporting
-
Implement data quality checks, observability, and alerting to ensure integrity and reliability across environments
-
Partner with Data Science, ML Engineering, Product, and Engineering to ensure reliable data delivery and feature readiness for modeling
-
Optimize warehouse and query performance, scalability, and cost as data volumes grow
-
Maintain clear documentation, runbooks, and operational processes for pipelines and datasets
-
Partner with security and compliance stakeholders to ensure pipelines and access controls meet regulatory and internal standards
About You
You want to learn fast, take ownership, and build systems that other teams can rely on. You are not just doing this for the win. You are doing it because you have something to prove and want to be great.
You care about data integrity and reliability, you enjoy turning messy inputs into clean systems, and you are comfortable operating without a playbook. You are curious about AI and ML infrastructure and want to build the foundation that powers it.
Required Experience and Skills
-
4 to 12 years of experience in data engineering or analytics engineering
-
Strong Python and SQL skills, with experience building production-grade data workflows
-
Experience building and maintaining ETL or ELT pipelines and working with cloud data warehouses or analytics databases
-
Familiarity with orchestration, workflow scheduling, and transformation tooling (for example Airflow, dbt, Dagster, Prefect, or similar)
-
Comfort working with both structured and unstructured data and designing scalable data architectures
-
Strong understanding of data quality, testing, observability, and operational best practices
-
Ability to communicate clearly across technical and non-technical audiences
What Sets You Apart
-
Experience working in regulated environments or with sensitive identity, risk, fraud, compliance, or financial services data
-
Experience integrating external data sources and APIs, including government or registry data
-
Familiarity with near-real-time or streaming data patterns
-
Highly feedback-oriented with a desire for continuous improvement
-
Strong bias toward ownership and building systems that scale
Work Location
-
Hybrid in SF, in office 3 days per week
Compensation and Benefits
-
Salary range of $135,000 to $220,000
-
Equity package
-
Unlimited vacation
-
Fully paid health insurance, dental, and vision
-
401(k) with company match
Create a Job Alert
Interested in building your career at Baselayer? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
