Sr. Data Engineer
Mavericks Wanted
When was the last time you achieved the impossible? If that thought feels overwhelming, you might want to pause here, but if it sparks excitement...read on
In 2015, we pioneered a “moneyball for biotech” approach, pooling projects and promising early-stage research from academia together under one financial umbrella to reduce risk and unleash innovation. This model allows science and small teams of experts to lead the way. We build bridges to groundbreaking advancements in rare disease, and develop life-changing medicines for patients with unmet needs as fast as humanly possible.
Together we define white space, push boundaries and empower people to solve problems. If you're someone who defies convention, join us and work alongside some of the most respected minds in the industry. Together, we'll ask "why not?" and help reengineer the future of biopharma.
What You'll Do
The Data Science and Operations team is seeking a full-time Senior Data Engineer to build and maintain critical data and compute infrastructure to support our drug discovery and development efforts. This role will play a key part in building scalable data pipelines, optimizing cloud-based storage and processing, and ensuring data accessibility for data science and machine learning applications.
As part of the Computational Genomics group at BridgeBio, the Data Science and Operations team is dedicated to three key objectives:
- Discovering new opportunities for drug development through the analysis of human genetic data.
- Providing data science and bioinformatics support to core program affiliates.
- Building a data platform to facilitate data-driven decision-making in internal drug development.
To achieve these goals, the team designs, develops, maintains, and operates software tools and data processing systems, enabling the analysis of scientific and business data for insightful discoveries.
Responsibilities
- Architect & Develop Scalable Data Infrastructure: Design and implement robust, secure, and scalable data pipelines and infrastructure on AWS using EC2, S3, Athena, EKS, and other cloud-native services
- Optimize Data Processing: Leverage Apache Spark and Databricks to process large-scale datasets efficiently for analytics, reporting, and machine learning applications
- Automate Data Workflows: Build and maintain orchestration workflows using Apache Airflow to automate data pipelines
- Monitor & Optimize Performance: Continuously improve system reliability, performance, and cost efficiency through monitoring, logging, and infrastructure optimization
- Collaborate with Cross-Functional Teams: Work closely with computational biologists, experimental scientists, and colleagues in business development to provide accessible and high-quality data solutions
Where You'll Work
This a U.S-based remote role that will require quarterly, or as needed visits to our San Francisco Office.
Who You Are
Minimum Education requirement
- Master’s degree or higher in Computer Science, Data Engineering, Information Systems, or a related technical field.
Relevant Experience
- 5+ years of experience as a Data Engineer, DevOps engineer or similar role.
Skills
- Expert knowledge with AWS cloud computing including hands on experience with the following services:
- EC2, S3
- Athena
- Elastic Kubernetes service
- Elastic Container Registry
- Strong proficiency in Python, which includes
- Developing stand-alone libraries
- Developing and deploying automated ETL pipelines
- Knowledge of at least one testing suit
- Performance optimization
- Hands-on experience with at least one modern data platform
- Databricks
- Snowflake
- Expertise in Apache Spark
- Spark SQL, DataFrames, and PySpark.
- Knowledge of relational databases
- Version control with git, which includes
- Setting up and managing remote repositories, implementing proper branch management, resolving merge conflicts locally
- Collaborating on remote repository: working with protected branches, submitting and resolving pull requests, adding automated tests with github actions.
Any experience with the following is a plus:
- Human genetics data.
- Familiar with the drug development process and pharmaceutical industry.
- Data visualization & dashboarding solutions such as Metabase, plotly Dash.
- Experience with the UK Biobank, All of Us Research Program.
Rewarding Those Who Make the Mission Possible
We have high expectations for our team members. We make sure those working hard for patients are rewarded and cared for in return.
Financial Benefits:
- Market-leading compensation
- 401K with 100% employer match on first 3% & 50% on the next 2%
- Employee stock purchase program
- Pre-tax commuter benefits
- Referral program with $2,500 award for hired referrals
Health & Wellbeing:
- Comprehensive health care with 100% premiums covered - no cost to you and dependents
- Mental health support via Spring Health (6 therapy sessions & 6 coaching sessions)
- Hybrid work model - employees have the autonomy in where and how they do their work
- Unlimited flexible paid time off - take the time that you need
- Paid parental leave - 4 months for birthing parents & 2 months for non-birthing parents
- Flex spending accounts & company-provided group term life & disability
- Subsidized lunch via Forkable on days worked from our office
Skill Development & Career Paths:
- People are part of our growth and success story - from discovery to active drug trials and FDA pipelines, there are endless opportunities for skill development and internal mobility
- We provide career pathing through regular feedback, continuous education and professional development programs via LinkedIn Learning, LifeLabs, Spring Health & BetterUp Coaching
- We celebrate strong performance with financial rewards, peer-to-peer recognition, and growth opportunities
Salary
$170,000 - $215,000 USD
Apply for this job
*
indicates a required field