Job Application for Data Engineer at Scribe Therapeutics

Back to jobs

Scribe Therapeutics is a molecular engineering company focused on creating best-in-class in vivo therapies that permanently treat the underlying cause of disease. Founded by CRISPR inventors and leading molecular engineers Benjamin Oakes, Brett Staahl, David Savage, and Jennifer Doudna, Scribe is overcoming the limitations of current genome editing technologies by developing custom engineered enzymes and delivery modalities as part of a proprietary, evergreen CRISPR by Design™ platform for CRISPR-based genetic medicine.

We are seeking a highly creative, passionate, and motivated individual to join us in our quest to develop the next generation of CRISPR-based therapeutics. The current role is for a Data Engineer to join our team and advance our platform. The candidate should have a passion for working collaboratively with biologists and bioengineers to enable building the necessary tools for meeting the new frontier of CRISPR-based therapeutics. Additionally, the candidate would bring expertise in data engineering to work with Scribe’s fast-growing teams, contributing to the development of best practices of data solutions and playing a key role in data-driven gene editing innovation.

The candidate will have numerous opportunities for professional growth in a rapidly growing biotechnology start-up, which includes growing into a leadership role of increasing responsibilities and the ability to publish highly impactful work in peer reviewed journals.

Location: This role is in Alameda, CA and will be onsite 3-4 days/week.

Key Responsibilities:

Design, construct, and maintain data systems and pipelines for efficient and reliable data ingestion, processing, and storage.
Develop and implement data integration and ETL (Extract, Transform, Load) processes to ensure data quality, accuracy, and consistency.
Build scalable and optimized databases, data warehouses, and data lakes to support the organization's data needs.
Implement and manage data governance practices, including data security, privacy, and compliance.
Automate data workflows, data validation, and data quality checks to ensure data accuracy and reliability.
Collaborate with data scientists and bioinformatics scientists to maintain code libraries/pipelines that implement and support Machine Learning (ML) workflows while integrating them with internal storage and metadata-management systems
Build and maintain MLOps system for supporting data featurization, ML model training, ML model deployment, ML model inference and ML model monitoring.
Effectively collaborate with members within a fully integrated team to facilitate execution on projects within established timelines
Foster a driven, fast-paced, dynamic, and fun environment in which to do rigorous science

Required Skills and Background:

Experience with big data and data science tools and development, as well as database and data management best practices
Strong proficiency in programming languages such as Python, Java, or Scala, as well as SQL for data manipulation and query optimization.
Familiarity with cloud platforms and services, such as AWS, Azure, or GCP, and their data-related offerings (e.g., Amazon Redshift, Google BigQuery).
Experience with data integration and ETL tools such as Apache Spark, Apache Airflow, or Glue in AWS
Solid understanding of database systems, both relational (e.g., PostgreSQL, MySQL) and NoSQL (e.g., MongoDB, Cassandra).
Knowledge of data modeling and database design principles to ensure efficient storage and retrieval of data.
Knowledge of data governance, data security, and compliance frameworks to ensure data integrity and privacy.
Familiarity with machine learning concepts and frameworks to support data science initiatives.
Ability to communicate and work effectively with internal software and data science teams, external IT contractors and members of other functional teams
At least 2 years of experience working in an academic or industry lab, with a proven track record of hands-on experience developing data engineering projects, databases, or data science tools.
Demonstrated quantitative and scientific thinker as evidenced by a strong publication record
Ability to work both independently and collaboratively in a fast-paced, interdisciplinary research team

Preferred Skills and Background:

B.S or M.S in Computer Science, Data Science, or related engineering fields
Knowledge of agile engineering practices and development methodologies (Scrum, Kanban)
Familiarity with CRISPR technologies and therapeutic approaches
Familiarity with protein or RNA structure and engineering approaches

Salary will be commensurate with experience. We will provide an intellectually stimulating, collegial and fast-paced environment. If you are ready to engineer the future of therapeutics, then we are excited to hear from you! Visit us at www.scribetx.com.

We are committed to creating a diverse environment and are proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, disability, age, or veteran status.

At the time of posting, the base pay wage range for this role is $80,000-120,000 per year. The offered pay range will depend on internal equity and the candidate’s relevant skills, experience, qualifications, training, and market data. Additional incentives are provided as part of the complete package in addition to comprehensive medical and other benefits.

Data Engineer

Required Skills and Background:

Preferred Skills and Background:

Apply for this job