Back to jobs

Senior Scientific Data Engineer, Data Platform

London, England; Oxford, England

Your work will change lives. Including your own. 

Recursion is decoding biology to industrialize drug discovery. We are looking for a Senior Scientific Data Engineer. As part of a team, you will own a suite of business-critical data products, including our Structure-Activity Relationship data mart.

This is a high-impact role requiring a strong synthesis of robust software engineering capabilities and deep drug discovery domain expertise. You will take ownership of the data architecture responsible for ingesting, standardizing, and serving both public and proprietary datasets. These systems directly power our competitor intelligence, chemical tractability assessments, and compound design models.

Please note: This is a specialized Data Engineering position focused strictly on data infrastructure and product ownership. While your work will directly enable our machine learning and predictive modeling efforts, the responsibilities do not encompass building or training models. This opportunity is ideally suited for engineers dedicated to architecting complex scientific data systems, rather than data scientists seeking modeling-focused roles.

The Systems You Will Own

You will join the Data Platform team and maintain an ecosystem of ~100 ingested datasets, while taking specific ownership of high-value products including:

  • Flagship SAR Data Mart: A unified bioactivity warehouse merging commercial and public (e.g., ChEMBL) databases with internal assay data.
  • Commercial Vendor Data Mart: A massive catalog of purchasable compounds used to guide our internal compound design tools and tractability assessments.
  • Biomedical Knowledge Graph: The critical data feeds and infrastructure that power our semantic graph and associated AI agents, linking targets, diseases, and compounds.
  • Chemical Synthesis Data: The foundational dataset of chemical reactions used for training retrosynthesis models and tractability prediction.
  • Patent Intelligence System: A pipeline transforming patent feeds and competitor data into actionable intelligence.
  • Compound Standardization Registry: A large-scale chemical structure warehouse ensuring consistency across billions of compounds (similar to UniChem).

What You’ll Do

  • Pipeline Ownership at Scale: Act as a key owner for our core bioactivity pipeline, processing 75M+ records and managing ~100 distinct data feeds. You will navigate complex logic and orchestration, including managing 4000+ lines of complex SQL with 20+ transformation steps.
  • Scientific Data Standardization: Resolve ambiguity by reconciling heterogeneous data formats from diverse commercial and public sources. You will design and implement logic to standardize chemical structures (SMILES, InChI, tautomers), biological targets (UniProt mapping, gene families, species homology), and assay data (IC50/Ki normalization, unit conversion).
  • Engineer for Distributed Compute: Optimize tasks using Python and Snowpark for heavy-lifting operations, such as large-scale text mining (extracting dose/concentration from unstructured text) and molecular property calculation.
  • Drive Data Quality: Implement rigorous data quality frameworks (DQF) to handle the nuance of biological data, ensuring our downstream models are trained on clean, semantic-aware data.
  • Cross-Functional Consulting: Interface directly with discovery scientists to understand their diverse data needs and translate complex scientific requirements into robust engineering solutions.

The Experience You’ll Need

  1. Core Engineering:
  • Advanced SQL & Warehousing: Deep expertise in modern cloud data warehousing (e.g. Snowflake, BigQuery). You should be comfortable with complex window functions, CTEs, and schema design for multi-layer environments.
  • Python & Distributed Compute: Strong proficiency in Python for data processing. Experience with Data warehouses is a huge plus, but general distributed processing experience is also valuable.
  • Orchestration: Experience managing complex DAGs and asynchronous task coordination (e.g. Prefect, Argo Workflows).
  1. Domain Expertise:
  • Medicinal Chemistry Context: You understand how chemistry is represented in data (SMILES, scaffolds) and the nuance of bioactivity measurements (potency vs. efficacy, IC50 vs. pXC50).
  • Biological Context: Familiarity with gene/protein families, species homology, and target nomenclature (e.g., how similar genes appear in different species).
  • Assay Knowledge: Ability to distinguish between assay types (e.g., binding, functional), formats, and the units/measurements associated with them. Ideally familiar with ontologies (e.g., BioAssay Ontology, cell line taxonomies).
  • Data Landscape: Knowledge about public drug discovery datasets and how they can be used to support the drug discovery pipeline.
  1. Nice-to-Haves:
  • Experience with chemical toolkits (e.g. OpenEye or RDKit).
  • Experience using text mining or LLMs for structured data extraction from scientific text.

Working Location & Compensation:

This position can be based at either our London or Milton Park office. Please note that we are a hybrid environment and ask that employees spend 50% of their time in the office.

At Recursion, we believe that every employee should be compensated fairly. Based on the skill and level of experience required for this role, the estimated current annual base range for this role is £75,900 - £101,900. You will also be eligible for an annual bonus and equity compensation, as well as a comprehensive benefits package.

#LI-EP1

The Values We Hope You Share:

  • We act boldly with integrity. We are unconstrained in our thinking, take calculated risks, and push boundaries, but never at the expense of ethics, science, or trust. 
  • We care deeply and engage directly. Caring means holding a deep sense of responsibility and respect - showing up, speaking honestly, and taking action.
  • We learn actively and adapt rapidly. Progress comes from doing. We experiment, test, and refine, embracing iteration over perfection.
  • We move with urgency because patients are waiting. Speed isn’t about rushing but about moving the needle every day.
  • We take ownership and accountability. Through ownership and accountability, we enable trust and autonomy—leaders take accountability for decisive action, and teams own outcomes together. 
  • We are One Recursion. True cross-functional collaboration is about trust, clarity, humility, and impact. Through sharing, we can be greater than the sum of our individual capabilities.

Our values underpin the employee experience at Recursion. They are the character and personality of the company demonstrated through how we communicate, support one another, spend our time, make decisions, and celebrate collectively.

More About Recursion

Recursion (NASDAQ: RXRX) is a clinical-stage TechBio company decoding biology to radically improve lives. Recursion is advancing a portfolio of differentiated investigational medicines across its wholly owned and partnered pipeline in oncology, rare disease, neuroscience, immunology, and other therapeutic areas with significant unmet need. Enabling its mission is the Recursion OS, an AI-native, end-to-end drug discovery and development platform integrating biology, chemistry, and clinical development into a unified intelligence system. Powered by proprietary multimodal data, purpose-built AI models, and bilingual teams fluent in both science and AI, the Recursion OS is designed to translate complex science into medicines that matter — faster, better, and at scale — for patients who are waiting.

Recursion’s platform infrastructure is anchored in Salt Lake City, Utah and Milton Park, Oxfordshire, where its automated biology and chemistry laboratories generate proprietary data at industrial scale. Recursion also maintains offices in New York, Montréal, and London, three global hubs for talent and leadership at the intersection of AI and scientific innovation. Learn more at www.recursion.com, or connect on X and LinkedIn.

Recursion is an Equal Opportunity Employer.  All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other characteristic protected under applicable federal, state, local, or provincial human rights legislation. 

Accommodations are available on request for candidates taking part in all aspects of the selection process.


Recruitment & Staffing Agencies: Recursion Pharmaceuticals and its affiliate companies do not accept resumes from any source other than candidates. The submission of resumes by recruitment or staffing agencies to Recursion or its employees is strictly prohibited unless contacted directly by Recursion’s internal Talent Acquisition team. Any resume submitted by an agency in the absence of a signed agreement will automatically become the property of Recursion, and Recursion will not owe any referral or other fees. Our team will communicate directly with candidates who are not represented by an agent or intermediary unless otherwise agreed to prior to interviewing for the job.

Create a Job Alert

Interested in building your career at Recursion? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...
Select...
Select...
Select...
Select...
Select...

When you apply to a job on this site, the personal data contained in your application will be collected by Recursion Pharmaceuticals, Inc., or its affiliates (“Controller”), which is located at 41 S. Rio Grande Street, Salt Lake City, UT 84101 and can be contacted by emailing infor@recursion.com. Controller’s representative for purposes of data protection is VeraSafe LLC, 100 M Street S.E., Suite 600, Washington, D.C. 20003 USA, who can be contacted at experts@verasafe.com.Your personal data will be processed for the purposes of managing Controller’s recruitment related activities, which include setting up and conducting interviews and tests for applicants, evaluating and assessing the results thereto, and as is otherwise needed in the recruitment and hiring processes. Such processing is legally permissible under Art. 6(1)(f) of Regulation (EU) 2016/679 (General Data Protection Regulation) and Art. 6(1)(f) for the UK GDPR as necessary for the purposes of the legitimate interests pursued by the Controller, which are the solicitation, evaluation, and selection of applicants for employment.

Your personal data will be shared with Greenhouse Software, Inc., a cloud services provider located in the United States of America and engaged by Controller to help manage its recruitment and hiring process on Controller’s behalf. Accordingly, if you are located outside of the United States, your personal data will be transferred to the United States once you submit it through this site. Because the European Union Commission has determined that United States data privacy laws do not ensure an adequate level of protection for personal data collected from EU data subjects, the transfer will be subject to appropriate additional safeguards under the standard contractual clauses. You can obtain a copy of the standard contractual clauses by contacting us.

Your personal data will be retained by Controller as long as Controller determines it is necessary to evaluate your application for employment. Under the GDPR, you have the right to request access to your personal data, to request that your personal data be rectified or erased, and to request that processing of your personal data be restricted. You also have the right to data portability. In addition, you may lodge a complaint with the relevant supervisory authority.


Equal Opportunity Employment Information (Recursion)

We are committed to a high-performing workplace where everyone feels like they belong and can do the best work of their careers. We reward merit and contribution as we strive for a workplace that reflects the communities in which we operate and the patients we intend to serve. To this end, we invite you to self-identify your race/ethnicity and gender. This information will be kept confidential and will not be used to favor or discriminate against any candidate. It will not be shared with the hiring managers or otherwise considered as part of your application. Submission of this information is voluntary and refusal to provide any or all of the information requested will not subject you to any adverse treatment. 

Select...
Select...
Select...
Select...
Select...
Select...