Job Application for Machine Learning Engineer, AI at Biohub

Featured

Biohub is the first large-scale initiative bringing frontier AI models, massive compute, and frontier experimental capabilities under one roof. We're building a general-purpose system to accelerate scientific discovery, integrating frontier AI models, biological foundation models, and lab capabilities, with the ultimate goal of curing disease. Our technology powers scientists around the world, translating AI capabilities into tools that accelerate research everywhere.

Biohub operates one of the largest AI compute clusters dedicated to biology, spanning three frontier research institutes with some of the world's leading biologists. We're not a startup trying to find product-market fit, and we're not a pharma company optimizing a pipeline. We're building frontier AI for fundamental science, as open science, at a scale no one else is doing. This is a unique moment for scientific acceleration. The problems are among the hardest and most impactful problems you can choose to work on, and we move at a pace that meets this moment.

Our research spans:

Frontier molecular modeling, from protein language models (e.g., ESM) to structure prediction (e.g., ESMFold) and beyond.
Scaled biological foundation models trained on some of the largest GPU clusters dedicated to science
Imaging foundation models trained across the world's largest microscopy datasets
Reasoning and agentic systems that connect frontier LLMs with biological foundation models
Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights
Scientific data at unprecedented scale: AI systems to collect, curate, and learn from some of the richest biological datasets ever assembled

Join our Team!

As an ML Engineer, you'll join some of the strongest infrastructure engineers in AI, building the systems that connect everything together. The infrastructure problems you solve directly determine what science becomes possible.

What You'll Do

Work with high-dimensional scientific data formats and contribute to backend compatibility, format evaluation, and I/O performance benchmarking at petabyte scale.
Define and shape the engineering patterns your team and collaborating researchers will build on for years; the abstractions you write today become the foundation others depend on at scale.
Work at the intersection of AI systems and biological discovery, where the infrastructure problems you solve directly determine what science becomes possible.
Deploy models to production and manage artifact tracking across models and datasets.
Design and optimize GPU-native data loading pipelines for large-scale multi-dimensional tensor workloads, including profiling and resolving hardware utilization bottlenecks across multi-backend systems.
Work on simplification and improvement of codebase abstractions to accelerate research momentum.
Build and maintain primitives for pre-training infrastructure that ensure the reliability and continuity of large-scale training runs.
Help cultivate best practices in MLOps, and think about the full ML lifecycle, including data, fine-tuning, deployment, reliability and monitoring.
Possesses the ability to execute complex modifications to the research pipeline, such as fast data loading and distributed training.
Handle DevOps responsibilities, focused on making all engineers and researchers more productive. This includes tasks like cluster monitoring, unit testing and integration testing of research codebase, and continuous integration.
Collaborate with partner researchers and engineers to deploy our technology within external infrastructure.

What You'll Bring

5+ years of industry experience building and deploying machine learning infrastructure at scale.
Hands-on experience with PyTorch, including custom training loops, distributed training, or low-level performance work.
Familiarity with GPU-native data I/O tools and large-scale tensor formats (e.g. Zarr, HDF5, TensorStore, or similar).
Experience with distributed computing frameworks such as Apache Spark, Dask, or Ray.
Familiarity with containerization and orchestration tools such as Docker and Kubernetes.
Experience building or working with AI agent frameworks is a plus.
A track record of building systems that other engineers and researchers depend on. Not just running experiments, but shipping infrastructure that scales.

Compensation

The future anticipated Redwood City, CA, and New York City, NY base pay range for a role in this field is $214,000 to $335,000 annually. Final compensation is based on the level at which you are hired. Actual placement in range is based on job-related skills and experience, as evaluated throughout the interview process.

Benefits for the Whole You

We’re thankful to have an incredible team behind our work. To honor their commitment, we offer a wide range of benefits to support the people who make all we do possible.

Provides a generous employer match on employee 401(k) contributions to support planning for the future.
Paid time off to volunteer at an organization of your choice.
Funding for select family-forming benefits.
Relocation support for employees who need assistance moving

Please note that applying to this opportunity does not guarantee that we will be in touch with you regarding our opportunities. Our recruiting team will contact you if your experience aligns with the skills we seek for future open positions. We will keep your interest on file, contact you as opportunities arise, and send you information about the exciting work we are doing at Biohub. You can opt out at any time!

#LI-Hybrid

First Name

Last Name

Preferred First Name

Country

Phone

Location (City)

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf

LinkedIn Profile

Are you currently eligible to work in the United States of America?

Select...

Do you now or in the future require visa sponsorship to continue working in the United States?

Select...

How did you hear about our organization?

Select...

If 'Other', please specify how you heard about our organization:

Would you like to be invited to our recruiting events and receive information about our work? You’ll have the option to opt out at any time.

Select...

Reasonable Accommodation, Data Privacy, Background Check and Artificial Intelligence Usage Notice.

Select...

Reasonable Accommodation Notice
The organization provides (and state and federal law requires) reasonable accommodations to be provided to qualified applicants with disabilities. Your recruiter will work with you during the interview process should you require any such accommodations. Examples of reasonable accommodation include making a change to the application process or work procedures, providing documents in an alternate format, using a sign language interpreter, or using specialized equipment.

Applicant Privacy Notice
To learn more about how we use the information you submit, please see our Privacy Notice for Job Applicants.

Background Check NoticeAs part of our hiring process, all offers of employment are contingent upon the successful completion of a background check. By submitting your application, you acknowledge that you will be required to undergo a background check prior to employment.

Artificial Intelligence Usage Notice
We use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications and analyzing resumes. These tools assist our recruitment team but do not replace human judgment. Hiring decisions are ultimately made by humans. If you have questions about this once your are in our hiring process, please contact your Recruiter.

Voluntary Self Identification

For reporting purposes, we ask candidates to respond to the below self-identification survey. Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in a confidential file.

As set forth in the organization’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

Race & Ethnicity Definitions

Veteran Status

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection. Classification of protected categories is as follows:

A "disabled veteran" is one of the following: a veteran of the U.S. military, ground, naval or air service who is entitled to compensation (or who but for the receipt of military retired pay would be entitled to compensation) under laws administered by the Secretary of Veterans Affairs; or a person who was discharged or released from active duty because of a service-connected disability.

A "recently separated veteran" means any veteran during the three-year period beginning on the date of such veteran's discharge or release from active duty in the U.S. military, ground, naval, or air service.

An "active duty wartime or campaign badge veteran" means a veteran who served on active duty in the U.S. military, ground, naval or air service during a war, or in a campaign or expedition for which a campaign badge has been authorized under the laws administered by the Department of Defense.

An "Armed forces service medal veteran" means a veteran who, while serving on active duty in the U.S. military, ground, naval or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985.

Gender

Select...

Are you Hispanic/Latino

Select...

What is your Race/Ethnicity

Select...

Veteran Status

Select...

Machine Learning Engineer, AI

Join our Team!

What You'll Do

What You'll Bring

Compensation

Benefits for the Whole You

Apply for this job

Voluntary Self Identification