Back to jobs

Research Scientist - Multimodal Large Language Models (LLMs)

The Chan Zuckerberg Biohub San Francisco (CZ Biohub SF) (https://www.czbiohub.org/sf/) is an independent nonprofit research institute that brings together three powerhouse universities - Stanford, UC Berkeley, and UC San Francisco - into a single collaborative technology and discovery engine. CZ Biohub SF itself supports some of the brightest, boldest engineers, data scientists, and biomedical researchers to investigate the fundamental mechanisms underlying disease and develop new technologies that will lead to actionable diagnostics and effective therapies. We are guided by our values of scholarly excellence; disruptive innovation; hands-on engineering/hacking/building; partnership and collaboration; open communication and respect; inclusiveness; and opportunity for all.

Our Vision

  • We pursue large scientific challenges that cannot be pursued in conventional environments
  • We enable individual investigators to pursue their riskiest and most innovative ideas
  • The technologies developed at CZ Biohub San Francisco facilitate research by scientists and clinicians at our home institutions and beyond

Diversity of thought, ideas, and perspectives are at the heart of CZ Biohub Network and enable disruptive innovation and scholarly excellence. We are committed to cultivating an inclusive organization where all colleagues feel inspired and know their work makes an important contribution.

The Opportunity

The Chan Zuckerberg Biohub (CZ Biohub SF) is seeking a highly skilled and motivated Research Scientist to lead the development of state-of-the-art multimodal large language model (LLM) agents that will enable breakthrough research and discoveries in biology. We are interested in pursuing these new ideas for zebrafish, a powerful model organism, to understand mechanisms of infection and immunity, organ regeneration, and organismal development.  The ideal candidate will have established expertise in machine learning, generative AI, development of multimodal LLMs, and reinforcement learning for model tuning. The successful candidate will report directly to both Yasin Şenbabaoğlu (Director of Computational Biology) and Loïc A. Royer (Director of Imaging AI) at CZ Biohub, San Francisco.

You will

  • Design, develop, and implement multimodal LLMs that integrate textual, multi-omic, and image data
  • Lead the research and development of novel algorithms to process and align scientific literature with biological datasets for downstream analysis
  • Collaborate closely with computational biologists and experimental scientists to understand domain-specific challenges and optimize model performance
  • Manage large-scale datasets (scientific texts, omics data, and imaging) and build efficient data pipelines for training and evaluation
  • Drive innovative research to publish in top-tier AI and computational biology journals and present at conferences
  • Mentor junior scientists and engineers, fostering a culture of collaboration and continuous learning

You have

Required –

  • PhD in Computer Science, Machine Learning, Computational Biology, Bioinformatics or a related field; or Masters with equivalent experience
  • 3+ years of experience with Python and relevant deep learning libraries (e.g., PyTorch, TensorFlow)
  • 3+ years of experience in designing and deploying large-scale language models or multimodal AI systems
  • Expertise in natural language processing (NLP), deep learning, and model training techniques
  • Proven track record of impactful publications and conference presentations in relevant areas
  • Excellent problem-solving skills and ability to work in an interdisciplinary environment
  • Strong professional judgment and problem-solving abilities that adapt to a variety of situations
  • Strong interpersonal skills with excellent written and verbal communication skills

Nice to have -

  • Experience in integrating and aligning heterogeneous data sources (text, omics, images) for AI-driven applications
  • Familiarity with scientific literature databases (e.g., PubMed, arXiv) and bioinformatics tools
  • Experience with high-performance computing (HPC) environments and cloud-based AI platforms
  • Strong leadership and project management skills

The Chan Zuckerberg Biohub Network requires all employees, regardless of work location or type of role, to provide proof of their initial COVID-19 vaccination by their start date. Those who are unable to get vaccinated because of a disability, or who choose not to be vaccinated due to a sincerely held religious belief, practice, or observance must have an approved exception prior to their start date.

Compensation

  • $150,000 - $205,700

What We Provide

  • Resources to disrupt and innovate at the frontiers of our knowledge of biology and disease
  • A collegial and collaborative environment consisting of diverse expertise
  • Existing collaborations within CZ Biohub: Technology Platforms (Bioengineering, Computational Microscopy, Data Science, Genomic Sequencing, Mass Spectrometry/Proteomics), and Research Group Leaders
  • Access to collaborators, resources and facilities at our three partner universities (Stanford, UC Berkeley, and UC San Francisco) and at partner organizations in the Bay Area and beyond
  • Competitive compensation and benefits commensurate with experience

Benefits

We offer a robust benefits program that enables the important work Biohubbers do everyday. Our benefits include healthcare coverage, life and disability insurance, commuter subsidies, family planning services with fertility care, childcare stipend, 401(k) match, flexible time off and a generous parental leave policy. In addition, we honor our commitment to career development and our value of scholarly excellence through regular onsite opportunities to learn from the world's leading scientists.

The CZ Biohub Network is an equal opportunity employer committed to diversity of thought, ideas and perspectives. We are committed to cultivating an inclusive organization where all Biohubbers feel inspired and know their work makes an important contribution. Therefore, we provide employment opportunities without regard to age, race, color, ancestry, national origin, religion, disability, sex, gender identity or expression, sexual orientation, or any other protected status in accordance with applicable law.

Pursuant to the California Fair Chance Act, we will consider for employment qualified applicants with arrest and conviction records.

Headhunters and recruitment agencies may not submit resumes/CVs through this website or directly to managers. The CZ Biohub Network does not accept unsolicited headhunter and agency resumes. The CZ Biohub Network will not pay fees to any third-party agency or company that does not have a signed agreement with the CZ Biohub Network.

 

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...
Select...
Select...

Voluntary Self-Identification

For government reporting purposes, we ask candidates to respond to the below self-identification survey. Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in a confidential file.

As set forth in Chan Zuckerberg Biohub - San Francisco’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

Select...
Select...
Race & Ethnicity Definitions

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection. As a government contractor subject to the Vietnam Era Veterans Readjustment Assistance Act (VEVRAA), we request this information in order to measure the effectiveness of the outreach and positive recruitment efforts we undertake pursuant to VEVRAA. Classification of protected categories is as follows:

A "disabled veteran" is one of the following: a veteran of the U.S. military, ground, naval or air service who is entitled to compensation (or who but for the receipt of military retired pay would be entitled to compensation) under laws administered by the Secretary of Veterans Affairs; or a person who was discharged or released from active duty because of a service-connected disability.

A "recently separated veteran" means any veteran during the three-year period beginning on the date of such veteran's discharge or release from active duty in the U.S. military, ground, naval, or air service.

An "active duty wartime or campaign badge veteran" means a veteran who served on active duty in the U.S. military, ground, naval or air service during a war, or in a campaign or expedition for which a campaign badge has been authorized under the laws administered by the Department of Defense.

An "Armed forces service medal veteran" means a veteran who, while serving on active duty in the U.S. military, ground, naval or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985.

Select...

Voluntary Self-Identification of Disability

Form CC-305
Page 1 of 1
OMB Control Number 1250-0005
Expires 04/30/2026

Why are you being asked to complete this form?

We are a federal contractor or subcontractor. The law requires us to provide equal employment opportunity to qualified people with disabilities. We have a goal of having at least 7% of our workers as people with disabilities. The law says we must measure our progress towards this goal. To do this, we must ask applicants and employees if they have a disability or have ever had one. People can become disabled, so we need to ask this question at least every five years.

Completing this form is voluntary, and we hope that you will choose to do so. Your answer is confidential. No one who makes hiring decisions will see it. Your decision to complete the form and your answer will not harm you in any way. If you want to learn more about the law or this form, visit the U.S. Department of Labor’s Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp.

How do you know if you have a disability?

A disability is a condition that substantially limits one or more of your “major life activities.” If you have or have ever had such a condition, you are a person with a disability. Disabilities include, but are not limited to:

  • Alcohol or other substance use disorder (not currently using drugs illegally)
  • Autoimmune disorder, for example, lupus, fibromyalgia, rheumatoid arthritis, HIV/AIDS
  • Blind or low vision
  • Cancer (past or present)
  • Cardiovascular or heart disease
  • Celiac disease
  • Cerebral palsy
  • Deaf or serious difficulty hearing
  • Diabetes
  • Disfigurement, for example, disfigurement caused by burns, wounds, accidents, or congenital disorders
  • Epilepsy or other seizure disorder
  • Gastrointestinal disorders, for example, Crohn's Disease, irritable bowel syndrome
  • Intellectual or developmental disability
  • Mental health conditions, for example, depression, bipolar disorder, anxiety disorder, schizophrenia, PTSD
  • Missing limbs or partially missing limbs
  • Mobility impairment, benefiting from the use of a wheelchair, scooter, walker, leg brace(s) and/or other supports
  • Nervous system condition, for example, migraine headaches, Parkinson’s disease, multiple sclerosis (MS)
  • Neurodivergence, for example, attention-deficit/hyperactivity disorder (ADHD), autism spectrum disorder, dyslexia, dyspraxia, other learning disabilities
  • Partial or complete paralysis (any cause)
  • Pulmonary or respiratory conditions, for example, tuberculosis, asthma, emphysema
  • Short stature (dwarfism)
  • Traumatic brain injury
Select...

PUBLIC BURDEN STATEMENT: According to the Paperwork Reduction Act of 1995 no persons are required to respond to a collection of information unless such collection displays a valid OMB control number. This survey should take about 5 minutes to complete.