About Violet Research Institute

Violet Research Institute (VRI) is building the future of personalized medicine for patients with genetic diseases. We're at the frontier of a new era in medicine — one where treatments can be designed for individual patients based on their unique biology. Recent breakthroughs in science, engineering, and regulatory pathways have made this possible, but much of this work remains nascent and distributed across disparate efforts globally. We're unifying, refining, and scaling these efforts into a cohesive platform. For each patient we serve, we deeply understand their biology, then design and manufacture a targeted treatment that can be delivered in months instead of years.

We combine the urgency and execution mindset of a startup with the mission-driven openness of a nonprofit, allowing us to collaborate broadly and move quickly on behalf of the patients we serve. We've brought together leading researchers, engineers, and organizations across omics, therapeutic design, manufacturing, clinical care, and AI to move from insight to action as quickly as possible.

VRI is founded by the family of our first patient, Violet, and is led by Michael Buckley, Siranush Babakhanova and Steve Turner. Our team is deeply cross-disciplinary and first-principles driven. We value builders, experts, and generalists who are excited to work across domains, challenge conventional approaches, and increase access to personalized medicine.

Location: San Francisco Bay Area (or Remote, US-based preferred)

Type: Full-Time

Compensation: $175k - $225k

About Violet Research Institute

Violet Research Institute (VRI) is building the future of personalized medicine for patients with genetic diseases. We are at the frontier of a new era in medicine — one where treatments can be designed for individual patients based on their unique biology. Recent breakthroughs in science, engineering, and regulatory pathways have made this possible, but much of this work remains nascent and distributed across disparate efforts globally. We are unifying, refining, and scaling these efforts into a cohesive platform. For each patient we serve, we deeply understand their biology, then design and manufacture a targeted treatment that can be delivered in months instead of years.

We combine the urgency and execution mindset of a startup with the mission-driven openness of a nonprofit, allowing us to collaborate broadly and move quickly on behalf of the patients we serve. We have brought together leading researchers, engineers, and organizations across omics, therapeutic design, manufacturing, clinical care, and AI to move from insight to action as quickly as possible.

VRI is founded by the family of our first patient, Violet, and is led by Michael Buckley, Siranush Babakhanova, and Steve Turner. Our team is deeply cross-disciplinary and first-principles driven. We value builders, experts, and generalists who are excited to work across domains, challenge conventional approaches, and increase access to personalized medicine.

The Role

As our founding Bioinformatician, you will be the architect and owner of VRI’s entire genomics data foundation. This is not an analyst role embedded in someone else’s pipeline. You will design, build, and steward the systems that collect, unify, quality-control, and surface all genetic, sequencing, and assay data across the organization, replacing fragmented, ad-hoc processes with a rigorous, reproducible, and scalable data layer.

Your work begins with PacBio long-read whole-genome sequencing and multiomics integration, and grows into a platform that can onboard new patients and indications with speed and consistency. You will partner closely with computational biologists and clinical scientists to make data trustworthy and analysis-ready, enabling the fast and accurate clinical interpretations being made by experts you will partner with. Your job is to make their work faster, more reliable, and fully reproducible.

You will also be responsible for and own the data infrastructure, from architecting the integrations between the various systems, the data ontology and structure, and everything else needed to ensure clean data and reproducible processes and analysis.The data infrastructure you create will have a direct, near-term impact on real patients’ lives.

What You’ll Own

Genomics Data Management & Stewardship

Own the full lifecycle of VRI’s genomics data, from raw sequencer output (FASTQ, BAM/CRAM, VCF) through QC, storage, versioning, and retrieval,as the single accountable person for data integrity across all datasets
Define and enforce data standards, naming conventions, metadata schemas, and ontologies for all data types: sequences, variant calls, splicing data, and experimental assay results
Build and maintain a centralized, queryable genomics data lake that unifies heterogeneous inputs from internal labs and CRO partners (US, Israel, China) into a single, analysis-ready data model
Establish sample tracking, data lineage documentation, and versioning protocols so every result is traceable back to its source
Manage cloud storage strategy (AWS S3 or GCP) across hot, warm, and cold tiers;balancing cost, accessibility, and HIPAA-compliant security
Create and maintain an internal data catalog documenting all datasets, pipeline versions, and transformation logic so any scientist can understand what data exists and how it was produced

Pipeline Development & Data Engineering

Design and build production-grade, reusable pipelines for ingesting and processing PacBio long-read WGS data, including phased genome assembly, structural variant calling, and SNP/indel detection
Build ETL workflows that clean, normalize, and integrate diverse data modalities (sequencing reads, RNA/splicing data, and assay metadata)into unified, analysis-ready formats
Automate QC steps to surface data anomalies early; monitor data quality continuously across sequencing batches and CRO handoffs
Establish code quality standards, testing protocols, and deployment practices (version control, containerization) that will scale as the team grows
Maintain and develop internal database systems, including our proprietary VRI OS platform used for experiment tracking — contributing to data integrity, system upkeep, and building custom tools and interfaces to support research workflows.
Integrate physics-based thermodynamic models and predictive algorithms to forecast therapeutic performance and guide design decisions
Develop and apply design criteria and ranking systems to evaluate therapeutic candidates computationally before advancing to wet lab testing
Build and maintain algorithms that bridge computational predictions with experimental validation, optimizing the design-to-testing pipeline.

Multiomics Integration

Integrate multi-layered genomics data (DNA, RNA-seq, long-read RNA, splicing) with proteomics, metabolomics, and mass spectrometry data (LC-MS, MS/MS) into coherent, patient-centric multiomics datasets
Query and harmonize large-scale population cohorts (UK Biobank, Mount Sinai Million, and similar) to contextualize patient findings
Partner with computational biologists and clinical scientists to surface analysis-ready datasets, enabling and supporting their interpretation work

Insight Delivery & Reporting

Build automated reporting pipelines that push structured summaries of data quality, pipeline status, and batch results to scientific stakeholders, thereby replacing manual handoffs
Develop QC dashboards to surface data quality metrics, pipeline status, and anomaly alerts in real time
Directly support IND filings through preparation of relevant datasets and written reports/descriptions.

Continuous Improvement

Actively monitor the bioinformatics landscape — using AI-assisted tools where applicable — to identify emerging algorithms and platforms that can sharpen VRI’s data infrastructure
Lay the foundation for future bioinformatics hires by embedding well-documented, reproducible data practices from day one

Requirements

Must Have

4+ years of hands-on bioinformatics experience in a research or biotech environment, with a strong focus on genomics data management and pipeline engineering
Proven experience owning genomics data end-to-end — not just running analyses, but building the systems and standards that make data trustworthy and reusable
Strong fluency in genomics file formats and toolchains: FASTQ, BAM/CRAM, VCF, BED; variant callers (GATK, DeepVariant, PBSV); assembly tools (hifiasm or equivalent)
Demonstrated experience with PacBio long-read WGS data and associated long-read tooling
Proficiency in Python; experience building and maintaining production-grade pipelines with workflow managers (Nextflow, Snakemake, or WDL)
Hands-on experience with cloud data infrastructure: AWS S3 or GCP, data lake design, pipeline orchestration, and HIPAA-compliant storage
Experience querying and integrating biobank-scale datasets (UK Biobank or similar)
Strong organizational skills — you naturally document your work, build systems others can use, and take ownership of data quality without being asked

Preferred

Experience with RNA-seq and long-read RNA analysis, including pre-mRNA processing and splicing characterization
Familiarity with LIMS systems (Benchling, LabVantage, or similar) and data governance / FAIR data frameworks
Familiarity with containerization tools (Docker, Singularity) and CI/CD practices
Exposure to siRNA, ASO, or other therapeutic modality-specific bioinformatics
Experience in a seed or early-stage biotech; comfort building infrastructure from scratch

Behavioral Essentials

Execute independently from loosely specified tasks, you are self-directing
Ask for help only when truly blocked, communicating` clearly what is needed and what you have already tried
Thrive in early-stage, ambiguous, high-pace environments where the path is built as you walk it
Mission-driven with genuine, active care for patient impact (a daily operating principle at VRI)

AI, Tools & Operating Environment

At VRI we genuinely embrace AI at every step of the process. Claude and other AI tools are used throughout the day, across every function. Computational fluency and comfort with AI-assisted analysis and literature synthesis are expected. If you treat AI as a novelty or an occasional aid, this is not the right environment.

How We Hire

We are looking to hire immediately and are moving quickly. Our anticipated process can take as little as 5 days: Apply → Initial Recruiter Call → Hiring Manager Interview → Technical Stakeholder Interview → Executive Director Interview → Offer.

Compensation & Benefits

VRI provides competitive compensation based upon experience, qualifications, and role scope, starting at $175k. We also offer a full suite of benefits.

First Name

Last Name

Preferred First Name

Country

Phone

Resume/CV

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter*

Accepted file types: pdf, doc, docx, txt, rtf

School

Select...

Degree

Select...

LinkedIn Profile

Ideal start date

Where do you plan to work from if hired?

Example: City, State/Province, Country

Tell us a bit about yourself and why you are interested in joining Violet Research Institute

What is something you’ve discovered, built, led, or significantly improved that you’re most proud of?

What was the goal, what did you personally do, and what was the outcome?

Are you currently authorized to work for any employer in the United States?

Select...

Will you, now or in the future, require employer sponsorship to work in the United States?

Select...

Personal Website

We're a visual team and love to see what you've built! If you have a personal website, portfolio, or slidedeck showcasing your work please add it here.

Personal Portfolio

Accepted file types: pdf, doc, docx, txt, rtf

We're a visual team and love to see what you've built! If you have a personal website, portfolio, or slidedeck showcasing your work please upload it here. (Optional)

Senior Bioinformatician – Genomics Data Infrastructure

About Violet Research Institute

About Violet Research Institute

The Role

What You’ll Own

Genomics Data Management & Stewardship

Pipeline Development & Data Engineering

Multiomics Integration

Insight Delivery & Reporting

Continuous Improvement

Requirements

Must Have

Preferred

Behavioral Essentials

AI, Tools & Operating Environment

How We Hire

Compensation & Benefits

Apply for this job