
Staff Scientist
Overview
Black Canyon Consulting is seeking a Staff Scientist to work with a Principal Investigatory in the National Institutes of Health at the National Library of Medicine to support the development of high-fidelity artificial intelligence models designed to decode the functional landscape of the human and mouse genomes. This effort will leverage Telomere-to-Telomere (T2T) reference assemblies to advance understanding of gene regulation, particularly within complex and repetitive genomic regions.
This position requires a unique combination of computational genomics expertise, machine learning proficiency, and scalable software engineering capabilities to support large-scale data integration and model development.
Responsibilities
- Lead the design, development, and implementation of AI-driven models for gene regulation analysis
- Architect and scale a TREDNet-based framework for cloud-native execution
- Optimize models for distributed, multi-GPU training environments
- Integrate and analyze large-scale genomic and epigenomic datasets, including:
- ENCODE / modENCODE
- NIH Roadmap Epigenomics
- UCSC Genome Database
- Apply AI methodologies to functionally annotate repetitive genomic regions, including centromeres and telomeres
- Develop and maintain scalable, containerized pipelines using Docker and/or Singularity
- Implement MLOps best practices, including experiment tracking, model versioning, and reproducibility
- Deploy and manage workflows in cloud environments (AWS, GCP, or Azure)
- Collaborate with interdisciplinary teams across computational and life sciences domains
Required Qualifications
- PhD in Computer Science, Computational Biology, Bioinformatics, or a related field
- Minimum of 5 years of experience developing and deploying machine learning or deep learning models
- Strong experience with cloud platforms (AWS, GCP, or Azure)
- Proficiency in deep learning frameworks (PyTorch preferred; TensorFlow or HuggingFace acceptable)
- Deep understanding of neural network architectures (CNNs, transformers, sequence models)
- Strong programming skills in Python and experience working in Linux-based environments
- Experience with MLOps practices, including experiment tracking and model versioning
- Experience building and deploying containerized workflows (Docker and/or Singularity)
- Experience with distributed training across GPUs or multi-node environments
- Strong knowledge of genomics, gene regulation, and epigenomics
- Experience working with large-scale biological datasets (e.g., ENCODE, Roadmap Epigenomics, UCSC Genome Browser)
- Familiarity with genomics data formats (FASTA, VCF, BAM/CRAM, BED)
Preferred Qualifications
- Experience with Telomere-to-Telomere (T2T) genome assemblies
- Experience analyzing repetitive genomic regions (e.g., centromeres, telomeres)
- Background in regulatory, functional, or comparative genomics (e.g., human vs. mouse)
- Experience with hyperparameter tuning and large-scale model optimization
- Familiarity with genomic foundation models or sequence-based deep learning approaches
- Experience running ML workloads on GPU-enabled cloud or HPC environments
- Familiarity with workflow orchestration tools (e.g., Nextflow, Snakemake, Airflow)
- Experience transitioning research models into production-grade pipelines
- Familiarity with CI/CD and infrastructure-as-code tools (e.g., Terraform)
- Experience working in interdisciplinary teams
Deliverables
- Develop a containerized (Docker/Singularity) TREDNet pipeline capable of scaling across multiple GPU nodes in a cloud environment
- Produce a comprehensive functional map of the T2T reference genome, identifying regulatory motifs in previously unresolved regions
- Develop comparative models between human and mouse cell lines to identify conserved regulatory mechanisms
Benefits and Salary
We attract the best people in the business with our competitive benefits package, including medical, dental, and vision coverage; a 401(k) plan with employer contribution; paid holidays, vacation, and tuition reimbursement.
We offer a competitive salary commensurate with experience and location. The targeted range for this position is $110,000 - $140,000.
If you enjoy being part of a high-performing, professional, technology-focused organization, please apply today!
Create a Job Alert
Interested in building your career at BCC-NIH? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field