
Bioinformatics Machine Learning Intern
Bioinformatics Machine Learning Intern
RefinedScience | United States (hybrid or remote)
At RefinedScience, our mission is to advance care by bringing together the best science, data and minds – disease by disease, patient by patient, cell by cell to discover pathways to life beyond disease.
What We Are Looking For
We are seeking a highly motivated Bioinformatics Machine Learning Intern to join our team. This internship is designed for Ph.D. candidates with experience applying machine learning, deep learning, or generative AI methods to single-cell omics data. You will contribute to active projects spanning single-cell biology, multiomics integration, and computational approaches to precision medicine and drug development.
Our Bioinformatics team plays a crucial role in integrating computational biology, large-scale data analysis, and machine learning to drive discoveries in precision medicine and drug development.
Key Activities
- Analyze single-cell and multiomics datasets to extract biological insights supporting precision medicine and drug development programs
- Apply and evaluate machine learning and deep learning approaches to single-cell data for tasks such as cell type classification, biomarker discovery, and patient stratification
- Explore and prototype generative AI and LLM-based approaches to accelerate biological data interpretation and scientific workflows
- Collaborate with scientists, clinicians, and data scientists to design and execute data-driven research projects
- Document and optimize computational workflows following reproducible research best practices
- Present findings through technical reports, visualizations, and presentations to cross-functional teams
Must Haves
- Current Ph.D. candidate in Bioinformatics, Computational Biology, Computer Science, Biostatistics, or a related quantitative field
- Single-cell omics experience: Demonstrated ability to process, analyze, and interpret single-cell data (scRNA-seq, scATAC-seq, CITE-seq, or spatial transcriptomics) using frameworks such as Scanpy/scverse, Seurat, or Bioconductor
- Machine learning expertise: Applied experience developing and evaluating ML/deep learning models on biological data, including neural network architectures (GNNs, transformers, autoencoders), model selection and benchmarking, and integration of ML approaches into analytical workflows
- Programming proficiency: Python and/or R for data analysis, statistical modeling, and visualization
- Statistical foundation: Understanding of statistical methods for biological data (hypothesis testing, differential expression, multiple testing correction, clustering)
- Strong problem-solving skills and ability to communicate complex insights effectively
Desired Qualifications
Machine Learning & AI
- Experience with deep learning frameworks (PyTorch, TensorFlow, JAX)
- Familiarity with graph neural networks, attention mechanisms, or transformer architectures applied to biological data
- Experience with ML experiment tracking and reproducibility (MLflow, Weights & Biases)
- Exposure to representation learning, variational autoencoders, or contrastive learning methods
- Familiarity with scikit-learn, XGBoost, or similar ML libraries
- Interest in or experience with LLMs, RAG systems, or agentic AI tooling
Bioinformatics
- Experience with multimodal single-cell integration (Seurat WNN, scvi-tools/MultiVI/totalVI, Muon)
- Familiarity with spatial transcriptomics analysis (Squidpy, cell2location, nf-core/spatialvi)
- Experience with cell-cell communication inference (CellChat, NicheNet, LIANA)
- Knowledge of drug-gene interaction resources (CMap/LINCS, OpenTargets, ChEMBL)
Engineering & Infrastructure
- Familiarity with Linux/Unix CLI and version control (Git/GitHub)
- Experience with containerization (Docker, Singularity) and environment management (conda, venv)
- Exposure to cloud computing platforms (GCP preferred)
- Familiarity with workflow managers (Nextflow, Snakemake)
- Adherence to best-practices for conduct reproducible computational research
Duration
8–10 weeks
Why You'll Love RefinedScience
Team + Values
At RefinedScience, we seamlessly integrate top-tier clinical and biological data with expert knowledge to provide unparalleled insights. We maximize patient impact with these unique insights by optimizing clinical trial probability of success and time to actionable results. We work across biopharma and we are a trusted partner in achieving better results, faster – working together to unlock strategic advantage.
Our Values
- Act with Purpose – We believe in rigor through deliberate and thoughtful actions
- Be Curious – Curiosity is the spark that ignites innovation and growth
- Take Ownership – True ownership leads to pride and commitment in the work we do
- Invest in Relationships – Building strong connections is the foundation for effective collaboration and trust for long term success
- Embrace Agility – We celebrate agile thinking, resilience, and adaptability
Compensation
- $34-$38 per hour
Create a Job Alert
Interested in building your career at RefinedScience? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field