
Member of Technical Staff, Pretraining Science
Member of Technical Staff, Pre-Training Science
Location: SF Bay Area or Tokyo, Japan
Type: Full-time
About Radical Numerics
Radical Numerics is an AI lab bringing the rigor of distributed systems, model architecture, and numerics research to the challenges of biology. We are building the infrastructure needed to unlock scaling on vast biological sequence, structure, and image datasets so that biological world models become a reality. Our team introduced hybrid architectures that unlocked million-token context windows, enabling work toward AI-designed whole genomes and real gene-editing tools.
We believe that biological foundation models will require advances not only in systems and scale, but also in the science of pretraining itself: how models learn from diverse biological data, what objectives produce useful representations, and how training recipes evolve as models and datasets grow. This role is focused on that core scientific agenda.
About the Role
As a Member of Technical Staff, Pre-Training Science at Radical Numerics, you will work on the science of how biological world models learn during large-scale training. You will develop new pretraining methods, study scaling behavior, and design training recipes that improve efficiency, generalization, and downstream scientific usefulness.
This role blends research and engineering. You should be excited to move fluidly between theory and implementation: reading technical literature, proposing new hypotheses, running large-scale experiments, and writing high-performance code that turns ideas into measurable progress.
What You’ll Do
- Research and develop new pretraining methodologies. Explore how biological world models learn from multi-modal data (eg, sequence, structure, and image data), and develop new objectives, training strategies, or architectural ideas that improve representation quality and downstream performance.
- Study scaling behavior. Investigate how training dynamics change with model size, data composition, context length, and compute budget. Use empirical results to inform scaling protocols and future research priorities.
- Design data curricula and sampling strategies. Build and refine mixtures, curricula, and sampling policies that improve learning efficiency, generalization, and robustness across biological modalities and tasks.
- Work on architecture, algorithms, and optimization. Evaluate ideas in model design, optimization, long-context learning, and training stability that make large-scale biological pretraining more effective.
- Run large-scale experiments rigorously. Design, execute, and analyze experiments with strong empirical discipline. Distinguish real effects from bugs, noise, or benchmark artifacts, and convert findings into better training recipes.
- Collaborate closely with infrastructure and data teams. Work across the stack to ensure large-scale experiments are reproducible, efficient, and instrumented well enough to support fast scientific iteration.
- Define evaluations for pretraining progress. Build and improve evaluation suites that measure representation quality, long-context behavior, transfer to downstream biological tasks, and scientific utility.
What We’re Looking For
- Strong track record in ML research or engineering, especially in large-scale model training, pretraining, representation learning, optimization, scaling laws, or related areas.
- Ability to design, run, and analyze experiments thoughtfully, with strong research judgment and empirical rigor.
- Proficiency in Python and modern deep learning tooling such as PyTorch, plus comfort debugging distributed or high-performance training systems at scale.
- Experience working in distributed or high-performance computing environments.
- Excellent written and verbal communication skills, especially the ability to explain complex technical findings clearly across engineering, research, and scientific collaborators.
- Intellectual curiosity and a bias toward experimentation, iteration, and continuous improvement.
Nice to Have
- Experience training or analyzing frontier or foundation models.
- Strong grasp of probability, statistics, optimization, and ML fundamentals.
- Familiarity with curriculum learning, data selection, active learning, or data-quality methods for large-scale training.
- Experience designing or maintaining evaluation frameworks for large models.
- Contributions to open-source ML systems, datasets, or research tooling.
- Background in applied math, systems, computational biology, physics, mathematics, or another strongly quantitative field.
Why Radical Numerics
- Help build the multimodal biological world models needed for rapid detection, response, and countermeasures across global health.
- Work on fundamental questions in pretraining science while staying close to real scientific applications in biology.
- Join a collaborative culture that values rigor, creativity, and cross-disciplinary partnership across AI labs, biotechs, hospital systems, and research institutes.
- Competitive compensation, comprehensive benefits, and support for continual learning.
Apply for this job
*
indicates a required field