
Member of Technical Staff, Post-Training
Member of Technical Staff, Post-Training
Location: SF Bay Area or Tokyo, Japan
Type: Full-time
About Radical Numerics
Radical Numerics is an AI lab bringing the rigor of distributed systems, model architecture, and numerics research to the challenges of biology. We are building the infrastructure needed to unlock scaling on vast biological sequence, structure, and image datasets so that biological world models become a reality. Our team introduced hybrid architectures for million-token context windows, enabling work toward AI-designed whole genomes and gene-editing tools.
We believe the next generation of biological foundation models will require not only new and improved pretraining recipes, but also innovation on post-training: the work that turns a powerful base model into a system that is useful, steerable, robust, and scientifically productive. This role sits at that interface between fundamental research and practical engineering.
About the Role
As a Member of Technical Staff, Post-Training at Radical Numerics, you will develop the training and evaluation loops that shape biological world models after pretraining. You will work on the methods, data, and infrastructure required to improve model behavior on real scientific tasks: reasoning over long biological context, following complex objectives, making useful predictions, and interacting reliably with downstream tools and workflows.
This is a hands-on role for someone who wants to both build systems and deepen understanding. You should be excited to run careful experiments, question whether the metrics reflect reality, and translate empirical findings into better recipes, datasets, and productively used models.
What You’ll Do
- Develop and tune post-training recipes. Design and iterate on post-training stages, datasets, reward signals, and hyperparameters for biological world models. Study how choices in data mixtures, objective design, curriculum, and training schedules affect model behavior.
- Build evaluations that actually matter. Collaborate with the science team to develop and refine evaluation suites for biological reasoning, scientific usefulness, long-context behavior, robustness, and model reliability, and to identify when existing benchmarks stop being informative and should be replaced with better ones.
- Debug model behavior end-to-end. Investigate failure modes in training runs and model outputs, distinguish between signal and noise, and trace problems back to data, optimization, evaluation design, or systems issues.
- Work on preference- and feedback-driven learning. Explore methods such as preference modeling, reward modeling, synthetic feedback, or related post-training approaches that improve how models respond to scientific tasks and constraints.
- Improve data for post-training. Help define, curate, or generate high-quality post-training datasets, including expert-informed data, synthetic data, and task-specific examples grounded in biological workflows.
- Study scaling in post-training. Measure how performance changes with dataset size, recipe complexity, compute budget, and model family. Use those results to guide what we scale next and what new directions are worth exploring.
- Collaborate across research and engineering. Work closely with colleagues in training systems, architecture, and biology-facing research to ensure post-training methods are grounded in the realities of large-scale experimentation and downstream scientific use.
What We’re Looking For
- Strong track record in ML research or engineering, especially in frontier-model training, post-training, alignment, evaluation, data quality, or related areas.
- Proficiency in building production-quality software and research infrastructure, ideally in Python and PyTorch, with comfort debugging large-scale training workflows.
- Ability to design careful experiments, interpret ambiguous results, and separate real effects from artifacts, bugs, or benchmark overfitting.
- Excellent written and verbal communication skills, especially the ability to explain technical findings clearly across research, engineering, and scientific collaborators.
- Curiosity, rigor, and a bias toward iteration: you like improving systems by repeatedly tightening the loop between hypotheses, experiments, and insight.
Nice to Have
- Experience with RLHF, RLAIF, preference optimization, reward modeling, rejection sampling, or other post-training methods for large models.
- Experience designing or operating evaluation frameworks for model quality, reliability, safety, or scientific task performance.
- Familiarity with synthetic data generation, annotation workflows, or expert-in-the-loop data collection.
- Background in applied math, systems, computational biology, or another quantitative scientific field.
- Contributions to open-source ML systems, model tooling, or research infrastructure.
Why Radical Numerics
- Help build the post-training stack for multimodal biological world models that could materially improve how we detect, understand, and respond to problems in health and biology.
- Work in an environment that combines distributed systems, model architecture, and numerics research with real biological applications.
- Join a collaborative culture that values rigor, creativity, and cross-disciplinary partnership across AI labs, biotechs, hospital systems, and research institutes.
- Competitive compensation, comprehensive benefits, and support for continual learning.
Apply for this job
*
indicates a required field