
Back to jobs
Member of Technical Staff - RL Infrastructure
San Francisco
About Vmax
Vmax is an applied research lab developing AI capable of open-ended learning. We are building systems to exceed humans in all capacities by optimising beyond the local maxima of learning from human expertise.
About the role
This role is for strong infrastructure engineers who can build the systems layer for RL at scale: distributed rollouts, training orchestration, inference, evals, data pipelines, observability, and reliability. You will create the durable platform that enables researchers and applied ML engineers to run, debug, and reproduce large-scale RL experiments.
Responsibilities
- Build infrastructure for distributed RL training and inference across thousands of GPUs
- Improve the reliability, debuggability, and throughput of RL experiments.
- Build interfaces that allow researchers and applied ML engineers to launch, inspect, compare, and reproduce experiments easily.
- Own infrastructure projects end to end, from architecture and implementation through deployment, documentation, and long-term maintenance.
- Identify and eliminate bottlenecks in training, rollout generation, eval execution, data movement, and cluster utilization.
- Maintain engineering standards for RL infrastructure, including testing, observability, versioning, and reproducibility.
Minimum Requirements
- Strong software engineering experience.
- Experience building infrastructure for LLM inference and/or RL training.
- Experience with GPU clusters, distributed training, model serving, or high-throughput inference systems.
- Familiarity with vLLM, SGLang and modern LLM-RL training frameworks
- Strong understanding of system reliability, observability, testing, debugging, and performance optimization.
- Ability to work closely with ML researchers and translate messy experimental workflows into durable infrastructure.
- Experience building tools, platforms, or services used by other technical users.
- Strong judgment around technical tradeoffs: when to prototype, when to harden, when to simplify, and when to redesign.
- Clear written and verbal communication, especially around system design, operational risks, and engineering tradeoffs.
Nice to have
- Experience supporting research teams or fast-moving ML teams.
- Experience at a high engineering bar organization where reliability, ownership, and code quality were central.
- Evidence of strong independent technical work, such as open-source projects, infrastructure projects, competitions, or substantial systems built from scratch.
- Experience reducing operational complexity in systems that had become brittle, slow, or hard to debug.
Role specific location policy
- This role is based in our San Francisco office; for exceptional candidates we are willing to consider a hybrid arrangement
Compensation
The expected salary range for this position is $300,000 - $500,000 USD
Apply for this job
*
indicates a required field