
Member of Technical Staff - Applied RL
About Vmax
Vmax is an applied research lab working at the frontier of reinforcement learning (RL). We are building new techniques for leveraging RL with Large Language Models (LLMs). Our research contributes directly to our RL platform, which automates the engineering involved in converting data and evals into RL environments.
About the role
Your objective will be to rapidly deliver bespoke environments and agents for our customers. It will be your responsibility to translate customer needs into bespoke environments and then post-train agents within these environments. You will also shape our product and research directions, helping us productize our research and make RL more widely accessible.
Responsibilities
- Build RL environments for our customers
- Post train LLM-based agents on domain specific tasks
- Productizing Vmax research - apply environment generation and automated RL research to improve our customers' agents
Role Requirements
- Experience post-training LLMs
- Software engineering experience beyond research projects
- Can independently build post-training data and training pipelines
Nice to have
- Research experience in RL
- Open source contributions to RL frameworks
Role specific location policy
- This role is based in our San Francisco office; For exceptional candidates we are willing to consider a hybrid arrangement
Compensation
The expected salary range for this position is $250,000 - $450,000 USD
Apply for this job
*
indicates a required field