Helix AI Engineer, Modeling
Figure is an AI robotics company developing autonomous general-purpose humanoid robots. Our goal is to build embodied AI systems that can perceive, reason, and act in the real world. Figure is headquartered in San Jose, CA, and this role requires 5 days/week in-office collaboration.
Our Helix team is responsible for developing the core AI systems that power humanoid autonomy. We are looking for a Helix AI Engineer, Modeling to design and advance the core model architectures and learning approaches that enable perception, reasoning, and action in embodied systems.
This role focuses on developing new modeling approaches across vision, language, and action—spanning representation learning, multimodal fusion, and model capabilities that directly impact robot intelligence.
Responsibilities
- Design and develop model architectures for perception, reasoning, and action across multimodal inputs (e.g., vision, language, proprioception)
- Build models that learn structured representations of the world, including objects, dynamics, and interactions
- Advance multimodal learning approaches, including fusion, alignment, and cross-modal reasoning
- Improve model capabilities in areas such as generalization, robustness, and long-horizon reasoning
- Work across the model lifecycle, from initial research and prototyping to training and deployment
- Collaborate closely with pretraining, video, generative, RL, and robot learning teams to integrate modeling advances into the full autonomy stack
- Design experiments and evaluation frameworks to understand model behavior and guide iteration
- Contribute to the development of new modeling paradigms for embodied AI systems
Requirements
- Experience designing and training deep learning models for vision, language, or multimodal systems
- Strong understanding of modern model architectures (e.g., transformers and related approaches)
- Experience improving model performance through architectural innovation and experimentation
- Proficiency in Python and deep learning frameworks such as PyTorch
- Strong experimental rigor and ability to iterate on model design and performance
- Solid software engineering skills and ability to build reliable, maintainable systems
- Ability to operate independently and drive ambiguous, high-impact technical problems
Bonus Qualifications
- Experience with multimodal models (vision-language or vision-language-action systems)
- Background in representation learning, world models, or structured prediction
- Experience working on frontier models at companies such as OpenAI, Google DeepMind, Anthropic, Meta, or xAI
- Familiarity with embodied AI, robotics, or real-world ML systems
- Experience with large-scale training or distributed systems
- Publication record in machine learning, computer vision, NLP, or multimodal AI
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
Create a Job Alert
Interested in building your career at Figure? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
