
Member of Technical Staff, Software Engineering
At River, our mission is to create personal AI owned and shaped by each individual. To achieve this, we are rewriting the entire stack from scratch: personal hardware for local inference, bespoke training infrastructure, next-generation UIs, and frontier deep learning research.
Who we are
We are scientists, engineers, and builders from the industry's top tech companies and AI labs. We bring a proven track record of scaling consumer systems for hundreds of millions of users and architecting the pre-training infrastructure behind today's frontier models.
About the Role
We are looking for exceptional systems engineers to build the high-performance engines that train our models. Your goal is to make training at River fast, reliable, and massively scalable.
You will take ownership of our core infrastructure stack; from writing custom GPU kernels to managing clusters of thousands of nodes, ensuring our researchers can focus on science rather than system bottlenecks.
What You’ll Do
- Architect and deploy fault-tolerant distributed systems for training and inference workloads across clusters with thousands of nodes.
- Design high-performance kernels to maximize tensor operation efficiency, memory throughput, and networking over InfiniBand/RDMA.
- Profile systems end-to-end to resolve blockers across hardware, software, data loading pipelines, and collective communication primitives.
- Partner directly with research scientists to rapidly implement, optimize, and scale experimental model architectures.
Skills & Qualifications
Minimum Qualifications:
- Bachelor’s degree in Computer Science, Computer Engineering, or equivalent practical industry experience.
- Deep expertise in systems-level languages (C, C++, or Rust) with a track record of writing performant, maintainable code.
- Strong foundation in computer architecture, memory management, and concurrent programming.
- Exceptional debugging skills, especially when tackling complex, non-deterministic issues in distributed environments.
- A highly collaborative mindset and a bias for action to push boundaries across the stack.
Preferred Qualifications: (We encourage you to apply even if you don't meet all of these)
- Hands-on experience with modern AI frameworks (e.g., PyTorch, JAX) and tooling for large-scale model training.
- Deep familiarity with modern GPU architectures (NVIDIA/AMD) and hardware constraints (HBM bandwidth, PCIe limits).
- A proven track record of shipping and maintaining high-performance distributed systems or low-level software libraries.
Logistics & Benefits
- Location: Palo Alto, California.
- Compensation: Depending on experience and skills the expected base pay is $200,000 - $420,000 USD per year.
- Benefits: Comprehensive health, dental, and vision insurance; unlimited PTO; and relocation assistance as needed.
- Visa Sponsorship: We sponsor visas and are committed to supporting the process for the right candidate.
Apply for this job
*
indicates a required field