Back to jobs

Member of Technical Staff — TPU Systems (JAX / XLA / PALLAS)

Palo Alto, CA

About the Role

RadixArk is looking for a TPU Systems Engineer to build high-performance inference and training systems using JAX, XLA, and Pallas. You'll push model workloads to their limits on TPU hardware, working on SGLang-JAX and other critical infrastructure that enables efficient deployment of frontier models on Google's tensor processing units.

Requirements

  • 3+ years experience building production ML systems utilizing JAX/Torch, XLA, or TPU-focused frameworks. 
  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or equivalent industry experience
  • Deep understanding of XLA internals preferred: HLO, MLIR, operator fusion, SPMD partitioning, and sharding strategies.
  • Strong performance tuning instincts across compiler and runtime layers
  • Experience with distributed inference systems (e.g. SGLang, vLLM) or training frameworks (e.g. Miles, Alpa, Pathways)
  • Proficiency in Python with demonstrated ability to write high-performance, production-quality code
  • Experience writing custom GPU/TPU/AI Accelerator kernels. Familiarity with Pallas for kernel development is strongly preferred. 

Responsibilities

  • Build high-performance inference and training systems using JAX/XLA/Pallas, including SGLang-JAX
  • Push large-model workloads to the limits on the newest TPU hardwares
  • Optimize end-to-end latency and throughput for LLM serving on TPU infrastructure
  • Design and implement SPMD strategies for efficient distributed inference and training
  • Design and implement Pallas kernels for operations that require customized low level control for best performance
  • Profile and optimize XLA compilation pipelines and HLO graph transformations
  • Collaborate with kernel engineers and compiler teams to achieve performance wins across the stack
  • Contribute to open-source projects with TPU optimization guides, benchmarks, and architectural insights
  •  

    About RadixArk

    RadixArk is an infrastructure-first company built by enggineers who've shipped production Al systems,created SGLang (20K+ GitHub stars,the fastest open LLM serving engine),and developed Miles(our large-scale RL framework).
    We're on a mission to democratize frontier-level Al infrastructure by building world-class open systems for inference and training.
    Our team has optimized kernels serving billions of tokens daily,designed distributed training systems coordinating 10,000+ GPUs, and contributed to infrastucture that powers leading Al companies and research labs.
    We're backed by well-known infrastructure investors and partner with  NVIDIA, Google,AWS,and frontier Al labs.
    Join us in building infrastructure that givees real leverage back to the Al community.

    Compensation

    We offer competitive compensation for this 1-year residency program, with health benefits and potential for conversion to a full-time role. Compensation is determined by location and prior experience. Strong residents may receive offers to join RadixArk full-time with equity after program completion.

    Equal Opportunity

    RadixArk is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Apply for this job

*

indicates a required field

Phone
Resume/CV

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf