
Back to jobs
Staff Software Engineer - AI Infrastructure
Santa Clara, CA
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing (eVTOL) aircraft, and robotics. With a strong focus on intelligent mobility, XPENG is dedicated to reshaping the future of transportation through cutting-edge R&D in AI, machine learning, and smart connectivity.
About the Role
We are looking for a versatile Machine Learning Infrastructure Engineer to join XPENG’s Fuyao AI Platform team — a core AI infrastructure powering autonomous driving, robotics, and intelligent cockpit teams with large-scale data processing, model training, and inference acceleration. You will begin with building and optimizing our next-generation DataLoader and Dataset Management System, and later expand to distributed training, large-scale inference, model pruning/quantization, and operator-level acceleration, improving AI model efficiency and scalability.
Job Responsibilities
-
Design, develop, and maintain high-performance Dataloader SDKs and Dataset Management Systems for multi-source, heterogeneous data (images, videos, point clouds, sensor streams, etc.).
-
Optimize multi-threaded/multi-process data pipelines for minimal I/O latency and preprocessing overhead, supporting large-scale model training and inference workloads.
-
Contribute to AI infrastructure projects beyond data loading, including:
-
Distributed training and inference optimization.
-
Custom operator development (CUDA kernels, TensorRT, ROCm) and hardware-specific acceleration for GPU/TPU.
-
Model optimization techniques such as pruning, quantization, distillation, sparsification, and mixed-precision training.
-
-
Collaborate with algorithm and platform teams to translate business needs into scalable, production-grade solutions.
-
Continuously identify and address performance bottlenecks across the AI training and inference stack.
Minimum Requirements
-
Master’s degree in Computer Science, Software Engineering, or equivalent experience.
-
5+ years of experience in large-scale data processing or ML infrastructure.
-
Proficient in Python with solid software engineering fundamentals, clean coding practices, and strong debugging skills.
-
Hands-on experience with relational databases and NoSQL systems, including metadata and cache management; prior experience with large-scale VectorDB is highly desirable.
-
Experience in at least one of the following areas:
-
Large-scale deep learning training or inference optimization focused on scalability and model acceleration (distributed training strategies, quantization, CUDA kernel development, and related optimizations).
-
Columnar storage formats (Parquet/ORC) and related ecosystems, including partitioning, compression, and vectorized I/O optimization.
-
Linux file system and network I/O optimization for NFS, (high-performance) distributed file systems, and object storage.
-
Large-scale data loading frameworks (PyTorch Dataloader, Hugging Face Datasets).
-
-
Strong communication skills and ability to work cross-functionally in fast-paced environments.
-
Strong ability to learn quickly, adapt to new challenges, and proactively explore and adopt new technologies.
Preferred Qualifications
-
Familiarity with the autonomous driving industry and enthusiasm for its challenges.
-
Experience with distributed computing frameworks such as Apache Ray.
-
Experience in building and scaling ML infrastructure in cloud-native environments.
The base salary range for this full-time position is $179,400-$303,600, in addition to bonus, equity and benefits. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.
We are an Equal Opportunity Employer. It is our policy to provide equal employment opportunities to all qualified persons without regard to race, age, color, sex, sexual orientation, religion, national origin, disability, veteran status or marital status or any other prescribed category set forth in federal or state regulations.
Create a Job Alert
Interested in building your career at XPENG? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field