
AI Systems Engineer
Perplexity is an AI-powered answer engine founded in December 2022 and growing rapidly as one of the world’s leading AI platforms. Perplexity has raised over $1B in venture investment from some of the world’s most visionary and successful leaders, including Elad Gill, Daniel Gross, Jeff Bezos, Accel, IVP, NEA, Nvidia, Samsung, and many more. Our objective is to build accurate, trustworthy AI that powers decision-making for people and assistive AI wherever decisions are being made. Throughout human history, change and innovation have always been driven by curious people. Today, curious people use Perplexity to answer more than 780 million queries every month–a number that’s growing rapidly for one simple reason: everyone can be curious.
We are looking for an AI Systems Engineer to join our growing team. Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference.
Responsibilities
- Develop robust APIs for AI inference used by both internal and external customers
- Design, deploy, and maintain scalable, reliable infrastructure for deploying machine learning models
- Benchmark system performance, diagnose bottlenecks, and implement improvements across the inference stack
- Enhance system reliability and observability by integrating modern monitoring and alerting tools
- Respond swiftly to system outages and collaborate across teams to maintain high uptime and performance
Qualifications
- Experience in developing APIs and managing distributed systems
- Strong understanding of Kubernetes and container orchestration
- Experience with deploying reliable, distributed, real-time systems at scale
- High level familiarity with LLM architecture, and the key pieces (Multi-Head, Multi/Grouped-Query, as well as common Layers)
Apply for this job
*
indicates a required field