AI Infrastructure Engineer
About Us:
Here at Fireworks, we’re building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highest-quality models and the fastest, most scalable inference. We’ve been independently benchmarked to have the fastest LLM inference and have been getting great traction with innovative research projects, like our own function calling and multi-modal models. Fireworks is funded by top investors, like Benchmark and Sequoia, and we’re an ambitious, fun team composed primarily of veterans from Pytorch and Google Vertex AI.
Job Duties:
- Design core, backend software components. Interface with other teams to incorporate their innovations and vice versa.
- Conduct design and code reviews. Analyze and improve efficiency, scalability, and stability of various system resources.
- Design and implement the hardware and software infrastructure required for AI projects.
- Procure, configure, and manage servers, GPUs, TPUs, and other hardware resources.
- Set up cloud-based environments (e.g., AWS, Azure, GCP) for AI workloads.
- Deploy and manage distributed computing clusters (e.g., Kubernetes) for AI model training and inference.
- Optimize cluster performance and resource allocation for AI workloads.
- Monitor cluster health and troubleshoot issues as they arise.
- Architect and maintain data storage solutions (e.g., data lakes, databases) for AI datasets.
- Ensure data security, access controls, and data versioning. Implement data pipelines for efficient data ingestion and preprocessing.
- Develop and maintain automation scripts and tools for infrastructure provisioning and scaling.
- Implement continuous integration and continuous deployment (CI/CD) pipelines for AI models.
- Orchestrate workflows for training, evaluation, and deployment of AI models.
- Optimize infrastructure to handle large-scale AI workloads efficiently.
- Monitor and analyze system performance, making adjustments as needed.
- Implement load balancing and scaling strategies to meet demand.
- Implement security best practices to protect AI infrastructure and data.
- Stay up-to-date with security vulnerabilities and apply patches and updates.
- Ensure compliance with relevant data privacy and regulatory requirements.
- Collaborate with data scientists and AI engineers to understand their infrastructure needs.
- Provide technical support and troubleshooting assistance for AI infrastructure issues.
- Train and educate team members on best practices for using AI infrastructure.
Minimum Education & Experience Required:
- Bachelor’s degree or the equivalent in Computer Science, Computer Engineering or a related field
- Three (3) years of experience with ML infrastructure (PyTorch, Vertex AI, and Sagemaker) or related experience
Minimum Skills Required:
- Experience with one or more search engine, recommendations, natural language processing, personalization, or similar applied ML domain.
- Experience with building, scaling, and optimizing distributed enterprise-grade Machine Learning systems.
- Experience with architectural patterns of large-scale software applications.
- Experience with publishing papers in machine learning and/or computer vision conferences and journals.
- Experience with large-scale machine learning techniques like semi-supervised learning, weakly-supervised learning, and online adaptation of ML models.
- Experience with publishing machine learning domains such as computer vision and natural language processing.
Compensation is determined by various factors including individual qualifications, experience, skills, interview performance, market data, and work location. The listed salary range for this role is a guideline and may be modified.
Redwood City Pay Range
$170,000 - $180,000 USD
Why Fireworks AI?
- Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
- Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
- Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
- Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.
Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.
Apply for this job
*
indicates a required field