Senior Software Engineer, Developer Productivity
Who We Are
Lightning AI is the company reimagining the way AI is built. After creating and releasing PyTorch Lightning in 2019, Lightning AI was launched to reshape the development of artificial intelligence products for commercial and academic use.
We are on a mission to simplify AI development, making it accessible to everyone—from solo researchers to large enterprises. By removing the complexity of building and deploying AI tools, we empower innovators to focus on solving real-world problems. Our platform is built to scale with the latest AI advancements while staying intuitive and adaptable, so you can bring your ideas to life.
We have offices in New York City, San Francisco, and London and are backed by investors such as Coatue, Index Ventures, Bain Capital Ventures, and Firstminute.
-
Move Fast: We act with speed and precision, breaking down big challenges into achievable steps.
-
Focus: We complete one goal at a time with care, collaborating as a team to deliver features with precision.
-
Balance: Sustained performance comes from rest and recovery. We ensure a healthy work-life balance to keep you at your best.
-
Craftsmanship: Innovation through excellence. Every detail matters, and we take pride in mastering our craft.
-
Minimal: Simplicity drives our innovation. We eliminate complexity through discipline and focus on what truly matters.
What We're Looking For
We're looking for a senior engineer who is passionate about making other developers' lives better. This role is focused on developer productivity: you’ll treat the engineering team as your users, working to streamline local dev workflows, CI/CD, observability, and test infrastructure. You’ll identify friction, design solutions, and ship improvements that make the whole team faster, more stable, and more cost-effective.
You’ll have the same level of technical depth and autonomy as any other engineer—your difference is focus. You’ll chart your own direction, pace, and priorities, while staying fully embedded in the engineering team. Your work will touch every part of the platform and have a compounding impact on our product velocity. You’ll play a key role in optimizing our cloud costs. You’ll partner with product teams to build usage-aware tooling and guardrails, helping us ship efficiently without compromising speed or quality.
You will be joining the Engineering Squad and report to our one of our Engineering Leads. This is a hybrid role based in either our San Francisco or NYC office with in-office requirements of 2 days per week. The salary range for this role is $180,000 - $215,000.
What You’ll Do
- Drive the design, development, and implementation of tools, systems, and processes that accelerate engineering velocity, reduce manual effort, and increase the quality of output.
- Be a thought leader on engineering productivity - suggesting better practices and utilizing current technology for improved velocity.
- Monitor and analyze cloud expenditures for storage, compute, and other resources. Set up alerts and dashboards to track real-time costs and usage across environments.
- Identify areas of opportunity for savings on cloud cost and work to realize those savings.
- Architect CI/CD pipelines to improve deployment frequency and reduce manual intervention.
- Work with your team to design and deliver new and past features in a cost effective way.
- Guide and advise product engineering teams on best practices for ensuring cost effective, scalable systems.
- You collaborate deeply—bringing curiosity, empathy, and clarity to your partnerships with engineers and stakeholders.
- Identify inefficiencies and proactively building solutions that make engineering teams faster and more reliable.
What You’ll Need
- Have 5+ years of experience in engineering, including 2+ years of experience in infrastructure building tooling for developers.
- Strong experience with cloud platforms (AWS, GCP, Azure) and tools for cost monitoring and optimization (e.g., AWS Cost Explorer, GCP Cost Management).
- Expertise in CI/CD tools such as Jenkins, CircleCI, GitHub, or GitLab.
- Solid troubleshooting skills with a track record of providing effective L2 support.
- Ability to code in golang
- Familiarity with containerization technologies like Docker and Kubernetes.
- Knowledge of monitoring and logging tools such as Prometheus, Grafana, or ELK stack.
- Excellent problem-solving skills, attention to detail, and ability to work in a fast-paced, collaborative environment.
Benefits and Perks
We offer competitive base salaries and stock options with a 25% one year cliff and monthly vesting thereafter. For our international employees, we work with Velocity Global to pay you in your local currency and provide equitable benefits across the globe.
In the US, we offer:
- Medical, dental and vision
- Life and AD&D insurance
- Flexible paid time off plus 1 week of winter closure
- Generous paid family leave benefits
- $500 monthly meal reimbursement, including groceries & food delivery services
- $500 one time home office stipend
- $1,000 annual learning & development stipend
- 100% Citibike membership (NYC only)
- $45/month gym membership
- Additional various medical and mental health services
At Lightning AI, we are committed to fostering an inclusive and diverse workplace. We believe that diverse teams drive innovation and create better products. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other protected characteristic. We are dedicated to building a culture where everyone can thrive and contribute to their fullest potential.
Apply for this job
*
indicates a required field