
Research Fellow
About Goodfire
Behind our name: Like fire, AI holds the potential for both devastating harm and immense benefit. Just as our ancestors' mastery of fire enabled them to cook food, smelt metals, and launch rockets into space, AI stands as humanity's most profound innovation since that first controlled flame. Our goal is to tame this new fire, enabling a safe transition into a post-AGI world.
Goodfire is an AI interpretability research lab focused on understanding and intentionally designing advanced AI systems. We believe that advances in interpretability will unlock the next frontier of safe and powerful foundation models.
At Goodfire, we're building Neural Programming Interfaces (NPIs) to transform how AI models are developed, just like how Application Programming Interfaces (APIs) transformed software development. NPIs allow developers to reach into the mind of an AI model, extract knowledge the model is thinking about, and design model behavior in precise and targeted ways.
We’re backed by Lightspeed Venture Partners, Menlo Ventures, NFDG’s AI Grant, South Park Commons, Work-Bench, and other leading investors.
Working at Goodfire
Our team brings together AI interpretability experts and experienced startup operators from organizations like OpenAI and DeepMind, united by the belief that interpretability is essential to advancing AI development.
We're a public benefit corporation based in San Francisco. All roles are in-person, five days a week, at our Telegraph Hill office.
The role:
We are seeking talented, early-career researchers or engineers to execute a research project in AI interpretability. Fellows will work alongside leading interpretability researchers at Goodfire, receiving structured mentorship while contributing to important work in AI alignment. The project is expected to span approximately 1-2 months, with some flexibility based on project progress and mutual agreement.
Core responsibilities:
- Execute an assigned interpretability research project according to established methodology
- Produce a co-authored research blog post by program completion
- Implement feedback from mentors while maintaining independent execution capability
- Commit approximately 20 hours per week
Who you are:
Goodfire is looking for experienced individuals who embody our values and share our deep commitment to making interpretability accessible. We care deeply about building a team who shares our values:
Put mission and team first
All we do is in service of our mission. We trust each other, deeply care about the success of the organization, and choose to put our team above ourselves.
Improve constantly
We are constantly looking to improve every piece of the business. We proactively critique ourselves and others in a kind and thoughtful way that translates to practical improvements in the organization. We are pragmatic and consistently implement the obvious fixes that work.
Take ownership and initiative
There are no bystanders here. We proactively identify problems and take full responsibility over getting a strong result. We are self-driven, own our mistakes, and feel deep responsibility over what we’re building.
Action today
We have a small amount of time to do something incredibly hard and meaningful. The pace and intensity of the organization is high. If we can take action today or tomorrow, we will choose to do it today.
If you share our values and have at least two years of relevant experience, we encourage you to apply and join us in shaping the future of how we design AI systems.
What we are looking for:
- Strong technical background in computer science, machine learning, or related fields
- Previous machine learning research experience
- Proficiency in Python programming and ML frameworks
- Excellent written and verbal communication skills
Preferred qualifications:
- Experience with LLM and/or AI interpretability research
- Track record of completing structured research projects end-to-end
Success Profile:
- Excel at executing well-defined research plans
- Thrive in collaborative environments
- Can work independently while incorporating structured feedback
- Demonstrate strong interest in AI interpretability
Program Benefits:
- Weekly stipend
- Full coverage of necessary compute and API costs
- Direct mentorship from Goodfire researchers
- Opportunity to co-author published research
This is a fixed-term fellowship position. While full-time positions are not guaranteed following the fellowship, exceptional performance during the program may indicate future opportunities at Goodfire.
Apply for this job
*
indicates a required field