Job Application for Inference Server – Product Software Intern at Tenstorrent University Jobs

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

At Tenstorrent, we believe the future of computing must be open, which is why our interns don’t just watch from the sidelines - they help build the core of it. We provide a "code-to-career" pipeline where students collaborate with industry experts to solve high-stakes problems in RISC-V and AI hardware-software co-design. By joining us, you are taking an internship to democratize high-performance computers that are accessible to everyone.

Join our Inference Server Technologies team, where we build the software layer that powers state-of-the-art AI inference on Tenstorrent hardware. This team develops APIs, deploys workloads, and benchmarks end-to-end model performance so developers can efficiently scale inference on our stack. You will work on a project under the guidance of experienced engineers and a dedicated mentor. We are looking for a minimum of 3 months for this role with the potential for extension to 6 months.

This role is hybrid based in Belgrade, Serbia.

Who You Are

Final-year BSc or MSc student in Computer Science, Software Engineering, Electrical Engineering, or a related technical field
Strong programming fundamentals in Python, with familiarity in C++ considered a plus
Interested in backend systems, API design, and how ML models are deployed in production environments
Curious about performance optimization techniques such as batching, caching, and model parallelism
Motivated to learn and contribute in a collaborative engineering environment

What We Need

Contribute to backend features and APIs that support AI inference workloads
Assist in deploying, testing, and benchmarking models running on Tenstorrent hardware
Analyze inference performance and help identify optimization opportunities
Write clean, maintainable code with guidance from senior engineers
Collaborate with the team to improve reliability, usability, and performance of the inference server stack

What You Will Learn

How end-to-end ML inference is optimized on custom AI hardware
How scalable backend systems are designed to serve real-world AI applications
How APIs and infrastructure shape the developer experience for AI workloads
Practical performance analysis techniques in production-like environments
How modern AI software stacks integrate models, runtimes, and hardware

Hiring Timelines

This internship opportunity is available throughout our 3 terms with the following corresponding recruitment cycles:

Winter Term: Mar–May work term, Nov–Jan recruit.
Summer Term: Aug–Oct work term, Jan–May recruit.
Fall Term: Oct–Dec work term, Apr–May recruit.

Please note these timelines are for reference only. Actual timelines may vary.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

This offer of employment is contingent upon the applicant being eligible to access U.S. export-controlled technology. Due to U.S. export laws, including those codified in the U.S. Export Administration Regulations (EAR), the Company is required to ensure compliance with these laws when transferring technology to nationals of certain countries (such as EAR Country Groups D:1, E1, and E2). These requirements apply to persons located in the U.S. and all countries outside the U.S. As the position offered will have direct and/or indirect access to information, systems, or technologies subject to these laws, the offer may be contingent upon your citizenship/permanent residency status or ability to obtain prior license approval from the U.S. Commerce Department or applicable federal agency. If employment is not possible due to U.S. export laws, any offer of employment will be rescinded.

First Name

Last Name

Country

Phone

Location (City)

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf

School

Select...

Degree

Select...

Discipline

Select...

Start date month

Select...

Start date year

End date month

Select...

End date year

How did you hear about the role?

Select...

If you selected "Other," please specify how you heard about us

LinkedIn Profile

Website

What is your current GPA? If you are in the 1st year of a Master's or PhD program, please submit your GPA from your undergrad.

Please upload a copy of your unofficial transcript. *

Accepted file types: pdf, doc, docx, txt, rtf

What length of internship are you available for?

Select...

What internship term(s) are you targeting? Please check all that apply *

Winter/Spring

Summer

Fall

Please specify how many hours per week you are available:

Select...

I have read and understand the Export Control statement included in the job description above.

Select...

Inference Server – Product Software Intern

Apply for this job