Back to jobs
New

Senior Software Engineer

Gdańsk, Pomeranian Voivodeship, Poland

About Graphcore 

How often do you get the chance to build a technology that transforms the future of humanity? Graphcore products have set the standard in made-for-AI compute hardware and software, gaining global attention and industry acclaim. Now we are developing the next generation of artificial intelligence compute with systems that will allow AI researchers to develop more sophisticated models, help scientists unlock exciting new discoveries, and power companies around the world as they put AI at the heart of their business. We recently joined SoftBank Group, bringing large and ongoing investment from one of the world’s leading backers of innovative AI companies.

 

Job Summary

As a Senior Software Engineer in the ML Software Performance Validation team, you will play a critical role in ensuring end-to-end performance excellence of our proprietary AI hardware and software stack. You will directly report to the Performance Validation Team Lead and collaborate closely with component teams, including ML Framework developers, Compiler and Runtime teams, Infrastructure engineers, and Product Management. Your work will directly influence the efficiency and scalability of our ML software solutions, significantly impacting our business by enabling reliable and performant AI solutions for customers.

 

The Team

The ML Software Performance Validation team is part of the broader ML Software Engineering organisation, responsible for validating and optimizing the performance of our proprietary ML solutions. Our team comprises experienced engineers and specialists dedicated to rigorous performance benchmarking, analysis, and optimization across large-scale distributed systems. We collaborate closely with internal stakeholders to ensure our products meet the highest standards of efficiency and scalability.

 

Responsibilities and Duties

  • Develop and maintain automated benchmarking and performance validation frameworks and models for ML software stacks.
  • Analyse performance bottlenecks at scale (single-node, multi-node, and multi-rack) and recommend actionable improvements.
  • Collaborate with ML framework, compiler, and distributed computing teams to validate and enhance software optimizations.
  • Implement performance monitoring, profiling, and tracing tools tailored for ML workloads.
  • Perform systematic scalability testing (scale-up and scale-out) and document findings clearly.
  • Design, automate, and execute comprehensive test plans to validate software performance against defined goals.
  • Lead deep-dive debugging sessions, root-cause analysis of performance issues, and coordinate resolution activities.
  • Document performance validation processes and best practices.

 

Candidate Profile 

Essential:

  • A passion for your work and the ability to thrive in uncertain and complex environments.
  • Hands-on experience with ML software stacks, particularly PyTorch or similar frameworks.
  • Solid programming skills in Python and proficiency with performance debugging and profiling tools (perf, VTune, TensorBoard, or similar).
  • Good knowledge of distributed computing concepts, collective communication algorithms, and their impact on ML workload performance.
  • Demonstrated ability to analyse complex performance data and communicate findings effectively to technical and non-technical stakeholders.
  • Strong problem-solving skills, with the ability to systematically debug complex software and infrastructure issues.

Desirable

  • Expertise in software performance analysis, profiling, and benchmarking, particularly with large-scale distributed systems.
  • Familiarity with container technologies (Docker, Kubernetes) and their performance implications.
  • Experience with high-performance computing (HPC) clusters and networking technologies (InfiniBand, RDMA).
  • Prior experience with precision timing protocols (NTP, PTP) and time synchronization for performance benchmarking.
  • Knowledge of compiler internals, intermediate representations (IR), and hardware accelerators.
  • Experience working with custom AI/ML hardware accelerators or GPUs.

Benefits

In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.

 

Sponsorship

Applicants for this position must hold the right to work in the UK. Unfortunately at this time, we are unable to provide visa sponsorship or support for visa applications.

 

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Select...

UK Demographic Data

We take pride in our commitment to creating an inclusive and diverse workplace. As part of our recruitment process, we ask for confidential diversity data from all applicants. This data will be anonymised so that no personal identification information will be collected, and is retained for statistical purposes only and is not attached to your application. Your responses to the following three questions will remain confidential and will not impact or be used in any way in regards to your application. We are only using this data to improve our hiring process to be inclusive of all diversity backgrounds.

Select...
Select...
Select...