Principal AI Evaluation Engineer
About Backbase
As a a Principal AI Evaluation Engineeryou will be leading the evaluation efforts in our AI-powered SDLC team. You will own the evaluation strategy for AI assistants and agentic workflows, ensuring they are reliable, observable, and safeguarded with strong guardrails. Beyond hands-on work, you will mentor engineers, lead triage and reporting, and make evaluation a cornerstone of release decisions.
Meet the job
-
Define and lead the evaluation strategy and roadmap for AI-powered SDLC core product
-
Build and oversee evaluation pipelines and guardrails.
-
Build and maintain evaluation datasets (synthetic and real project data) to benchmark AI behavior.
-
Analyze evaluation results, identify gaps, and produce clear, actionable reports for engineering and product stakeholders.
-
Build a culture of innovation and excellence, encouraging continuous improvement and adoption of best practices in AI evaluation and deployment.
-
Collaborate with cross-functional teams to integrate evaluation insights into development.
How about you?
-
Strong understanding of software engineering principles and the software development lifecycle (SDLC).
-
Hands-on experience with test design, test management, observability, and data analysis.
-
Proficiency in Python (or another scripting language) for automating evaluations.
-
Familiarity with AI Agent evaluation methods (faithfulness, answer relevancy, contextual accuracy, tool correctness).
-
Excellent analytical and problem-solving skills.
-
Strong communication and collaboration abilities, able to work with cross-functional teams and stakeholders.
-
Demonstrated ability to mentor engineering talent, fostering collaboration and technical excellence.
-
(Nice to have) Experience with evaluation frameworks, RAG systems, or agentic workflows.
Create a Job Alert
Interested in building your career at Backbase? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field