AI Engineer – LLM Evaluation & Prompt Engineering
Who is Blueprint?
We are a technology solutions firm headquartered in Bellevue, Washington, with a strong presence across the United States. Unified by a shared passion for solving complicated problems, our people are our greatest asset. We use technology as a tool to bridge the gap between strategy and execution, powered by the knowledge, skills, and the expertise of our teams, who all have unique perspectives and years of experience across multiple industries. We’re bold, smart, agile, and fun.
What does Blueprint do?
Blueprint helps organizations unlock value from existing assets by leveraging cutting-edge technology to create additional revenue streams and new lines of business. We connect strategy, business solutions, products, and services to transform and grow companies.
Why Blueprint?
At Blueprint, we believe in the power of possibility and are passionate about bringing it to life. Whether you join our bustling product division, our multifaceted services team or you want to grow your career in human resources, your ability to make an impact is amplified when you join one of our teams. You’ll focus on solving unique business problems while gaining hands-on experience with the world’s best technology. We believe in unique perspectives and build teams of people with diverse skillsets and backgrounds. At Blueprint, you’ll have the opportunity to work with multiple clients and teams, such as data science and product development, all while learning, growing, and developing new solutions. We guarantee you won’t find a better place to work and thrive than at Blueprint.
In This Role
In this role, you will contribute to building and operating AI-powered middle-tier services that support conversational experiences within widely used productivity applications. You will focus on prompt evaluation, testing, and automation, ensuring that AI responses are accurate, reliable, and aligned with business and user expectations.
You will work closely with engineering, product, and data partners to evaluate LLM behavior, design test strategies, implement supporting code, and continuously improve prompt quality and system performance.
Key Responsibilities
- Design, evaluate, and refine conversational prompts used in AI-driven applications
- Perform manual and automated testing of LLM outputs to validate accuracy, relevance, and consistency
- Develop and maintain prompt evaluation frameworks and supporting tooling
- Set up and manage test environments for AI and prompt validation workflows
- Write and maintain code (Python or C#) to support evaluations, automation, and analysis
- Analyze evaluation results and provide data-driven recommendations for prompt improvements
- Review enhancement requests and translate requirements into technical solutions
- Prepare detailed software specifications, test plans, and test data
- Modify and enhance existing systems to meet new standards or requirements
- Conduct unit testing, quality assurance reviews, and post-implementation validation
- Support deployment, migration, and implementation activities
- Troubleshoot issues in both new and legacy systems and resolve defects identified during testing
Required Qualifications
- Bachelor’s degree in Computer Science, Computer Engineering, or a related technical field
- 2–4 years of professional software engineering or AI-related experience
- Strong foundation in computer science fundamentals, including data structures, algorithms, and software design
- Experience developing or supporting large-scale software systems
- Hands-on programming experience in Python or C#
- Experience with unit testing, debugging, and troubleshooting in production or pre-production systems
- Ability to analyze requirements and translate them into effective technical solutions
- Strong problem-solving skills and attention to detail
Preferred Qualifications
- Prior experience with LLM evaluation, prompt engineering, or AI experimentation
- Background in data science, experimentation, or model evaluation
- Experience building automated test frameworks or evaluation pipelines
- Familiarity with conversational AI, chatbots, or virtual assistant systems
- Experience working with AI/ML-powered applications in production environments
- Strong analytical mindset with the ability to interpret evaluation results and metrics
Salary Range
Pay ranges vary based on multiple factors including, without limitation, skill sets, education, responsibilities, experience, and geographical market. The pay range for this position reflects geographic based ranges for Washington state: $120,000- $135,000 USD annually. The salary/wage and job title for this opening will be based on the selected candidate’s qualifications and experience and may be outside this range.
Equal Opportunity Employer
Blueprint Technologies, LLC is an equal employment opportunity employer. Qualified applicants are considered without regard to race, color, age, disability, sex, gender identity or expression, orientation, veteran/military status, religion, national origin, ancestry, marital, or familial status, genetic information, citizenship, or any other status protected by law.
If you need assistance or a reasonable accommodation to complete the application process, please reach out to: recruiting@bpcs.com
Blueprint believe in the importance of a healthy and happy team, which is why our comprehensive benefits package includes:
- Medical, dental, and vision coverage
- Flexible Spending Account
- 401k program
- Competitive PTO offerings
- Parental Leave
- Opportunities for professional growth and development
Location: Redmond, WA
Create a Job Alert
Interested in building your career at Blueprint Technologies? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field