Back to jobs

QA Analyst (AI Systems)

Latin America

 

About CodeRoad

CodeRoad provides end-to-end software development services, helping businesses scale with ideal infrastructure solutions. From staff augmentation to dedicated IT teams and general software engineering, our nearshore technology services empower businesses to thrive in an ever-evolving digital landscape.

About the Role

We are looking for a Junior AI Agent Quality Engineer to help us build and validate the next generation of autonomous systems. In this role, you won't just be checking for bugs; you'll be evaluating the "brain" of our AI.

You will focus on the reliability of Agentic AI and RAG (Retrieval-Augmented Generation). You’ll be responsible for ensuring our agents can plan tasks, use tools accurately, and recover from errors without "hallucinating." This is a perfect role for someone with a QA mindset who is passionate about Python and the future of LLM-based orchestration.

Key Responsibilities

  • Agent Evaluation: Create question templates and Python scripts to test how well AI Agents and RAG instances solve complex tasks.

  • KPI Tracking: Evaluate Agent performance using specific metrics: Success Rate, Tool Use Accuracy, Planning Quality, and Autonomy.

  • Dataset Creation: Collaborate with AI Engineers to build "Ground Truth" datasets and Agentic Task Datasets to benchmark model improvements.

  • API & Logic Testing: Write unit and integration tests for Python-based RESTful APIs and agent endpoints.

  • Performance Testing: Run load and bulk testing (using Locust or JMeter) to see how our AI handles high-volume requests.

  • Input/Output Validation: Perform rigorous testing of agent payloads to catch prompt injection risks and logic failures.


Requirements

  • Agentic AI Focus: Basic understanding of Agentic workflows, prompt engineering (ReAct prompts), and LLM orchestration.

  • AI Observability: Familiarity with (or a strong desire to learn) tools like LangSmith, Langfuse, or OpenTelemetry.

  • Python Skills: Proficiency in Python 3.10+ for automation and data manipulation.

  • API Fundamentals: Strong experience testing and validating RESTful APIs and JSON structures.

  • Analytical Thinking: A "breaker" mindset—the ability to find edge cases where an AI might fail to follow instructions.

  • Language: Advanced English (B2/C1) for global team collaboration and technical documentation.


Nice to Have

  • QA Experience: 1–3 years of experience in Software QA (Manual or Automation).

  • Automation Tooling: Exposure to Pytest, Playwright, or Selenium.

  • AI Frameworks: Exposure to LangChain, LangGraph, or Pydantic AI.

  • Infrastructure: Basic knowledge of Docker and Vector Databases.

  • Performance: Experience with Locust or JMeter for bulk testing.

What You’ll Love

  • 100% Remote
  • Holidays off
  • Paid Time Off
  • Health insurance assistance program
  • Competitive USD compensation
  • Strong team culture and collaborative environment
  • Ongoing training and growth opportunities

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Select...

This role is 100% remote and LATAM-only.

Select...
Select...
Select...
Select...