Back to jobs

Performance Tester – GenAI

Chennai, Tamil Nadu

 

Role: Performance Test Engineer – Generative AI

Experience: 5+ years (with hands-on performance testing in GenAI / LLM-based applications)

Role Overview:

We are seeking a skilled and detail-oriented Performance Tester with strong experience in Generative AI (GenAI) projects. The ideal candidate will be responsible for ensuring scalability, reliability, and optimal performance of AI-powered applications, including Large Language Model (LLM) integrations, conversational AI systems, and Retrieval-Augmented Generation (RAG) pipelines. This role requires expertise in performance engineering, cloud platforms, and testing of AI/ML workloads in production environments.

Key Responsibilities

• Performance Strategy & Planning:

  • Define and implement performance testing strategies for GenAI and LLM-based applications.
  • Identify performance bottlenecks across APIs, model inference layers, vector databases, and cloud infrastructure.
  • Establish performance benchmarks, SLAs, and scalability targets for AI-driven systems.

• Performance Testing & Engineering:

  • Design, develop, and execute load, stress, spike, endurance, and scalability tests for GenAI applications.
  • Perform performance testing of LLM-powered APIs (e.g., ChatGPT-like applications) hosted on cloud platforms.
  • Validate latency, throughput, token usage, concurrency handling, and cost-performance trade-offs.
  • Conduct performance validation for RAG pipelines including embedding generation and vector search.
  • Analyze model inference time, GPU/CPU utilization, memory usage, and autoscaling behavior.

• Tools & Automation:

  • Develop automated performance test scripts using tools such as JMeter, LoadRunner, k6, or Gatling.
  • Monitor system performance using APM tools like Dynatrace, AppDynamics, Azure Monitor, or AWS CloudWatch.
  • Integrate performance testing into CI/CD pipelines using Azure DevOps or similar platforms.
  • Create dashboards and reports for performance metrics and trend analysis.

• Cloud & Infrastructure Testing:

  • Conduct performance testing on AI solutions deployed on Azure, AWS, or GCP.
  • Validate autoscaling configurations, containerized deployments (Docker, Kubernetes), and serverless architectures.
  • Assess performance of vector databases such as Chroma, Pinecone, Weaviate, or FAISS under load.

• Collaboration & Optimization:

  • Collaborate with AI engineers, data scientists, DevOps, and architects to optimize model serving and API performance.
  • Recommend improvements in prompt engineering, caching strategies, batching, and parallelization.
  • Support capacity planning and cost optimization for LLM-based applications.

• Governance & Reporting:

  • Document performance test results, bottlenecks, and optimization recommendations.
  • Ensure compliance with security and data privacy standards in performance environments.
  • Present findings to stakeholders and provide actionable insights.

Key Requirements

• Technical Skills:

  • 5+ years of experience in Performance Testing and Engineering.
  • Hands-on experience in performance testing GenAI / LLM-based applications.
  • Experience working with LLM platforms such as OpenAI GPT models, Gemini, Llama 2, Claude, or Grok.
  • Understanding of concepts like tokenization, embeddings, vector search, and RAG architecture.
  • Experience testing AI services hosted on Azure AI Services, Azure ML, AWS Bedrock, or Google Vertex AI.
  • Proficiency in performance testing tools such as JMeter, LoadRunner, k6, or Gatling.
  • Knowledge of API testing tools like Postman or Rest Assured.
  • Familiarity with monitoring tools such as Azure Monitor, AWS CloudWatch, Grafana, or Prometheus.
  • Experience with containerization (Docker) and orchestration (Kubernetes).
  • Basic scripting knowledge in Python or Java for test automation.
  • Understanding of CI/CD pipelines and DevOps practices.

• GenAI-Specific Knowledge:

  • Experience testing conversational AI applications and chatbot performance.
  • Knowledge of inference latency optimization techniques for LLMs.
  • Understanding of GPU-based workloads and performance considerations.
  • Exposure to agentic frameworks like LangChain, Semantic Kernel, AutoGen, or CrewAI (preferred).
  • Experience validating performance of vector databases (Chroma, Pinecone, Weaviate, FAISS).

Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, or related field.
  • 5+ years of experience in performance testing, with at least 2 years in AI/ML or GenAI projects.
  • Experience in testing cloud-native, microservices-based applications.
  • Strong analytical and troubleshooting skills.
  • Excellent communication and stakeholder management skills.

Create a Job Alert

Interested in building your career at Orion Innovation Naukri? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Education

Select...
Select...
Select...

Select...
Select...
What is your total experience? *
What is your notice period? *