Performance Tester – GenAI
Role: Performance Test Engineer – Generative AI
Experience: 5+ years (with hands-on performance testing in GenAI / LLM-based applications)
Role Overview:
We are seeking a skilled and detail-oriented Performance Tester with strong experience in Generative AI (GenAI) projects. The ideal candidate will be responsible for ensuring scalability, reliability, and optimal performance of AI-powered applications, including Large Language Model (LLM) integrations, conversational AI systems, and Retrieval-Augmented Generation (RAG) pipelines. This role requires expertise in performance engineering, cloud platforms, and testing of AI/ML workloads in production environments.
Key Responsibilities
• Performance Strategy & Planning:
- Define and implement performance testing strategies for GenAI and LLM-based applications.
- Identify performance bottlenecks across APIs, model inference layers, vector databases, and cloud infrastructure.
- Establish performance benchmarks, SLAs, and scalability targets for AI-driven systems.
• Performance Testing & Engineering:
- Design, develop, and execute load, stress, spike, endurance, and scalability tests for GenAI applications.
- Perform performance testing of LLM-powered APIs (e.g., ChatGPT-like applications) hosted on cloud platforms.
- Validate latency, throughput, token usage, concurrency handling, and cost-performance trade-offs.
- Conduct performance validation for RAG pipelines including embedding generation and vector search.
- Analyze model inference time, GPU/CPU utilization, memory usage, and autoscaling behavior.
• Tools & Automation:
- Develop automated performance test scripts using tools such as JMeter, LoadRunner, k6, or Gatling.
- Monitor system performance using APM tools like Dynatrace, AppDynamics, Azure Monitor, or AWS CloudWatch.
- Integrate performance testing into CI/CD pipelines using Azure DevOps or similar platforms.
- Create dashboards and reports for performance metrics and trend analysis.
• Cloud & Infrastructure Testing:
- Conduct performance testing on AI solutions deployed on Azure, AWS, or GCP.
- Validate autoscaling configurations, containerized deployments (Docker, Kubernetes), and serverless architectures.
- Assess performance of vector databases such as Chroma, Pinecone, Weaviate, or FAISS under load.
• Collaboration & Optimization:
- Collaborate with AI engineers, data scientists, DevOps, and architects to optimize model serving and API performance.
- Recommend improvements in prompt engineering, caching strategies, batching, and parallelization.
- Support capacity planning and cost optimization for LLM-based applications.
• Governance & Reporting:
- Document performance test results, bottlenecks, and optimization recommendations.
- Ensure compliance with security and data privacy standards in performance environments.
- Present findings to stakeholders and provide actionable insights.
Key Requirements
• Technical Skills:
- 5+ years of experience in Performance Testing and Engineering.
- Hands-on experience in performance testing GenAI / LLM-based applications.
- Experience working with LLM platforms such as OpenAI GPT models, Gemini, Llama 2, Claude, or Grok.
- Understanding of concepts like tokenization, embeddings, vector search, and RAG architecture.
- Experience testing AI services hosted on Azure AI Services, Azure ML, AWS Bedrock, or Google Vertex AI.
- Proficiency in performance testing tools such as JMeter, LoadRunner, k6, or Gatling.
- Knowledge of API testing tools like Postman or Rest Assured.
- Familiarity with monitoring tools such as Azure Monitor, AWS CloudWatch, Grafana, or Prometheus.
- Experience with containerization (Docker) and orchestration (Kubernetes).
- Basic scripting knowledge in Python or Java for test automation.
- Understanding of CI/CD pipelines and DevOps practices.
• GenAI-Specific Knowledge:
- Experience testing conversational AI applications and chatbot performance.
- Knowledge of inference latency optimization techniques for LLMs.
- Understanding of GPU-based workloads and performance considerations.
- Exposure to agentic frameworks like LangChain, Semantic Kernel, AutoGen, or CrewAI (preferred).
- Experience validating performance of vector databases (Chroma, Pinecone, Weaviate, FAISS).
Qualifications
- Bachelor’s degree in Computer Science, Information Technology, or related field.
- 5+ years of experience in performance testing, with at least 2 years in AI/ML or GenAI projects.
- Experience in testing cloud-native, microservices-based applications.
- Strong analytical and troubleshooting skills.
- Excellent communication and stakeholder management skills.
Create a Job Alert
Interested in building your career at Orion Innovation Naukri? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field