Back to jobs

Observability Expert

Bangalore

Job Description

Job Overview

The Observability Expert plays a crucial role in optimizing the performance, reliability, and security of complex IT systems. Specializing in the collection, processing, analysis, and visualization of telemetry data, this position is responsible for continuously monitoring system health and proactively identifying and resolving potential issues.

Key Responsibilities

  • Design and Implement Observability Frameworks: Build and maintain comprehensive frameworks for monitoring the performance, health, and reliability of IT systems.
  • Manage Monitoring Infrastructure: Set up new monitoring capabilities such as dashboards, alerts, and metrics for existing and new infrastructure and applications.
  • Anomaly Detection and Problem Resolution: Utilize advanced tools and techniques to identify deviations from normal behavior and address potential issues.
  • Resource Optimization: Analyze telemetry data to maximize resource efficiency and cost-effectiveness.
  • Enhance User Experience: Improve user experience by analyzing telemetry data to optimize performance and reduce bottlenecks.
  • Support Data-Driven Decision Making: Provide actionable insights that enable informed choices based on real-time data.
  • Strengthen Compliance and Security: Maintain robust security practices by monitoring and analyzing potential vulnerabilities and risks to ensure regulatory compliance.

Required Skills and Experience

  • Master's degree in Computer Science, Engineering, or related field
  • 3-5 years of proven experience in Observability or related fields
  • Experience implementing and managing observability solutions such as New Relic, Datadog, Dynatrace, Prometheus, Grafana, ELK stack, Splunk, AWS CloudWatch
  • Experience with cloud platforms (AWS, GCP, Azure, etc.)
  • Proficiency in programming languages such as Python, Java, Go, TypeScript, SQL
  • Experience building and operating CI/CD pipelines
  • Deep understanding and ability to analyze telemetry data (metrics, events, logs, traces)
  • Strong problem-solving skills and analytical thinking
  • Effective communication skills for working with cross-functional teams and business stakeholders
  • Understanding and practical experience with Agile development methodologies

Desired Qualities

  • Proactive problem-solver with a data-driven approach
  • Enthusiastic about continuous learning and adapting to industry trends
  • Ability to understand and optimize complex systems
  • Capable of facilitating collaboration between teams and sharing best practices

Apply for this job

*

indicates a required field

Resume/CV

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf