
M1 - Infra Lead - Observability
🔧 We're Hiring: M1 - Infra Lead - Observability! 🚀
Are you ready to redefine observability and drive system reliability across a dynamic engineering organization? Join us as an Observability Lead and lead the charge in transforming how we monitor, analyze, and optimize our systems for maximum performance and user impact!
🎯 What You’ll Do:
✅ Lead and mentor a team of observability engineers, fostering a culture of ownership and continuous improvement.
✅ Redesign and implement an observability architecture that connects system health metrics, logs, and traces to real business impact.
✅ Define and enforce observability best practices across engineering teams, ensuring proper instrumentation and meaningful telemetry data.
✅ Optimize the integration and usage of observability platforms (e.g., Datadog, Grafana, Prometheus, ELK Stack).
✅ Develop a structured alerting strategy to ensure actionable responses and reduce noise.
✅ Partner with SRE, engineering, and product teams to embed observability into the software development lifecycle.
✅ Lead post-incident analysis to drive permanent improvements and prevent recurring issues.
✅ Design and maintain clear, actionable dashboards for real-time system health and performance visibility.
✅ Promote a proactive observability mindset, shifting from reactive monitoring to proactive system reliability.
✅ Provide training and documentation to help engineering teams integrate observability practices.
✅ Collaborate with security and compliance teams to align observability practices with regulatory requirements.
✅ Stay ahead of industry trends and emerging technologies to continuously evolve our observability strategy.
🎓 What We’re Looking For:
🔹 8+ years of experience in observability, SRE, or infrastructure operations.
🔹 Proven leadership experience in driving accountability and engagement across engineering teams.
🔹 Deep understanding of observability principles (monitoring, logging, tracing, metrics).
🔹 Expertise with Datadog, Opsgenie, Grafana, OpenTelemetry, Prometheus, and similar tools.
🔹 Strong analytical skills to correlate observability data with user experience and business impact.
🔹 Experience designing alerting frameworks that prioritize actionable responses over noise.
🔹 Ability to drive cultural and process change within engineering organizations.
🔹 Strong troubleshooting skills for debugging performance issues and infrastructure failures.
🔹 Excellent communication and leadership skills to mentor and influence teams.
🔹 Experience in regulated environments with knowledge of security and compliance requirements.
🔹 Advanced English proficiency for technical discussions and collaboration.
💡 Why Join Us?
Be part of a transformative role where your leadership and expertise will shape the future of observability, driving operational excellence and system reliability across the organization.
📩 If you're ready to lead the way in observability, apply today!
#Observability #PlatformEngineering #Leadership #Hiring #TechJobs
Spin está comprometida con un lugar de trabajo diverso e inclusivo.
Somos un empleador que ofrece igualdad de oportunidades y no discrimina por motivos de raza, origen nacional, género, identidad de género, orientación sexual, discapacidad, edad u otra condición legalmente protegida.
Si desea solicitar una adaptación, notifique a su Reclutador.
Apply for this job
*
indicates a required field