Job Application for M1 - Infra Engineer Lead

New

Objective of the Role

Lead a team of Infrastructure Operations Engineers by providing technical guidance, driving operational excellence, and ensuring the stability and reliability of core infrastructure services. This role is accountable for delivering efficient IT Change Management, Incident and Problem Management, Availability Management and Service Level Management practices while scaling operational maturity and promoting a culture of continuous improvement and proactive ownership.

Main Responsibilities

Lead, mentor, and develop a team of Infra Engineers focused on IT operations, ensuring high performance, accountability, and continuous capability development.
Oversee execution of IT Change Management processes, ensuring operational risk is managed and all changes are properly documented, assessed, and implemented with minimal disruption.
Act as the escalation point for high-impact incidents, coordinating real-time response, guiding root cause analysis, and ensuring alignment across engineering and business stakeholders.
Lead the Availability Management process, ensuring alignment with business objectives, and overseeing data analysis to optimize the availability and performance of IT services.
Ensure compliance with service level agreements (SLAs), monitor KPIs, and lead initiatives to improve infrastructure reliability and operational performance.
Partner with SRE, DevOps, and Application Support teams to align on tooling, monitoring strategies, and recovery mechanisms that reduce time to detect and resolve issues.
Promote a proactive incident management culture by reinforcing postmortem practices, tracking remediation efforts, and driving operational improvements.
Guide the development and enforcement of operational runbooks, escalation paths, and DRP protocols to ensure service continuity and audit readiness.
Participate in roadmap planning and represent operational perspectives, ensuring infrastructure and product initiatives incorporate scalability and service resilience.
Promote automation, efficiency, and process optimization to reduce operational toil and improve system observability.
Foster cross-team collaboration and knowledge sharing, ensuring engineers are equipped to handle incidents independently and aligned with company standards.
Support capacity planning efforts by consolidating operational insights and usage trends to recommend scaling actions and optimize infrastructure usage.
Communicate effectively across technical and non-technical stakeholders, providing visibility into team performance, incident resolution status, and improvement plans.
Manage the team's participation in the on-call rotation and ensure handover practices and operational documentation are up to date.
Actively contribute to shaping a strong team culture based on transparency, autonomy, ownership, and alignment with Spin’s values.
Promote an autonomous work culture by encouraging self-management, accountability, and proactive problem-solving among team members.
Serve as a Spin Culture Ambassador to foster and maintain a positive, inclusive, and dynamic work environment that aligns with the company's values and culture.

Required Knowledge and Experience

Minimum 6 years of experience in IT Infrastructure Operations, with at least 1 year in a leadership or mentoring role.
Proven expertise in ITSM disciplines, including end-to-end ownership of Change Management, Incident Response, Problem Management and Service Level Management practices.
Hands-on experience with infrastructure components such as Linux systems, networking, and monitoring tools across hybrid or cloud-native environments.
Experience managing cross-functional incident response, leading RCAs, and driving improvements based on post-incident learnings.
Solid understanding of service health metrics, performance monitoring, capacity planning, and reliability best practices.
Demonstrated ability to lead through influence, coach engineers, and grow team capabilities in alignment with business needs.
Experience working with observability and ITSM platforms (e.g., Datadog, Prometheus, ServiceNow, Jira Service Management).
Strong problem-solving, communication, and organizational skills, with the ability to operate effectively in high-pressure environments.
Familiarity with automation frameworks or scripting languages (e.g., Python, Bash) is a plus.
High ownership, resilience, and a bias for action to drive team success and operational excellence.

Spin está comprometida con un lugar de trabajo diverso e inclusivo. 
Somos un empleador que ofrece igualdad de oportunidades y no discrimina por motivos de raza, origen nacional, género, identidad de género, orientación sexual, discapacidad, edad u otra condición legalmente protegida. 
Si desea solicitar una adaptación, notifique a su Reclutador.

Create a Job Alert

Interested in building your career at Spin Careers? Get future opportunities sent straight to your email.

M1 - Infra Engineer Lead - Ops

Apply for this job