
IC3 - Infra Engineer - SRE
Job: IC3 IT Infrastructure Engineer - SRE
Job Family: Technology > Sub-Family: Platform Engineering
Reports to (role): Lead | Manager
Objective of the Role
The IC3 SRE Engineer is responsible for supporting and enhancing the reliability, availability, and performance of the company's IT infrastructure and applications. This semi-senior role focuses on improving system stability and efficiency through advanced monitoring, automation, and incident response, contributing to the overall success of IT operations and strategic initiatives.
Main Responsibilities
- Advanced System Monitoring: Implement and maintain advanced monitoring solutions to ensure the health and performance of infrastructure and applications.
- Incident Response: Lead incident response activities, diagnosing and resolving system reliability issues, and conducting post-incident reviews.
- Automation and Scripting: Develop and implement automation scripts and tools to improve system reliability and operational efficiency.
- Performance Analysis: Collect, analyze, and interpret performance data to identify trends, anomalies, and potential issues, providing actionable insights.
- Documentation: Maintain accurate and up-to-date documentation of system configurations, processes, and procedures.
- Collaboration: Work closely with other IT team members and departments to support reliability engineering projects and initiatives.
- Mentorship: Provide guidance and support to junior engineers, helping to enhance their technical skills and knowledge.
- Security Compliance: Implement and enforce security measures to protect systems and ensure compliance with security policies.
- Continuous Improvement: Drive continuous improvement initiatives, exploring new technologies and methodologies to enhance system reliability.
- Autonomous Work Culture: Actively contribute to creating an autonomous work culture by taking initiative, being self-motivated, and collaborating effectively in an agile and lean environment.
- Spin Culture Ambassador: Embody and promote Spin's values in every action, fostering a positive and inclusive work environment.
- Disaster Recovery: Develop and maintain disaster recovery plans to ensure business continuity in case of system failures.
Required Knowledge and Experience
- Bachelor's degree in computer science, Information Technology, or a related field, or equivalent work experience.
- Minimum of 5+ years of experience in site reliability engineering or related fields.
- Strong understanding of system reliability concepts, including monitoring, automation, and incident response.
- Proficiency with scripting languages and automation tools.
- Strong problem-solving and troubleshooting skills.
- Excellent communication and teamwork skills.
- Willingness to learn and adapt to new technologies and processes.
- Data-driven mindset
- Strong communication skills
- English Level: Intermediate to advance
Spin está comprometida con un lugar de trabajo diverso e inclusivo.
Somos un empleador que ofrece igualdad de oportunidades y no discrimina por motivos de raza, origen nacional, género, identidad de género, orientación sexual, discapacidad, edad u otra condición legalmente protegida.
Si desea solicitar una adaptación, notifique a su Reclutador.
Create a Job Alert
Interested in building your career at Spin Careers? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field