Job Overview
We are seeking a motivated and technically curious Site Reliability Engineer to help build and maintain the reliable and distributed systems that support our business operations.
In this role, you will play a vital part in supporting the following businesses:
- Vulcan: Vulcan is a cybersecurity solution for GenAI, providing red and blue team services to ensure compliance and security.
- Cymetrics: A cybersecurity platform designed specifically for small and medium enterprises in the APAC region.
- IXT: An insurance core system solution for APAC insurance markets.
Tech Blog: https://medium.com/onedegree-tech-blog
Responsibilities
- Implement and enhance system reliability, availability, scalability, performance, and efficiency by leveraging monitoring, alerting, and automation tools on public cloud platforms.
- Participate in capacity planning, analyze software performance, and fine-tune systems to ensure optimal operation.
- Develop and enhance GitLab CI/CD processes and toolset to streamline software delivery and deployment.
- Define and monitor key metrics to assess and enhance system reliability.
- Collaborate closely with the engineering team to improve reliability and operational efficiency at every software development life cycle (SDLC) stage.
- Troubleshoot, optimize infrastructure and automate repetitive tasks to increase efficiency and effectiveness