Job Overview
We are seeking a skilled and passionate Site Reliability Engineer with a strong technical background and excellent communication skills. This individual will lead the development, construction, and management of reliable and distributed systems that support our business operations.
In this role, you will play a vital part in supporting our Cybersecurity business, Vulcan. Vulcan is a cybersecurity solution for GenAI, providing red and blue team services to ensure compliance and security.
Learn more about us 👉
- Vulcan product: https://vulcanlab.ai/
- Vulcan LinkedIn: https://www.linkedin.com/company/vulcanlab-ai/
- AIFT group: https://aift.io/
Responsibilities
- Implement and enhance system reliability, availability, scalability, performance, and efficiency by leveraging monitoring, alerting, and automation tools on public cloud platforms.
- Participate in capacity planning, analyze software performance, and fine-tune systems to ensure optimal operation.
- Develop and enhance GitLab CI/CD processes and toolset to streamline software delivery and deployment.
- Define and monitor key metrics to assess and enhance system reliability.
- Collaborate closely with the engineering team to improve reliability and operational efficiency at every software development life cycle (SDLC) stage.
- Troubleshoot, optimize infrastructure and automate repetitive tasks to increase efficiency and effectiveness
Requirements
- Strong expertise and experience in cloud technologies.
- Advanced knowledge of monitoring solutions like Prometheus, Grafana, ELK (Elasticsearch, Logstash, Kibana).
- Experience in the complete software development life cycle (SDLC).
- In-depth understanding of network concepts, particularly with a focus on security.
- Hands-on experience implementing GitLab CI/CD processes.
- Proficiency in automation platforms like Ansible and Terraform.
- Knowledge of orchestration tools like Kubernetes.
- Familiarity with container technologies like Docker.
- Experience with Git source code version control systems.
- Experience with AI pair programming like OpenAI.
- Proficiency in programming languages such as Bash, Python, or Go.
- Experience and capability in executing client-side/on-premise deployments is a strong plus.
Interview Process
- HR phone interview: 1 hour
- Onsite Interview: 1.5~2 hours, meet with hiring team and HR
Why Join Us?
- Innovative Environment: Be part of a company at the forefront of technology to provide security in GenAI, with opportunities to work on groundbreaking projects.
- Growth Opportunities: Take your career to new heights with our career development programs and growth-focused culture.
- Dynamic Team: Join a multi-cultural and dynamic team of dedicated professionals who inspire and support each other.
- Compensation: Competitive salary and benefits package, commensurate with experience and performance.

