Senior Site Reliability Engineer (Heavy K8, AWS)
About Vonage
Vonage is a global cloud communications leader that helps businesses accelerate their digital transformation through our fully programmable Unified Communications, Contact Center Applications, and Communications APIs. We enable next-generation communications that are more flexible, intelligent and personal, empowering our customers to do what is next and stay ahead.
At Vonage, our team members bring their passion to work, solve our customers’ problems, and operate as one global team supporting our vision of accelerating the world’s ability to connect.
Working at Vonage you are valued for your knowledge, skills, achievements, and diverse perspectives. With Accountability and Trust at our core, we offer our employees flexibility where they work and you will be rewarded through enriching learning and development opportunities to further grow in your career.
So ask yourself, why not Vonage?
Vonage Mission
We embody the notion of be what’s next now! We envision, develop and manage technology to connect the world. Our team brings excellence, passion, creativity and curiosity to the job. We look at the business environment and technologies in new and challenging ways, striving to develop and deliver integrated whole-system solutions to meet our customers’ ever-changing needs.
Why this role matters
In our Business Support Systems Team, We believe that there shouldn’t be walls between operations and development and we have embraced the DevOps movement.
As a site reliability engineer, you will work alongside development teams to build automation and tools to deploy, monitor and maintain the platform's health, targeted SLO and SLAs.
All this will be done working on a scrum team with a mixture of experienced developers and engineers. Commonly you’ll also work with teams across the organization to help understand their challenges on delivering large-scale, distributed, and fault-tolerant systems.
IF THIS SOUNDS LIKE YOU, CONTINUE READING BELOW...
What you'll do
- Lead the effort in ensuring reliability of the platform.
- Create Software and Tooling that improves performance, stability, and reliability of the platform.
- Ability to work closely with development teams.
- Monitor Application and Infrastructure metrics to help with improving software performance.
- Build solutions that are highly resilient, scalable, and secure.
- Have a wide breadth of knowledge from software, infrastructure, and security.
- Adopt best practices and champion an engineering culture emphasizing Agile.
What's required for application
- Bachelor's degree (or higher) in Computer Science and 5+ Years of hands-on working experience on DevOps/SRE.
- 2+ years of any cloud experiences (Ideally AWS) - must
- Core Technologies: Kubernetes, Helm, Docker, Python, Unix scripting.
- Experience working on monitoring, logging, and alerting solutions.
- Good understanding of CI/CD tools such as Argo CD, Jenkins.
- Understand monitoring tools such as Prometheus and Grafana.
- Great communication skills.
What's in it for you
- Unlimited Discretionary Time Off
- Private Medical Insurance with optional dependent coverage
- Educational Assistance Reimbursement Program
- Opportunities for reimbursement for conferences, trainings, and other personal development events
- Maternity and Paternity Leave
- Ask recruiter for country specific information.
Note: The purpose of this profile is to provide a general summary of essential responsibilities for the position and is not meant as an exhaustive list. Assignments may differ for individuals within the same role based on business conditions, departmental need or geographic location.
Apply for this job
*
indicates a required field