DevOps Automation Engineer
Kunai is a fast-growing digital consultancy focused on banking, payments, and fintech powered by a global network that attracts the best and brightest people from all backgrounds and cultures, driven by innovation and experimentation, spread across almost every single continent. Over the past decade, we've shipped over 150 products for clients that include Visa, American Express, Capital One, WEX, Wells Fargo, Ernst & Young, and TOMS Shoes. Our founders built a previous agency (Monsoon) that was acquired by Capital One in 2015.
Kunai is working with one of the world’s largest FinTech organizations to build a brand-new core banking platform that will manage tens of billions of dollars in annual transactions across 100 million+ credit card accounts.
Quality Assurance stability has been a challenge during this massive tech transformation. Our brand new team of software engineers will enable greater support of engineers and their work, improving developer productivity and quality of code by implementing systems and procedures that allow for the detection, prevention and correction of issues before they impact a customer or system. We’ll provide continuous QA & Automation support for engineers and their deployment pipelines, as well as big-picture thought leadership and development of processes and systems that underpin our Testing Center Of Excellence’s goals.
We’re excited about this opportunity to build world class automation tools and processes that will be used by thousands of other engineers. If you love the challenge of wearing multiple hats and supporting multiple disciplines like QA, coding, and infrastructure, we’d love to hear from you.
- Effectively manage troubleshooting and recovery of complex incidents, ranging from low to critical impacts
- Drive incident resolution through a systematic problem solving approach, coupled with a strong sense of ownership and drive
- Actively participate in teams’ Agile stories (project work) to streamline and enhance day to day operations of the team
- Create, manage, and utilize appropriate technical procedural documentation
- Proactively monitor applications and infrastructure behind external and internal customer facing services including their availability, latency, performance, and capacity
- Influence resiliency and scalability in production environments in Amazon Web Services (AWS)
- Identify opportunities and develop proactive automated monitoring and alerting solutions by leveraging available tools (Splunk, New Relic, etc.)
- Assist with conducting Root Cause Analysis (RCA) on critical production outages to develop and implement future mitigation strategies
- Utilize QA support expertise to influence and support new designs, architectures, standards, and methods to maintain stability and availability for large-scale distributed systems
- Proactively identify and implement automations for routine maintenance tasks, data gathering, and resolution of common issues
- Continuously seek to develop new skills and technical expertise, and proactively share knowledge with others
- Bachelors or equivalent certification
- Experience managing and troubleshooting incident bridge calls
- Expertise with Python scripting
- Experience using and supporting public cloud environments (AWS, Azure, or GCP)
- Experience with monitoring solutions like Splunk, New Relic, or DataDog
- Experience working with Production Support/Operations teams
At Kunai, we have built deep relationships with our clients. Our bar is high, and our mission is to always exceed our client’s expectations. If you are fanatical about customer success and driven to work on and solve tough technical challenges, we would love to chat with you!
Apply for this job
*
indicates a required field