Site Reliability Engineer
Pavilion Payments enables the world’s gaming entertainment leaders to create amazing consumer experiences and maximize spend across all of their physical and digital properties. Our complete suite of payment solutions enables safe, secure, and trusted cash access at the cage, on the casino floor, or online. Our compliance and security solutions offer additional layers of automation and risk protection. And our analytics solutions enable clients to view performance across all of their gaming properties.
About the Role
As Pavilion Pay’s inaugural Site Reliability Engineer (SRE), you will play a foundational role in building a resilient infrastructure and ensuring high availability across our systems. This position, part of IT Operations, will work closely with Network, Cloud Infrastructure, DevOps, and Cloud Architects to implement best practices in system reliability, observability, and automated response. This role emphasizes reliability, platform management, and network security.
Key Responsibilities:
Reliability and Incident Management
- Establish and track reliability metrics such as Latency, Traffic, Errors, and Capacity, focusing on uptime across applications and products, with plans to expand monitoring to kiosk and edge networks.
- Develop and refine monitoring systems using Grafana to ensure comprehensive visibility, focusing on continuous improvements in reliability.
- Establish robust processes for incident response and root cause analysis, leveraging OpsGenie to ensure timely and structured responses.
- Work with TailScale, SUSE, and F5 to support secure, resilient network connectivity and load balancing.
Platform Management and Service Objectives
- Collaborate with IT leadership to define and maintain service level objectives (SLOs) and monitor performance against these standards.
- Structure and optimize platform management with a focus on supporting uptime in our production environment.
Automation, IaC, and CI/CD Pipelines
- Develop and maintain Terraform configurations for scalable, repeatable infrastructure deployment, focusing on minimizing manual tasks and ensuring resource consistency.
- Work with DevOps to optimize CI/CD workflows using Azure DevOps, focusing on pipeline automation and deployment efficiency.
- Automate repetitive tasks and enhance deployment processes within AKS and Azure environments, aiming to reduce potential deployment bottlenecks.
Network and Security Collaboration
- Partner with network engineers to optimize and maintain F5 load balancers and Palo Alto Networks/Panorama for secure, resilient network operations.
- Collaborate with security teams to ensure network traffic and access patterns align with security best practices, integrating observability into network operations.
Requirements:
- Technical Skills Desired: Proficiency with SUSE, AKS, Linux, Azure Cloud, Grafana, Rancher, Terraform, Azure DevOps pipelines.
- Monitoring Tools: Strong experience with Grafana for observability and OpsGenie for incident response, with a focus on maintaining uptime and proactive alerts.
- Automation and Scripting: Proficiency in scripting (e.g., Bash, Python) and experience with TailScale for secure networking solutions.
- Problem-Solving Mindset: Experience in identifying and remediating performance and security issues, focusing on proactive, long-term solutions.
First 90 Days:
- Understand the Product: Deepen familiarity with Pavilion Pay's products and their interdependencies.
- Develop Monitoring Structures: Work with current Grafana structures, defining future enhancements.
- Network Architecture and Monitoring: Learn our network architecture and support basic monitoring/alerting systems currently in place.
- Platform Familiarization: Gain familiarity with platform elements, especially Terraform, CI/CD, and SLO definitions, to support a highly reliable production environment.
Pavilion Payments provides equal employment opportunities to all employees and applicants for employment without regard to race, color, religion, sex (including pregnancy), national origin, ancestry, age, marital status, sexual orientation, gender identity or expression, disability, veteran status, genetic information or any other basis protected by law. Those applicants requiring reasonable accommodation to the application and/or interview process should notify a representative of the Human Resources Department
Apply for this job
*
indicates a required field