Principal Site Reliability Engineer
SonicWall is a cybersecurity forerunner with more than 30 years of expertise and is recognized as a leading partner-first company, ensuring our partners and their customers are never alone in the fight against cybercrime. With the ability to build, scale and manage security across the cloud, hybrid and traditional environments in real-time, SonicWall provides relentless security against the most evasive cyberattacks across endless exposure points for increasingly remote, mobile and cloud-enabled users. With its own threat research center, SonicWall can quickly and economically provide purpose-built security solutions to enable any organization—enterprise, government agencies and SMBs—around the world. For more information, visit www.sonicwall.com or follow us on Twitter, LinkedIn, Facebook and Instagram.
Join our team as a SRE to lead the charge in ensuring our systems are highly reliable, scalable, and performant. You'll collaborate with seasoned engineers to optimize our infrastructure through data-driven decisions, automation, and innovative problem-solving. Your role will involve designing robust monitoring solutions, auditing Kubernetes-based workloads, and driving the implementation of CI/CD pipelines. With your expertise in AWS/GCP, you'll architect solutions that enhance the reliability of our large-scale cloud-based infrastructure. If you're passionate about technology, have a knack for data analysis, and thrive in a dynamic environment, we want you to shape the future of our operations.
Required Skills:
- Engineering Principles applied to infrastructure management.
- Proficient in identifying and addressing bottlenecks and failure points in large-scale distributed systems.
- Hands-on experience with Kubernetes (GKE) clusters and workloads using tools like Lens, K9s, and FluxCD.
- Ability to translate business needs into actionable metrics, pulling data from multiple sources like AWS, GCP, and custom applications.
- Skilled in building and maintaining dashboards using tools like Datadog, Grafana, Prometheus and Statsd to provide critical insights to business leaders.
- Expert use of performance analysis and debugging tools such as: tcpdump, ss/netstat, top, sar, ab, etc.
- Proficiency in at least one scripting language (e.g., Python, Bash, Perl).
- MySQL database administration.
#LI-KB7
#LI-Remote
#Kubernetes
SonicWall is an equal opportunity employer.
We are committed to creating a diverse environment and are an equal opportunity employer. All qualified applicants receive consideration for employment without regard to race, color, ethnicity, religion, sex, gender, gender identity and expression, sexual orientation, national origin, disability, age, marital status, veteran status, pregnancy, or any other basis prohibited by applicable law.
At SonicWall, we pride ourselves on recruiting a diverse mix of talented people and providing active security solutions in 100+ countries.
Apply for this job
*
indicates a required field