Staff Site Reliability Engineer
Who We Are
Core Scientific is a leading provider of infrastructure for high-performance compute in North America. Our mission is to accelerate digital innovation by scaling high-value compute rapidly, efficiently, and responsibly. We transform energy into high-value compute with unmatched efficiency at scale. The company is a $5 billion publicly traded company (NASDAQ: CORZ).
We power AI, HPC, and other next-generation data center workloads demanding exceptional computing power, in addition to our digital asset mining operations. We own and operate nine data centers in seven states, housing advanced infrastructure for our customers.
What sets us apart? We have an entrepreneurial culture, a "can-do" and collaborative attitude, and we own and control our infrastructure. These strategic advantages enable us to maintain operational excellence, increase efficiency, and rapidly deploy cutting-edge innovations developed by our team of experts.
Join us and accelerate your career alongside our groundbreaking journey. We seek smart, creative, and collaborative professionals who thrive in a fast-paced, result-driven environment. Ready to be part of something exceptional? Apply today and make an impact at Core Scientific.
Title
Staff Site Reliability Engineer
Reports To
Site Reliability Engineering Manager
The Job
We are seeking a capable, motivated generalist who thrives in a change-controlled, compliant environment and enjoys working across hybrid cloud and on-premises systems. This role partners closely with application architecture and peer engineering teams while contributing hands-on across platform engineering, DevOps, and SRE.
This position is expected to take ownership of complex technical initiatives and see them through to completion—balancing hands-on implementation with effective delegation and cross-team coordination.
Responsibilities
- Lead end-to-end delivery of complex technical initiatives, from problem definition and design through implementation, rollout, and operation
- Own the design, implementation, and reliability of systems across hybrid cloud and on-premises environments
- Take accountability for technical outcomes, including system reliability, scalability, and performance in regulated, change-controlled environments
- Drive execution by coordinating work across engineers and teams, delegating effectively while remaining hands-on where needed
- Partner with application architecture and peer teams to shape system design and influence technical decisions
- Build, deploy, and operate infrastructure and applications using automation and infrastructure as code
- Implement secure, immutable infrastructure using modern tooling (e.g., Terraform, Kubernetes, Helm, Ansible)
- Improve observability, monitoring, and incident response practices
- Establish and promote best practices for reliability, security, and operational excellence across teams
- Mentor engineers and contribute to raising the technical bar across the organization
- Foster open, respectful, and professional communication directly within the team as well as with co-workers/ teammates and leaders across the organization
- Performs other duties as assigned
Qualifications
- Bachelor's degree in Computer Science or a related field, 7+ years of experience, or equivalent demonstrated impact in SRE, DevOps, or Infrastructure Engineering
- Broad technical experience across infrastructure and distributed systems, with the ability to design effective solutions, apply appropriate patterns, and anticipate scaling, reliability, and operational challenges
- Strong understanding of distributed systems behavior, including application runtime characteristics, service-to-service communication, networking, and failure modes in production environments
- Experience operating in regulated, compliant, or change-controlled environments
- Experience working in hybrid environments (AWS preferred; on-premises infrastructure required)
- Strong experience with Infrastructure as Code, configuration management, and orchestration tools (Terraform, Helm, Kustomize, Ansible)
- Experience with Kubernetes and virtualization technologies
- Experience with observability platforms (e.g., Datadog), including building monitoring and alerting integrations
- Experience with build and release systems (e.g., GitHub Actions, Makefiles, Python tooling)
Location
To be considered for the role you must reside near Miami, FL or Austin, TX.
Travel
Occasional travel may be required
Work Environment
This job typically operates in a professional office environment and routinely utilizes standard equipment, including laptop computers and smartphones. This role may also travel to data center sites, and the work environment at a data center may contain loud noise, construction, and other operational elements.
Physical Demands
While performing the duties of this job, the employee is frequently required to sit, stand, walk, use hands, and lift up to 25 pounds.
Position Type/ Expected Hours of Work
This is a full-time position. General hours and days of work are Monday through Friday, 8:00 a.m. to 5:00 p.m. The employee is expected to be available generally around U.S. time zones and will be part of an on-call rotation. The current rotation is 1 week every 5 weeks.
Supervisory Experience (Yes or No)
No
Create a Job Alert
Interested in building your career at Core Scientific? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
