UMBRA - JDE-High - Principal Site Reliability Engineer
Clarity Innovations is a trusted national security partner, dedicated to safeguarding our nation’s interests and delivering innovative solutions that empower the Intelligence Community (IC) and Department of Defense (DoD) to transform data into actionable intelligence, ensuring mission success in an evolving world.
Our mission-first software and data engineering platform modernizes data operations, utilizing advanced workflows, CI/CD, and secure DevSecOps practices. We focus on challenges in Information Warfare, Cyber Operations, Operational Security, and Data Structuring, enabling end-to-end solutions that drive operational impact.
We are committed to delivering cutting-edge tools and capabilities that address the most complex national security challenges, empowering our partners to stay ahead of emerging threats and ensuring the success of their critical missions. At Clarity, we are people-focused and set on being a destination employer for top talent, offering an environment where innovation thrives, careers grow, and individuals are valued. Join us as we continue to lead innovation and tackle the most pressing challenges in national security.
Position Overview
The Network Operations Center Engineer assists the NOC Lead to manage and oversee the daily operations of an 8am - 5pm EST classified cloud development environment, with a strong emphasis on maintaining Kubernetes-hosted services. The NOC Engineer is responsible for coordinating incident response, system monitoring, team leadership, performance reporting, and ensuring the development environment’s security and availability.
Key Responsibilities
-
Carry out day-to-day operations of the classified NOC, ensuring adherence to service level agreements and system uptime requirements
-
Perform monitoring and support of cloud-based systems, networks, and containerized applications in Kubernetes clusters
-
Coordinate incident response, troubleshooting, and escalation procedures
-
Ensure timely detection, resolution, and documentation of service-impacting events
-
When NOC lead is absent, act as the primary point of contact for cloud system alerts, outages, and classified network incidents; communicate status to stakeholders and leadership
-
Ensure 24/7 observability of network, platform, and container-level components using tools such as Prometheus, Grafana, Fluentd, and Elastic Stack
-
Draft technical guidance for NOC staff and collaborate with engineering, cybersecurity, and cloud teams
-
Maintain situational awareness of the system through dashboards, logs, and proactive monitoring tools
-
Develop and maintain standard operating procedures, incident response plans, runbooks, and shift logs
-
Assist NOC lead conducting daily stand-ups, shift handovers, and weekly ops reviews
-
Generate operational metrics and performance reports
-
Ensure compliance with federal security policies and contribute to continuous accreditation of the cloud system under RMF
-
Perform readiness drills, after-action reviews, and contribute to lessons-learned activities
Qualifications
-
Must be able to obtain and maintain a TS/SCI security clearance (note, only US Citizens are eligible for security clearances)
-
Expertise in cloud infrastructure (AWS GovCloud, Azure Government, or C2S/C2E/JWCC), virtualization, and hybrid environments
-
Understanding of secure networking, load balancers, DNS in cloud-native architectures, and inter-cluster communication
-
Operational experience with Kubernetes, containerized workloads, and supporting technologies (Docker, Helm, Fluentd, Kustomize)
-
Strong understanding of monitoring tools (e.g., Prometheus, Grafana, ELK Stack) and ticketing systems (e.g., osTicket, Jira)
-
Familiarity with GitOps workflows and infrastructure as code using Terraform or Flux
-
Familiarity with DoD/IC cybersecurity compliance standards, ATO processes, and classified system governance
-
Excellent communication skills and the ability to clearly brief complex operational topics to leadership and mission partners
Preferred Qualifications
-
Active US TS/SCI security clearance with CI polygraph or higher
-
5+ years of experience in IT operations or network/system administration
Create a Job Alert
Interested in building your career at Clarity Innovations? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field