
DevOps Engineering Manager
Company Overview:
About Us:
Atlanta-based Incident IQ is the leading workflow management platform built exclusively for K-12 districts. Trusted by over 2,000 districts, Incident IQ powers mission-critical services for more than 12 million students and educators nationwide. By connecting technology and operational workflows, Incident IQ enables schools to streamline processes, reduce administrative burdens, and focus on what matters most: supporting students.
Purpose:
Incident IQ is committed to creating a future where every K-12 district operates with seamless efficiency. When operations are unified on a single platform, districts gain the clarity and control needed to build a stronger foundation for student success. We’re focused on delivering the tools, support, and partnerships that help make that vision a reality.
Mission:
Incident IQ is on a mission to eliminate the friction of disconnected systems and clunky workflows that slow schools down. We’re reimagining the critical work that happens behind the scenes, bringing visibility, efficiency, and impact to the processes that keep classrooms running. By streamlining the complex, automating the routine, and surfacing the insights that matter most, we can create the conditions for educators to teach, students to thrive, and districts to shape the future of education.
DevOps Engineering Manager Overview
We are seeking a highly technical, results-oriented DevOps Engineering Manager to lead and evolve our cloud platform operations at scale. This is not a maintenance role — it’s a leadership position for someone who thrives on solving complex problems, optimizing infrastructure for performance and cost, and leading teams through challenges to deliver measurable progress.
The ideal candidate has deep expertise in Azure cloud, microservice architectures, and DevOps automation at scale, paired with strong people leadership and the ability to make fast, high-quality decisions. This role requires someone who can get into the technical trenches when needed, hold the team accountable to high standards, and drive innovation in automation, reliability, and developer experience.
Technical Leadership & Direction
- Provide strong technical direction across all DevOps initiatives — from infrastructure design to pipeline automation.
- Lead with authority in architecture discussions; drive clarity, make decisive calls, and resolve blockers swiftly.
- Be hands-on when necessary: investigate incidents, review architectures, or design automation solutions yourself.
- Partner with the Architecture team to evolve our Azure-based microservices infrastructure to support billions of monthly requests.
- Maintain deep situational awareness of the platform — knowing what’s running, what’s breaking, and what’s improving.
Team Ownership & Development
- Lead three critical DevOps pods of 2-3 engineers each:
- SRE / Platform Sustainability
- CI/CD Innovation & Automation
- Developer Experience & Internal AI Tooling
- Develop and mentor technical leads and engineers, ensuring each pod has a clear mission, measurable goals, and high accountability.
- Create a culture of urgency, transparency, and continuous improvement.
- Have the tough conversations early — ensuring accountability and alignment with engineering and business goals.
- Recruit and develop exceptional talent, growing the team’s technical strength and delivery velocity.
Fiscal & Operational Responsibility
- Own and manage the Azure budget, ensuring cloud spend aligns with business priorities and remains within target.
- Design and enforce cost-efficiency practices in architecture, scaling, and automation.
- Implement proactive cloud spend monitoring and reporting to maintain fiscal visibility and control.
Execution & Delivery
- Deliver automation, pipelines, and infrastructure improvements with velocity — measured in hours or days, not weeks.
- Drive progress through ambiguity and technical constraints; never accept “blocked” as the final answer.
- Push for measurable outcomes: faster deployments, lower downtime, improved cost-to-performance ratios.
- Ensure that platform reliability, observability, and performance continuously evolve with scale.
Innovation & AI-Driven Operations
- Champion the integration of AI and automation in CI/CD, monitoring, and developer workflows.
- Partner with internal AI initiatives to improve system intelligence, predictive monitoring, and developer productivity.
- Stay ahead of emerging DevOps trends to ensure Incident IQ remains at the forefront of modern operations practices.
Requirements
- Bachelor’s Degree in Computer Science, Software Engineering, or related field (or equivalent experience).
- 7+ years of engineering experience, including 3+ years managing high-performing DevOps or SRE teams.
- Deep technical expertise in Azure, Kubernetes, microservice architectures, and distributed systems.
- Proven success operating and scaling high-traffic SaaS systems (billions of monthly requests).
- Mastery of CI/CD systems (Azure DevOps, GitHub Actions, etc.) and Infrastructure as Code (Terraform, Bicep, or equivalent).
- Strong background in observability and performance monitoring (Datadog, Grafana, Azure Monitor).
- Demonstrated ability to lead teams through complex challenges and deliver under pressure.
- History of budget ownership and driving cloud cost optimization initiatives.
- Exceptional communication skills — able to align engineers, executives, and cross-functional stakeholders.
Preferred
- Experience incorporating AI/ML or intelligent automation into operational workflows.
- Familiarity with GitOps, service mesh architectures, and distributed tracing.
- Background in developer experience platforms or internal tooling.
- Track record of transforming DevOps functions and demonstrating measurable efficiency or reliability gains.
Technology Stack
- Cloud: Azure, ARM/Bicep, Terraform
- Orchestration: Kubernetes, Docker, Helm
- CI/CD: Azure DevOps, GitHub Actions, Backstage, ArgoCD, Octopus Deploy
- Monitoring: Application Insights,, Grafana, Azure Monitor, Prometheus, ELK
- Languages: C#, PowerShell, Python, Bash
- AI Tooling: Azure AI Services, OpenAI, Copilot Integrations, Claude, Gemini, Cursor
What makes Incident IQ different:
- We facilitate whole-person growth where employees can develop personally as well as professionally.
- We offer an energetic and collaborative environment; everyone’s opinion matters!
- We produce software that empowers K-12 schools to run efficiently, allowing for a better classroom experience for students to THRIVE!
- We provide excellent work/life balance. Two amazing offices - a Downtown Atlanta office location and one at Halcyon in Alpharetta!
Incident IQ offers a competitive salary based on experience with a benefits package for full-time employees that includes medical, dental, vision, life insurance, 401k match, and paid-time off (PTO).
Incident IQ is an Equal Opportunity Employer
Create a Job Alert
Interested in building your career at Incident IQ? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
