
Manager, Tenant Health & Monitoring Team
Armis, the cyber exposure management & security company, protects the entire attack surface and manages an organization’s cyber risk exposure in real time. In a rapidly evolving, perimeter-less world, Armis ensures that organizations continuously see, protect and manage all critical assets - from the ground to the cloud. Armis secures Fortune 100, 200 and 500 companies as well as national governments, state and local entities to help keep critical infrastructure, economies and society stay safe and secure 24/7.
Armis is a privately held company headquartered in California.
The Manager of Tenant Health & Monitoring will build, lead, and scale a new operational function responsible for proactively identifying and resolving tenant health issues across performance, data quality, collector stability, and integration health. This leader will serve as both a hands-on technical contributor and a people manager, especially during the early stages of the team, ensuring operational excellence while additional staff are hired.
This role is critical in transforming Armis from a reactive posture to a proactive one reducing TCSM/SUPPORT burden, improving customer experience, and ensuring data cleanliness and platform quality at scale. The Manager will work closely with Technical Customer Success Managers, ACEs, Deployment Engineering, Experts, MTS, Support, Product, and Data Engineering to develop workflows, build internal tools, define “what good looks like,” and drive continuous improvement across all monitored customer tenants.
Core Responsibilities
- Establish and scale the Tenant Health & Monitoring team, including workflows, playbooks, SLAs, KPIs, and operational dashboards.
- Define the blueprint for tenant health monitoring, including performance baselines, data quality standards, collector and integration health metrics, and escalation paths.
- Drive a proactive detection and prevention model, reducing customer impact and internal escalations.
- Act as the central liaison for other teams to ensure issues are handled efficiently and consistently.
- Ensure TCSMs are notified of issues requiring customer input and maintain a clear communication while setting proactive expectations and deadlines
- Provide structured feedback conversations to Product & Engineering for long-term quality improvements.
- Build dashboards and reporting mechanisms covering tenant health, issue detection rates, MTTR, and data quality trends.
- Partner with Product, DE, and Data teams to define what constitutes “healthy tenant data” across all environments.
- Manage ticket queues, prioritize tasks, and ensure SLA performance across the team.
- Hire, coach, and develop a high-performing team responsible for proactive tenant health monitoring.
- Create a structured onboarding path leveraging Armis360, DE tooling, and internal dashboards.
- Promote a culture of ownership, urgency, and data-driven decision making.
Until the team scales, the Manager will:
- Perform direct monitoring of tenants using Grafana, Deployment Health Assessment tools, MODE dashboards, beta monitoring tools, Support utilities, and internal logs.
- Troubleshoot tenant issues, including:
- degraded performance (CPU, RAM, resets, throttling)
- data inconsistency/deduplication
- collector degradation (offline, high CPU/RAM, disk full, no integrations)
- integration failures (password issues, connectivity, SPAN flow, network mapper gaps)
- Collaborate with Engineering for data quality issues
Key Skills
- Strong understanding of system performance monitoring (CPU/RAM/DB utilization/logs).
- Experience with Grafana, backend logs, SSH, APIs, and general troubleshooting.
- Ability to interpret data trends and correlate telemetry signals with tenant behavior.
- Understanding of enterprise cybersecurity concepts and network fundamentals.
- Ability to build an operational program from scratch, including processes, SLAs, runbooks, and monitoring frameworks.
- Strong decision-making skills; able to balance hands-on work with strategic planning.
- Effective at influencing cross-functional teams without direct authority.
- Strong sense of urgency and accountability, especially when managing customer-impacting issues.
- Comfortable managing high-volume, high-stakes operational workloads.
- Excellent written and verbal communication skills with the ability to translate technical findings into business-relevant updates.
- Clear communicator who can manage expectations with leadership and cross-functional stakeholders.
- Experience driving alignment across Support, Engineering, Product, and Customer Success teams.
Requirements
- BS degree in Computer Science, Engineering, Information Systems, or equivalent practical experience
- 5+ years in a technical operations, NOC, SRE, Support Engineering, or similar monitoring-focused role
- 2+ years in a leadership, mentoring, or team-lead capacity (formal or informal)
- Strong hands-on troubleshooting skills across Linux systems, logs, network flows, integrations, and performance telemetry
- Experience with monitoring stacks such as Grafana, Prometheus, CloudWatch, Splunk, Elastic, or equivalent
- Experience working in high-volume, high-urgency operational environments (NOC, SOC, SRE, Cloud Ops, TAM escalation teams)
- Ability to interpret complex telemetry (CPU, RAM, DB performance, throughput, packet loss, retries, API failure patterns)
- Comfortable building new processes, defining SLAs, creating runbooks, and scaling a new operational practice from inception
- Strong analytical abilities; capable of building dashboards, defining KPIs, and identifying systemic issues across tenants
- Excellent communication skills with the ability to translate technical findings into clear updates for cross-functional partners
- Ability to excel in a fast-paced, evolving environment with competing priorities
- Demonstrates a continuous-improvement mindset, believes that what was good enough yesterday must be improved today, and consistently pushes for higher standards, optimization, and refinement.
Preferred:
- Direct NOC experience in monitoring network, cloud, or infrastructure environments
- Familiarity with Armis components like collectors, related integrations, SPAN traffic, or similar architectures
- Understanding of cybersecurity fundamentals and enterprise network topologies
The choices you make in your career journey matter. You want to do interesting work in an important field while also having time to live your life, which is why we place so much value in your life-work balance. Armis sets you up for success with comprehensive health benefits, discretionary time off, paid holidays including monthly me days, and a highly inclusive and diverse workplace. Put your unique experiences and perspective to work in an environment where they will enable you to thrive, grow, and live your life with integrity.
Armis is proud to be an equal opportunity employer. We never discriminate based on race, ethnicity, color, ancestry, national origin, religion, sex, sexual orientation, gender identity, age, disability, veteran status, genetic information, marital status or any other legally protected (or not) status. In compliance with federal law, all persons hired will be required to submit satisfactory proof of identity and legal authorization.
Create a Job Alert
Interested in building your career at Armis Security? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field