DevOps Manager
At Mitratech, we are a team of technocrats focused on building world-class products that simplify operations in the Legal, Risk, Compliance, and HR functions. We are a close-knit, globally dispersed team that thrives in an ecosystem that supports individual excellence and takes pride in its diverse and inclusive work culture centered around great people practices, learning opportunities, and having fun! Our culture is the ideal blend of entrepreneurial spirit and enterprise investment, enabling the chance to move at a rapid pace with some of the most complex, leading-edge technologies available.
For over 35 years, the experts at Mitratech have been focused on solving the complex needs. Today, we serve 20,000 client companies of all sizes globally, representing 30% of the Fortune 500 and over 500,000 users in over 160 countries.
As we continue to grow, we’re always looking for resourceful, enthusiastic, and fresh perspective. Join our global team and see what makes Mitratech a truly exceptional place to work!
Job Overview:
We are looking for an experienced and passionate DevOps & SRE Manager to lead multiple DevOps and Site Reliability Engineering teams. The ideal candidate will be responsible for building and maintaining scalable, reliable, and high-performing infrastructure and operational processes. As a DevOps & SRE Manager, you will play a key role in ensuring our development, deployment, and operational practices align with industry standards while fostering a culture of automation and continuous improvement.
Key Responsibilities:
Leadership & Team Management:
- Lead, mentor, and develop a team of DevOps engineers and SREs to drive innovation and operational excellence.
- Build a collaborative and inclusive team culture focused on delivering high-quality services.
- Establish and track goals for your team to align with business objectives.
Infrastructure Automation & Scalability:
- Design, implement, and manage highly available and scalable cloud infrastructure in AWS and Azure.
- Oversee the implementation of Infrastructure as Code (IaC) tools (e.g., Terraform, Bicep, Ansible etc) to automate provisioning and configuration.
- Identify and address bottlenecks in deployment pipelines and infrastructure performance.
Site Reliability Engineering:
- Lead efforts to define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.
- Drive incident management processes to quickly detect, mitigate, and resolve issues while ensuring post-mortem analyses for continuous improvement.
- Optimize and enhance monitoring, logging, and alerting systems (e.g., NewRelic, Datadog, Splunk, Prometheus, Grafana, ELK stack).
Continuous Integration and Continuous Deployment (CI/CD):
- Establish and refine CI/CD pipelines to ensure smooth software releases with minimal/zero downtime.
- Collaborate with development teams to implement DevOps best practices and ensure code quality, security, and performance.
Security & Compliance:
- Implement and oversee security best practices in DevOps and operational workflows, including secrets management, vulnerability scans, and automated patching.
- Ensure compliance with relevant regulations and standards (e.g., SOC2, ISO 27001).
Collaboration & Communication:
- Work cross-functionally with product, engineering, and operations teams to ensure alignment on goals and priorities.
- Provide regular updates to stakeholders on system health, incidents, and improvement initiatives.
Cost Optimization:
- Analyze cloud and infrastructure costs, identify opportunities for savings, and implement cost optimization strategies.
- Manage budgets and vendor relationships for tools and services used by the team.
Qualifications:
Education:
- Bachelor’s degree in Computer Science, Engineering, or a related field. A Master’s degree is a plus.
Experience:
- Proven experience managing DevOps or SRE teams in fast-paced environments.
- Hands-on expertise in cloud platforms (AWS, Azure) and containerization technologies (Docker, Kubernetes).
- Deep understanding of software development lifecycle (SDLC) and Agile practices.
- Track record of driving operational efficiency, incident resolution, and automation.
Technical Skills:
- Expertise in CI/CD tools (e.g., Jenkins, CircleCI, Github Actions, Azure DevOps).
- Experience operating in Kubernetes platforms like AKS, EKS, or similar.
- Experience using managed languages such as Python, Go, C#, Java, or similar.
- Experience designing tooling to simplify the operational management of SaaS/PaaS systems.
- Experience with monitoring and observability tools (e.g., Prometheus, Splunk, New Relic, Datadog, ELK Stack).
- Strong knowledge of infrastructure-as-code tools (e.g., Terraform, Bicep, CloudFormation).
- Strong understanding of cloud best practices for networking, security and identity management in AWS and Azure.
Soft Skills:
- Excellent leadership and people management abilities.
- Strong problem-solving skills and attention to detail.
- Exceptional communication skills to collaborate across teams and with stakeholders.
- Proven ability to manage and prioritize multiple product lines and initiatives simultaneously.
We are an equal-opportunity employer that values diversity at all levels. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, national origin, age, sexual orientation, gender identity, disability, or veteran status.
Apply for this job
*
indicates a required field
