Infra Team Manager (Platform)
[About the role]
We are looking for a Senior DevOps / Platform Engineer to own the end-to-end cloud infrastructure, reliability, security, and CI/CD platforms for our platform products. This is a high-ownership, hands-on role responsible for building and operating production-grade infrastructure that supports high availability, regulatory compliance, scalability, and cost efficiency. You will act as the platform backbone for fast-moving product and engineering teams.
[Key Responsibilities]
Infrastructure & Platform
- Design, deploy, and operate cloud-native infrastructure on AWS or Azure
- Own Kubernetes-based microservices infrastructure across environments (prod, staging, dev)
- Define infra standards, environments, networking, and release guardrails
- Manage databases, caches, queues, and supporting infra from a platform perspective
Reliability & Availability
- Define and enforce SLOs / SLAs for critical user and payment flows
- Build for high availability, auto-scaling, and fault tolerance
- Implement zero-downtime deployments and safe rollout strategies
- Own incident response, postmortems (RCA), and preventive action plans
- Plan capacity and scaling for traffic spikes and seasonal peaks
CI/CD & Release Engineering
- Build and maintain CI/CD pipelines for backend, frontend, and mobile applications
- Standardize deployment pipelines with rollback, approvals, and auditability
- Enable faster, safer releases without compromising reliability
Observability & Monitoring
- Own the observability stack across services and infrastructure
- Implement monitoring for:
- Latency, throughput, error rates
- Infrastructure health
- Payment and checkout flows
- Define alerting strategies aligned with business impact, not noise
Security & Compliance
- Own platform-level security posture and compliance readiness
- Implement and manage:
- Secrets management and key rotation
- IAM, RBAC, and least-privilege access
- Network security, TLS, WAFs, and audit logging
- Partner with security/compliance stakeholders during audits and reviews
- Ensure infrastructure is audit-ready (logs, access trails, change history)
Cost & Efficiency
- Monitor and optimize cloud costs across environments
- Design infra with cost-efficiency in mind without sacrificing reliability
- Provide visibility into infra usage and cost drivers
[Required Experience]
- 4+ years of experience in DevOps / SRE / Platform Engineering
- Hands-on experience running production-grade Kubernetes workloads
- Experience supporting high-availability, high-throughput systems
- Strong understanding of microservices infrastructure
- Experience working with fintech, payments, or transaction-heavy systems
- Proven ownership of uptime, reliability, and incident management
Expected Tech Exposure
- Cloud: AWS or Azure
- Containers: Docker, Kubernetes
- CI/CD: GitHub Actions, Jenkins, Fastlane
- Infrastructure as Code: Terraform / ARM / Pulumi
- Databases & Caches: PostgreSQL, Redis
- Messaging: Kafka / Event Hubs
- Security: Vaults, SSL/TLS, key rotation, network policies
- Observability: Prometheus, Grafana, Azure Monitor, Sentry
Good-to-Have
- Experience with UPI, wallets, payment gateways, or banking systems
- Exposure to multi-cloud or hybrid environments
- Familiarity with compliance standards (PCI-DSS, SOC-style controls, audit workflows)
- Experience defining SLOs, error budgets, and release safety mechanisms
You’ll Be Successful If
- You think in systems, reliability, and failure scenarios
- You proactively prevent outages, not just react to them
- You enjoy owning infrastructure end-to-end, not just tooling
- You build platforms that enable product teams to move fast, safely
- You’re comfortable being accountable for uptime, security, and scale
Create a Job Alert
Interested in building your career at KRAFTON INDIA? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field