Back to jobs
New

Lead Software Engineer

Thiruvananthapuram Office, AEDGE AICC India Pvt Ltd

 

About the Company

Armada is an edge computing startup that provides computing infrastructure to remote areas where connectivity and cloud infrastructure is limited, as well as areas where data needs to be processed locally for real-time analytics and AI at the edge. We’re looking to bring on the most brilliant minds to help further our mission of bridging the digital divide with advanced technology infrastructure that can be rapidly deployed anywhere.

 

About the Role

We are seeking a highly experienced Lead Software Engineer / Lead AI Platform Engineer to architect and lead the development of our GPU-as-a-Service (GPUaaS) platform. In this role, you will define the core abstractions that transform complex GPU fabrics, storage systems, and networking into a seamless, self-service experience for researchers and engineers.

You will operate at the intersection of distributed systems, Kubernetes internals, and GPU infrastructure, setting the technical direction of the platform while mentoring engineers and driving cross-functional collaboration. This role is ideal for leaders who enjoy hands-on architecture, deep technical ownership, and building infrastructure at massive scale.


What You’ll Do (Key Responsibilities)

Architectural Strategy & Platform Design

  • Lead the design of a globally scalable AI control plane for GPU, storage, and network orchestration.

  • Define architectural patterns for custom Kubernetes operators managing complex AI training and inference workloads.

  • Own the long-term scalability, extensibility, and evolution of the GPUaaS platform.

Systemic Multi-Tenancy & Security

  • Architect hard isolation strategies across kernel, hypervisor, and hardware layers (IOMMU, SR-IOV, device isolation).

  • Design secure multi-tenant execution models aligned with zero-trust networking principles.

  • Ensure strong isolation without compromising performance in a shared environment.

Storage & Networking Strategy

  • Drive integration strategies for VAST, Weka, and DDN storage platforms.

  • Collaborate with hardware and networking vendors to optimize RDMA, GPUDirect, and RoCE v2 traffic patterns.

  • Design and evolve VXLAN and BGP-EVPN–based networking architectures.

Feature Development

  • Design, develop, and maintain custom Kubernetes operators for GPU, storage, and infrastructure automation.

  • Implement CRDs, reconciliation logic, and lifecycle management for AI workloads.

  • Guide implementation patterns while remaining hands-on with critical platform components.

Reliability, Performance & Scale

  • Define platform SLOs, capacity planning models, and GPU availability targets.

  • Establish benchmarking standards including MLPerf and custom training/inference stress tests.

  • Lead post-incident reviews, root-cause analysis, and performance optimization initiatives.

Technical Leadership & Mentorship

  • Set engineering standards through design reviews, architecture documentation, and technical RFCs.

  • Mentor and grow L3/L4 engineers into strong platform owners.

  • Influence and collaborate across infrastructure, security, and product teams.


Required Qualifications

  • 10–15 years of experience in software, platform, or infrastructure engineering roles.

  • Demonstrated expertise designing and operating production-grade Kubernetes operators using Go (Kubebuilder / Operator SDK).

  • Deep understanding of Kubernetes internals, including etcd performance, API machinery, CRDs, controllers, and scheduling.

  • Proven experience building secure, multi-tenant platforms with strong isolation and zero-trust networking.

  • Strong hands-on knowledge of high-performance storage and networking, including POSIX semantics, CSI drivers, and InfiniBand / RoCE v2.

  • Experience designing infrastructure automation workflows using tools such as Ansible, Terraform, or equivalent.

  • Hands-on experience with observability and monitoring tools such as Prometheus, OpenTelemetry (OTEL), Grafana, Splunk, or similar.

  • Strong proficiency in Go and Python.

  • Excellent leadership, communication, and cross-functional collaboration skills.


Preferred / Nice-to-Have Qualifications

  • Experience with AI serving frameworks such as vLLM, Ray Serve, Triton Inference Server, or similar.

  • Familiarity with virtualization and lower-layer systems including VMware vSphere, OpenStack, KVM, or bare-metal provisioning.

  • Experience with GPU infrastructure, including NVIDIA DGX/HGX systems, GPU Operator, DCGM, Nsight, or performance profiling tools.

  • Exposure to distributed training systems such as PyTorch DDP, DeepSpeed, or large-scale training frameworks.


Compensation & Benefits

For India-based candidates, we offer a competitive base salary along with equity options, providing an opportunity to share in the success and growth of Armada.

 

 

You're a Great Fit if You're

  • A go-getter with a growth mindset. You're intellectually curious, have strong business acumen, and actively seek opportunities to build relevant skills and knowledge 
  • A detail-oriented problem-solver. You can independently gather information, solve problems efficiently, and deliver results with a "get-it-done" attitude 
  • Thrive in a fast-paced environment. You're energized by an entrepreneurial spirit, capable of working quickly, and excited to contribute to a growing company
  • A collaborative team player. You focus on business success and are motivated by team accomplishment vs personal agenda 
  • Highly organized and results-driven. Strong prioritization skills and a dedicated work ethic are essential for you 

 

Equal Opportunity Statement

At Armada, we are committed to fostering a work environment where everyone is given equal opportunities to thrive. As an equal opportunity employer, we strictly prohibit discrimination or harassment based on race, color, gender, religion, sexual orientation, national origin, disability, genetic information, pregnancy, or any other characteristic protected by law. This policy applies to all employment decisions, including hiring, promotions, and compensation. Our hiring is guided by qualifications, merit, and the business needs at the time.

 

Unsolicited Resumes and Candidates

Armada does not accept unsolicited resumes or candidate submissions from external agencies or recruiters. All candidates must apply directly through our careers page. Any resumes submitted by agencies without a prior signed agreement will be considered unsolicited and Armada will not be obligated to pay any fees.

 

Create a Job Alert

Interested in building your career at Armada? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Please ensure that your passport-sized photograph is attached/uploaded. This photograph will be used for identification purposes during the application review process.*

Accepted file types: pdf, doc, docx, txt, rtf

Please upload a clear copy of your valid ID card that includes your photograph. This document will be used for verification purposes during the application process.*

Accepted file types: pdf, doc, docx, txt, rtf

Please upload copies of your pay slips for the last six months. These documents will be used to verify your employment history and salary details.

Accepted file types: pdf, doc, docx, txt, rtf