Back to jobs
New

Manager, Operations

Memphis, TN

ABOUT xAI

xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

ABOUT THE ROLE:

We are seeking an exceptional Manager, Operations to lead facilities operations and power generation for xAI’s hyperscale AI compute facilities. This role will own the day-to-day and long-term performance of mission-critical data center operations, including power generation, power distribution, cooling, mechanical, electrical, and environmental systems, while also directing the fiber teams responsible for high-capacity networking and connectivity that support our supercomputing clusters.

You will build and lead high-performing operations, power generation, and fiber teams, drive relentless reliability and efficiency, and ensure seamless 24/7 uptime for the infrastructure powering xAI’s AI training at unprecedented scale. This high-impact position requires deep expertise in data center or hyperscale operations (including power generation), strong leadership in fast-paced environments, and the ability to deliver world-class performance under aggressive growth timelines. This is a full-time, primarily onsite role with significant travel to sites and vendor locations.

RESPONSIBILITIES:

  • Lead and scale the facilities operations and power generation teams responsible for the reliable operation, maintenance, monitoring, and optimization of critical infrastructure including on-site power generation assets, electrical systems, mechanical/HVAC, liquid cooling, power distribution, UPS, generators, and building management systems.
  • Direct the fiber teams overseeing the design, deployment, maintenance, and expansion of high-speed fiber optic networks, dark fiber, and connectivity infrastructure supporting AI compute clusters and data center interconnects.
  • Own key performance metrics such as uptime (targeting 99.999%+), mean time to detect/repair (MTTD/MTTR), power usage effectiveness (PUE), water usage effectiveness (WUE), power generation efficiency, and overall infrastructure availability.
  • Develop and enforce standard operating procedures (SOPs), preventive maintenance programs, incident response protocols, and continuous improvement processes for both facilities and power generation assets to minimize downtime and maximize efficiency.
  • Build, mentor, and grow multidisciplinary teams of operations technicians, power generation engineers and controls specialists while fostering a culture of ownership, safety, and excellence.
  • Partner closely with engineering, construction, procurement, and AI hardware teams to support new facility builds, expansions, commissioning, power integration, and smooth handovers from project to operations.
  • Manage operational budgets, vendor relationships (maintenance contractors, fiber providers, power generation OEMs, fuel suppliers), spare parts inventory, and risk mitigation strategies in a high-velocity environment.
  • Drive innovation in operational practices, automation, predictive maintenance, power generation optimization, and sustainability initiatives to support the extreme power and cooling demands of next-generation AI systems.
  • Provide regular performance reporting, root cause analyses, lessons learned, and strategic recommendations to senior leadership.

BASIC QUALIFICATIONS:

  • 5+ years of progressive experience in data center facilities operations, power generation operations, hyperscale infrastructure management, or mission-critical industrial operations, with at least 2+ years in a management or supervisor role.
  • Proven track record leading large-scale operations teams supporting high-density compute environments with significant on-site or dedicated power generation (AI, HPC, or hyperscaler data centers strongly preferred).
  • Strong experience managing fiber optic networks, dark fiber deployments, or high-bandwidth connectivity infrastructure in large-scale technical environments.
  • Deep knowledge of power generation systems (gas turbines, reciprocating engines, cogeneration, etc.), MEP (mechanical, electrical, plumbing) systems, BMS/SCADA, liquid cooling, power redundancy topologies, and 24/7 operations best practices.
  • Demonstrated success delivering high reliability, rapid incident resolution, and operational excellence under aggressive scaling timelines.
  • Hands-on leadership style with the ability to roll up sleeves while effectively managing teams, budgets, and cross-functional stakeholders.
  • Proficiency with operations tools, CMMS (computerized maintenance management systems), monitoring platforms, and data-driven decision making.

PREFERRED SKILLS AND EXPERIENCE:

  • Direct background in AI or hyperscale data center operations, including liquid cooling systems, high-power GPU/accelerator environments, and on-site power generation.
  • Experience building or scaling fiber infrastructure for low-latency, high-bandwidth interconnects between compute clusters or sites.
  • Familiarity with Uptime Institute Tier standards, ASHRAE guidelines, power generation standards (e.g., IEEE, NFPA), OSHA/EPA compliance, and sustainability practices in critical facilities.
  • Bachelor’s or Master’s degree in Electrical, Mechanical Engineering, Power Systems, Facilities Management, or related field; relevant certifications (CDCP, CDCS, or equivalent) a plus.
  • Track record of implementing automation, predictive analytics, or process improvements that significantly enhanced operational performance and power reliability.

ADDITIONAL REQUIREMENTS:

  • Willingness to be primarily onsite at key facilities (e.g., Memphis region) with on-call responsibilities and travel to other sites as needed.
  • Ability to work in industrial/data center environments and lead teams during high-pressure phases.

xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.

Create a Job Alert

Interested in building your career at xAI? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Education

Select...
Select...
Select...

In 100 words or less, tell us about a piece of work you are most proud of.

Select...