
Lead Data Architect Quality & Reliability
Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.
Cerebras' current customers include global corporations across multiple industries, national labs, and top-tier healthcare systems. In January, we announced a multi-year, multi-million-dollar partnership with Mayo Clinic, underscoring our commitment to transforming AI applications across various fields. In August, we launched Cerebras Inference, the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services.
About The Role
As our Lead Data Architect for Product Quality & Reliability, you will define and implement the data architecture that underpins Cerebras’s understanding of product health, reliability performance, and warranty economics across the entire lifecycle of our AI systems.
Sitting within the Quality & Reliability (Q&R) organization, this role blends deep technical execution with high-ownership cross-functional leadership. You will partner closely with Data Engineering, Software, Inference, AI Infrastructure, Manufacturing, Test, Field Service, Customer Success, and Finance to architect a single source of truth for quality, reliability, and warranty data. You will shape how the company captures, structures, integrates, and interprets data that spans factory, lab, test, deployment, and field environments.
Your mission is to design and operationalize the data models, ontologies, pipelines, governance standards, dashboards, and analytical frameworks that enable a unified, trustworthy narrative of quality and reliability across Cerebras’s products. This includes building and refining the data infrastructure behind warranty tracking and modeling in collaboration with Finance, ensuring accuracy, defensibility, and forward-looking insight.
This role is well-suited for individuals who have built data foundations in hyper-growth, high-complexity hardware environments, and who thrive in architecting cohesive systems from diverse data sources, collaborating across teams, and strengthening an organization’s ability to reason about quality and reliability.
Key Responsibilities
- Quality & Reliability Data Architecture
- Define and build the data models, schemas, ontologies, and canonical representations needed to unify quality, reliability, manufacturing, test, telemetry, service, and warranty data.
- Develop a structured, non-overlapping, fully descriptive data ontology for field symptoms, defect categories, test results, reliability events, and component-level attributes.
- Establish and maintain the single source of truth for Q&R data by defining standardized identifiers, metadata, lineage, and data governance rules across all relevant systems.
- Architect scalable data storage and access patterns (warehouse/lake/lakehouse, curated marts) that support both high-frequency operational metrics and deep engineering analysis.
- Data Integration, Pipelines & Automation
- Build and maintain ETL/ELT pipelines (Python, SQL, Airflow/dbt or similar) that integrate data from factory systems, test infrastructure, cloud telemetry, RMAs, field service, supplier quality systems, and financial sources.
- Implement robust data quality frameworks, including validation, anomaly detection, and monitoring for completeness, consistency, and freshness.
- Work with internal data platform teams to leverage existing ingestion, orchestration, and catalog infrastructure while extending it to support Q&R needs.
- Develop reusable Python-based data tooling and automation frameworks for analysts and engineers.
- Manufacturing, Supplier & CM Quality Data Architecture
- Architect the end-to-end data flow for manufacturing quality, including data from Contract Manufacturers (CMs), suppliers, incoming inspection, yield systems, SPC, burn-in, repair, and rework loops.
- Unify and normalize CM and supplier datasets into Cerebras’s internal quality models, including:
- Lot, serial, revision, and configuration metadata
- Process parameters, station-level results, time-series logs
- Defect coding, symptom taxonomies, and repair actions
- Drive alignment with CMs on data formats, quality schemas, required metadata, and data delivery cadences, ensuring traceability and completeness of external data streams.
- Partner with Quality Engineering and Supplier Quality to design real-time and batch pipelines that surface:
- Line performance indicators
- Supplier DPPM trends
- Yield escape risks
- Top defects and component-level issues
- Ensure manufacturing and supplier datasets integrate seamlessly with reliability, test, and field data to enable true end-to-end quality lifecycle analytics.
- Build or support dashboards that help Q&R and operations teams monitor manufacturing health, detect anomalies early, and identify systemic issues.
- Reliability Modeling & Analytics Enablement
- Build data structures that support Weibull analysis, Crow-AMSAA, Kaplan–Meier survival curves, FRACAS, accelerated life test models, and system health scoring.
- Partner with reliability statisticians and Q&R engineers to design datasets enabling insights into early-life failures, wear-out mechanisms, and fleet-wide reliability trends.
- Codify standardized quality and reliability metrics:
- Failure rates, DPPM/DPMO
- MTBF, survival probabilities
- 0MIS/1MIS indicators
- Subsystem/component reliability KPIs
- Ensure the data platform supports both routine operational reporting and deep root-cause investigations.
- Warranty Data Architecture & Finance Partnership
- Design and maintain the warranty data architecture, supporting accruals, reserves, warranty forecasting, and cohort-based risk analysis.
- Integrate financial, service, reliability, and RMA data to create a comprehensive warranty insights platform.
- Ensure all warranty calculations are auditable and defensible, with strong version control and clear lineage.
- Enable feedback loops where warranty outcomes guide engineering improvements, manufacturing processes, and cost-reduction initiatives.
- Cross-Functional Collaboration, Visualization & Storytelling
- Build strong partnerships across Data Engineering, Manufacturing, Supplier Quality, Test, SW/Inference, Field Service, Customer Success, and Finance to align definitions, schemas, and integration points.
- Influence upstream teams on system instrumentation, logging, test data structures, and CM/supplier data requirements.
- Create or oversee dashboards in Grafana or custom web tools that communicate actionable insights and leading indicators.
- Develop executive-level narratives that tie together factory, test, field, and service data into a cohesive, trusted product health story.
- Mentor analysts and engineers on data modeling, querying, and reliability analytics, elevating Q&R data literacy company-wide.
Qualifications
Required
- 8+ years in data architecture, data engineering, analytics engineering, or related technical data roles, with ownership in multi-source data environments.
- Strong background working with complex electromechanical products (automotive, robotics, semiconductor, aerospace, advanced hardware, or similar).
- Expertise in data modeling, ontology development, dimensional modeling, and unification of heterogeneous, large-scale datasets.
- Strong Python (pandas, PySpark, tooling automation) and SQL skills and comfort building production-grade data pipelines.
- Demonstrated ability to structure data for reliability analytics, including survival analysis, reliability growth, failure pattern analysis, and RMA-based insights.
- Experience integrating data from test systems, MES, PLM/ERP, field telemetry, customer service tools, CM/ODM manufacturing partners, and financial systems.
- Strong proficiency with visualization and analytics tools (Tableau, Grafana, Looker, custom dashboards).
- Excellent communication and influence skills, with a track record of partnering across engineering, operations, and data teams.
- Experience in startup-to-scale environments and creating data systems from scratch in fast-moving organizations.
Preferred
- Familiarity with Q&R methodologies such as Weibull, Crow-AMSAA, Kaplan–Meier, fault tree analysis, FRACAS, or reliability growth modeling.
- Evidence of designing canonical data models or ontologies in ambiguous, high-complexity domains.
- Experience supporting warranty forecasting, warranty cost modeling, or similar financial analytics.
- Ability to design lightweight custom internal tools (Python-based, web-based, or notebook-driven).
- Passion for transforming raw data into coherent, simple, and compelling product health stories.
As our Lead Data Architect for Q&R, you will build the data foundation that shapes how Cerebras understands product reliability, operational health, customer experience, and long-term product evolution.
Your work will directly affect engineering decisions, customer trust, warranty economics, and product roadmap priority — making this one of the highest-impact data roles in the company.
The base salary range for this position is $150,000 to $250,000 annually. Actual compensation may include bonus and equity, and will be determined based on factors such as experience, skills, and qualifications.
Why Join Cerebras
People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:
- Build a breakthrough AI platform beyond the constraints of the GPU.
- Publish and open source their cutting-edge AI research.
- Work on one of the fastest AI supercomputers in the world.
- Enjoy job stability with startup vitality.
- Our simple, non-corporate work culture that respects individual beliefs.
Read our blog: Five Reasons to Join Cerebras in 2025.
Apply today and become part of the forefront of groundbreaking advancements in AI!
Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.
This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.
Create a Job Alert
Interested in building your career at Cerebras Systems? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field