Back to jobs
New

Senior Data Engineer

Münster, North Rhine-Westphalia, Germany

About Atari

Atari is an interactive entertainment company and an iconic gaming industry brand recognized worldwide for its multi-platform games and licensed products. Atari owns and/or manages a portfolio of more than 400 games and franchises, including globally recognized brands such as Asteroids®, Centipede®, Missile Command®, Pong®, and RollerCoaster Tycoon®.

The Atari family of brands includes Digital Eclipse, Nightdive Studios, Infogrames, AtariAge, MobyGames, as well as Coatsink, Early Morning Studios, and Stormteller Games—spanning game development, publishing, and community experiences worldwide.

Atari operates internationally with offices in New York, Paris, Germany, and India.

Overview

Build and own the data pipelines and knowledge systems that power AI automation. You own the schema, the sources, the quality, and the completeness — working closely with the engineering team to ensure the right knowledge is always in the system and working.

What You’ll Do

Data Schema Design and Ownership

  • Design and own canonical data schemas that structure knowledge for AI consumption — versioned, documented, and treated as engineering artefacts.
  • Evolve schemas as AI system requirements change: assess downstream impact and coordinate with the engineering team before any rollout.
  • Keep the knowledge model legible to domain experts, AI engineers, and future team members through clear documentation of schema decisions.

Source Integration and Pipeline Engineering

  • Identify and connect to source systems — databases, repositories, APIs, workflow outputs, domain expert inputs — and build reliable, monitored ingestion pipelines for each.
  • Build and maintain production-grade data pipelines using pipeline orchestration tooling (such as Apache Airflow, Prefect, or equivalent): extraction, transformation, validation, loading, error handling, and full observability.
  • Keep pipelines current as sources change, new knowledge is produced, or schemas evolve — downstream AI systems are never fed stale or broken data.

Data Cleaning, Transformation, and Quality

  • Transform raw source material into structured, consistent, agent-consumable knowledge: normalise formats, resolve conflicts, enforce schema conformance, and eliminate ambiguity.
  • Own data quality end-to-end: define standards, build validation logic at ingestion using data quality frameworks (such as Great Expectations or equivalent), and maintain quality metrics across the full knowledge estate.
  • Trace quality issues to their root — pipeline fault, schema gap, or source problem — and fix there, not downstream.

Data Systems Maintenance and Lifecycle Management

  • Maintain all data pipelines and knowledge systems in production: monitor health, manage dependencies, handle source system changes, and ensure nothing degrades silently over time.
  • Run periodic data cleanup: identify and remove stale, duplicate, or degraded knowledge assets that would dilute AI system quality if left in place.
  • Manage knowledge deprecation cleanly: when data becomes outdated or superseded, retire it from the system without breaking downstream pipelines or AI system behaviour.

AI Knowledge System Partnership

  • Work in a closed loop with the engineering team: diagnose knowledge gaps when AI outputs fail, fix at the pipeline or schema level, and validate through the evaluation framework.
  • Understand how RAG retrieval and context assembly work well enough to make data design decisions with AI consumption as the primary constraint.
  • Proactively surface knowledge gaps by monitoring AI system failures and escalations, and address root causes before they become production issues.
  • Extract specialist knowledge from domain experts and convert it into structured, pipeline-managed data assets the AI system can rely on.

Data Infrastructure and Governance

  • Select and manage data storage infrastructure appropriate to each knowledge type — relational databases, document stores, and vector databases (such as Pinecone, Weaviate, or pgvector) — with AI consumption as the deciding factor.
  • Implement full data lineage and audit trails: origin, transformation history, and change log for every knowledge asset.
  • Enforce data versioning and change management across the knowledge estate, with rollback capability preserved at every step.
  • Build a data estate that is legible and accessible across the organisation — laying the foundation for intelligence and analytics as the product portfolio grows.

Stakeholder Communication and Collaboration

  • Translate between domain and engineering language in both directions — extracting knowledge requirements from non-technical stakeholders and communicating data decisions and tradeoffs back clearly.
  • Work with domain experts to understand how data should be structured and what outputs need to look like — then convert that directly into schema and pipeline requirements.
  • Keep engineering, product, and domain stakeholders appropriately informed on pipeline health, data quality, and knowledge gaps — right level of detail, right audience.

Requirements & Qualifications

  • Proven track record building and owning production data pipelines end-to-end: source integration, transformation, validation, loading, observability, and ongoing maintenance.
  • Strong data schema design skills: canonical data modelling, schema evolution management, downstream impact assessment, and decision documentation.
  • Experience integrating with diverse source systems — APIs, databases, document repositories, event streams — with reliable, monitored ingestion pipelines.
  • Hands-on experience with pipeline orchestration tooling (such as Apache Airflow, Prefect, or equivalent): scheduling, dependency management, failure handling, and pipeline observability.
  • Data cleaning and transformation expertise: normalisation, conflict resolution, format standardisation, and quality validation at scale using data transformation tools (such as dbt or equivalent).
  • Experience with vector databases (such as Pinecone, Weaviate, or pgvector) and relational or document stores: schema design, indexing, and production data management.
  • Working knowledge of how RAG systems and context assembly operate — sufficient to make data design decisions with AI consumption as the primary constraint.
  • Experience partnering with AI or ML engineers: translating knowledge requirements into pipeline tasks and closing feedback loops between AI output quality and data quality.
  • Ability to interrogate and profile data across sources — querying, inspecting, and validating content using SQL and Python to identify anomalies, gaps, and quality issues before they reach AI systems.
  • Cloud platform experience (AWS, Azure, or GCP): deploying and operating data pipelines and storage systems at production reliability standards.

Preferred / Nice-to-Have

  • Experience in the gaming industry: game development pipelines, content management, or platform-specific data and metadata requirements.
  • Familiarity with game engine data formats, asset pipelines, or platform SDK data structures (Xbox GDK, PlayStation SDK, or similar).

To Apply

Please submit your resume and a brief cover letter outlining your experience and interest in the role. If available, you are also welcome to include a link to your portfolio.

EEO Statement

Atari is an equal opportunity employer and we are committed to providing a workplace free from harassment and discrimination.  We are committed to equal employment regardless of race, religion or lack thereof, color, national origin, gender, sexual orientation, gender identity or expression, age, marital status, medical condition, veteran status, ancestry, disability status, pregnancy, parental status, genetic information, political affiliation, or any other status protected by the laws or regulations in the locations where we operate.



Create a Job Alert

Interested in building your career at Atari, Inc.? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter*

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...

If applicable, provide a short overview of your experience working in the gaming industry.

If you work in the gaming industry, there’s a good chance you already have a MobyGames profile. Claim it — free of charge:

Claim MobyGames Account

Select...