AI/ML Engineer
Xebia is a global AI-first, digital transformation, and engineering partner. With over 25 years of experience and a team of 5,000 professionals across 16 countries, we help organizations design and build scalable products, platforms, and data-driven solutions.
We specialize in Artificial Intelligence, Data and Cloud, Intelligent Automation, and Digital Products, combining deep technical expertise with a strong focus on engineering excellence and a people-first culture.
In the CEE region, we’re a team of nearly 1,000 experts delivering modern applications, data platforms, and AI solutions for clients such as McLaren, Aviva, Deloitte, Spotify, Disney, ING, UPS, Tesco, Truecaller, AllSaints, Volotea, Schmitz Cargobull, Allegro, InPost, and many, many more. We work with leading technologies including AWS, Azure, GCP, Databricks, and Snowflake, and combine strong engineering culture with a consulting mindset and a continuous focus on growth and knowledge sharing.
You will be:
- designing and developing AI agents for operations: Incident Triage, Dependency Analysis, Runbook Interpreter, Change Risk,
- building and deploying the AI KT Agent that ingests codebase snapshots, IaC, and 12 months of incident history,
- implementing the Agent Maturity Gate framework: OBSERVE, RECOMMEND, CONTROLLED EXECUTE, LIMITED AUTONOMY,
- developing ML models for alert correlation, anomaly detection, and root-cause prediction,
- training models on operational data within client own repositories (full IP portability),
- building confidence scoring systems: agents only advance maturity gates when data-validated,
- deploying agents using Xebia ACE (Anthropic Claude Enterprise) platform across 4 regional instances,
- maintaining full audit trails for all agent decisions and recommendations,
- collaborating with SRE and Automation teams to integrate agents into operational workflows,
- measuring and reporting agent effectiveness: accuracy, false positive rates, human override frequency.
Your profile:
- 3-6 years of experience in ML engineering, AIOps, or applied AI,
- strong Python skills with ML frameworks (scikit-learn, TensorFlow, PyTorch),
- experience with NLP and/or LLM integration (RAG architectures, prompt engineering, fine-tuning),
- understanding of MLOps: model deployment, monitoring, versioning, CI/CD for ML,
- experience with cloud ML services (AWS SageMaker, Azure ML, GCP Vertex AI),
- knowledge of observability data structures: metrics, logs, traces, events Familiarity with time-series analysis and anomaly detection techniques,
- strong English communication skills for cross-functional collaboration.
Work from the European Union region and a work permit are required.
Nice to have:
- experience with LLM-based agent architectures (LangChain, AutoGen, or similar),
- familiarity with Anthropic Claude or OpenAI APIs,
- background in IT operations or SRE (understanding of incident workflows),
- experience with graph-based dependency mapping,
- knowledge of change risk prediction and automated testing frameworks.
Recruitment Process:
CV review – HR call – Interview – Client Interview – Decision
Create a Job Alert
Interested in building your career at Poland and Eastern Europe? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
.png?1773750017)