
Data Scientist Intern
Dataiku is The Universal AI Platform™, giving organizations control over their AI talent, processes, and technologies to unleash the creation of analytics, models, and agents. Providing no-, low-, and full-code capabilities, Dataiku meets teams where they are today, allowing them to begin building with AI using their existing skills and knowledge.
Internship goal
Identify and implement an industrial use case for converting an agentic system that uses a Large Language Model (LLM) into one that uses a Small Language Model (SLM), leveraging Dataiku's platform to create a real-world example for our customers.
Detailed description
Agents are being increasingly experimented with and integrated into critical business processes. As their use becomes more widespread, there is a growing demand for improved efficiency, both in terms of performance and cost. Additionally, data security is a primary concern. Companies are looking to host their own Large Language Models (LLMs) rather than depend on third parties to ensure their sensitive information remains secure.
While state-of-the-art LLMs simplify the development of agents with their strong reasoning and interpolation skills, creating reliable agents with smaller LLMs (SLMs) is a more complex challenge. It often requires advanced techniques like fine-tuning or meticulous prompt optimization to achieve consistent results. However, this effort is worthwhile. Recent research has shown how to reliably convert agentic systems that use LLMs into systems that use SLMs, which is the exact application we want to develop at Dataiku.
Dataiku offers a comprehensive platform for building, evaluating, and fine-tuning agents. The main goal of this internship is to identify a practical, industrial use case where converting to an SLM-based agent makes sense. You will then implement this case, creating a tangible example that our customers can use for inspiration.
During this internship, you will:
- Get familiar with Dataiku, its Agent and LLM mesh infrastructure.
- Research state-of-the-art techniques for converting LLMs agentic systems into SLMs ones.
- Experiment on some industrial use-cases how algorithms perform and evaluate their efficiency.
- Collaborate with the Data Science and the broader Solutions team to identify technical challenges and industrial context.
- Develop a solution or demo that leverages this technique on an example that resonates with the industry.
- Contribute to increasing Dataiku’s credibility as the platform of choice for their Agentic AI use-cases.
Stack
- Python #LI-Onsite #LI-FR1
Create a Job Alert
Interested in building your career at Dataiku? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
.jpg?1756841146)