Job Application for Data Developer at DRW

Back to jobs

DRW is a diversified trading firm with over 3 decades of experience bringing sophisticated technology and exceptional people together to operate in markets around the world. We value autonomy and the ability to quickly pivot to capture opportunities, so we operate using our own capital and trading at our own risk.

Headquartered in Chicago with offices throughout the U.S., Canada, Europe, and Asia, we trade a variety of asset classes including Fixed Income, ETFs, Equities, FX, Commodities and Energy across all major global markets. We have also leveraged our expertise and technology to expand into three non-traditional strategies: real estate, venture capital and cryptoassets.

We operate with respect, curiosity and open minds. The people who thrive here share our belief that it’s not just what we do that matters–it's how we do it. DRW is a place of high expectations, integrity, innovation and a willingness to challenge consensus.

We are looking for a Data Developer to join our AI and Multi Asset Systematic Strategies team. This team builds AI and ML-powered tools and solutions that enable teams across the firm and support AI researchers. You'll build data pipelines for RAG systems, optimize embedding workflows, and architect scalable solutions for managing analytical, relational, structured, and unstructured data.

Responsibilities:

Design and build data pipelines for RAG systems, including document ingestion, chunking, embedding generation, and vector storage.
Build ingestion pipelines for structured and unstructured data sources into a centralized data lake, ensuring data is clean, normalized, and accessible for analytics, research, and AI workloads.
Develop data processing workflows to prepare and optimize datasets for fine-tuning and inference workloads.
Build monitoring and evaluation frameworks to measure retrieval quality, latency, and system performance.
Collaborate with ML engineers to optimize data formats and storage patterns for GPU-accelerated inference.
Implement caching strategies and data versioning systems to support efficient model serving.
Deploy and manage vector databases, embedding services, and data processing pipelines.
Drive initiatives to improve data quality, reduce latency, and enhance the accuracy of retrieval systems.
Continuously learn and stay up-to-date with emerging technologies and best practices in data engineering and AI.
Proactively contribute ideas for new tools, process improvements, and technology adoption that move the team forward.

Requirements:

Bachelor's or Master's degree in Computer Science, Data Engineering, or related field.
2-5 years building data systems and pipelines in production environments.
Strong experience with RAG architectures, including vector databases (Milvus, ChromaDB, Pinecone, Weaviate, or Qdrant).
Proficiency in Python with experience using DAG-based orchestration platforms (Airflow, Dagster, Prefect, or similar).
Hands-on experience with embedding models and semantic search systems.
Experience with distributed data processing frameworks (Apache Spark, Ray, or Dask).
Understanding of LLM inference optimization techniques and prompt engineering.
Familiarity with Docker, containerization, and orchestration platforms.
Strong grasp of data engineering best practices including data modeling, ETL/ELT patterns, and data quality.

For more information about DRW's processing activities and our use of job applicants' data, please view our Privacy Notice at https://drw.com/privacy-notice.

California residents, please review the California Privacy Notice for information about certain legal rights at https://drw.com/california-privacy-notice.

[#LI-KS1]

Data Developer

Apply for this job