
Senior Data Engineer (Databricks/Spark/PySpark)
We are looking for a Senior Data Engineer with strong experience in building and optimizing data pipelines using Databricks, Apache Spark, and PySpark. The ideal candidate is passionate about data architecture, performance optimization, and working with high-scale distributed data systems.
You will play a key role in designing and developing scalable data ingestion, transformation, and processing pipelines, enabling reliable and timely data for downstream analytics, reporting, and machine learning.
Why Join Exadel
We’re an AI-first global tech company with 25+ years of engineering leadership, 2,000+ team members, and 500+ active projects powering Fortune 500 clients, including HBO, Microsoft, Google, and Starbucks.
From AI platforms to digital transformation, we partner with enterprise leaders to build what’s next.
What powers it all? Our people are ambitious, collaborative, and constantly evolving.
About the Client
The world's largest human resources consulting firm is headquartered in New York City, with its main branches in 40+ countries. Over 20,500 employees operate internationally in more than 130 countries. Its services are used by 97% of Fortune 500 companies.
What You’ll Do
- Design, develop, and maintain scalable and efficient data pipelines using Databricks, Apache Spark, and PySpark
- Collaborate with data scientists, analysts, and product teams to understand data requirements and ensure reliable data delivery
- Implement ETL/ELT workflows to extract, cleanse, transform, and load data from various structured and unstructured sources
- Optimize Spark jobs and workflows for performance, scalability, and cost-efficiency
- Develop reusable components, frameworks, and libraries to accelerate pipeline development
- Monitor data quality and pipeline health; implement data validation and error-handling mechanisms
- Ensure compliance with security, privacy, and governance policies
- Contribute to best practices in data engineering and cloud-native data architecture
What You Bring
- 3–6+ years of experience in data engineering or software engineering with a focus on large-scale data processing
- Strong hands-on experience with Apache Spark and PySpark
- Proficiency in working with Databricks platform (including notebooks, jobs, clusters, and workspace management)
- Solid knowledge of data formats (Parquet, Avro, JSON, etc.) and data modeling concepts
- Experience building and orchestrating ETL/ELT pipelines (e.g., using Airflow, Databricks Workflows, Azure Data Factory, etc.)
- Familiarity with cloud platforms (Azure, AWS, or GCP) and their data services
- Strong programming skills in Python; SQL expertise is a must
- Understanding of CI/CD practices and version control (Git)
- Ability to work in Agile development environments and collaborate with cross-functional teams
Nice to have
- Experience with Delta Lake or other transactional data lake technologies
- Familiarity with data lakehouse architecture
- Exposure to data warehousing tools and MPP databases (Snowflake, Redshift, BigQuery, etc.)
- Knowledge of data governance, lineage, and cataloging tools (e.g., Unity Catalog, DataHub, Collibra)
- Experience with streaming data (Kafka, Spark Structured Streaming)
English level
Upper-Intermediate
Legal & Hiring Information
- Exadel is proud to be an Equal Opportunity Employer committed to inclusion across minority, gender identity, sexual orientation, disability, age, and more
- Reasonable accommodations are available to enable individuals with disabilities to perform essential functions
- Please note: this job description is not exhaustive. Duties and responsibilities may evolve based on business needs
Apply for this job
*
indicates a required field