Back to jobs

Principal Data Engineer

The Role

Main responsibilities:

  1. As an IC and team lead, build and maintain data pipelines using Redshift, Dynamo, Athena, Glue, Kinesis, SQS, FIrehose, CDK, Step Functions, etc, with these pipelines will process terabytes of information in batch and real-time, and will:
    1. Be re-used by various upstream data producers to land their data to the company analytical database or customer data center
    2. Used in various downstream business dashboards and machine learning models for batch or real-time customer decisions and experience improving
    3. Calculate real-time business-critical metrics that are used by executives and engineers to make prioritization decisions
  2. Facilitate design sessions, drive design decisions, lead code reviews. Comfortable challenging assumptions to improve existing solutions and ensure the team is building the best data product possible.
  3. Research, evaluate, and recommend tools and services required to support data capabilities and/or accelerate delivery.
  4. Act as a “tech lead” on both internal and cross-functional projects: work with the business team and engineering leadership to prioritize development roadmaps and plan future projects, prioritize and plan the team’s work, communicate with stakeholders on progress, and unblock team members
  5. Attract and nurture talent, mentor, and develop a world-class engineering team.
  6. Hybrid Role 
What we are looking for:
  • Experience:
    • 8+ years of experience in software engineering with strong focus on data
    • 5+ years experience in building and managing high performance engineer team with Agile framework
    • 5+ years experience in building batch and real-time data pipelines that extract, transform, and load the data into analytical data warehouses or data lake
    • Creative, resourceful, and enthusiastic about seeking new solutions to problems and opportunities

 

  • Skills and attitudes
    • Expert in SQL, git, and a programming language (ie Python, Java, etc)
    • Strong proficiency with Python, NodeJS is a plus
    • Familiarity with technologies like AWS Serverless, S3, Lambda, Redshift, DynamoDB, Athena, Glue, EMR, Kinesis, SQS, FIrehose, CDK, Cloudformation, Terraform, or Serverless Framework, Step Functions
    • Experience with CI/CD (Continuous Integration, Continuous Delivery), Automated Testing, Automated Delivery

  • Bonus Points:
    • Experience in building customer data center or large scale real-time/batch customer feature data pipelines/microservices
    • Experience in developing real-time data pipelines for fintech companies
    • Experience with DBT and Airflow

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Select...
Select...
Select...
Select...