Back to jobs

Senior Data Engineer

Los Angeles, CA

About Arena Club

If you’re fascinated by sports cards and memorabilia, your search ends here. Arena Club is pioneering the collectibles domain by introducing the first-ever digital card show. Spearheaded by 5x World Series Champion Derek Jeter and serial entrepreneur Brian Lee, Arena Club has developed a fully digital marketplace. This innovative platform is built on trust, transparency, and fun, featuring grading & authentication, vaulting, and digital pack openings for collectors to build and showcase their collections in a personalized online showroom from anywhere in the world. 

About The Role

We are seeking a Senior Data Engineer to help strengthen Arena Club’s strategic decision-making, enhance operational performance, and integrate data across the company to unlock deeper insights into customer behavior and market performance.

As a key individual contributor, you’ll focus on building and maintaining reliable data pipelines, improving data quality, and ensuring data is well-instrumented, accessible, and structured to support meaningful analysis. You will work with existing tools and systems, including Mixpanel and AWS-based data infrastructure, partnering closely with engineering and business teams across product, operations, and the marketplace.

This is a hands-on role for someone who enjoys owning execution, operating in evolving data environments, and incrementally improving how data is collected, processed, and used to drive better business decisions.

What You Will Do

Current State — Production Data Warehouse

Maintain and optimize inbound and outbound ETL pipelines built on AWS Glue (Python Shell & Spark ETL)
Manage Redshift cluster performance across various schemas 
Own integrations with SaaS data sources via AppFlow and direct connectors 
Operate outbound distribution pipelines to external vendors
Manage infrastructure, alerting, and migration state tracking


Future State — Medallion Architecture & Real-Time

Lead the migration from ad-hoc SQL scripts to a Bronze/Silver/Gold medallion architecture with dbt as the transformation layer
Design and implement dimensional models ie fact tables and dimensions 
Build the Silver staging layer 
Architect the real-time CDC pipeline
Implement data contracts and governance at the Silver layer to insulate downstream consumers from source changes

Data Archival & Cost Optimization
 
Implement a hot/cold storage strategy via Redshift Spectrum
Build the Unified Access Layer 
Design and automate Glue jobs
Configure S3 lifecycle policies for progressive cost reduction 

Requirements

Must Have
5+ years in data engineering with production pipeline ownership (not just analytics or BI)
Deep AWS experience: Glue (both Python Shell and Spark ETL), Redshift, S3, IAM, EventBridge, Lambda, AppFlow
Strong SQL: complex joins, window functions, MERGE/UPSERT patterns, Redshift-specific optimization (sort keys, dist keys, VACUUM/ANALYZE)
Python fluency: boto3, data processing libraries, writing production Glue scripts (not just notebooks)
Dimensional modeling: star schemas, fact/dimension design, SCD Type 1 and Type 2 implementation
dbt: hands-on experience building and maintaining staging, intermediate, and mart models with tests and documentation
Data warehouse operations: schema migration, incremental loads, backfill strategies, monitoring, and alerting
 
Nice-to-Have
Redshift Spectrum: experience with external schemas, Parquet/Hive partitioning, and unified hot/cold querying
CDC / streaming: Postgres WAL, Debezium, EventBridge, or similar change data capture pipelines
Data Mesh concepts: domain-oriented ownership, data-as-a-product thinking, federated governance
AppFlow & SaaS integrations: configuring and troubleshooting managed connectors for Stripe, Zendesk, Mixpanel, etc.
Cost optimization: right-sizing Glue jobs (Python Shell vs. Spark), Redshift concurrency scaling, S3 lifecycle policies
Vendor distribution: building outbound API sync jobs with rate limiting, SFTP transfers, webhook delivery
 
Bonus
Familiarity with marketplace or e-commerce data (orders, payments, attribution, promo codes)
Experience with Mixpanel, Customer.io, or Singular data exports and event schemas
Prior experience migrating from monolithic ETL to medallion or lakehouse architectures
Exposure to data governance tooling: data catalogs, lineage tracking, quality frameworks (e.g., Great Expectations, dbt tests)

The Arena Club Standard

Life at Arena Club isn’t for the faint of heart — and that’s by design. We’re building products and experiences the collectibles world has never seen. This is a proving ground. It demands your best every single day, because anything less means you’re falling behind.

From day one, you’re in the game. Trusted to deliver, expected to own outcomes, and driven to raise the bar higher than you thought possible. We don’t just execute — we innovate, compete, and win together. That’s how real breakthroughs happen.

If you want routine or predictability, you won’t find it here. But if you’re ambitious, relentless, and hungry to prove yourself on a team built to dominate — step into the arena. You’ll discover growth and reward here, unlike anywhere else.

The base salary range listed is a guideline. Actual compensation is determined based on skills, experience, and the impact you bring. Total compensation includes base salary, bonus, and equity.

Salary Range

$130,000 - $170,000 USD

Create a Job Alert

Interested in building your career at Arena Club? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV

Accepted file types: pdf, doc, docx, txt, rtf


Education

Select...
Select...

Select...
Select...
Select...
Select...
Select...
Select...