
Senior Data Engineer
What is Cobre, and what do we do?
Cobre is Latin America’s leading instant b2b payments platform. We solve the region’s most complex money movement challenges by building advanced financial infrastructure that enables companies to move money faster, safer, and more efficiently.
We enable instant business payments—local or international, direct or via API—all from a single platform.
Built for fintechs, PSPs, banks, and finance teams that demand speed, control, and efficiency. From real-time payments to automated treasury, we turn complex financial processes into simple experiences.
Cobre is the first platform in Colombia to enable companies to pay both banked and unbanked beneficiaries within the same payment cycle and through a single interface.
We are building the enterprise payments infrastructure of Latin America!
What we are looking for:
We are looking for a Senior Data Engineer to design, build, and evolve our modern, event-driven and near real-time Lakehouse platform on AWS + Snowflake.
This is a data platform role, focused on high-volume transactional data, near real-time processing, and strong architectural foundations.
Our data ecosystem is built around:
- Event-driven ingestion using Confluent Cloud (Kafka)
- CDC ingestion from databases via AWS DMS
- Custom connectors for specific internal and third-party sources
- Near real-time and batch processing
- AWS S3 + S3 Tables (Apache Iceberg) as the core Lakehouse storage
- Medallion Architecture (Bronze / Silver / Gold)
- Glue & dbt as the transformation layer
- Snowflake as the access, analytics, and data governance layer
You will play a key role in defining how data flows from source to Lakehouse to analytics, ensuring reliability, scalability, observability, and governance.
What would you be doing:
Event-Driven & Near Real-Time Data Ingestion
- Design and maintain event-driven ingestion pipelines using Confluent + AWS.
- Ingest CDC streams from transactional databases using AWS DMS.
- Build and maintain custom ingestion connectors for internal and external sources.
- Ensure data consistency, ordering, idempotency, and replayability in near real-time pipelines.
Lakehouse & Storage Layer
- Own the Lakehouse architecture based on Apache Iceberg on S3 Tables.
- Design Iceberg tables optimized for streaming and batch workloads.
- Manage schema evolution, partitioning strategies, compaction, and time travel.
- Implement and evolve Bronze, Silver, and Gold layers aligned with event-driven ingestion patterns.
Transformations & Processing
- Build near real-time and batch ELT pipelines using AWS Glue, Python, and dbt.
- Implement incremental models optimized for streaming-derived datasets.
- Ensure transformations are modular, testable, and production-ready.
Observability, Quality & Reliability
- Implement data observability and monitoring across ingestion and transformation layers.
- Track freshness, volume, schema changes, and data quality metrics.
- Define alerting and monitoring strategies to proactively detect pipeline and data issues.
- Implement data quality checks, contracts, and validation rules.
Analytics Enablement & Governance
- Expose curated datasets through Snowflake as a governed access layer.
- Collaborate on access control, data retention, and lineage strategies.
- Ensure consistency between Lakehouse storage and analytics consumption.
Platform Ownership
- Contribute to architectural decisions across ingestion, storage, and consumption.
- Balance latency, cost, scalability, and reliability trade-offs.
- Define standards, templates, and best practices for the data platform.
What do you need:
Must have
- 3–6+ years of experience as a Data Engineer (flexible depending on depth).
- Strong experience with event-driven data architectures.
- Hands-on experience with Kafka / Confluent in production environments.
- Experience ingesting data via CDC (AWS DMS or similar tools).
- Solid experience designing near real-time data pipelines.
- Strong knowledge of Apache Iceberg
- Experience with AWS (S3, Glue, Lambda, EventBridge, Firehose, etc.).
- Advanced SQL and Python.
- Production experience using dbt.
- Experience using Snowflake as an analytics and governance layer.
Nice to have
- Experience with S3 Tables specifically.
- Infrastructure as Code (Terraform / Terragrunt).
- Experience with high-volume transactional or fintech systems.
- Familiarity with data contracts, schema registries, and data observability tools.
Who will thrive in this role
- Engineers who enjoy event-driven systems and near real-time processing.
- People who think in platforms, not one-off pipelines.
- Engineers comfortable owning architectural decisions.
- Profiles that enjoy working close to core business events and transactional data.
What this role makes explicit
- Kafka / Confluent is the primary ingestion layer.
- CDC + events are first-class citizens.
- Near real-time is core, not an afterthought.
- Iceberg-first Lakehouse, Snowflake as access & governance, not as the lake.
- Clear Mid / Senior expectations.
- Observability and reliability are part of the job, not “nice to have”.
Not a fit if
- You focus on BI/dashboards or SQL-only analytics.
- You’ve only worked with batch ETL.
- You’re looking for a junior or execution-only role.
- You’re not interested in event-driven, near real-time systems.
Apply for this job
*
indicates a required field
