Job Application for Senior Data Engineer at Monstro

The Opportunity

Monstro is hiring a Senior Data Engineer to design and build the pipelines that transform real-world financial information into clean, intelligent, and actionable data. You’ll process unstructured, semi-structured, and structured datasets—from tax codes and financial statements to user data and investment records—and turn them into the backbone of Monstro’s AI-powered platform.

As Monstro’s first dedicated Data Engineer, you’ll shape the foundation of our data platform at a company where data is the product. You’ll build ingestion systems, ETL pipelines, and vector database infrastructure that power retrieval, knowledge graphs, analytics, and applied AI. This role demands technical depth, architectural clarity, and an unwavering focus on quality, security, and scalability.

If you’re driven by ownership, precision, and the opportunity to architect systems that define the future of financial intelligence, this role is for you.

About Monstro

Monstro, headquartered in New York City, is an AI-native fintech/banktech platform transforming how people and institutions manage money.

Our mission is to democratize access to high-quality financial insight—giving every individual and institution the intelligence, tools, and automation to make better decisions across wealth, tax, legal, and investment.

For financial institutions, Monstro serves as the intelligence layer—a unified platform that unlocks real-time insights, automation, and revenue opportunities across their client base. For consumers, we deliver personalized, always-on financial guidance and the ability to take action—all within a seamless, next-generation mobile experience.

This B2B2C model drives deeper engagement, higher retention, and scalable growth for institutions, while empowering everyday users to make smarter, more confident financial decisions.

Our team brings together leaders from fintech, wealth management, and AI-driven technology companies—combining decades of experience building and scaling platforms that have transformed industries.

Responsibilities

Data Ingestion & Processing
- Build and own scalable pipelines that parse, normalize, and validate unstructured, semi-structured, and structured data.
- Design ingestion systems for documents (PDFs, images, HTML, XML), APIs, and partner feeds with monitoring, retries, and alerting.
- Implement automated schema inference, content validation, and lineage tracking for high reliability.
Data Platform Engineering
- Stand up and manage object, relational, document, and vector data stores with appropriate indexing and partitioning.
- Operate and optimize vector databases (e.g., pgvector, Pinecone, Weaviate) with multiple collections, embeddings, and metadata.
- Build reusable libraries and ETL components for parsing, enrichment, and embedding generation.
Infrastructure & Reliability
- Own end-to-end ETL infrastructure—from ingestion to transformation to serving.
- Ensure high uptime, observability, and fault tolerance across all pipelines.
- Implement data quality checks, range validations, and access controls for sensitive financial data.
Security & Governance
- Handle sensitive consumer and institutional data with the highest standards for privacy, compliance, and retention.
- Implement robust access controls, auditing, and monitoring for all data systems.
Collaboration & Impact
- Partner with AI, Product, and Engineering teams to power features that depend on clean, reliable data.
- Establish best practices, document standards, and mentor teammates as the data function scales.

Qualifications

5–8+ years of total engineering experience, including 2+ years in a dedicated data engineering role (AI-native or data-intensive startup preferred).
Proven ownership of end-to-end data pipelines—ingestion, transformation, and serving.
Strong proficiency in Python and SQL, with hands-on ETL and document parsing experience (PDF, HTML, JSON, XML).
Experience with vector databases (pgvector, Pinecone, Weaviate) and knowledge of embeddings, chunking, and metadata design.
Familiarity with data orchestration tools, APIs, and scheduling frameworks.
Deep understanding of schema design, indexing, and performance optimization.
Track record of building secure, compliant systems for sensitive data.
Clear written communication, strong collaboration skills, and ownership mindset.

Nice-to-Haves

Experience with AWS infrastructure and containerized deployment (Docker, App Runner).
Prior exposure to Model Context Protocol or similar controlled database access patterns.
Background in fintech, financial data systems, or AI-driven data enrichment.

Why Monstro?

Ownership & Impact: Own the foundation of Monstro’s data platform and define how financial data is ingested, transformed, and served.
Data-First Mission: Work where data is the product—powering AI-driven insights that redefine financial decision-making.
High Standards: Build for scale, reliability, and security across global, regulated environments.
Collaborative Team: Work with experienced leaders who’ve scaled successful startups and data-driven platforms.
Compensation: $175K–$225K base + equity and performance upside.

Apply Today

If you’re passionate about building high-performance data systems, thrive in a fast-moving startup environment, and want to shape the data core of an AI-native fintech—we’d love to meet you.

First Name

Last Name

Preferred First Name

Country

Phone

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf

School

Select...

LinkedIn Profile

Provide a link to your GitHub

Do you have Minimum 2 years in a dedicated Data Engineering role at an AI-native startup OR 4+ years of experience in traditional Data Engineering?

Select...

Have you designed or implemented systems using vector databases (e.g. Pinecone, Weaviate, pgvector, FAISS, Chroma)?

Select...

If yes, briefly describe the project and scale (e.g. # docs, users, queries/sec).

Have you built or contributed to Retrieval-Augmented Generation (RAG) pipelines (embeddings, chunking, retrieval, reranking)?

Select...

Which tools/libraries have you used to build RAG or agent workflows? (e.g. LangChain, LlamaIndex, Hugging Face)

Have you developed ingestion frameworks for APIs, partner feeds, or event streams (e.g. Spark, Databricks, Kafka, Airflow, dbt)?

Select...

Have you built data pipelines that process unstructured or semi-structured documents (e.g. PDFs, JSON logs, HTML, images)?

Select...

Do you have hands-on experience deploying ML pipelines into production (e.g. using MLflow, SageMaker, Azure ML, Vertex AI)?

Select...

Do you have experience operating vector databases such as pgvector, Pinecone, or Weaviate, with multiple collections? If yes, please explain.

What's your experience with Python?

Select...

What's your experience level with SQL?

Select...

Are you currently working in a full time position?

Select...

Are you open and comfortable with commuting to our NYC office (Chrysler building) 3-4x per week?

Select...

Are you currently authorized to work lawfully in the United States?

Select...

Will you now, or in the future require Monstro to commence (sponsor) an immigration case in order to employ you (for example, H-1B or other employment-based immigration case)?

Select...

Bonus: Do you have experience working within fintech or banktech?

Select...

Senior Data Engineer

The Opportunity

About Monstro

Responsibilities

Qualifications

Why Monstro?

Apply Today

Apply for this job