Back to jobs

Research Engineer (Scaling Multimodal Data)

San Francisco

About World Labs:

We build foundational world models that can perceive, generate, reason, and interact with the 3D world — unlocking AI's full potential through spatial intelligence by transforming seeing into doing, perceiving into reasoning, and imagining into creating. We believe spatial intelligence will unlock new forms of storytelling, creativity, design, simulation, and immersive experiences across both virtual and physical worlds. We bring together a world-class team, united by a shared curiosity, passion, and deep backgrounds in technology — from AI research to systems engineering to product design — creating a tight feedback loop between our cutting-edge research and products that empower our users.

About the Role:

We’re looking for a research engineer to help improve our in-house world models through better multimodal data. This role is about figuring out what data actually moves model quality — then building the datasets, pipelines, and experiments to prove it. The best generative models aren’t just a product of model architecture and compute, they are a product of the training data. The model output reflects someone’s obsession over what goes into the data, how it’s processed, and what gets thrown away. We’re looking for the person who does the obsessing and builds the tools to act on it at scale. This isn’t a role where someone hands you a dataset and asks you to clean it. You will decide what data we need, figure out where to get it, build the processing and curation systems, and close the loop with model training to make sure it actually works. You will need strong engineering skills to do this well, but engineering serves your judgement about data, not the other way around.

What You’ll Do:

  • Discover, evaluate, and acquire training data. You will find, evaluate, and integrate data from diverse sources. You will write scrapers, work with APIs, and make judgement calls about whether a source is worth pursuing before investing days of effort.
  • Build data processing and curation systems. Design and implement data processing pipelines for filtering, deduplication, quality scoring, and curation. You will create well-abstracted systems that your teammates can pick up and extend.
  • Look at the actual data constantly. You will sampling outputs, spotting distributional issues (e.g., too many screenshots, low-resolution crops, near-duplicates), and catch problems before they propagate to model training.
  • Close the data → model → evaluation loop. You will diagnose model failures and trace them back to data issues, then design principled fixes to nip the problem in the bud.
  • Deploy ML models for data enrichment. captioning, quality scoring, text embedding, segmentation, classification etc. You will evaluate whether these models actually help.
  • Make systematic, documented decisions. Score thresholds, filtering criteria, mixture ratios — every processing choice should be reproducible, versioned, and auditable. You will set the standard for rigor on the team.

Questions We Think About:

  • How do you sample data for large scale world models, where the best practices for dense frame video models don’t apply?
  • How do you caption large scale video datasets for world generation?
  • How do you measure the diversity of video datasets, where counting the raw number of hours or frames doesn’t account for variation in content?
  • How do we build data pipelines that are reproducible and robust?
  • How can we improve the observability of billion-scale datasets so we can catch issues early?
  • What does it mean for a dataset to have a good “taste”? How do you operationalize aesthetic judgement at a billion scale?
  • How do you decide whether to filter aggressively for quality versus preserve diversity and coverage? Where’s the line, and how do you find it empirically?
  • How do you strike the balance between pre-processing data and computing things on the fly? One locks you into design decisions, while the other can bottleneck training throughout.

Most of these questions don’t have clean answers. We want someone who thinks about them seriously.

What We Require:

  • Strong software engineering fundamentals. You write well-abstracted, readable code and build reusable tools with clear interfaces. You find messy, undocumented systems personally unacceptable, because you've been burned by the alternative.
  • Deep experience with image and video data at scale. You know the data formats, the processing libraries (OpenCV, PIL, FFmpeg, PyAV), and you have hard-won intuition for what goes wrong when you're processing billions of samples.
  • Experience with distributed computing. You've used frameworks like Apache Beam, Spark, Kubernetes, or Ray to process datasets that don't fit on a single machine.
  • Experience using ML models as components. You’ve built and run inference pipelines (e.g., filtering, scoring, captioning, and embedding) at billion scale, and evaluated whether they actually improved outcomes.
  • A research-oriented approach to data decisions. You design experiments to validate processing choices rather than guessing. You can articulate why a filtering step exists and show evidence that it helps.
  • Familiarity with the model training lifecycle. You understand how data composition affects model behavior and can reason about what changes to try and can articulate why.

An overall obsession for the data-model-evaluation loop. You have demonstrated a track record of being obsessed with curating the best possible data to improve model performances and to prove that via rigorous evaluation, over and over again. You have a special knack that turns this obsession into successful data and model work.

What We’d Love To See:

  • Familiarity with columnar and large-scale data storage formats and libraries (PyArrow, Lance, Vortex, DeepMind Bagz, or similar). You have strong opinions (but loosely held) about when to use what.
  • Track record of independently discovering and integrating new data sources into a training pipeline, not just processing what was handed to you.
  • Direct experience closing the data → model quality loop: you diagnosed a model issue, traced it to the data, fixed it at the source, and measured the improvement.
  • Strong visual intuition for data quality and diversity. You can scroll through samples and quickly spot systematic problems.
  • You build tools and libraries, not just scripts. When you solve a problem, you think about how to make sure the problem is repeated by someone else.

What This Isn’t:

To help you self-select:

  • Not infrastructure. We have a separate team for data storage, data loading, and pipeline throughput. You need to write code that works at scale, but your focus is on what data to use and how to process it, not on the bare metal infrastructure.
  • Not pure research. You’ll read papers and run experiments, but you’ll also write production-quality pipelines that need to work reliably.
  • Not a role where you wait for instructions. We need someone who will independently identify data problems and opportunities, propose solutions, and execute.

Who You Are:

  • Fearless Innovator: We need people who thrive on challenges and aren't afraid to tackle the impossible.
  • Resilient Builder: Impacting Large World Models isn't a sprint; it's a marathon with hurdles. We're looking for builders who can weather the storms of groundbreaking research and come out stronger.
  • Mission-Driven Mindset: Everything we do is in service of creating the best spatially intelligent AI systems, and using them to empower people.
  • Collaborative Spirit: We're building something bigger than any one person. We need team players who can harness the power of collective intelligence.

 

We're hiring the brightest minds from around the globe to bring diverse perspectives to our cutting-edge work. If you're ready to work on technology that will reshape how machines perceive and interact with the world, World Labs is your launchpad.

Join us, and let's make history together.


Equal Opportunity & Pay Transparency

Equal Employment Opportunity

World Labs is an equal opportunity employer. We do not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, genetic information, veteran status, or any other characteristic protected under applicable law. We welcome all qualified applicants and are committed to providing reasonable accommodations throughout the hiring process upon request.

California Pay Transparency

In accordance with California law, we disclose the following:

Pay Range

$200-$325k base salary (good-faith estimate for San Francisco Bay Area upon hire; actual offer based on experience, skills, and qualifications)

Total Compensation

Base salary plus equity awards and annual performance bonus

Salary History

We do not request or consider prior compensation in making offers

 

Compliance: Cal. Lab. Code §432.3 (pay scale disclosure & salary history ban); Cal. Lab. Code §1197.5 (Equal Pay Act); Cal. Gov. Code §12940 (FEHA); 42 U.S.C. §2000e (Title VII); 29 U.S.C. §621 (ADEA); 42 U.S.C. §12101 (ADA)

Accommodations & inquiries: talent@worldlabs.ai

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf


Voluntary Self-Identification

For government reporting purposes, we ask candidates to respond to the below self-identification survey. Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiring process or thereafter. Any information that you do provide will be recorded and maintained in a confidential file.

As set forth in World Labs’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

Select...
Select...
Race & Ethnicity Definitions

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection. As a government contractor subject to the Vietnam Era Veterans Readjustment Assistance Act (VEVRAA), we request this information in order to measure the effectiveness of the outreach and positive recruitment efforts we undertake pursuant to VEVRAA. Classification of protected categories is as follows:

A "disabled veteran" is one of the following: a veteran of the U.S. military, ground, naval or air service who is entitled to compensation (or who but for the receipt of military retired pay would be entitled to compensation) under laws administered by the Secretary of Veterans Affairs; or a person who was discharged or released from active duty because of a service-connected disability.

A "recently separated veteran" means any veteran during the three-year period beginning on the date of such veteran's discharge or release from active duty in the U.S. military, ground, naval, or air service.

An "active duty wartime or campaign badge veteran" means a veteran who served on active duty in the U.S. military, ground, naval or air service during a war, or in a campaign or expedition for which a campaign badge has been authorized under the laws administered by the Department of Defense.

An "Armed forces service medal veteran" means a veteran who, while serving on active duty in the U.S. military, ground, naval or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985.

Select...

Voluntary Self-Identification of Disability

Form CC-305
Page 1 of 1
OMB Control Number 1250-0005
Expires 04/30/2026

Why are you being asked to complete this form?

We are a federal contractor or subcontractor. The law requires us to provide equal employment opportunity to qualified people with disabilities. We have a goal of having at least 7% of our workers as people with disabilities. The law says we must measure our progress towards this goal. To do this, we must ask applicants and employees if they have a disability or have ever had one. People can become disabled, so we need to ask this question at least every five years.

Completing this form is voluntary, and we hope that you will choose to do so. Your answer is confidential. No one who makes hiring decisions will see it. Your decision to complete the form and your answer will not harm you in any way. If you want to learn more about the law or this form, visit the U.S. Department of Labor’s Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp.

How do you know if you have a disability?

A disability is a condition that substantially limits one or more of your “major life activities.” If you have or have ever had such a condition, you are a person with a disability. Disabilities include, but are not limited to:

  • Alcohol or other substance use disorder (not currently using drugs illegally)
  • Autoimmune disorder, for example, lupus, fibromyalgia, rheumatoid arthritis, HIV/AIDS
  • Blind or low vision
  • Cancer (past or present)
  • Cardiovascular or heart disease
  • Celiac disease
  • Cerebral palsy
  • Deaf or serious difficulty hearing
  • Diabetes
  • Disfigurement, for example, disfigurement caused by burns, wounds, accidents, or congenital disorders
  • Epilepsy or other seizure disorder
  • Gastrointestinal disorders, for example, Crohn's Disease, irritable bowel syndrome
  • Intellectual or developmental disability
  • Mental health conditions, for example, depression, bipolar disorder, anxiety disorder, schizophrenia, PTSD
  • Missing limbs or partially missing limbs
  • Mobility impairment, benefiting from the use of a wheelchair, scooter, walker, leg brace(s) and/or other supports
  • Nervous system condition, for example, migraine headaches, Parkinson’s disease, multiple sclerosis (MS)
  • Neurodivergence, for example, attention-deficit/hyperactivity disorder (ADHD), autism spectrum disorder, dyslexia, dyspraxia, other learning disabilities
  • Partial or complete paralysis (any cause)
  • Pulmonary or respiratory conditions, for example, tuberculosis, asthma, emphysema
  • Short stature (dwarfism)
  • Traumatic brain injury
Select...

PUBLIC BURDEN STATEMENT: According to the Paperwork Reduction Act of 1995 no persons are required to respond to a collection of information unless such collection displays a valid OMB control number. This survey should take about 5 minutes to complete.