Back to jobs

Member of Technical Staff (Research Engineer)

NYC

Haize Labs gets LLM apps out of POCs and into production. We eliminate the risk and improve the reliability of LLM apps by haizing them -- i.e. rigorously, proactively, and continuously fuzz-testing them.

We are looking for Research Engineers to help develop our reliability platform, with a focus on:

  1. Data-efficient alignment of evaluation models
  2. Dynamic testing of AI applications
  3. Observability and anomaly detection 
  4. Discrete optimization (with applications in architecture search and automated prompting)

Our work is both intellectually stimulating and practically in high demand. Your work will result in net-new primitives, frameworks, and algorithms for developing robust LLM applications. You work will directly influence how LLM apps are tested, verified, and deployed everywhere. You will directly influence how the world responsibly uses LLMs.

Responsibilities

  • Develop optimization, synthetic data generation, and fuzzing methods for breaking LLM systems.
  • Implement complex automated evaluation models and systems.
  • Go from research idea to code within hours; and iterate quickly on experiments and data.
  • Work directly with customers to adapt our tools for different domains.

Qualifications

  • First-author publications in top-tier ML venues (NeurIPS, ICML, ICLR, and others).
  • Bias for action & experimentation over philosophizing (though sometimes good).
  • Not interested in printing papers for papers' sake.
  • Some production engineering experience (e.g. ML in an applied setting). No spaghetti research code!
  • Some familiarity with ideas from active learning, weak supervision, synthetic data, functional verification, reinforcement learning, reward modeling, automated evaluation. A subset of this is fine.

Annual Salary

$150,000 – $600,000 USD

Logistics

  • Location policy: In NYC.
  • US visa sponsorship: If you are exceptional, we will sponsor.
  • Compensation and Benefits: We provide generous salary, equity, and benefits

We're Not Here to Play Games.

We're not here to write GPT wrappers or get rich quick off the AI bubble. We're here to solve the hardest problem in AI: making it safe, reliable, and production-ready. 

Since our company's inception in 2024, we've amassed amazing customers like OpenAI, Anthropic, AI21, and several others. We've developed best-in-class tooling for evaluation, dynamic testing, red-teaming, observability, and continuous robustification. And we’re backed + advised by the founders of Cognition, Hugging Face, Weights and Biases, Nous, Etched, Okta, Replit and C-suite execs from Google, Stripe, Databricks, Robinhood, and more.

Our core team is exceptionally fit for this mission. We turned down Stanford PhDs, got into & rejected Y Combinator, wrote ML-guided matchmaking for 50,000+ students, built an educational nonprofit supporting 60 countries, and did some other cool things along the way. Our early hires include an MIT PhD with 21,000+ Physics/ML/Stats citations, a Datadog engineering manager who led their GenAI observability team, a Citadel quant with a huge open-source presence, and more.

We can only serve our mission with an incredibly high talent-density team. Come here to push yourself, learn fast, experience excellence, grow with each other, and pursue your life's work.

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Education

Select...
Select...
Select...
Select...
Select...