
Senior AI Engineer (US)
Senior AI Engineer, Ares Platform
Team: Ares AI Engineering Reports to: Ilir Osmanaj, VP of AI Engineering Location: Boston, MA (hybrid) or remote with overlap to ET working hours
Position summary
The Senior AI Engineer is a core builder on the team responsible for the agents and models that power Ares — Assail's autonomous offensive security platform for APIs, web applications, and mobile applications. This role works directly on Ares' named-agent architecture (Polemos, Hermes, Enyo, Momos, Dolos, Themis, Aletheia, Argus, Kratos), the model powering Ares, and the Javelin co-evolutionary self-training loop. The engineer will ship capabilities that move the platform forward across exploit chaining, multimodal vision, mobile coverage, self-improvement, and customer-facing accuracy.
Core tasks
- Agent development. Design, implement, and continuously improve the behavior and prompting of Ares' named agents, including orchestration patterns, hand-offs, planning loops, tool use, and shared memory.
- Model training and fine-tuning. Contribute to the model powering Ares across data curation, SFT, preference optimization (DPO/GRPO-style), and evaluation. Own pieces of the training pipeline from dataset construction through eval.
- Javelin loop. Extend the co-evolutionary self-training system that lets Ares learn from its own engagements and improve over time.
- Self-improvement systems (ARES-420 and successors). Build false-positive detection, tiered skill learning (suppression rules, agent directives, code-patch proposals), and the infrastructure that routes proposed changes through human approval and back into the platform.
- Evals. Design rigorous, security-specific evaluations covering OWASP Top 10 coverage, exploit chaining, finding accuracy, and agent reliability. Track performance over every model and agent change.
- Multimodal and platform expansion. Contribute to vision capabilities, mobile (iOS/Android) coverage, and BYOK support shipping in Sidewinder and beyond.
- Production reliability. Own latency, cost, observability, and failure-mode analysis for agents running in customer engagements. Partner with the platform team on Kubernetes-based deployment.
- Customer-facing accuracy. Contribute to the live accuracy gauge and other surfaces where model and agent quality is exposed to customers.
Must-have skills
- 5+ years building production ML/AI systems, with at least 2 years working directly on LLMs or LLM-powered agents.
- Deep Python; strong, production-grade engineering practices (testing, code review, observability).
- Hands-on fine-tuning experience: SFT, preference optimization (DPO, GRPO, RLHF/RLAIF), data curation, and synthetic data generation.
- Strong grasp of transformer architectures and the modern training stack (PyTorch, Hugging Face, DeepSpeed or FSDP, accelerate).
- Experience designing and shipping multi-agent or tool-using LLM systems in production — not just demos.
- Rigorous eval design: building harnesses, tracking experiments, and making model/agent decisions based on data rather than vibes.
- Inference optimization experience: vLLM or TensorRT-LLM, quantization, throughput/latency tradeoffs.
- Comfort with retrieval pipelines, vector stores, and structured memory for agents.
- Kubernetes and containerized deployment fluency.
- Genuine interest in offensive security and the ability to ramp quickly on OWASP Top 10, API security, web app pentesting, and mobile pentesting concepts. Direct offensive security background is a strong plus but not required.
Nice to have
- Offensive security background: OSCP/OSWE/OSWA, CTF, bug bounty, or prior red team work.
- Research publications at NeurIPS, ICML, ICLR, USENIX Security, IEEE S&P, Black Hat, or DEFCON.
- Open source contributions to agent frameworks or LLM tooling.
- Experience with adversarial ML or red-teaming AI systems.
- Familiarity with mobile app reverse engineering or binary analysis.
Apply for this job
*
indicates a required field
