Software Engineer - Observability & Debugging
What we’re doing isn’t easy, but nothing worth doing ever is.
We envision a future powered by robots that work seamlessly with human teams. At Diligent Robotics, we build artificial intelligence that enables service robots to collaborate with people and adapt to dynamic, human-filled environments.
Diligent is one of the only companies in the world operating a production fleet of mobile manipulation robots in real environments. Every day, our robots work alongside hospital staff, generating the real-world data needed to advance the next generation of Physical AI. Debugging autonomy in the real world is fundamentally different than debugging in the lab, and solving that challenge requires exceptional tooling and infrastructure.
As a Software Engineer – Observability & Debugging, you will strengthen our team’s ability to understand, diagnose, and improve the performance of our robotics applications in production. You will work closely with robotics engineers and operations teams to build the tools, systems, and standards that allow us to debug, triage, and root-cause robot performance issues quickly and reliably.
Our goal is that every bug should be reproducible. You will help us get there by building the observability, replay, and debugging systems that make real-world robotics development scalable.
Responsibilities
- Build and maintain observability tooling that supports debugging and root-cause analysis of robot performance in real-world deployments
- Define and standardize triage workflows and instrumentation practices across the robotics stack
- Develop reliable mechanisms for collecting, curating, and replaying robot logs, events, and telemetry (“debug + replay” systems)
- Own critical incident tooling foundations, such as our structured logging and application replay systems, and evolve them into scalable, easy-to-use systems
- Improve and expand on-robot metrics generation: what we measure, how we measure it, and how quickly we can interpret it
- Integrate and extend visualization and introspection tools (e.g., Foxglove) for fast iteration and effective triage
- Partner with robotics platform and applications teams to add instrumentation to key subsystems (behavior, planning, localization, controls, etc.)
- Drive improvements in data management pipelines: upload flows, retention policies, indexing/search, and developer ergonomics
- Mentor others on best practices for instrumentation, debugging, reproducibility, and operational excellence
Basic Qualifications
- Undergraduate or graduate degree in Robotics, Computer Science, Electrical Engineering, or related field (or equivalent experience)
- Strong proficiency in C++ and Python
- Some robotics experience (comfortable reading autonomy logs, reasoning about robot state, and debugging cross-system behaviors)
- Experience building observability/debugging systems (structured logging, metrics, tracing, event pipelines, replay tooling, dashboards)
- Familiarity with developer workflows for diagnosing distributed or real-time systems (profiling, postmortems, regression analysis)
- Nice to have:
- Foxglove (or similar robotics visualization/telemetry tooling)
- Log replay / bag replay systems (ROS bags or equivalent)
- Data pipeline experience (capture → upload → storage → indexing → retrieval)
Create a Job Alert
Interested in building your career at Diligent Robotics? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
