Back to jobs
New

Staff Backend Engineer - Databases Tempo | Canada | Remote

Canada (Remote)

Grafana Labs is a remote-first, open-source powerhouse. There are more than 20M users of Grafana, the open source visualization tool, around the globe, monitoring everything from beehives to climate change in the Alps. The instantly recognizable dashboards have been spotted everywhere from a NASA launch and Minecraft HQ to Wimbledon and the Tour de France. Grafana Labs also helps more than 3,000 companies -- including Bloomberg, JPMorgan Chase, and eBay -- manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack, both featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).

We’re scaling fast and staying true to what makes us different: an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do.

You may not meet every requirement, and that’s okay. If this role excites you, we’d love you to raise your hand for what could be a truly career-defining opportunity.

This is a remote position. We are seeking candidates in US and Canada. 

The Opportunity: 

We build Tempo, the open-source distributed tracing backend behind Grafana Cloud Traces and Grafana Enterprise Traces (GET). Tempo makes it easy to search traces, generate metrics from spans, and connect tracing data with logs, metrics, and profiles across the Grafana stack.

2026 is an inflection point for Tempo. After a major architectural upgrade and the launch of TraceQL metrics, we are shifting from foundational work to product and operational excellence, and evolving Tempo from a SaaS database into a platform that powers Grafana’s next generation of observability products (App Observability, Asserts, Traces Drilldown, and AI-driven assistants).

Over the next year, you will help us:

  • Make Grafana Cloud Traces “just work” for customers by eliminating rough edges, confusing limits, and hidden failure modes.
  • Achieve operational excellence at scale as we grow from close to 50 cells today into triple digits this year, with autoscaling, parameterized rollouts, and aggressive toil reduction.
  • Evolve Tempo into a platform enabler: higher-density APIs, trace aggregation, TraceQL metrics math, and machine/LLM-friendly interfaces that downstream products and agents can build on.
  • Push performance further: faster query latency at hundreds of MB/s ingestion and performant 30-day query ranges to match competitors.
  • Prepare Tempo for an agent-driven world: larger, burstier, higher-cardinality workloads, and new categories of AI-powered workflows, such as assistant-driven triage and “why is this slow?”- style investigations.

What You’ll Be Doing: 

As a Staff Engineer on Tempo, you will set technical direction on the hardest problems in our roadmap and raise the bar across the team.

  • Lead multi-quarter technical initiatives from problem framing through rollout, e.g., trace aggregation APIs, Limitless Tempo, autoscaling cells and customer limits, or query engine improvements.
  • Own the architecture of core Tempo components: ingestion, storage, query, and metrics generation. Drive design reviews, make sharp trade-offs on performance, cost, and complexity, and document the “why” for the team.
  • Design APIs for humans and agents. Shape the next generation of Tempo’s interfaces (structured, deterministic, discoverable) so that Act 3 products, LLM-driven assistants, and external integrators can build on Tempo reliably.
  • Drive operational excellence. Own outcomes against concrete SLOs (P99 write latency, incident recurrence, TCO per ingested GB) and push the team toward Zero Ops through automation, parameterized rollouts, and actionable alerts.
  • Partner with Product and sibling teams. Work closely with PMs and with App Observability, Asserts, Drilldown, and Grafana Assistant teams to understand how Tempo gets consumed and to ship what unblocks them.
  • Mentor engineers. Raise the engineering bar through code review, design feedback, pairing on hard problems, and writing that leaves the team smarter than you found it.
  • Participate in on-call for the services you help build, and be a force multiplier in incident response and post-incident learning.
  • Contribute to open source. Tempo is OSS. You will engage the community, review external contributions, and help steer the project in the open.

 

We invest heavily in developer productivity. You can use modern AI coding assistants as part of your daily workflow (your choice of tools, within security guidelines), backed by a company-funded usage budget so you can iterate quickly without unnecessary friction.

We encourage pragmatic AI-assisted development: faster prototyping, test generation, refactors, documentation, and incident follow-ups—always paired with strong code review and quality standards.

You’ll also have access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.6, Gemini 3 Pro).

Example problems you could work on

These are the kinds of projects landing in 2026. Any one of them is a Staff-sized problem:

  • Trace aggregation and higher-density APIs: extend TraceQL metrics, design LLM-friendly response types, and make Tempo a first-class data source for Grafana’s AI assistant.
  • Autoscaling end to end: customer limits and Tempo cells, with hysteresis, predictive scaling for spikes, and safe scale-down.
  • Agent-scale ingestion and query: guardrails for bursty, high-cardinality, agent-generated workloads.
  • Query performance: new data formats, smarter query pipelines, targeted optimizations for common Drilldown and Traces workflows, and 30-day query ranges.
  • Rollouts and multi-cell operations: parameterized rollouts, push-button deploys, and the tooling to grow safely into triple-digit cell counts without a proportional increase in alert noise.
  • Limits and self-service: drive customer-facing configuration and observability so escalations trend toward zero.

What Makes You a Great Fit: 

  • Technical leadership. A track record of leading complex, multi-quarter initiatives that spanned design, delivery, and operations, and made the teams around you better.
  • Deep systems experience. Substantial hands-on experience building and operating distributed data systems in production: ingestion pipelines, storage engines, query execution, or similar.
  • Strong software craftsmanship. You write clean, robust, performant software that others can maintain, and you know when to optimize vs. when to ship.
  • Strong Go, or a path to it. We write Tempo in Go. Deep experience in other systems languages (Rust, C, C++) translates well.
  • Operational mindset. You’ve owned production services, carried a pager, reduced toil, and treated SLOs as a product feature, not a chore.
  • Customer focus and pragmatism. You break complex problems into short feedback loops: analyze, design, deliver an MVP, learn, iterate.
  • Leadership through writing and collaboration. You lead through design docs, reviews, and shipped code, not hierarchy. You communicate clearly in a fully remote, asynchronous environment.

Bonus Points For: 

  • Experience with tracing, OpenTelemetry, or large-scale observability systems.
  • Experience designing query languages, SQL/TraceQL-like engines, or APIs intended to be consumed programmatically (by services or agents).
  • Experience with columnar storage formats (e.g., Parquet) or purpose-built on-disk formats for analytical workloads.
  • Experience operating multi-tenant, multi-cell SaaS infrastructure at scale on Kubernetes.
  • Experience building for AI/LLM consumers: structured APIs, metadata/discovery endpoints, deterministic outputs, evaluation harnesses.
  • Open-source contribution or maintainership, and comfort engaging a community in the open.
  • Experience as an on-call user of Grafana, Prometheus, Loki, or Tempo in a previous role (or on a homelab).
  • Experience in a fully remote, globally distributed team.

How we work

We are a remote-first team that meets regularly over video and does most of our work asynchronously, in writing. We value creativity, diverse perspectives, and clear communication. Tempo is relied upon by prominent global organizations to monitor critical applications and infrastructure, and we expect everyone on the team, including our Staff engineers, to contribute ideas that make it a more reliable, more useful, and more loved product.


In Canada, the compensation range for this role is $186,368 - $223,642 CAD. Actual compensation may vary based on level, experience, and skillset as assessed throughout the interview process. All of our roles include Restricted Stock Units (RSUs), giving every team member ownership in Grafana Labs' success. We believe in shared outcomes—RSUs help us stay aligned and invested as we scale globally.

 

*Compensation ranges are country specific. If you are applying for this role from a different location than listed above, your recruiter will discuss your specific market’s defined pay range & benefits at the beginning of the process.

Why You’ll Thrive at Grafana Labs:

  • 100% Remote, Global Culture - As a remote-only company, we bring together talent from around the world, united by a culture of collaboration and shared purpose.
  • Scaling Organization – Tackle meaningful work in a high-growth, ever-evolving environment.
  • Transparent Communication – Expect open decision-making and regular company-wide updates.
  • Innovation-Driven – Autonomy and support to ship great work and try new things.
  • Open Source Roots – Built on community-driven values that shape how we work.
  • Empowered Teams – High trust, low ego culture that values outcomes over optics.
  • Career Growth Pathways – Defined opportunities to grow and develop your career.
  • Approachable Leadership – Transparent execs who are involved, visible, and human.
  • Passionate People – Join a team of smart, supportive folks who care deeply about what they do.
  • In-Person onboarding - We want you to thrive from day 1 with your fellow new ‘Grafanistas’ to learn all about what we do and how we do it. 
  • Balance is Key - We operate a global annual leave policy of 30 days per annum. 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect. *We will comply with local legislation where applicable.

Equal Opportunity Employer: We will recruit, train, compensate and promote regardless of race, religion, color, national origin, gender, disability, age, veteran status, and all the other fascinating characteristics that make us different and unique. We believe that equality and diversity builds a strong organization and we’re working hard to make sure that’s the foundation of our organization as we grow.

Grafana Labs may utilize AI tools in its recruitment process to assist in matching information provided in CVs to job postings. The recruitment team will continue to review inbound CVs manually to identify alignment with current openings.

#LI-Remote

For information about how your personal data is used once you’ve applied to a job, check out our privacy policy
 

Create a Job Alert

Interested in building your career at Grafana Labs? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...
Select...
Select...

Share a link to your LI profile. 


Diversity & Inclusion Survey

At Grafana Labs, we strive to ensure we grow in a way that represents the world in which we live. To help us learn more about how we can increase diversity in our candidate pool, we invite you to voluntarily provide demographic information in a confidential survey. Providing this information is optional. It will not be used in the hiring process, and has no effect on your opportunity for employment.

By voluntarily providing information and submitting your application, you explicitly consent to the collection of race, ethnicity, gender identity, and disability information and use of this information as described above

Select...
Select...
Select...
Select...