
Sr. DevOps Engineer I (Remote Eligible in Bulgaria)
For over 20 years, Smartsheet has helped people and teams achieve–well, anything. From seamless work management to smart, scalable solutions, we’ve always worked with flow. We’re building tools that empower teams to automate the manual, uncover insights, and scale smarter. But more than that, we’re creating space– space to think big, take action, and unlock the kind of work that truly matters. Because when challenge meets purpose, and passion turns into progress, that’s magic at work, and it’s what we show up for everyday.
We are seeking a Senior DevOps Engineer to join the team that owns and operates Smartsheet's edge proxy platform and internal engineering tooling. The edge proxy is a custom-built, high-performance reverse proxy that serves as the entry point for all traffic across Smartsheet's commercial and FedRAMP-authorised US Government environments. The team owns the full lifecycle of this platform: from maintaining the proxy binary and its configuration tooling, to provisioning and operating the multi-region AWS EKS infrastructure it runs on, to managing the observability and on-call story for every service that routes through it. Beyond the edge layer, the team also drives the evaluation and adoption of internal developer tooling — from developer portals to productivity platforms — that improves the engineering experience across the company.
You will work remotely from Bulgaria and will be reporting to the Engineering Manager.
You Will:
- Own and evolve the edge proxy platform by maintaining, upgrading, and extending a high-performance reverse proxy — including maintaining the proxy binary and its configuration tooling, writing Go and Python automation, managing the full container image lifecycle on hardened Linux base images, and working across the broader edge layer, including CDN, WAF, and traffic management capabilities.
- Build and maintain cloud infrastructure as code by designing and implementing Terraform/Terragrunt modules and live environment configurations managing EKS clusters, load balancers, IAM roles, VPC networking, ECR registries, and supporting AWS services across multiple regions including GovCloud.
- Operate Kubernetes clusters at scale by managing multi-region, multi-cluster EKS deployments via FluxCD GitOps workflows and Helm charts, including node AMI rotation, add-on lifecycle management, and horizontal pod autoscaling.
- Build and own CI/CD pipelines by designing, maintaining, and improving shared GitLab CI/CD pipeline templates used across all team repositories, and by building and operating alternative pipeline workflows for isolated government cloud environments.
- Automate operational toil by building and maintaining tooling for tasks such as container image patching, EKS AMI rotation, air-gapped ECR image sync to GovCloud, and automated merge request creation for monthly version-bump patching cycles.
- Manage observability and on-call responsibilities by provisioning and maintaining Datadog SLOs, monitors, and dashboards via Terraform, and by participating in the team's on-call rotation responding to edge proxy incidents across production and GovCloud environments.
- Support FedRAMP/GovCloud operations by operating the GovCloud environment with its unique constraints — air-gapped image distribution, infrastructure automation in isolated networks, and alert management with compliance-aware data handling.
- Evaluate and adopt internal developer tooling by researching, prototyping, and driving the adoption of internal tools that improve engineering productivity across the company — including developer portals, platform self-service capabilities, and other tooling that raises the bar for the developer experience at Smartsheet.
- Mentor and collaborate with peers across the team through code reviews, architecture discussions, and runbook authorship, fostering a culture of engineering excellence and operational rigour.
- Strategically apply AI tools within the team's domain to improve project execution, infrastructure design, quality, and debugging, and lead adoption of AI best practices across the team.
- Apply sustained focus and independent judgement when troubleshooting complex, multi-system incidents, managing cognitive load across concurrent workstreams in a high-availability environment.
- Maintain resilience and composure when navigating ambiguous or high-pressure operational situations, including production incidents and compliance-sensitive environments, demonstrating the psychosocial steadiness required of an on-call engineer.
- Perform standard sedentary work involving extended screen use in a remote environment, including standard computer and keyboard use.
- Work within Smartsheet's fully remote, geographically distributed team structure, contributing to a psychologically safe and inclusive team environment where diverse perspectives and approaches are welcomed. This role operates within standard business hours with flexibility expected around on-call responsibilities. No regular travel is required.
You Have:
- 5+ years of experience in DevOps, platform engineering, or site reliability engineering, demonstrating a depth of expertise in operating production infrastructure at scale.
- A BS or MS in Computer Science, Engineering, or a related field, or equivalent industry experience providing the foundational knowledge required for this role.
- Deep proficiency with Terraform and Terragrunt for managing production cloud infrastructure at scale across multiple environments and regions.
- Strong Kubernetes expertise, including EKS cluster operations and Helm chart authoring.
- Hands-on experience with AWS networking and container workload services: EKS, ALB/NLB, VPC, IAM, ECR, Route53, CloudWatch, and EventBridge.
- Proficiency in at least one general-purpose programming language — Go or Python preferred — for building operational tooling and automation.
- Solid understanding of reverse proxies, API gateways, or load balancers (NGINX, HAProxy, or equivalent).
- Experience designing and maintaining CI/CD pipelines (GitLab CI preferred), including shared template libraries across multiple repositories.
- Experience with container image security practices: hardened base images, vulnerability scanning, and image promotion workflows.
- Strong operational instincts, including comfort with on-call responsibilities, incident response, runbook authorship, and postmortems in production environments, as well as the ability to plan, prioritise, and manage multiple concurrent workstreams independently.
- Demonstrated ability to communicate clearly and collaborate effectively with peers, stakeholders, and cross-functional engineering teams, including through written documentation, architecture discussions, and code review.
- The ability to handle and document sensitive operational and compliance-related information with discretion, including infrastructure configurations, security-relevant data, and compliance artefacts.
- Awareness of the operational context in which infrastructure decisions are made, including an understanding of how platform choices affect engineering productivity and organisational efficiency, with no direct financial responsibility required for this role.
- 1 year of professional experience leveraging AI-based workflows to author, maintain, review, and deploy infrastructure or code.
- Fluency in English is required.
- Legal eligibility to work in Bulgaria on an ongoing basis.
Nice to Have:
- Experience with enterprise CDN platforms, including WAF configuration, origin routing, and cache policy management.
- Familiarity with FedRAMP or AWS GovCloud operational constraints.
- Experience with advanced reverse proxy or API gateway configuration, including programmatic config generation, dynamic routing, and service mesh components.
- Experience managing Datadog resources via Terraform: SLO definitions, monitors, and dashboards.
- Familiarity with rate-limiting services, control plane components, or custom traffic filter development.
Smartsheet provides a competitive base salary range for roles that may be hired in different geographic areas we are licensed to operate our business from. Actual compensation is determined by several factors including, but not limited to, level of professional, educational experience, skills, and specific candidate location. In addition, this role will be eligible for a market competitive incentive opportunity.
Bulgaria Base Salary Pay Range
€58,250 - €69,750 EUR
Get to Know Us:
At Smartsheet, your ideas are heard, your potential is supported, and your contributions have real impact. You’ll have the freedom to explore, push boundaries, and grow beyond your role. We welcome diverse perspectives and nontraditional paths—because we know that impact comes from individuals who care deeply and challenge thoughtfully. When you’re doing work that stretches you, excites you, and connects you to something bigger, that’s magic at work. Let’s build what’s next, together.
Equal Opportunity Employer:
Smartsheet is an Equal Opportunity (EEO) employer committed to fostering an inclusive environment with the best employees. It is our policy to provide equal employment opportunities to all qualified applicants in accordance with applicable laws in the US, UK, Australia, Germany, Costa Rica, Japan, Bulgaria, and India. All qualified applicants will receive consideration without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, protected veteran or disabled status, or genetic information.
If there are preparations we can make to help ensure you have a comfortable and positive interview experience, please let us know.
#LI-Remote
Create a Job Alert
Interested in building your career at Smartsheet? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field