
Senior Cloud Platform Engineer
About Datavations
Datavations is a leading New York-based data and AI software specializing in the $2.3 trillion dollar building materials industry. Our platform, powered by advanced Machine Learning and Artificial Intelligence, provides manufacturers with a data-driven approach to better service customers and grow their relationships with key accounts. By simplifying massive datasets into actionable business insight, we help businesses make data-driven decisions to optimize pricing, inventory, and product assortment.
Our values
- We value execution: momentum is everything
- Belief in the power of positivity: we are encouraging and take risks
- Proactive ownership: we are passionate about driving financial value with creative data product
- Foster collaboration: we welcome help and always extend a helping hand, internally and externally
About the Role
We are seeking a Senior Cloud Platform Engineer to lead and take ownership of our cloud infrastructure and architecture. This role will encompass building, scaling, and managing robust cloud systems that support our data-driven products. The ideal candidate will be experienced in architecting cloud solutions with a focus on scalability, cost optimization, and security.
This role requires a mix of hands-on engineering and strategic thinking to ensure our systems are scalable, reliable, and secure. You will work closely with engineering, product, and business teams across the company to establish best practices and automate processes that improve our development and operational efficiency.
If you want to help define the next generation of data-powered products in a massive industry, this is the perfect opportunity for you. Expect autonomy in your work, support from a team that values continuous learning, and the chance to directly shape our solutions and strategy.
Responsibilities & skills
- Oversee the management and optimization of existing AWS services, including EC2, Fargate, and ClickHouse, ensuring performance and scalability.
- Work closely with engineering, product, and business teams to design and build infrastructure solutions with business goals, ensuring seamless integration with existing and future product offerings. Implement and maintain infrastructure as code using Terraform, streamlining cloud provisioning, and managing CI/CD pipelines to improve deployment cycles.
- Enhance system observability by setting up monitoring, alerting, and performance diagnostics across services to ensure seamless operations and detect issues proactively.
- Manage AWS cloud spending by identifying and implementing cost-saving strategies while ensuring scalability and high availability.
- Lead the design and deployment of cloud infrastructure to support Generative AI solutions, working alongside software and AI teams to implement and scale AI services securely.
- Design and implement solutions to support integrating Datavations’ platform with external customer data solutions (e.g., Power BI, Azure data platforms), ensuring secure, smooth data flow between systems.
- Define and uphold best practices for cloud operations, including on-call rotation management, disaster recovery, and incident resolution.
- Champion AI-powered coding tools (e.g., GitHub Co-Pilot) to enhance engineering efficiency.
- Present insights and strategic recommendations to senior leadership and stakeholders
- Participate in team meetings, project planning, and knowledge-sharing activities
- Adopt a self-starter mentality, and inspire other team members to do the same
Experience
- 4+ years of experience in Cloud Infrastructure, Site Reliability Engineering (SRE), or Platform Engineering.
- Strong expertise in architecting solutions using AWS services. Experience with EC2, Fargate, Kinesis, and others
- Experience with ClickHouse, Amazon Redshift, or similar data warehouse platforms. Knowledge of designing and optimizing data pipelines.
- Strong expertise with Terraform and using infrastructure as code (IaC).
- Experience with CI/CD pipelines (GitHub Actions, Jenkins, GitLab CI/CD, ArgoCD).
- Familiarity with deployment strategies (canary, blue/green)
- Proficiency in monitoring and observability tools (CloudWatch, OpenTelemetry, Prometheus, Datadog).
- Hands-on experience with containerization and orchestration (Docker, Kubernetes).
- Deep understanding of incident management and alerting best practices.
- Strong background in cloud cost optimization.
- Passionate about building massively scalable data platforms.
- Bonus: cross-cloud experience (especially AWS <> Azure), ML & GenAI Ops
- Soft Skills: Excellent problem-solving skills, customer obsession, and the ability to communicate complex technical concepts to both technical and non-technical stakeholders.
- Ability to work hybrid, onsite from our NYC office Tues-Thurs
Why Join Datavations
- Impact at Scale: Influence a $2.3 trillion industry by shaping how data science accelerates ROI for major manufacturers.
- Autonomy & Growth: Enjoy the freedom to experiment with new technologies and see your ideas realized in production.
- Collaborative Culture: Work alongside a supportive team that values positivity, proactive ownership, and continuous learning.
- Professional Development: Work with Terraform, OpenTelemetry, GenAI coding tools, and modern observability stacks.
Apply for this job
*
indicates a required field