Back to jobs
New

Senior Data Engineer - ML

Chennai, Tamil Nadu, India

TechGrove is the Centre of Excellence for Banyan Software, based in Chennai, India. It plays a key role in supporting Banyan’s global businesses through technology, security, and software development. TechGrove brings together India’s deep pool of technical talent with Banyan’s long-term approach to growth, creating a trusted, developer-focused environment where people can do their best work.

This is a Senior Data Engineer – ML Systems role focused on rebuilding Touchstream’s data platform from the ground up to enable AI-driven streaming observability.

Today, data is fragmented across systems and stored in PostgreSQL in a way that doesn’t support ML. Your role is to unify, scale, and redesign the data layer, then build intelligent anomaly detection on top of it.

What will you be responsible for?

  1. Rebuild the Data Foundation
  • Consolidate two existing PostgreSQL systems into a single unified data model
  • Resolve inconsistent identifiers and messy real-world data
  • Execute zero-downtime migration (dual-write, backfill, validation, rollback)
  1. Build a Scalable Time-Series Platform
  • Move ~800M+ rows of streaming metrics into a purpose-built time-series database
  • Design:
    • Real-time query layer (dashboards, alerting)
    • Historical data layer (ML training)
    • Retention and rollup strategy
  • Manage high-cardinality data at scale
  1. Enable ML-Driven Anomaly Detection
  • Replace manual monitoring with automated anomaly detection systems
  • Build models that:
    • Learn per-stream behavior over time
    • Handle noisy, seasonal data
    • Balance false positives vs missed incidents
  • Implement statistical fallbacks (EWMA, Z-score) alongside ML models
  1. Optimize and Own the Backend
  • Tune PostgreSQL for performance (queries, indexing, partitioning)
  • Build reliable data pipelines and monitoring systems
  • Ensure the platform supports AI-driven features and real-time operations

What are we looking for?

Core Requirements

  • Strong PostgreSQL expertise (performance tuning, large-scale systems)
  • Experience with production data migrations (zero/low downtime)
  • Hands-on with time-series data or databases
  • Experience building anomaly detection or time-series ML systems in production
  • Strong Python engineering (production-quality code)
  • Comfortable working with AWS infrastructure

Nice to Have

  • Experience with multi-tenant data systems
  • Exposure to ML pipelines or synthetic data
  • Background in streaming, CDN, or video systems

What success looks like

Within the first year, you will:

  • Build a unified, scalable data platform
  • Launch ML-based anomaly detection replacing manual monitoring
  • Enable AI-powered insights across all customers and streaming data

 

Beware of Recruitment Scams

We have been made aware of individuals fraudulently posing as members of our Talent Acquisition team and extending fake job offers. These scams may involve requests for personal information or payment for equipment. 

Protect yourself by following these steps:

  • Verify that all communications from our recruiting team come from an @banyansoftware.com email address.
  • Remember, employers will never request payment or banking information during the hiring process.
  • If you receive a suspicious message, do not respond — instead, forward it to careers@banyansoftware.com and/or report it to the platform where you received it.

Your safety and security are important to us. Thank you for staying vigilant.

Apply for this job

*

indicates a required field

Phone
Resume/CV

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf