Staff/Senior Software Engineer, Machine Learning Platform (Ad Cloud)
About Appier
Appier is a software-as-a-service (SaaS) company that uses artificial intelligence (AI) to power business decision-making. Founded in 2012 with a vision of democratizing AI, Appier’s mission is turning AI into ROI by making software intelligent. Appier now has 17 offices across APAC, Europe and U.S., and is listed on the Tokyo Stock Exchange (Ticker number: 4180). Visit www.appier.com for more information.
The Impact You’ll Make at Appier
We’re looking for a Staff/Senior Machine Learning Platform Engineer to join our Machine Learning Platform Team, which powers end-to-end infrastructure for model training, evaluation, deployment, and monitoring at scale. Our platform supports daily execution of hundreds of ML models and processes billions of data records across batch and streaming pipelines.
In this role, you’ll shape the architecture and core components of our ML platform—covering batch (Spark), streaming (Flink), job orchestration (Argo on Kubernetes), and infrastructure tools—while ensuring the platform remains robust, scalable, and developer-friendly. You’ll also champion best practices and modern development tools including LLM-based programming assistants.
What You’ll Work On
- Architect, implement, and scale batch (Spark) and streaming (Flink) pipelines that process billions of records daily for ML training and evaluation.
- Design and operate robust ML job execution frameworks for training, inference, and post-processing.
- Build and maintain internal API servers and developer tools to orchestrate ML jobs on Kubernetes (via Argo Workflows, Helm, Terraform).
- Design and monitor data infrastructure using ClickHouse and PostgreSQL.
- Ensure high availability and observability through monitoring tools like Prometheus and Grafana.
- Collaborate with data scientists, product managers, and engineers to deliver reliable and efficient ML platform capabilities.
- Actively adopt and promote the use of LLM-based tools (e.g., GitHub Copilot, ChatGPT) to accelerate development, documentation, and debugging.
- Mentor junior engineers and help evolve team engineering culture and standards.
What We’re Looking For
- Bachelor’s degree in Computer Science, Engineering, or a related field; Master’s preferred.
- 4+ years of hands-on experience in data systems, machine learning infrastructure, or platform engineering.
- Strong coding proficiency in Python and/or Java, with experience building large-scale production systems.
- Practical experience with Spark, Flink, Kubernetes (GKE), and infrastructure-as-code tools such as Terraform and Helm.
- Experience managing high-throughput data infrastructure using ClickHouse, PostgreSQL, or similar systems.
- Deep understanding of ML pipelines and distributed job execution in production environments.
- Proven ability to apply LLM-based tools (e.g., Copilot, ChatGPT) to boost engineering productivity.
- Strong ownership, architectural thinking, and ability to lead cross-functional platform projects.
#LI-AK1
Apply for this job
*
indicates a required field