11.Cloud Infrastructure Engineer (AI/ML Ops)
About PayPay India
Why India ?
To build our Payment services, we got technical cooperation from Paytm (A large payment service company in India). And based on their customer-first technologies , we created and expanded the smartphone payment service in Japan. Therefore, we have decided to establish a development base in India, because it is a major IT country with many talented engineers, as evidenced by the fact that cutting-edge mobile payments can continue to be generated.
OUR VISION IS UNLIMITED
Job Description
PayPay is looking for an experienced Cloud-Based AI and ML Engineer.
This role involves leveraging cloud-based AI/ML Services to build infrastructure as well as developing, deploying, and maintaining ML models, collaborating with cross-functional teams, and ensuring scalable and efficient AI solutions particularly on Amazon Web Services (AWS).
Main Responsibilities
1. Cloud Infrastructure Management :
- Architect and maintain cloud infrastructure for AI/ML projects using AWS tools.
- Implement best practices for security, cost management, and high-availability.
- Monitor and manage cloud resources to ensure seamless operation of ML services.
2. Model Development and Deployment :
- Design, develop, and deploy machine learning models using AWS services such as SageMaker.
- Collaborate with data scientists and data engineers to create scalable ML workflows.
- Optimize models for performance and scalability on AWS infrastructure.
- Implement CI/CD pipelines to streamline and accelerate the model development and deployment process.
- Set up a cloud-based development environment for data engineers and data scientists to facilitate model development and exploratory data analysis
- Implement monitoring, logging, and observability to streamline operations and ensure efficient management of models deployed in production.
3. Data Management :
- Work with structured and unstructured data to train robust ML models.
- Use AWS data storage and processing services like S3, RDS, Redshift, or DynamoDB.
- Ensure data integrity and compliance with set Security regulations and standards.
4. Collaboration and Communication :
- Collaborate with cross-functional teams including DevOps, Data Engineering, and Product Management teams.
- Communicate technical concepts effectively to non-technical stakeholders.
5. Continuous Improvement and Innovation :
- Stay updated with the latest advancements in AI/ML technologies and AWS services.
- Provide through Automations means for developers to easily develop and deploy their AI/ML models on AWS.
Tech Stack
- AWS:
- VPC, EC2, ECS, EKS, Lambda, MWAA, RDS, ElastiCache, DynamoDB, Opensearch, S3, CloudWatch, Cognito, SQS, KMS, Secret Manager, KMS, MSK,Amazon Kinesis, CodeCommit, CodeBuild, CodeDeploy, CodePipeline, AWS Lake Formation, AWS Glue, SageMaker and other AI Services.
- Terraform, Github Actions, Prometheus, Grafana, Atlantis
- OSS (Administration experience on these tools)
- Jupyter, MLFlow, Argo Workflows, Airflow
Required Skills and Experiences
- More than 5+ years of technical experience in cloud-based infrastructure with a focus on AI and ML platforms
- Extensive technical hands-on experience with computing, storage, and analytical services on AWS.
- Demonstrated skill in programming and scripting languages, including Python, Shell Scripting, Go, and Rust.
- Experience with infrastructure as code (IAC) tools in AWS, such as Terraform, CloudFormation, and CDK.
- Proficient in Linux internals and system administration.
- Experience in production level infrastructure change management and releases for business-critical systems.
- Experience in Cloud infrastructure and platform systems availability, performance and cost management.
- Strong understanding of cloud security best practices and payment industry compliance standards.
- Experience with cloud services monitoring, detection, and response, as well as performance tuning and cost control.
- Familiarity with cloud infrastructure service patching and upgrades.
- Excellent oral, written, and interpersonal communication skills.
Preferred Qualifications
- Bachelor’s degree and above in a technology related field
- Experience with other cloud service providers (e.g GCP, Azure)
- Experience with Kubernetes
- Experience with Event-Driven Architecture (Kafka preferred)
- Experience using and contributing to Open Source tools
- Experience in managing IT compliance and security risk
- Published papers / blogs / articles
- Relevant and verifiable certifications
- Bilingual in English and Japanese is nice to have, but not required. Proficiency in either language is fine.
- JLPT Level 3 or above
Remarks
*Please note that you cannot apply for PayPay (Japan-based jobs) or other positions in parallel or in duplicate.
PayPay 5 senses
- Please refer PayPay 5 senses to learn what we value at work.
Working Conditions
Employment Status
- Full Time
Office Location
- Gurugram (Wework)
※The development center requires you to work in the Gurugram office to establish the strong core team.
Apply for this job
*
indicates a required field