
DevOps Engineer
New York
ROLE
We are looking for an Linux and Devops Engineer to architect and deploy a large-scale solution tailored for a group of systematic trading teams.
RESPONSIBILTIES
- Design and ensure a consistent environment exists across trading team environments (e.g. OS version, installed packages, mounts, SSH keys, bashrc, etc.)
- Build and maintain external package management and tools (e.g. GCC, , conda, etc.)
- Design and maintain framework for production deployment. (e.g. GIT tagging and release, rollback, communication of changes, service deployment, tools to start-stop-restart services)
- Production monitoring and alerting of applications and trading platform (e.g. Nagios scripts, resilient alerting, alerting API’s, web dashboards)
- QA deployment
- Design and implement continuous integration frameworks (e.g. Jenkins setup and customization, pre-commit GIT hooks, etc.)
- Backups, data archiving, and data organization of applications, code and logs
- Database administration and configuration (PostgreSQL, MongoDB, Cassandra, MySQL, etc.)
- Maintain and update the platform, ensuring its stability, robustness, and security
- Design and implement Cluster/Cloud computing infrastructure framework that can be leveraged by every trading team
- Troubleshoot and resolve any systems related issues and handle the release of code fixes and enhancements
- Evaluate new compute and GPU hardware platforms and management
REQUIREMENTS
- Strong working knowledge with Linux operating systems (RHEL8, RHEL9)
- Troubleshooting Complex OS and kernel related issues
- Linux OS Builds and Patch management
- Good knowledge of LDAP/AD auth using Kerberos.
- Package Management (e.g. DNF, Yum, Satellite, Foreman)
- System performance turning (e.g. Low latency, kernel params)
- Experience with Storage and protocols (e.g. NFS, CIFS, SMB, S3, XFS, ZFS, LVM, RAID)
- Configuration management (e.g. Ansible, Puppet/Chef)
- Monitoring, logging, reporting tools (e.g. Grafana, Datadog, ELK, MongoDB, Redis)
- Setup and management of Compute Grid (HTCondor, Rafay, Slurm)
- 3+ years of experience with:
- Reading and writing code and scripts in Python and Bash
- Build automation (e.g. Jenkins, Bamboo, Pipelines)
- Release automation (e.g. GIT)
-
- Monitoring, logging, reporting tools (e.g. Grafana, Datadog, ELK, MongoDB, Redis)
- Managing backups, disaster recovery
- Database admin, configuration and maintenance (e.g. MySQL, PostgreSQL)
- Cloud compute deployment (e.g. AWS, Google)
- Platform as a Service (PaaS) and containerization (Kubernetes, Docker, OCP)
- Automation and configuration management using Puppet, Chef, or Ansible
- Ability to resolve problems in a timely, effective, and professional manner
- Excellent written and verbal communication skills
- Experience with Big Data capture and analytics pipelines is a plus
- Experience with web server configuration and deployment is a plus
- Commitment to the highest ethical standards
Apply for this job
*
indicates a required field