Back to jobs
Site Reliability Engineer (SRE)
Latam
Job Description:
Incident Response:
- Respond promptly to system alerts and incidents.
- Assist in diagnosing and resolving system outages or performance issues.
Automation:
- Develop and maintain automation scripts using languages such as JavaScript, Rust, Python, and Bash.
- Implement automated monitoring and alerting solutions to improve operational efficiency.
System Maintenance:
- Perform regular system maintenance tasks, including updates and patches.
- Assist in capacity planning and scaling efforts to ensure system reliability.
Collaboration:
- Work closely with senior engineers and other technical teams to improve system reliability and performance.
- Participate in team meetings and contribute to discussions on system enhancements.
Documentation:
- Document incident reports, troubleshooting steps, and resolution outcomes.
- Maintain accurate and up-to-date documentation for systems and processes.
Performance Tuning:
- Assist in optimizing system and application performance.
- Analyze performance data to identify areas for improvement.
Learning and Development:
- Continuously learn and adopt new tools, technologies, and best practices in site reliability engineering.
- Attend training sessions, workshops, and other learning opportunities as needed.
Security:
- Implement and maintain security best practices for systems and applications.
- Assist in security audits and vulnerability assessments.
Qualifications:
- Basic understanding of cloud services, primarily AWS.
- Proficiency in scripting languages such as JavaScript, Rust, Python, and Bash.
- Familiarity with containerization and orchestration tools like Docker and Kubernetes.
- Experience with monitoring tools, including Prometheus, Grafana, and Datadog.
- Strong problem-solving skills and attention to detail.
- Effective communication skills for collaboration and documentation.
Additional Requirements:
- Full-time availability, aligned with Arkansas time zone.
- Recommendation letters will be requested.
This role offers a dynamic environment to grow your skills in site reliability engineering while contributing to mission-critical systems. If you have a passion for maintaining high-performing systems and enjoy problem-solving in a collaborative team, we encourage you to apply.
Benefits:
- 100% remote work
- Workspace setup bonus $400 USD gross
- Bonuses in September and December, $300 USD gross each
- Vacation bonus $500 USD gross
- Floating days (Ex: Birthday off)
- 20 working days of vacation after the first year of employment
- English Online Training
Additional Programs:
- 5% sales commissions if you bring a sales opportunity, and the opportunity is won.
- USD $500 if you bring a new employee using our referral program.
Apply for this job
*
indicates a required field