System Reliability Engineer - Contract
About us:
Working at Tech Holding isn't just a job, it's an opportunity to be a part of something bigger. We are a full-service consulting firm that was founded on the premise of delivering predictable outcomes and high-quality solutions to our clients. Our founders and team members have industry experience and have held senior positions in a wide variety of companies – from emerging startups to large Fortune 50 firms – and we have taken our combined experiences and developed a unique approach that is supported by the principles of deep expertise, integrity, transparency, and dependability.
The Role:
As a System Reliability Engineer, you will be crucial in managing Linux and Windows environments, automating processes, and implementing robust monitoring and security practices. Your expertise will help us maintain high availability and performance across our client's systems. If you thrive on solving complex problems and optimizing systems, we want to hear from you!
Responsibilities:
- Manage, configure, and maintain Linux and Windows Server environments.
- Perform regular system updates, patches, and security configurations.
- Implement and maintain monitoring tools to track system performance, availability, and reliability.
- Analyze performance metrics and logs to identify and resolve issues proactively.
- Collaborate with stakeholders to create dashboards and alerts for proactive performance monitoring.
- Develop and maintain automation scripts for routine tasks, deployments, and incident responses.
- Use configuration management tools to ensure consistent and repeatable system setups.
- Implement and enforce security best practices for system configurations and network setups.
- Conduct regular vulnerability assessments and apply necessary patches to mitigate risks.
- Work closely with development, DevSecOps, and cloud engineering teams to support application deployments and infrastructure changes.
- Provide technical guidance and support for resolving complex system issues.
- Create and maintain detailed documentation for system configurations, procedures, and incident reports.
- Identify opportunities for process improvements and implement changes to enhance system reliability and performance.
Required Skills:
- Proficiency in managing and troubleshooting Linux (e.g., Amazon Linux, CentOS) and Windows Server systems.
- Experience with system configuration, management, and maintenance.
- Experience with automation tools such as Ansible, Puppet, or Chef.
- Familiarity with monitoring solutions such as AWS CloudWatch, Dynatrace, Datadog or similar solutions.
- Ability to analyze system performance metrics and implement optimizations.
- Experience with patch management, vulnerability assessment, and remediation.
- Proficiency in scripting languages such as Bash, Python and PowerShell for automating administrative tasks.
- Experience with version control systems like Git.
- Familiarity with AWS, specifically in managing EC2 instances, lambdas and containers.
- Familiarity with AWS System Manager features, specifically Patch Manager and Run Command
- Familiarity in incident response, troubleshooting, and performing root cause analysis.
- Familiarity with infrastructure as code (IaC) tools like Terraform or AWS CloudFormation.
Nice to have:
- Familiarity with AWS's Well-Architected principles.
- AWS Certifications:
- DevOps Engineer: Professional
- Solutions Architect: Associate
- SysOps Administrator: Associate
- Developer: Associate
What We Offer:
- Remote Work Opportunities
- Flexible Work Hours
Tech Holding is proud to be an Equal Opportunity Employer and is committed to fostering a diverse and inclusive workplace. We welcome applicants from all backgrounds and experiences, and we consider qualified applicants without regard to race, color, religion, gender, sexual orientation, gender identity, national origin, disability, veteran status, or any other legally protected characteristic. If you require accommodation in the application process, please contact our HR
Apply for this job
*
indicates a required field