Filled Positions

Thankz Hero

Senior Site Reliability Engineer

Are you looking to hire?

Thankz offers a range of outstanding Senior Site Reliability Engineer candidates. If you're searching for top talent in this field or a similar position, our team can find the ideal person who meets your specific needs and requirements.

As a Senior Site Reliability Engineer, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems. Your expertise in infrastructure automation, incident response, and performance monitoring will contribute to maintaining and improving our platform. 

What you'll be doing 

  • Designing, building, and maintaining scalable and reliable infrastructure systems 
  • Implementing automation tools and processes to streamline operations and improve efficiency 
  • Collaborating with cross-functional teams to identify and resolve system issues and performance bottlenecks 
  • Monitoring system performance and conducting proactive capacity planning 
  • Participating in incident response and performing root cause analysis to prevent future incidents 
  • Implementing and maintaining robust security measures to protect the system and data 
  • Continuously evaluating and optimizing system performance and resource utilization 
  • Contributing to the development of best practices and standards for site reliability engineering 
  • Mentoring and providing guidance to junior members of the team 

Requirements 

  • Bachelor's degree in Computer Science or a related field 
  • Proven experience as a Site Reliability Engineer or similar role 
  • C1/C2 English Level proficiency (both written and spoken English)  
  • Strong knowledge of cloud infrastructure (AWS, Azure, or GCP) and containerization technologies 
  • Proficiency in infrastructure automation tools (e.g., Terraform, Ansible) and scripting languages (e.g., Python, Bash) 
  • Experience with monitoring and log aggregation tools (e.g., Prometheus, ELK stack) 
  • Knowledge of incident management and incident response processes 
  • Familiarity with networking protocols and security principles 
  • Strong problem-solving and troubleshooting skills 

Preferred candidates should have experience working in remote or distributed teams. Certification in cloud technologies and site reliability engineering is a plus. Strong leadership and communication skills, along with the ability to mentor and guide junior team members, are also desirable. 

We offer a full-time, US-hours remote job, 40-hour workweek Mon-Fri, with excellent prospects for long-term growth for an ambitious experienced Senior Site Reliability Engineer. We can offer HMO and other benefits to Philippine candidates.