You are visiting this website from:
View All Jobs

Reliability Lead (Healthcare | Programming)

Job Seekers Singapore IT Others

Job Summary

  • Singapore
  • Permanent
  • BBBH811408
  • Mar 30, 2022
  • Competitive
Job Description

A leading organization in the Healthcare Technology sector, transforming healthcare through smart technology and latest innovations.

You will be part of a multi-award winning HealthTech firm. A leading organization in the Healthcare Technology sector, transforming healthcare through smart technology and latest innovations, which offer fantastic long-term stability and career growth as part of a thriving organization in a growing industry.

The Reliability Lead will support the reliability principal with senior management in strategy discussion for application & system improvement, and will also manage the reliability team.

He/She will ensure that the existing site reliability engineering (SREs) initiatives, such as monitoring availability, uplifting capability and automation are on track. He/She will also assist the Reliability Principal and Engineering Teams in reviewing the reliability program to take stock of success and challenges and refine the program. He/She will be in charge of the management reports that describe the current situation and recommend the next steps.

As Lead of the Reliability team, which consists of experienced engineers and product specialists, he/she will be coaching the engineering teams and service management teams to help them improve in application reliability with tools, monitoring, prevention activities. He/She will collaborate with the applications, incident management (IOC) and infrastructure support teams to identify and implement procedures, tools and scripts that will improve reliability and reduce downtime while improving automation.

Your Role:

  • Strive for automation either by coding it or by leading and influencing engineers to build systems that are easy to run in production.
  • Identify significant projects that result in substantial cost savings.
  • Identify changes for the production architecture from the reliability, performance and availability perspective with a data driven approach.
  • Proactively work on the efficiency and capacity planning to set clear requirements and reduce the system resources usage to make operating cost cheaper to run for all our customers.
  • Identify parts of the system that do not scale, provides immediate palliative measures and drives long term resolution of these incidents.
  • Identify Service Level Indicators (SLIs) that will align the team to meet the availability and latency objectives.
  • Know a domain really well and radiate that knowledge through recorded demos, discussions in DNA (Design and Automation) meetings, or Incident Reviews.
  • Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent the incident from ever happening again.
  • Set an example for team of SREs with positive and inclusive leadership and discussion on work.
  • Show ownership of a major part of the infrastructure.
  • De-escalate any conflicts inside the team.


  • Bachelor's degree in computer science or other highly technical, scientific discipline.
  • Ability to program (structured and OO) with one or more high level languages, such as Python, Java, C#, and JavaScript
  • Experience with infrastructure technologies like Operating Systems (Windows and Linux), networking, storage, virtualisation
  • Familiar with testing automation tools
  • Have a sense of urgency to deliver & iterate fast
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
  • Have successfully delivered large scale software application till production

Those who are keen for the role and would like to discuss the opportunity further, please click apply now or email Vincent at with your updated CV.

Only shortlisted candidates will be responded to, therefore if you do not receive a response within 14 days please accept this as notification that you have not been shortlisted.

Vincent Liew
M3S Solutions | Morgan McKinley Pte Ltd | EA Licence No: 11C5502 | EAP Licence No: R1872133


Consultant Details

Consultant Details

Vincent Liew
Vincent Liew
  • Talent Partner | M3S
  • +65 6818 3182