You are visiting this website from:
Find Talent Find a Job

Head of Department - Cloud-Site Reliability Management (SRM)

Back to all Jobs

Job Summary

  • Singapore
  • Permanent
  • BBBH841054
  • Mar 01, 2023
  • Competitive
Job Description

This role will be leading the strategy for Cloud and Site Reliability Management from a full stack observability mindset and ensure the execution of resiliency engineering actions.

Overview: This role will be leading the strategy for Cloud and Site Reliability Management from a full stack observability mindset and ensure the execution of resiliency engineering actions to continually root out production cloud infrastructure issues.

This role will have the opportunity to guide and develop the current team to drive high engagement and output. Will also provide expert training to the technology team on the advanced reliability techniques in fulfilling national cloud projects that will become a global benchmark for healthcare.

Years of experience: at least 15 - 20 years of relevant working experience in IT, at least 10 years in leadership position

What will you do?

  • Own the Cloud infrastructure portfolio and will be leading a team of technical professionals.
  • Engage with the senior stakeholders to drive cloud initiatives & adaption.
  • Articulate and translate technical initiatives into business outcome to senior stakeholders.
  • Manage technology risk and develop technology roadmap providing guidance for adaption.
  • Realize ROI (e.g., operational efficiency & excellence) through portfolio investment.
  • Lead the continuous improvement of systems availability and performance by applying SRE practices. This involves the introduction and utilisation of tooling and mechanisms to gather, analyse and optimise performance data with automation.
  • Lead system diagnostics and real-time debugging and/or troubleshooting of unique and highly complex problems, eliminate recurring incidents and minimize impact of unavoidable incidents.
  • Develop a culture of using software / cloud engineering approach coupled with data to manage systems and achieve IT operation excellence.

The successful candidate:

  • 15 - 20 years of relevant working experience in IT, at least 10 years in leadership position
  • Experience operating in a highly complex IT operations environment with more than 2000+ virtual machines across multiple DCs and/or public cloud.
  • Advance technical experience in designing and operating IT network & security solutions (eg. ACI, NGFW, Service Chaining)
  • Experience in IT infrastructure portfolio management
  • Ability to motivate and build a high-performing team.
  • Develop relationships with stakeholders to build confidence, alignment and communicate desired purpose, goals or objectives.
  • Identify opportunities for transdisciplinary collaboration and knowledge transfer to facilitate the integration of knowledge from different disciplines.
  • Establish team effectiveness and manage partnerships to create a cooperative working environment which enables the achievement of goals.
  • Anticipate potential problems to drive a culture of continuous improvement which seeks to turn problems into opportunities across the organisation.
  • Strong analytical and multi-tasking capabilities with good interpersonal, oral & written communication skills.
  • Flexibility and adaptable in a dynamic and fast-paced environment.
  • Good project management skills.

 

Morgan McKinley Pte Ltd
EA Licence No: 11C5502
Registration No: R1331697
EAP Name: ANDALAN RVIN JAMES MURILLO