You are visiting this website from:
View All Jobs

Site Reliability Engineer

Job Seekers Singapore IT IT

Job Summary

  • Singapore
  • Permanent
  • BBBH799263
  • Nov 17, 2021
  • Competitive
Job Description

M3S is looking for talented Site Reliability Engineers / Specialists on behalf of a leading IT organisation in Singapore.

As the Reliability Specialist, you will be part of the team to optimize system design and enhance platform management and develop methodology to evaluate system capacity for application as well as determine system limits, and also initiates & drives system improvement initiative to achieve business reliability target.

You will measure and optimize system performance, while improving our capabilities forward, getting ahead of customer needs, and innovating to continually improve the application & enhance reliability.

Apart from the system foundation capability, you also specialise and work as an SME for system / application specialties, in order to contribute and provide emergency response in domains which require in depth knowledge.

The Reliability Specialist works with application / infrastructure team to determine details of key performance indicator and benchmark for the management to describe the current situation and recommendation for directions and improvement.

You are also required to share knowledge of learning, collectively within and across the teams.

Role and Responsibilities

  • Provides emergency response either by being on-call or by reacting to symptoms according to monitoring and escalation when needed
  • Proposes ideas and solutions to reduce the workload by automation
  • Plan, design and execute solutions to reach specific goals agreed
  • Plan and execute configuration change operations both at application and infrastructure levels
  • Actively looks for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation
  • Improves documentation and processes all around, either in application documentation, or in runbooks, explaining the why, not stopping with the what

Requirements

  • Bachelor's degree in computer science or other highly technical, scientific discipline
  • At least 10 years' experience with infrastructure and application support, IT operations, software engineering, system tuning and capacity planning, cloud and automation.
  • Ability to program with one or more high level languages, such as Python, Java, C#, and JavaScript
  • Experience with infrastructure and application technologies like Operating Systems (Windows and Linux), networking, storage, virtualisation, Oracle & MS SQL database, .NET, Java, web and middleware software
  • Familiar with testing automation tools
  • Have a sense of urgency to deliver & iterate fast
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
  • Previous success in Software / System Operations Engineering leading small teams of 5 - 10 engineers
  • Coding experience beyond simple scripts
  • Great software engineer and able to code in resolving defects or vulnerabilities of systems
  • Use infrastructure automation tools such as Chef or Ansible to efficiently manage infrastructure
  • Implement ""Infrastructure as Code"" using Terraform and CI/CD for automation
  • Load balancing and high availability architecture of application including Proxies and CDN
  • Administer and manage high-availability, high-performance Microsoft SQL Server or Oracle cluster, web, and middleware software
  • Monitoring and Metrics in Dynatrace, ELK or eG and integrations with Dynatrace / ITSM Key, certificate, and secret management

Teiw Hui Shi (Lorren)

Morgan McKinley Pte Ltd EA Licence No: 11C5502 | EAP Registration No: R1547291

broadbean-tracking

Consultant Details

Consultant Details

Lorren Teiw
Lorren Teiw
  • Talent Sourcer | M3S
  • +65 6818 3137
  • lteiw@morganmckinley.com