Multi-Cloud & Hybrid Infrastructure: Maintain and manage enterprise-scale resources across Azure and Alibaba Cloud, alongside hybrid and on-premises environments, to ensure high availability, performance, and security.
Site Reliability & SRE: Support continuous business operations by maintaining robust system stability, conducting first-level problem diagnosis, and improving overall system reliability.
End-to-End Automation & DevOps: Implement and manage automation tools-specifically Azure DevOps, Terraform, and Ansible Automation Platform-to handle automated cloud provisioning (via Terraform/ServiceNow) and E2E automation for BAU tasks and incident remediation (via Ansible/Datadog).
Enterprise Platform Operations: Oversee foundational systems, including servers, databases, SAN storage, data protection, and disaster recovery, while handling routine health checks, housekeeping jobs, and disaster recovery drills.
Documentation & Governance: Author and maintain technical documentation, runbooks, architecture diagrams, and automation standards.
Operational Flexibility: Provide critical after-hours support when required and perform additional tasks assigned by the team leader.
Education & Experience: Degree or diploma in Computer Science, IT, or an equivalent field, paired with 5+ years of experience in cloud engineering, automation, or platform operations.
Core Technical Skills: Proficiency in multi-cloud platforms (Azure/Alibaba Cloud) and core infrastructure administration (Windows, Linux, MS SQL DB, and Networking).
Automation & Tooling Expertise: Hands-on mastery of Azure DevOps, Terraform, Ansible Automation Platform, and scripting languages like PowerShell, Bash, and Python.
Preferred Knowledge: Familiarity with ServiceNow catalogs/workflows, Datadog monitoring, and event-driven Ansible automation.
Certifications (Advantageous): IT professional credentials such as Azure Administrator Associate, Terraform Associate, or Alibaba Cloud ACA/ACP.
Language: Professional proficiency in both spoken and written English.
