Senior Platform Support Engineer / Site Reliability Engineer
Sydney
Full time
$160-190k
Hybrid
About the Opportunity
We are working with a leading financial services client who are looking for an experienced Production Support Engineer / Site Reliability Engineer to join a team responsible for supporting highly available, business-critical transaction processing platforms.
This role sits within a modern engineering environment supporting real-time distributed systems that process high volumes of transactions. The successful candidate will play a key role in ensuring platform stability, reliability, performance and operational excellence.
The position is ideal for engineers who thrive in fast-paced production environments, enjoy troubleshooting complex issues and are passionate about automation and continuous improvement.
Key Responsibilities
- Support and maintain mission-critical production platforms operating in a 24x7 environment
- Monitor platform health, performance and availability across distributed systems
- Respond to and manage production incidents, ensuring timely resolution and effective stakeholder communication
- Conduct root cause analysis and drive permanent fixes to recurring issues
- Partner with engineering teams to improve system reliability, scalability and resilience
- Develop automation and tooling to reduce operational overhead and improve efficiency
- Investigate transaction flows, application behaviour and data issues using SQL and system-level analysis
- Contribute to platform observability, monitoring and alerting improvements
Required Experience
- 5+ years' experience in Production Support, Site Reliability Engineering, Platform Engineering or Application Support roles
- Experience supporting high-availability, business-critical systems
- Strong troubleshooting and incident management capability
- Solid Linux administration and system-level debugging skills
- Experience supporting distributed or real-time processing environments
- Hands-on experience with monitoring and observability platforms such as Splunk, Grafana, ELK or Prometheus
- Scripting or automation experience using Python, Java, Shell or similar technologies
- Strong SQL skills and experience performing data investigations
- Excellent communication and stakeholder management skills
Preferred Experience
Experience within one or more of the following environments would be highly regarded:
- Financial Markets
- Electronic Trading
- Payments Platforms
- FinTech
- Transaction Processing Systems
- Large-scale Platform Engineering
- Cloud Infrastructure and SRE Teams
Desirable Skills
- Kubernetes or OpenShift
- Cloud-native technologies
- Security and key management concepts
- Experience supporting transaction-intensive systems
- Exposure to digital asset, blockchain or distributed ledger technologies
What We're Looking For
The ideal candidate will:
- Have supported complex production environments under demanding operational conditions
- Demonstrate strong analytical and problem-solving capability
- Be confident leading investigations into critical incidents
- Take ownership of platform reliability and service availability
- Have a passion for automation and operational excellence
- Communicate effectively with both technical and non-technical stakeholders
This is an excellent opportunity to work on modern, high-throughput platforms where reliability, performance and operational excellence are critical to business success.
If this is of interest, get in touch:
