Lead SRE
Serve as the production readiness steward for supported platforms
Partner closely with development teams to design, build, implement, and support reliable technology services
Confidential / Internal Use Only
Drive SRE best practices across availability, capacity, performance, monitoring, self healing, and deployment automation
Lead DevOps transformation through tooling, standards, and advocacy across development, quality, release, and product organizations
Tech Skills:
Automation: Ansible (Playbook), Jenkins, AI, XLR
Bitbucket, GitHub, Open Telemetry (OTel)
Networking experience
Monitoring: Splunk, Dynatrace ,
Unix, Shell Scripting, SQL, Python, Apache Nifi,
All About You
BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience. Experience with algorithms, data structures, scripting, pipeline management, and software design. Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive. Ability to help debug and optimize code and automate routine tasks. We support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed. Experience in one or more of the following is preferred: C, C++, Java, Python, Go, Perl or Ruby. Interest in designing, analyzing and troubleshooting large-scale distributed systems. We need team members with an appetite for change and pushing the boundaries of what can be done with automation. Experience in working across development, operations, and product teams to prioritize needs and to build relationships is a must.
Experience in industry standard CI/CD tools like Git/BitBucket, Jenkins, Maven, Artifactory, and Chef. Experience designing and implementing an effective and efficient CI/CD flow that gets code from dev to prod with high quality and minimal manual effort is desired.
The role of business operations is to be the production readiness steward for the platform. This is accomplished by closely partnering with developers to design, build, implement, and support technology services. A business operations engineer will ensure operational criteria like system availability, capacity, performance, monitoring, self-healing, and deployment automation are implemented throughout the delivery process.
Senior:
Automation: Ansible ( Playbook), Jenkins, AI, XLR
Bitbucket, Github, OpenTelemetry (OTel)
Monitoring : Splunk, Dynatrace ,
Work in incident, WO, PBI and Weekend support.
Good in communication and handle TRT/Incident independently.
Unix, Shell Scripting, Python
