Production Support Lead
We are currently partnering with a leading global trading firm known for its cutting-edge real-time systems and high-stakes market operations. They are looking for a seasoned Production Operations Engineering Team Lead to oversee and elevate their production environment across APAC and globally. This is a rare chance to lead a top-tier engineering team responsible for ensuring the reliability, scalability, and performance of mission-critical, low-latency trading infrastructure.
Responsibilities:
-
Leading, mentoring, and growing a high-performing global team of production operations engineers.
-
Defining and executing strategies to maintain and improve system resilience, scalability, and performance during peak trading hours.
-
Managing global support coverage with a follow-the-sun model to guarantee 24/7 system reliability.
-
Implementing operational standards, SLAs, KPIs, and audit frameworks to maintain discipline and quality.
-
Owning the full incident lifecycle — from rapid response and root cause analysis to remediation and process improvement.
-
Driving automation and optimization initiatives that reduce manual toil and enhance system robustness.
-
Ensuring production readiness for new market expansions, infrastructure upgrades, and trading strategies.
-
Overseeing colocation environments and capacity planning to maintain redundancy and performance.
-
Contributing technically to tooling and platform enhancements that improve observability and fault tolerance.
- Representing the team in leadership forums, aligning operational priorities with broader business goals.
Experience Required:
-
An experienced and pragmatic engineering leader with a proven track record in managing production engineering or SRE teams in real-time, low-latency environments.
-
Skilled at balancing technical ownership with people leadership, thriving in high-pressure, mission-critical settings.
-
Highly organized, with experience coordinating global teams and multi-region support.
-
A strong communicator who can bridge business and technical stakeholders, providing clarity during incidents and driving collaboration.
-
Deeply knowledgeable in Linux-based distributed systems, automation (Python, Bash), CI/CD, and observability platforms.
-
Comfortable managing SLAs/SLOs and leveraging data to continuously improve system reliability.
-
Experienced in leading incident responses, with a mindset focused on prevention and long-term system stability.
-
Exposure to financial markets, trading technology, or similarly fast-paced environments is highly desirable.
What's in it for you:
-
Competitive performance-based bonus structure tied directly to team and company success.
-
A chance to work with an elite, globally distributed team at the forefront of market technology.
-
Strong emphasis on personal development, mentorship, and continuous learning.
-
Comprehensive benefits including wellness perks, regular social events, and relocation support where needed.
Aboriginal and Torres Strait Islander Peoples are encouraged to apply.
To apply please click apply or call Chane Prasongdee on 02 8289 3118 for a confidential discussion.
About the job
Contract Type: Permanent
Specialism: Technology & Digital
Focus: Infrastructure, Cloud & DevOps
Industry: IT
Salary: Negotiable
Workplace Type: Hybrid
Experience Level: Mid Management
Location: Sydney CBD
FULL_TIMEJob Reference: 0QGV38-C64A68D3
Date posted: 19 October 2025
Consultant: Chane Prasongdee
sydney technology-and-digital/infrastructure-cloud-and-devops 2025-10-20 2025-11-18 it Sydney CBD New South Wales AU 2000 Robert Walters https://www.robertwalters.com.au https://www.robertwalters.com.au/content/dam/robert-walters/global/images/logos/web-logos/square-logo.png true