Operations Manager, Leo Network Operations Center
Amazon.com
Amazon Leo is establishing a 24/7 Network Operations Center (NOC) to provide proactive monitoring and rapid incident response for Leo's satellite network service. We are seeking an experienced Operations Manager to lead the U.S.-based NOC team in Redmond, Washington as part of our geographically distributed operations supporting the Leo program.
This role will manage a team of approximately 10 Support Engineers and Cloud Support Engineers providing 24/7 coverage, responsible for continuous monitoring of the Leo network service and rapid incident response. You will work closely with the Sr SDM and your counterpart in London to ensure seamless global operations.
Export Control Requirement: Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum.
Key job responsibilities
Team Leadership & Development:
- Lead and develop a team of 10 Support Engineers and Cloud Support Engineers in Redmond, Washington
- Manage 24/7 shift operations to provide continuous coverage
24/7 Network Operations:
- Oversee continuous monitoring of Leo network health at spot level (groups of customer terminals) and regional aggregations
- Ensure the team performs initial triage, documents incidents, and manage incident response workflows through resolution
- Coordinate with subject matter expert for complex issues requiring specialized technical expertise
- Maintain communication with stakeholders during active incidents and provide status updates
Operational Excellence:
- Implement and refine Standard Operating Procedures (SOPs) for incident response, escalation, monitoring, and shift operations
- Drive adherence to established runbooks and troubleshooting guides
- Ensure proper ticket lifecycle management and documentation standards
- Conduct shift handoff procedures and knowledge transfer protocols
- Lead post-incident reviews to capture lessons learned and identify improvement opportunities
Monitoring & Detection:
- Oversee team use of observability tools including Grafana dashboards
- Monitor alarm systems for spot-level outages and ensure timely response
- Review dashboards for anomalies, trends, and performance degradation
Cross-Functional Collaboration:
- Partner with London Operations Manager to ensure seamless 24/7 global coverage
- Collaborate with Mission Operations, Customer Service Agents (CSAs), and Business Customer Experience (BCX) teams
- Work with engineering teams to identify automation opportunities and improve observability
Metrics & Continuous Improvement:
- Track and report on key performance indicators including time-to-detection and time-to-resolution
- Identify trends in incident types and work with engineering to prevent recurrence
Work Schedule & Travel:
- This role requires flexibility to support 24/7 operations, including occasional off-hours support during major incidents
- Primary work location: Redmond, Washington
- Occasional travel to London and other operational sites (estimated 10-15%)
- May require participation in on-call rotation for management escalations
A day in the life
The Operations Manager starts each day reviewing overnight incidents and ensuring smooth shift transitions. You'll participate in daily standups with your team, review monitoring dashboards for trends, and coordinate with the London team on handoffs. Throughout the day, you'll provide guidance on active incidents, coach team members on troubleshooting techniques, and work on process improvements. You'll attend business reviews, collaborate with engineering teams on automation opportunities, and ensure your team has the tools and training needed to succeed. When major incidents occur, you'll coordinate response efforts and ensure proper escalation to subject matter experts.
About the team
The Leo Network Operations Center is a new, strategic function within the Leo organization. As part of the U.S. team, you'll help build the operational foundation for Leo's satellite network service from the ground up. You'll work with observability tools, collaborate with expert engineering teams, and operate at unprecedented scale. This is an opportunity to establish best practices, develop a high-performing team, and play a critical role in delivering low-latency, high-speed broadband connectivity to unserved and underserved communities around the world. The NOC team works closely with Mission Operations and maintains the health and performance of the Leo network through proactive monitoring and rapid incident response.
This role will manage a team of approximately 10 Support Engineers and Cloud Support Engineers providing 24/7 coverage, responsible for continuous monitoring of the Leo network service and rapid incident response. You will work closely with the Sr SDM and your counterpart in London to ensure seamless global operations.
Export Control Requirement: Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum.
Key job responsibilities
Team Leadership & Development:
- Lead and develop a team of 10 Support Engineers and Cloud Support Engineers in Redmond, Washington
- Manage 24/7 shift operations to provide continuous coverage
24/7 Network Operations:
- Oversee continuous monitoring of Leo network health at spot level (groups of customer terminals) and regional aggregations
- Ensure the team performs initial triage, documents incidents, and manage incident response workflows through resolution
- Coordinate with subject matter expert for complex issues requiring specialized technical expertise
- Maintain communication with stakeholders during active incidents and provide status updates
Operational Excellence:
- Implement and refine Standard Operating Procedures (SOPs) for incident response, escalation, monitoring, and shift operations
- Drive adherence to established runbooks and troubleshooting guides
- Ensure proper ticket lifecycle management and documentation standards
- Conduct shift handoff procedures and knowledge transfer protocols
- Lead post-incident reviews to capture lessons learned and identify improvement opportunities
Monitoring & Detection:
- Oversee team use of observability tools including Grafana dashboards
- Monitor alarm systems for spot-level outages and ensure timely response
- Review dashboards for anomalies, trends, and performance degradation
Cross-Functional Collaboration:
- Partner with London Operations Manager to ensure seamless 24/7 global coverage
- Collaborate with Mission Operations, Customer Service Agents (CSAs), and Business Customer Experience (BCX) teams
- Work with engineering teams to identify automation opportunities and improve observability
Metrics & Continuous Improvement:
- Track and report on key performance indicators including time-to-detection and time-to-resolution
- Identify trends in incident types and work with engineering to prevent recurrence
Work Schedule & Travel:
- This role requires flexibility to support 24/7 operations, including occasional off-hours support during major incidents
- Primary work location: Redmond, Washington
- Occasional travel to London and other operational sites (estimated 10-15%)
- May require participation in on-call rotation for management escalations
A day in the life
The Operations Manager starts each day reviewing overnight incidents and ensuring smooth shift transitions. You'll participate in daily standups with your team, review monitoring dashboards for trends, and coordinate with the London team on handoffs. Throughout the day, you'll provide guidance on active incidents, coach team members on troubleshooting techniques, and work on process improvements. You'll attend business reviews, collaborate with engineering teams on automation opportunities, and ensure your team has the tools and training needed to succeed. When major incidents occur, you'll coordinate response efforts and ensure proper escalation to subject matter experts.
About the team
The Leo Network Operations Center is a new, strategic function within the Leo organization. As part of the U.S. team, you'll help build the operational foundation for Leo's satellite network service from the ground up. You'll work with observability tools, collaborate with expert engineering teams, and operate at unprecedented scale. This is an opportunity to establish best practices, develop a high-performing team, and play a critical role in delivering low-latency, high-speed broadband connectivity to unserved and underserved communities around the world. The NOC team works closely with Mission Operations and maintains the health and performance of the Leo network through proactive monitoring and rapid incident response.
Confirmar seu email: Enviar Email