Software Development Manager, EC2 Infrastructure Supply Chain Security Services
Amazon.com
EC2 Infrastructure Services organization is responsible for making EC2 instances available to our customers at all times. We are a key part of what makes EC2 elastic. AI infrastructure has taken a key place in EC2 and we are building systems, services, and automation to operate this at scale.
We are seeking a software development manager to lead AI infrastructure provisioning team. This team owns delivering AWS Trainium UltraServers based capacity to our customers. The manager of the team will lead technical strategy for the team and lead the team in building software at scale to provision the UltraServers, improve the workflows, reduce dwell times, and ensure reliable, scalable delivery of AI infrastructure for Amazon's rapidly growing compute fleet.
In this role, you will translate partnership commitments into executable technical and operational plans, ensure solutions meet AWS architectural and operational standards, and provide clear, accurate communication to other leaders.
The role requires deep, hands-on technical knowledge of AWS architectures and best practices, as well as exceptional written and verbal communication skills.
Key job responsibilities
Team Leadership & People Management
* Lead and mentor a team of 8-12 Software Development Engineers focused on EC2 server provisioning and infrastructure delivery
* Conduct regular 1:1s, provide continuous feedback, and drive career development for all team members
* Recruit, hire, and onboard top engineering talent to scale the team in alignment with organizational goals
* Foster an inclusive team culture that promotes innovation, collaboration, and operational excellence
Technical Strategy & Execution
* Own the technical roadmap and delivery of EC2 provisioning systems for compute infrastructure for Trainium UltraServer Platforms
* Drive architectural decisions to scale and reliably provision and maintain Amazon's rapidly growing EC2 UltraServer fleet
* Collaborate with senior leadership and peer SDMs to align team priorities with broader EC2 organizational objectives
* Work closely with Principal Engineers and Senior SDEs to define technical standards and best practices
Operational Excellence & Innovation
* Drive continuous improvement in software development practices, including CI/CD pipelines, testing frameworks, and deployment automation
* Champion operational metrics and mechanisms to improve system reliability, reduce toil, and accelerate delivery
* Balance short-term delivery commitments with long-term technical debt reduction and system modernization
A day in the life
As a Software Development Manager, you will work with your customers, peers, and TPMs on execution of product roadmap. You will own the EC2 Supply Chain Security Services which are a suite of critical services in Data Centers and Manufacturing to vet capacity and host health. In this position, you will be managing your team and work with other SDM’s or stakeholders in designing, developing, and delivering solutions to our customers. You own supporting production issues and tickets.
About the team
The EC2 UltraServer Provisioning team is a high-performing engineering organization responsible for delivering AWS Trainium-based UltraServers infrastructure at scale. We manage end-to-end provisioning workflows from host ingestion through testing, repair, and recovery.
We are seeking a software development manager to lead AI infrastructure provisioning team. This team owns delivering AWS Trainium UltraServers based capacity to our customers. The manager of the team will lead technical strategy for the team and lead the team in building software at scale to provision the UltraServers, improve the workflows, reduce dwell times, and ensure reliable, scalable delivery of AI infrastructure for Amazon's rapidly growing compute fleet.
In this role, you will translate partnership commitments into executable technical and operational plans, ensure solutions meet AWS architectural and operational standards, and provide clear, accurate communication to other leaders.
The role requires deep, hands-on technical knowledge of AWS architectures and best practices, as well as exceptional written and verbal communication skills.
Key job responsibilities
Team Leadership & People Management
* Lead and mentor a team of 8-12 Software Development Engineers focused on EC2 server provisioning and infrastructure delivery
* Conduct regular 1:1s, provide continuous feedback, and drive career development for all team members
* Recruit, hire, and onboard top engineering talent to scale the team in alignment with organizational goals
* Foster an inclusive team culture that promotes innovation, collaboration, and operational excellence
Technical Strategy & Execution
* Own the technical roadmap and delivery of EC2 provisioning systems for compute infrastructure for Trainium UltraServer Platforms
* Drive architectural decisions to scale and reliably provision and maintain Amazon's rapidly growing EC2 UltraServer fleet
* Collaborate with senior leadership and peer SDMs to align team priorities with broader EC2 organizational objectives
* Work closely with Principal Engineers and Senior SDEs to define technical standards and best practices
Operational Excellence & Innovation
* Drive continuous improvement in software development practices, including CI/CD pipelines, testing frameworks, and deployment automation
* Champion operational metrics and mechanisms to improve system reliability, reduce toil, and accelerate delivery
* Balance short-term delivery commitments with long-term technical debt reduction and system modernization
A day in the life
As a Software Development Manager, you will work with your customers, peers, and TPMs on execution of product roadmap. You will own the EC2 Supply Chain Security Services which are a suite of critical services in Data Centers and Manufacturing to vet capacity and host health. In this position, you will be managing your team and work with other SDM’s or stakeholders in designing, developing, and delivering solutions to our customers. You own supporting production issues and tickets.
About the team
The EC2 UltraServer Provisioning team is a high-performing engineering organization responsible for delivering AWS Trainium-based UltraServers infrastructure at scale. We manage end-to-end provisioning workflows from host ingestion through testing, repair, and recovery.
Confirmar seu email: Enviar Email