Sr. Software Dev Engineer, Infrastructure Reliability Engineering
Amazon.com
We are seeking an experienced Sr. Software Development Engineer to join the Infrastructure Reliability Engineering team and make an impact reducing MTTR and improving service and infrastructure availability across Fulfillment Technology and Robotics. This role focuses on innovating in global network device monitoring, telemetry collection, and pioneering new relational monitoring tools. The role will leverage AI to build automatic detection and remediation solutions, reducing the need for human intervention during high severity events. This opportunity has scope and potential to work across orgs and influence technical direction of many different teams.
Key job responsibilities
- Lead architectural design and development of new and existing applications
- Design and implement systems that detect, correlate, and visualize service health at scale across thousands of global sites
- Drive the development of AI-powered solutions for automatic incident detection and remediation
- Create and enhance automation systems for incident response and management
- Architect solutions that integrate telemetry from diverse sources including software applications, AWS services, network paths, network devices, and end-user device fleets
- Build self-service integration capabilities for service owners
- Develop APIs and interfaces that enable automated decision-making for deployments and changes
- Mentor junior engineers and foster a culture of engineering excellence
A day in the life
Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment.
The benefits that generally apply to regular, full-time employees include:
- Medical, Dental, and Vision Coverage
- Maternity and Parental Leave Options
- Paid Time Off (PTO)
- 401(k) Plan
If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you!
At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply!
About the team
Infrastructure Reliability Engineering is a software development organization focused on building tools to improve the availability of network and service infrastructure across Amazon's global fulfillment network. The team is global and has employees in the USA and Europe.
Key job responsibilities
- Lead architectural design and development of new and existing applications
- Design and implement systems that detect, correlate, and visualize service health at scale across thousands of global sites
- Drive the development of AI-powered solutions for automatic incident detection and remediation
- Create and enhance automation systems for incident response and management
- Architect solutions that integrate telemetry from diverse sources including software applications, AWS services, network paths, network devices, and end-user device fleets
- Build self-service integration capabilities for service owners
- Develop APIs and interfaces that enable automated decision-making for deployments and changes
- Mentor junior engineers and foster a culture of engineering excellence
A day in the life
Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment.
The benefits that generally apply to regular, full-time employees include:
- Medical, Dental, and Vision Coverage
- Maternity and Parental Leave Options
- Paid Time Off (PTO)
- 401(k) Plan
If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you!
At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply!
About the team
Infrastructure Reliability Engineering is a software development organization focused on building tools to improve the availability of network and service infrastructure across Amazon's global fulfillment network. The team is global and has employees in the USA and Europe.
Confirmar seu email: Enviar Email
Todos os Empregos de Amazon.com