Lead Site Reliability Engineer
The Walt Disney Company
“We Power the Magic!” That’s our motto at Disney Experiences (DX). Our team creates world-class immersive digital experiences for the Company’s premier vacation brands including Disney’s Parks & Resorts worldwide, Disney Cruise Line, Aulani, a Disney Resort & Spa, and Disney Vacation Club.
We are responsible for the end-to-end digital and physical Guest experience for all technology & digital-led initiatives across the Attractions & Entertainment, Food & Beverage, Resorts & Transportation and Merchandise lines of business as well as other initiatives including MyDisneyExperience and Hey, Disney!
This role sits in the Commerce Shared Services organization within Technology & Digital for Disney Experiences. It works closely with Technical Operations and Product Delivery from across the company.
The Lead Site Reliability Engineer will report to the Manager-Site Reliability Engineer.
**About The Role & Team:**
This is a team lead role that focuses on engineering and reliability with a team of site reliability engineers. You will be responsible for coordinating the teams efforts for the portfolio of applications supported by the team. This team needs a strong mentor who can help develop and execute specific reliability plans in line with the business strategy of DX Tech and Digital.
**What You'll Do:**
+ Lead the evolution of DevOps practices within the broader team framework, guiding others in leveraging this culture to enhance observability practices.
+ Consult, design, build, and support development pipelines, automate infrastructure and operations, create telemetry for monitoring, engineer high reliability and reinforce best- practices to secure company data.
+ Expertise in systems administration skills on AWS Cloud, Docker, Kubernetes and must have extensive experience with web technologies, source control management using Nimbus, ECS, Tomcat, Harness, GitHub and GitLab.
+ Develop and advocate strategic directions for reliability, observability and recovery and bring practical knowledge on systems, network, operational excellence and application stability, security, performance, and capacity management.
+ Plan and coordinate larger efforts for the team of site reliability engineers.
+ You will be expected to stay up to date with emerging technologies so you can make informed recommendations.
+ Drive teams to consult, design, build, and support development pipelines, automate infrastructure and operations, build telemetry for monitoring, engineer high-reliability and reinforce best-practices to secure company data
**Required Qualifications:**
+ Minimum 7 years of related work experience
+ Demonstrated leadership in implementing observability principles across complex systems and environments, fostering a culture of reliability and resilience
+ Extensive experience with modern software delivery tools, including GitHub, GitLab, Harness.io, LaunchDarkly, Nimbus, Kubernetes and with optimizing workflows and ensuring seamless deployment processes
+ Outstanding communication and leadership abilities, to ensure effective growth and development of team
+ A visionary who motivates teams to excel and fosters creativity, consistently driving excellence in all endeavors
+ An advocate for a diverse and inclusive culture that encourages innovation and ensures every team member feels a sense of belonging
+ Proficient in implementing observability principles and advanced tools for system enhancement, applying expertise in major APM tools
+ Fluent in core scripting languages and advanced programming skills (Python, NodeJS, Golang), experienced with Linux, CLI's, and code editors like VS Code
+ Skilled in Source Control Management systems like GitHub and Gitlab, managing users, and repos, proficient in networking protocols, distributed systems, and container platforms (e.g., Docker, ECS)
+ Experience in cloud hosting services (AWS, Google Cloud, Azure), databases, tools, and security, with experience in CI pipelines, build tools like Jenkins, RESTful web service calls, and JSON
+ Outstanding troubleshooting methodology, including instructing new methodologies to the team and evaluating new systems and infrastructure solutions for technical feasibility against standards
**Preferred Qualifications:**
+ Leveraging AI for predictive insights, driving measurable continuous improvement in system reliability
**Required Education:**
+ Bachelor’s degree in Computer Science, Information Systems, Software, Electrical or Electronics Engineering, or comparable field of study, and/or equivalent work experience
\#DISNEYTECH
**Job ID:** 10129163
**Location:** Orlando,Florida
**Job Posting Company:** Disney Experiences
The Walt Disney Company and its Affiliated Companies are Equal Employment Opportunity employers and welcome all job seekers including individuals with disabilities and veterans with disabilities. If you have a disability and believe you need a reasonable accommodation in order to search for a job opening or apply for a position, email Candidate.Accommodations@Disney.com with your request. This email address is not for general employment inquiries or correspondence. We will only respond to those requests that are related to the accessibility of the online application system due to a disability.
Confirmar seu email: Enviar Email
Todos os Empregos de The Walt Disney Company