Modern IT Operations - SRE Support engineer
PepsiCo
Overview We are looking for a self-driven, software engineering mindset SRE support engineer enabling an SRE-driven orchestration of all components of the end2end ecosystem & preemptively diagnosing anomalies and remediating through automation. The SRE support engineer is integral part of the global team with its main purpose to provide a delightful customer experience for the user of the global consumer, commercial, supply chain and enablement functions in the PepsiCo digital products application portfolio of 260+ applications, enabling a full SRE Practice incident prevention / proactive resolution model. The scope of this role is focussed on the Modern architected application portfolio, B2B pepsiconnect and Direct to Customer and other S&T roadmap applications. Ensures that PepsiCo DPA applications service performance,reliability and availability expected by our customers and internal groups It requires a blend of technical expertise on SRE tools, modern applications arhictecture, IT operations experience, and analytics & influence skills. Responsibilities Reporting directly to the SRE & Modern Operations Associate Director, is responsible to enable & execute the pre-emptive diagnosis of PepsiCo applications towards service performance, reliability and availability expected by our customers and internal groups Responsible as pro-active support engineer, diagnosing any anomalies prior to any user and driving the necessary remediations across the teams involved. Develop / leverage aggregation correlation solutions that integrates events across all eco system component of the modern architecture solution and comes up with insights to continuously improve the user journey and order flow experience collaborating with software engineering teams. Drive incident response, root cause analysis (RCA), and post-mortem processes to ensure continuous improvement. Develop and maintain robust monitoring, alerting, and observability frameworks using tools like Grafana, ELK, etc. Collaborate with product and engineering teams during the design and development phases to embed reliability and operability into new services. Participate in architecture reviews and provide SRE input on scalability, fault tolerance, and deployment strategies. Define and implement SLOs/SLIs for new services before they go live, ensuring alignment with business objectives. Work closely with customer facing support teams to evolve & empower them with SRE insights Participate in on-call support and orchestrating blameless post-mortems and encourage the practice within the organization Provides inputs to the definition, collection and analysis of data relevant products systems and their interactions towards business process resiliency especially related impacting customer satisfaction, Actively engage and drive AI Ops adoption across teams Qualifications 7-11 years of work experience evolving to a SRE engineer with 3-5 years of experience in continuously improving and transforming IT operations ways of working Bachelor’s degree in Computer Science, Information Technology or a related field The ideal Engineer will be highly quantitative, have great judgment, able to connect dots across ecosytems, and efficiently work cross-functionally across teams to ensure SRE orchestrating solutions are meeting customer/end-user expectations The candidate will take a pragmatic approach resolving incidents, including the ability to systemically triangulate root causes and work effectively with external and internal teams to meet objectives. A firm understanding of SRE (Software Reliability Engineering) and IT Service Management (ITSM) processes with a track record for improving service offerings – pro-actively resolving incidents, providing a seamless customer/end-user experience and proactively identifying and mitigating areas of risk. Proven experience as an SRE in designing the events diagnostics, performance measures and alert solutions to meet the SLA/SLO/SLIs. Hands on experience in Python, SQL, relational or non-relational DBs, AppDynamics, Grafana, Splunk, Dynatrace, or other SRE Ops toolsets. Deep hands-on technical expertise, excellent verbal and written communication skills Differentiating Competencies Driving for Results: Demonstrates perseverance and resilience in the pursuit of goals. Confronts and works to resolve tough issues. Exhibits a “can-do” attitude and a willingness to take on significant challenges Decision Making: Quickly analyses complex problems to find actionable, pragmatic solutions. Sees connections in data, events, trends, etc. Consistently works against the right priorities Collaborating: Collaborates well with others to deliver results. Keeps others informed so there are no unnecessary surprises. Effectively listens to and understands what other people are saying. Communicating and Influencing: Ability to build convincing, persuasive, and logical storyboards. Strong executive presence. Able to communicate effectively and succinctly, both verbally and on paper. Motivating and Inspiring Others: Demonstrates a sense of passion, enjoyment, and pride about their work. Demonstrates a positive attitude in the workplace. Embraces and adapts well to change. Creates a work environment that makes work rewarding and enjoyable.
Confirmar seu email: Enviar Email