Taguig City, National Capital Region (Manila), Philippines
1 day ago
Site Reliability Engineer (Supply Chain IT Operations)

Job Location

Taguig City

Job Description

Information Technology (IT) at Procter & Gamble is where business, innovation and technology integrate to build a competitive advantage for P&G. Our mission is clear -- you deliver IT to help P&G win with consumers.

Do you love implementing continuous improvement in IT solutions to drive efficiency and agility in meeting constantly evolving business needs? Then this job might be for you!

As a Site Reliability Engineer, you will be instrumental in ensuring the high availability and reliability of our digital IT products in the P&G supply chain. Your primary focus will be on enhancing system performance through faster detection, response, and resolution of issues, while also implementing strategies to prevent recurrence and reduce operational toil. You will use robust Observability and Monitoring tools, automate incident response systems, and optimize IT architecture to create a resilient and reliable infrastructure.

Responsibilities:

Implement and lead comprehensive monitoring solutions and tools to provide real-time insights into system performance, enabling proactive incident detection and ensuring accurate, actionable alerts for prompt responses.

Continuously refine monitoring strategies and develop automation scripts to address recurring issues, enhancing system visibility, resource optimization, and overall efficiency.

Establish and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to improve service quality and reliability,

Collect and share data and insights from observability tools to drive continuous improvement initiatives.

Work closely with Software Engineers, Product Teams, and Infrastructure Teams to develop and implement initiatives that enhance IT reliability.

Engage with customers to understand their needs and difficulties regarding Observability and Monitoring tools, providing exceptional support in all interactions, including communications, updates, and feedback.

Stay updated on industry trends and effective strategies in Site Reliability Engineering while continuously enhancing technical skills in system architecture, automation, cloud technologies, and operational processes.

Share knowledge and mentor team members to foster a culture of learning and professional development within the team

Lead root cause analysis efforts and implement corrective action plans in a timely manner to achieve permanent resolutions for incidents.

Oversee documentation and knowledge management efforts.

Job Qualifications

Candidates must demonstrate strong leadership in the application of technical expertise to drive business results.

We are looking for candidates who possess the following core qualities:

A Bachelor's degree in related field such as Engineering, Information Technology and Computer Science discipline, and up to 5 years experience at most.

Experience or familiarity with monitoring and observability tools (e.g., Prometheus, preferably Grafana)

Knowledge and familiarity in system administration, including Linux/Unix environments, cloud platforms (Azure is preferred, but AWS or GCP are acceptable)

Experience with configuration management tools and infrastructure-as-code frameworks (e.g., Terraform)

Proficiency in at least one programming language (e.g., Python, C#) and a background in scripting for automation tasks

Understanding of networking protocols, network infrastructures, load balancing, and DNS management

Familiarity with containerization and Orchestration Technologies (e.g., Docker, Kubernetes)

Familiarity with databases and proficiency in writing SQL queries

Understanding of best practices in security and experience with implementing secure systems

Knowledge of incident response methodologies, root cause analysis, and implementing preventive measures (ITIL and/or SRE)

Familiarity with ticketing systems and task management (preferably ServiceNow)

Problem-solving skills with ability to analyze complex issues and devise effective solutions

Learning agility as there will be new topics to learn and new spaces to understand

Communication and collaboration skills to work effectively with multi-functional teams, partners, and customers

Teamwork and interpersonal skills, with an ability to build relationships and work effectively in a collaborative environment

Operational excellence / execution skills as the work requires discipline

Preferred Skills:

Understanding or experience in Supply Chain applications and processes, documents or general data flow to understand impact of unplanned IT downtimes and impact of IT changes to business operations

About us

We produce globally recognized brands, and we grow the best business leaders in the industry. With a portfolio of trusted brands as diverse as ours, it is paramount our leaders are able to lead with courage the vast array of brands, categories and functions. We serve consumers around the world with one of the strongest portfolios of trusted, quality, leadership brands, including Always®, Ariel®, Gillette®, Head & Shoulders®, Herbal Essences®, Oral-B®, Pampers®, Pantene®, Tampax® and more. Our community includes operations in approximately 70 countries worldwide.

Visit http://www.pg.com to know more.

We are an equal opportunity employer and value diversity at our company. We do not discriminate against individuals on the basis of race, color, gender, age, national origin, religion, sexual orientation, gender identity or expression, marital status, citizenship, disability, HIV/AIDS status, or any other legally protected factor.

Job Schedule

Full time

Job Number

R000132943

Job Segmentation

Recent Grads/Entry Level (Job Segmentation)
Confirmar seu email: Enviar Email
Todos os Empregos de Procter & Gamble