At Roche you can show up as yourself, embraced for the unique qualities you bring. Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted and respected for who you are, allowing you to thrive both personally and professionally. This is how we aim to prevent, stop and cure diseases and ensure everyone has access to healthcare today and for generations to come. Join Roche, where every voice matters.
The PositionRole OverviewWe are seeking a highly experienced System Reliability Engineer with deep expertise in ERP Operations Control Center (OCC) and enterprise observability architecture. In this role, you will lead the design, execution, and continuous evolution of our Monitoring, Observability, Automation and Job Management strategy, ensuring end-to-end visibility across SAP ERP, middleware, and business-critical applications.
You will bring a systems thinking approach, applying SRE principles, automation, and AI-powered diagnostics to enable self-healing, root cause analysis, and measurable reliability outcomes. This is a strategic role requiring collaboration with both technical teams and business stakeholders.
Key ResponsibilitiesArchitecture & Strategy Execution
Design, implement, and govern a comprehensive monitoring, observability, Automation and Job Management architecture across SAP (SAP S/4HANA, SAP BTP, SAP eWM, ATTP, GRC etc.), middleware , and hyper-specialized business systems.
Execute the Automation & Observability roadmap in alignment with IT strategy, business SLAs, and customer needs.
Standardize and scale monitoring patterns using SAP Focused Run, and/or Enterprise strategic tools like SAP Cloud ALM, Grafana etc.
SRE Practices & Maturity Improvement
Define and manage SLIs, SLOs, error budgets, and establish a reliability engineering culture across SAP operations.
Lead the continuous improvement of observability maturity through tooling, telemetry coverage, documentation, and team enablement.
Conduct thorough Root Cause Analysis and introduce operational best practices for proactive incident prevention.
AI, Automation & Self-Healing
Integrate AI-driven monitoring, anomaly detection, and predictive analytics for faster incident detection and auto-resolution.
Build event-driven automation pipelines for common incident scenarios using OCC guided procedures or external orchestration tools.
Enhance root cause analysis using automated correlation of system metrics, exceptions, and transaction traces.
Business KPI & Process Monitoring
Lead the setup of Business Process Monitoring for critical flows (e.g., Inter company Supply Chain, Order-to-Cash, Procure-to-Pay) to ensure performance and SLA visibility.
Define and operationalize business KPIs with dashboards and alerting tied to user experience and transaction health.
Customer Engagement & Stakeholder Collaboration
Actively engage business and technical stakeholders to gather feedback, identify pain points, and co-develop enhancements to observability capabilities.
Regularly present monitoring performance and roadmap updates to leadership and service teams.
Qualifications8+ years in SAP system architecture or basis, monitoring automation design, or SRE roles.
3+ years of experience with SAP OCC technologies (Focused Run, or Cloud ALM, Solution Manager).
Understanding of SAP S/4HANA, BTP, and middleware (e.g., mulesoft).
Proven track record in designing and scaling observability platforms and automation frameworks.
Proficiency in integration with ITSM and one or more enterprise wide observability tools (e.g., ServiceNow, Grafana, Splunk, Dynatrace, Prometheus).
Having formal certifications in SAP Solution manager or Focused Run, SAP Cloud ALM Operations is an added advantage
Strong stakeholder management, including customer-facing experience.
Excellent communication and cross-functional collaboration skills.
Passion for reliability, automation, and measurable improvement.
Who we areA healthier future drives us to innovate. Together, more than 100’000 employees across the globe are dedicated to advance science, ensuring everyone has access to healthcare today and for generations to come. Our efforts result in more than 26 million people treated with our medicines and over 30 billion tests conducted using our Diagnostics products. We empower each other to explore new possibilities, foster creativity, and keep our ambitions high, so we can deliver life-changing healthcare solutions that make a global impact.
Let’s build a healthier future, together.
Roche is an Equal Opportunity Employer.