Site Reliability Engineer (SRE)
Oracle
Site Reliability Engineer
Job Description:
Are you passionate about solving complex distributed systems challenges at scale? Join Oracle as a Site Reliability Engineer and help shape the reliability, scalability, and performance of Oracle Cloud Infrastructure (OCI). As part of the Site Reliability Engineering (SRE) team, you’ll contribute to designing, automating, and evolving mission-critical systems. You'll combine deep systems expertise with modern software engineering practices to reduce operational toil and build resilient, self-healing services.
This is a high-impact role where your work directly affects the reliability of cloud services used by thousands of customers around the world.
What We’re Looking For:
Advanced Linux systems administration Strong coding skills in Python (automation-focused) Intermediate experience with Bash/Shell scripting Familiarity with networking principles and distributed systems behavior Basic to intermediate knowledge of databases (e.g., SQL, NoSQL) Understanding of unit testing and modern software engineering practices Experience with CI/CD pipelines and deployment automation Comfortable working in Agile development environmentsNice to Have:
Exposure to monitoring/observability tools (e.g., Prometheus, Grafana, New Relic) Experience building internal tools for operational efficiency Participation in SRE culture: blameless postmortems, runbooks, and service design reviews
Confirmar seu email: Enviar Email