Job Summary
The RoleAre you passionate about improving and ensuring the resiliency of technology? Do you get your energy by providing technology solutions working with a team? We are seeking an experienced Production Support Engineer (SRE) who is curious and drives insights from massive-scale data in real-time. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to investigate and assist with resolving recurring and major issues and help improve the performance of our supported applications. This role requires on-call rotation.
Job Description
What is the Opportunity?
We are seeking a highly skilled and motivated Level 2 Platform Support and SRE Specialist to join our team in Toronto. This role combines platform support expertise with Site Reliability Engineering (SRE) principles to ensure the stability, scalability, and reliability of our systems. The ideal candidate will bring strong troubleshooting skills, a passion for automation, and the ability to collaborate across teams to improve platform performance and operational efficiency.
What Will You Do?
Platform Support (Level 2):
- Provide second-level support for critical platform-related incidents, ensuring timely resolution and minimal downtime.
- Monitor, investigate, and resolve platform issues using tools such as Grafana, Prometheus, Dynatrace, Splunk, Datadog, or similar monitoring solutions.
- Act as an escalation point for Level 1 support teams, providing mentorship and guidance as needed.
- Maintain and update runbooks, knowledge bases, and support documentation to streamline issue resolution.
- Manage and coordinate incident response activities to ensure efficient communication, resolution, and post-incident follow-ups.
Site Reliability Engineering (SRE):
- Develop and implement automation scripts and tools to reduce manual operational tasks and improve system reliability.
- Collaborate with development teams to identify and address system bottlenecks, inefficiencies, and vulnerabilities.
- Utilize Infrastructure-as-Code (IaC) tools (e.g., Terraform, Ansible) to manage and provision infrastructure.
- Design and maintain CI/CD pipelines to ensure smooth and reliable software deployments.
- Participate in post-mortem analyses to identify root causes and implement preventive measures.
Collaboration and Continuous Improvement:
- Work closely with cross-functional teams, including DevOps, development, and on-call support teams, to enhance system performance.
- Advocate for SRE best practices, including observability, monitoring, and incident management.
- Drive continuous improvement initiatives by identifying opportunities to optimize platform operations and reduce Mean Time to Recovery (MTTR).
Technical Skills
- Proficiency in scripting languages (e.g., Python, Bash, PowerShell).
- Hands-on experience with monitoring tools (e.g., Splunk, Prometheus, Grafana).
- Familiarity with cloud platforms (e.g., OCP, Kubernetes, AWS, Azure).
- Strong understanding of CI/CD pipelines and related tools (e.g., Jenkins, GitLab, GitHub Actions).
Soft Skills:
-Strong communication skills, ability to work collaboratively in a team environment, and a proactive mindset.
What Do You Need To Succeed?
Must Have
Experience: 3-7 years of experience in platform/application support, with at least 2 years of exposure to SRE or DevOps practices.
Nice to Have:
- Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Understanding of the Five Pillars of Cloud
- Experience with incident management tools (e.g., PagerDuty, Opsgenie).
- Knowledge of ITIL processes and best practices.
- Previous experience in a hybrid cloud environment or large-scale distributed systems.
What’s in it for you?
We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.
A comprehensive Total Rewards Program including bonuses and flexible benefits
Leaders who support your development through coaching and managing opportunities
Ability to make a difference and lasting impact
Work in a dynamic, collaborative, progressive, and high-performing team
Flexible work/life balance options
Job Skills
Additional Job Details
Address:
WATERFRONT CENTRE, 200 BURRARD ST:VANCOUVERCity:
VANCOUVERCountry:
CanadaWork hours/week:
37.5Employment Type:
Full timePlatform:
WEALTH MANAGEMENTJob Type:
RegularPay Type:
SalariedPosted Date:
2025-05-13Application Deadline:
2025-08-31Note: Applications will be accepted until 11:59 PM on the day prior to the application deadline date above
Inclusion and Equal Opportunity Employment
At RBC, we believe an inclusive workplace that has diverse perspectives is core to our continued growth as one of the largest and most successful banks in the world. Maintaining a workplace where our employees feel supported to perform at their best, effectively collaborate, drive innovation, and grow professionally helps to bring our Purpose to life and create value for our clients and communities. RBC strives to deliver this through policies and programs intended to foster a workplace based on respect, belonging and opportunity for all.
Join our Talent Community
Stay in-the-know about great career opportunities at RBC. Sign up and get customized info on our latest jobs, career tips and Recruitment events that matter to you.
Expand your limits and create a new future together at RBC. Find out how we use our passion and drive to enhance the well-being of our clients and communities at jobs.rbc.com.