Site Reliability Professional (DB2 LUW)
IBM
**Introduction**
At IBM Software, we transform client challenges into solutions. Building the world’s leading AI-powered, cloud-native products that shape the future of business and society. Our legacy of innovation creates endless opportunities for IBMers to learn, grow, and make an impact on a global scale. Working in Software means joining a team fueled by curiosity and collaboration. You’ll work with diverse technologies, partners, and industries to design, develop, and deliver solutions that power digital transformation. With a culture that values innovation, growth, and continuous learning, IBM Software places you at the heart of IBM’s product and technology landscape. Here, you’ll have the tools and opportunities to advance your career while creating software that changes the world.
**Your role and responsibilities**
We are looking for an IBM DB2 LUW Database Reliability Professional to design, build, and operate highly available and resilient database systems supporting business-critical applications. The ideal candidate combines deep expertise in IBM DB2 LUW with strong SRE principles, focusing on automation, observability, and performance optimization across hybrid and IBM Cloud environments.
* Manage and optimize DB2 LUW instances across multiple environments (dev/test/prod) with a focus on availability, scalability, and performance.
* Implement Site Reliability Engineering (SRE) practices and proactive monitoring for database platforms.
* Automate database provisioning, configuration, and maintenance tasks using Ansible and shell scripting.
* Design and maintain HA/DR configurations (HADR, Pacemaker, Q-Replication, etc.) ensuring zero data loss and minimal downtime.
* Build and operate DB2 on IBM Cloud and other hybrid cloud platforms, ensuring compliance with security and performance standards.
* Integrate DB2 operational metrics into observability stacks (e.g., Instana, Prometheus, Grafana).
* Conduct performance tuning, query optimization, and capacity planning to meet SLAs and prevent incidents.
* Support incident response, root cause analysis (RCA), and continuous improvement efforts to enhance system reliability.
* Maintain comprehensive runbooks, automated playbooks, and operational documentation for all database services.
**Required technical and professional expertise**
* 8+ years of experience as a DB2 LUW Database Administrator.
* Proven expertise in DB2 administration, backup/recovery, performance tuning, and troubleshooting.
* Solid understanding of Linux/Unix systems, networking, and security principles.
* Strong scripting skills in Ansible and shell scripting, or equivalent automation tools.
* Experience implementing HADR, Pacemaker, Replication, and DB2 clustering solutions.
* Familiarity with infrastructure-as-code (IaC) concepts and configuration management.
* Hands-on experience with monitoring, alerting, and observability tools in SRE environments.
**Preferred technical and professional experience**
* Analytical mindset with strong troubleshooting and performance optimization skills.
* Collaborative and proactive in driving reliability initiatives across teams.
* Excellent communication, documentation, and mentoring abilities.
* Strong sense of ownership and accountability for system uptime and performance.
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
Confirmar seu email: Enviar Email
Todos os Empregos de IBM