Fort Worth, TX, 76196, USA
12 hours ago
Senior Monitoring Engineer
We’re seeking a **Senior Monitoring Engineer** to join a high‑performing Monitoring Engineering team in a fast‑paced finance technology organization. You’ll design, develop, and maintain monitoring and observability solutions that keep core applications and infrastructure healthy and visible. In close partnership with application, platform, and development teams, you will implement alerting systems, dashboards, correlations, and automation—driving reliability, reducing MTTR, and elevating operational awareness. Critical thinking, system analysis, and proactive troubleshooting are essential to success in this role. **Key Responsibilities** **Design, Build, and Maintain Monitoring & Observability Solutions** + Develop and maintain **instrumentation, telemetry, and alerting** for the Enterprise Monitoring Center using industry‑leading tools, such as: + Grafana + OpsRamp + AppDynamics + Elastic Stack + BigPanda + AWS CloudWatch + Azure Monitor + Implement **Observability best practices** , ensuring comprehensive coverage of **metrics, logs, and traces** across critical systems. + Integrate and manage **OpenTelemetry** for distributed tracing and telemetry data collection, enabling end‑to‑end visibility of business‑critical transactions. **Collaboration & Project Participation** + Collaborate with application development teams to **define and document observability requirements** for each project or release. + Participate in complex initiatives, ensuring accurate and actionable monitoring and tracing are in place for every step of business‑critical workflows. **Alerting & Escalation Process** + Define and maintain **standardized alert payloads** per engineering guidelines, ensuring alerts are **actionable** . + Partner with Level 2 and Level 3 support teams to reflect process changes in monitoring dashboards. + Maintain and optimize **thresholds** , ensuring seamless **escalations** via **BigPanda** as the central alert hub. **Dashboard Creation & Maintenance** + Create and maintain **intuitive, actionable dashboards** for the Enterprise Monitoring Center and other finance teams. + Ensure dashboards are effectively **monitored by Level 1 teams** , presenting clear, actionable data that **reduces MTTR** . **System Validation, Documentation & Automation** + Develop and maintain **automation scripts** to enhance monitoring efficiency and improve team quality of life. + Proactively identify **process improvements** and learning opportunities; drive **continuous improvement** . **Automation & Quality‑of‑Life Improvements** + Contribute to the **automation of monitoring, alerting, and operational tasks** to streamline workflows and improve overall system reliability. **Qualifications** **Education** Bachelor’s in Computer Science, IT, or related field. **Experience** + **Minimum 4 years** in a technology organization, with **≥1 year** hands‑on **engineering experience** in **monitoring or production operations** . **Required Skills** + Strong experience **developing instrumentation and alerting** for **large, complex** environments. + Expertise in **≥4** of the following: **OpsRamp, Grafana, AppDynamics, Elastic Stack, InfluxDB, BigPanda** , and other monitoring solutions. + **Hands-on experience with Observability concepts and frameworks** , including **metrics, logs, and traces** . + **Working knowledge of OpenTelemetry** for distributed tracing and telemetry data collection. + Experience with **dashboard creation** , **alert management** , and **tool configuration** . + Excellent **verbal and written communication** —able to present complex technical issues to both technical and non‑technical stakeholders. + Strong **problem‑solving and troubleshooting** in **high‑pressure** environments. + Ability to **prioritize and manage multiple tasks** in a **deadline‑driven** setting. + Proven collaboration with **cross‑functional teams** in **large, complex IT environments** . + Experience with **scripting** (e.g., **Bash** , **PowerShell** ) and proficiency in **one programming language** (e.g., **Python** , **C family** , **JavaScript** ). + Experience designing and implementing **scalable, reliable** monitoring solutions. + Experience with agile software development methodologies + Familiar with problem diagnosis; performance tuning; capacity planning and configuration management across the stack via continuous improvement. **Preferred Qualifications** + Experience **querying, manipulating, and visualizing time‑series** data. + Familiarity with **Infrastructure as Code** tools (e.g., **Ansible** , **Terraform** ). + Strong understanding of how to create **actionable, digestible visualizations** for **Level 1 monitoring** teams. + Working knowledge of **REST APIs** , **JSON** , and **ServiceNow** . + Experience with **cloud monitoring** —particularly **AWS** or **Azure** . **Who we Are** OneMain Financial (NYSE: OMF) is the leader in offering nonprime customers responsible access to credit and is dedicated to improving the financial well-being of hardworking Americans. Since 1912, we’ve looked beyond credit scores to help people get the money they need today and reach their goals for tomorrow. Our growing suite of personal loans, credit cards and other products help people borrow better and work toward a brighter future. Driven collaborators and innovators, our team thrives on transformative digital thinking, customer-first energy and flexible work arrangements that grow lives, careers and our company. At every level, we’re committed to an inclusive culture, career development and impacting the communities where we live and work. Getting people to a better place has made us a better company for over a century. There’s never been a better time to shine with OneMain. Because team members at their best means OneMain at our best, we provide opportunities and benefits that make their health and careers a priority. That’s why we’ve packed our comprehensive benefits package for full- and some part-timers with: + Health and wellbeing options including medical, prescription, dental, vision, hearing, accident, hospital indemnity, and life insurances + Up to 4% matching 401(k)   + Employee Stock Purchase Plan (10% share discount)   + Tuition reimbursement   + Paid time off (15 days’ vacation per year, plus 2 personal days, prorated based on start date) + Paid sick leave as determined by state or local ordinance, prorated based on start date + Paid holidays (7 days per year, based on start date) + Paid volunteer time (3 days per year, prorated based on start date) OneMain Holdings, Inc. is an Equal Employment Opportunity (EEO) and Affirmative Action (AA) employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identify, national origin, age, marital status, protected veteran status, or disability status.
Confirmar seu email: Enviar Email