Skip to main content

DevOps Engineer / Sr. DevOps Engineer

Technology
TechBlocks
Gurugram, India2 months agoUntil 19/4/2026
Full timeOn-site

Job description

  • *Roles & Responsibilities:
  • *Duties & accountabilities
  • Provide second line client-facing technical support for issues escalated by first line support teams.
  • Apply strong technical skills and good business knowledge together with investigative techniques and problem-solving skills toidentifyand resolve issues efficiently andin a timely manner.
  • Work collaboratively with development teamrequiredfor third line escalation.
  • Coordinate with product and delivery teams to ensure the Service Management team is ready for new releases and engaged in early design of new enhancements.
  • Work on initiatives and continuous improvement process around proactive application health monitoring, reporting, and technical support.
  • Apply AI/ML techniques to detect anomaly, predict alerting, and to enhance support operations.
  • *Key Areas of The Teams Responsibilities Are
  • Proactive monitoring and management of business critical 24x7 real-time. Where required to rectify issues ina timelyfashion to restore application functionality.
  • Ensure incidents are correctly processed, assessing business and technical impact and severity.
  • Taking ownership of application incidents and ensuring that they are resolved, this includesretainingownership of incidents that require 3rd Line or IT Change activity to resolve.
  • Ensuring the communication to the business communityremainsactive.
  • Application responsibilities will cover Application Infrastructure, Data Fixes, User Queries, UserEducationand Incident Investigation.
  • Monitoring of application events alerts, job schedules, capacitymonitorsand performance KPI''s. Creation and ownership of change requests raised to address any of the above issues.
  • Proactively share knowledge with the team and update the knowledge base with support documentation (Confluence).
  • Work to provide services to agreed Service Level Targets and Operating Level Agreements.
  • Leverage AI Ops techniques to analyse logs, metrics, traces, and event data, enabling proactive trend identification and continuous optimization of system performance
  • *Education and Hand on experiencerequired.
  • Preferably 4+ years of direct experience in Site Reliability Engineering or DevOps roles, high availability, and incident response in AWS or Azure or GCP.
  • Proficiencywith cloud computing environments (AWS / GCP/ Azure).
  • Good understanding of Application Support processes
  • Ideally familiar with monitoring tools such as Splunk,Cloudwatch, Dotcom and Monolith.
  • Expertisein Oracle SQL/PostgreSQL:Proficiencyin advanced SQL techniques, query optimization, and experience with complex database systems.
  • Experience with advanced observability tools (e.g., Prometheus, Grafana, Splunk) for monitoring, logging, and tracing.
  • Experience in leading post-mortem analyses and implementing preventative measures to avoid recurrence of incidents.
  • Excellent problem-solving skills and the capacity to lead effectively under pressure during incident response and outage management.
  • Must understand operating systems most especially Windows and Linux. Good scripting experience (preferably including python) an advantage.
Keywords
confluenceamazon-web-servicesazure-devopsmicrosoft-azuregoogle-cloud-platformsplunkamazon-cloudwatchoraclepostgresqlprometheusgrafanawindowspython

¿Te interesa este puesto?