Durgesh Singh

Devops/Cloud Engineer

Talent

Noida, Gautam Buddha NagarMember since March 10, 2025

Hire this person

Send a job offer directly to this candidate

About

Cloud & DevOps Engineer with expertise in AWS, OpenStack, OpenShift, Linux, Kubernetes, and CI/CD automation. Adept at designing and managing cloud infrastructure, monitoring solutions,and containerized deployments. Passionate about DevOps automation, cloud security, and site reliability engineering (SRE).

Experience

OpenStack Infrastructure Management

 Managed OpenStack infrastructure across compute, storage, and networking services,

ensuring 24/7 availability.

 Integrated monitoring tools like Grafana, Kibana, and Prometheus for 70+ Pan-India sites.

 Expanded compute, KVM, and Ceph nodes on live sites, following proper CR and incident timelines.

 Configured and monitored virtual machine (VM) health and performance using

Prometheus and custom exporters.

 Performed OpenStack upgrades and migrations with minimal downtime, ensuring seamless service continuity.

 Automated routine tasks and system upgrades to enhance operational e􀆯iciency.

 Collaborated with cross-functional teams to troubleshoot performance bottlenecks in

Nova, Neutron, and Cinder services.

 Validated MBSS documents for the H-Cloud Project while implementing security hardening concepts.

 Managed and configured backups for all sites using the Commvault backup tool.

Monitoring and Observability

 Deployed and maintained monitoring tools (Prometheus, Thanos, Grafana) for real-time observability and long-term metrics storage.

 Integrated Thanos for scalable and centralized metric queries across distributed

Prometheus instances.

 Utilized Prometheus CLI query language and Kibana to advance AI/ML use cases.

 Performed all testing and alert configuration for site expansions and new node validations.

 Created custom dashboards and set up alerts for critical infrastructure metrics.

Site Expansion Projects

 Led site expansion projects by provisioning compute and storage resources using Heat templates and manual orchestration.

 Validated and tested newly expanded nodes to ensure optimal performance and reliability.

Incident and SLA Management

 Created and resolved TTs (Trouble Tickets) and CRs (Change Requests) for all issues and activities, ensuring compliance with SLA.

 Monitored and ensured the timely delivery of health checkup reports using Python and

Linux scripting.

OpenShift and Container Management

 Managed multiple projects within OpenShift, handling resource allocation and scaling for services and pods.

 Created, updated, and deleted ConfigMaps and Secrets to manage dynamic application configurations.

 Integrated Prometheus and Grafana for real-time monitoring and improved cluster observability.

 Troubleshot pod failures and network issues, achieving 99.9% application uptime.

 Monitored pod health and resolved performance issues using OpenShift Web Console and CLI tools.

 Contributed to automation scripts for container builds, deployment rollbacks, and log collection.

Data Analytics and Reporting

 Integrated multiple network components in the NLM project using data analytics and

Excel skills on Logstash and Elasticsearch.

 Utilized advanced data analysis techniques to ensure accurate insights and system performance.

Education

OpenStack Infrastructure Management

 Managed OpenStack infrastructure across compute, storage, and networking services,

ensuring 24/7 availability.

 Integrated monitoring tools like Grafana, Kibana, and Prometheus for 70+ Pan-India sites.

 Expanded compute, KVM, and Ceph nodes on live sites, following proper CR and incident timelines.

 Configured and monitored virtual machine (VM) health and performance using

Prometheus and custom exporters.

 Performed OpenStack upgrades and migrations with minimal downtime, ensuring seamless service continuity.

 Automated routine tasks and system upgrades to enhance operational e􀆯iciency.

 Collaborated with cross-functional teams to troubleshoot performance bottlenecks in

Nova, Neutron, and Cinder services.

 Validated MBSS documents for the H-Cloud Project while implementing security hardening concepts.

 Managed and configured backups for all sites using the Commvault backup tool.

Monitoring and Observability

 Deployed and maintained monitoring tools (Prometheus, Thanos, Grafana) for real-time observability and long-term metrics storage.

 Integrated Thanos for scalable and centralized metric queries across distributed

Prometheus instances.

 Utilized Prometheus CLI query language and Kibana to advance AI/ML use cases.

 Performed all testing and alert configuration for site expansions and new node validations.

 Created custom dashboards and set up alerts for critical infrastructure metrics.

Site Expansion Projects

 Led site expansion projects by provisioning compute and storage resources using Heat templates and manual orchestration.

 Validated and tested newly expanded nodes to ensure optimal performance and reliability.

Incident and SLA Management

 Created and resolved TTs (Trouble Tickets) and CRs (Change Requests) for all issues and activities, ensuring compliance with SLA.

 Monitored and ensured the timely delivery of health checkup reports using Python and

Linux scripting.

OpenShift and Container Management

 Managed multiple projects within OpenShift, handling resource allocation and scaling for services and pods.

 Created, updated, and deleted ConfigMaps and Secrets to manage dynamic application configurations.

 Integrated Prometheus and Grafana for real-time monitoring and improved cluster observability.

 Troubleshot pod failures and network issues, achieving 99.9% application uptime.

 Monitored pod health and resolved performance issues using OpenShift Web Console and CLI tools.

 Contributed to automation scripts for container builds, deployment rollbacks, and log collection.

Data Analytics and Reporting

 Integrated multiple network components in the NLM project using data analytics and

Excel skills on Logstash and Elasticsearch.

 Utilized advanced data analysis techniques to ensure accurate insights and system performance.

Reviews

Similar people near Noida, Gautam buddha nagar

Abhishek Tyagi

Talent

Cloud/DevOps Engineer

Delhi, Delhi₹500/hour

Hire their servicesdevops engineerinfrastructure as codecloud engineer+2

Ashish Thakur

Talent

Cloud & DevOps Engineer

Delhi, Delhi

Anjali Kashyap

Talent

Python , C++

Ghāziābād, Ghaziabad

PythonC++Django+13

Sarthak Shrivastava

Talent

Cloud / DevOps Engineer

Noida, Gautam Buddha Nagar

Vikash Adhikari

Talent

Cloud & DevOps Engineer

Delhi, Delhi

Kapil Kumar

Talent

Cloud & DevOps Engineer

Delhi

AWSMicrosoft AzureGit+3

Other similar people

Abhishek Tyagi

Talent

Cloud/DevOps Engineer

Delhi, Delhi₹500/hour

Hire their servicesdevops engineerinfrastructure as codecloud engineer+2

Ashish Thakur

Talent

Cloud & DevOps Engineer

Delhi, Delhi

Sachin Jayswal

Talent

Cloud Devops Engineer

Mumbai, Maharashtra$8/hour

Hire their servicesinfrastructure automationawsdevops tools+2

Selvaganapathy Venkateshwaran

Talent

Cloud and DevOps Engineer

Coimbatore, Coimbatore district

Tanveer Mujtaba K

Talent

Cloud DevOps Engineer

Bengaluru, Bengaluru Urban

Vinod Varma

Talent

Cloud & Devops Engineer

Mumbai, Maharashtra

Durgesh Singh

Hire this person

About

Experience

OpenStack Infrastructure Management

Monitoring and Observability

Site Expansion Projects

Incident and SLA Management

OpenShift and Container Management

Data Analytics and Reporting

Education

OpenStack Infrastructure Management

Monitoring and Observability

Site Expansion Projects

Incident and SLA Management

OpenShift and Container Management

Data Analytics and Reporting

Reviews

Similar people near Noida, Gautam buddha nagar

Abhishek Tyagi

Ashish Thakur

Anjali Kashyap

Sarthak Shrivastava

Vikash Adhikari

Kapil Kumar

Other similar people

Abhishek Tyagi

Ashish Thakur

Sachin Jayswal

Selvaganapathy Venkateshwaran

Tanveer Mujtaba K

Vinod Varma

Related