Site Reliability Engineering
أرسل عرض عمل مباشرة لهذا المرشح
Site Reliability Engineer (SRE) with 3+ years in infrastructure automation (Ansible), cloud (AWS), and observability (Dynatrace, ELK Stack). Proven track record of reducing downtime by 25% and automating 70% of manual tasks through self-healing systems. Passionate about building scalable, resilient cloud infrastructure for Qatarʼs dynamic tech landscape. Holding a valid Qatar work visa.
✓ Infrastructure as Code (IaC) & Cloud Automation (Ansible, Terraform, AWS)
✓ Observability & APM (Dynatrace, ELK Stack, CloudWatch Alerts)
✓ Self-Healing Systems & Auto-Remediation (Ansible Playbooks, Dynatrace APIs)
✓ Linux/Windows Administration & DevOps Practices (VMware, CI/CD Pipelines)
✓ Security & Compliance (SSH Hardening, IAM, Risk Mitigation)
February 2024 – July 2025 | Mangalore, India
Installed and configured Dynatrace OneAgents across Linux/Windows environments.
Implemented Real User Monitoring (RUM), Synthetic Monitoring, and Network Monitoring.
Used Davis AI and Session Replay for root cause analysis and performance optimization.
anaged log ingestion, processing rules, and visualization using Dynatrace Log Monitoring.
Built custom dashboards, alerts, and SLOs using DQL to monitor application performance and reliability.
Zones, custom tags, and Business Transactions for role-based access and segmented monitoring.
Used Dynatrace Configuration API to automate dashboard creation, alert policies, and tagging rules.
Dynatrace for application performance monitoring and analysis, resulting in improved system reliability and reduced downtime.
Developed and maintained Ansible playbooks and roles for patching, service restarts, and infrastructure provisioning.
August 2022 – January 2024 | Mumbai, India
Created custom dashboards, alerts, and reports to monitor application performance, infrastructure health, and business transaction.
Set up Management Zones, custom tags, and Business Transactions for role-based monitoring and segmentation.
Configured synthetic monitoring, real user monitoring (RUM), and network monitoring.
Collaborated with developers to resolve root causes of critical issues, minimizing downtime. •
Analyzed performance issues using PurePath, Davis AI, and Session Replay.
Ansible Automation Responsibilities (Auto-Remediation Playbooks):
Designed Kibana dashboards, reducing troubleshooting time by 30%.
Bachelor of Engineering (B.E) in Computer Science,
Visvesvaraya Technological University (VTU)
CGPA: 7.3/10 (70%) | (Degree attested by Qatar Embassy)