Soundarya Km

ASSISTANT SYSTEM ENGINEER - IOPEX TECHNOLOGIES - BANGALORE

(2024-05)

Incident Management & Operations - RRPS Project

Spearheaded P0–P3 incident management across global environments, maintaining 95%+ SLA compliance and minimizing service disruption.
Commanded critical outages as Incident Commander, driving cross-functional coordination and reducing MTTR by 20%.
Evaluated and prioritized large-scale disruptions impacting 20%+ of customers, enabling faster and informed recovery decisions.
Communicated real-time updates to stakeholders and leadership, ensuring 100% adherence to communication SLAs and issuing Cloud Availability Notifications within 30–60 minutes.
Managed 30+ incidents/month, ensuring 10-minute acknowledgment SLO compliance and timely SME engagement.
Automated incident workflows using Slack bots, improving response time by 25% and enhancing visibility across teams.
Led post-incident reviews (PIRs), ensuring 100% RCA completion and driving continuous improvement actions.
Reduced repeat incident volume by ~15% by identifying recurring issues and implementing preventive measures through RCA insights.
Transitioned 10+ critical incidents/month to full CINC, ensuring seamless handover and reducing resolution delays by ~15%.
Governed incident lifecycle compliance, achieving 100% closure within 72-hour SLO and reducing backlog by ~20%.

ASSISTANT SYSTEM ENGINEER - IOPEX TECHNOLOGIES - BANGALORE

(2024-05)

Splunk Applications & Support - Kaleidoscope Project

Owned end-to-end support for Splunk applications and add-ons, resolving 5+ tickets/week across data ingestion, parsing, and search performance.
Oversaw full application lifecycle (installation, configuration, upgrades, patching, decommissioning) across multiple Splunk environments.
Troubleshot and resolved complex technical issues, ensuring 95%+ SLA adherence with minimal service downtime.
Optimized SPL for log analysis and query optimization, improving search performance by ~15%.
Designed and maintained 10+ dashboards and alerts, enhancing monitoring, visibility, and operational insights.
Configured data onboarding pipelines, ensuring accurate, consistent, and reliable data flow.
Reduced alert noise by ~20% through threshold tuning, alert optimization, and misfire resolution.
Drove with Tier 2/3 teams root cause analysis (RCA) and implement preventive fixes for recurring issues.
Improved knowledge objects (fields, lookups, dashboards), improving data usability and reporting accuracy by ~20%.

About