Associate – AWS Cloud Infrastructure Engineer | Cognizant Technology Solutions | Oct 2021 – Present
Client: VodafoneZiggo, Netherlands (Fortune 500 Telecom) | Environment: Multi-account AWS — Dev /
Acceptance / Production
Cloud Infrastructure Management & High Availability
- Managed enterprise-scale, multi-account AWS infrastructure across Dev, Acceptance, and Production environments, maintaining 99.9% SLA uptime for a telecom platform serving millions of end users.
- Administered 15+ AWS services (EC2, S3, RDS, EMR, EKS, Lambda, IAM, Route 53, ALB, VPC Endpoints)
using AWS CDK (TypeScript) and CloudFormation, ensuring zero manual drift from IaC and full alignment with
AWS Well-Architected Framework principles.
- Led end-to-end VPC migration of multiple applications from legacy VPC to a Transit Gateway-based multi-VPC architecture using CloudFormation stacks, migrating all associated resources — ALB, EC2, Security Groups,
Route 53, and RDS — with zero downtime across all three environments.
- Designed and enforced high-availability and disaster recovery strategies across 3-tier environments, ensuring business continuity for critical telecom workloads.
- Created and managed VPC Endpoints for private, secure service connectivity; managed Security Groups and
IAM policies to enforce least-privilege access across all environments.
Linux Administration & Patching
- Administered Linux-based EC2 instances across dev, acceptance, and production environments — including OS configuration, performance tuning, and service management.
- Remediated OS-level vulnerabilities on Linux servers identified by Qualys agent scans, scheduling reboots within approved maintenance windows to maintain compliance SLAs.
CI/CD & Infrastructure as Code
- Structured modular IaC repositories (IAM, S3, Security Groups, VPC) enabling reusable, version-controlled infrastructure changes with full audit trail via Merge Requests and code owner approvals.
- Designed and maintained GitLab CI/CD pipelines for all infrastructure deployments, reducing average deployment time by 30% and eliminating manual intervention errors.Scripting & Automation
- Developed Python-based AWS Lambda automation scripts to eliminate repetitive manual operational tasks,
improving team efficiency and minimising human error in routine workflows.
- Automated a recurring weekly DMS (Database Migration Service) activity using Python Lambda functions,
converting an after-hours manual process into a fully automated, scheduled pipeline.
- Built EMR cluster auto-stop automation using Python Lambda triggered on a schedule to shut down idle clusters after 1 hour, directly reducing monthly cloud spend in dev environments.
FinOps & Cost Optimisation
- Reduced AWS infrastructure costs by 15% through FinOps analysis, rightsizing recommendations, and automated EMR idle-cluster shutdown scripts.
- Delivered recurring cost-saving strategies including decommissioning long-stopped EC2 instances, S3 multipart upload cleanup, and resource rightsizing — contributing to ongoing cloud cost governance.
Kubernetes, EKS & Observability (Central Insight Monitoring)
- Managed full CIM observability stack — Prometheus, Grafana, Alertmanager, and Thanos — across EKS clusters deployed via Helm charts, enabling end-to-end monitoring for production telecom workloads.
- Troubleshot pod failures, container restarts, and alert delivery pipeline issues across observability components,
reducing mean time to detect (MTTD) on infrastructure incidents.
EMR & Big Data Platform Support
- Supported EMR clusters running Spark, Oozie, Hue, and YARN workloads on Linux nodes; monitored spot, core,
and task node health to prevent job failures and data pipeline disruptions.
- Investigated and resolved EMR application incidents including cluster failures and node health degradations,
using CloudWatch logs and EMR console diagnostics.
Security, Compliance & Incident Management
- Maintained 100% audit compliance through proactive vulnerability remediation on Linux EC2 instances using
Qualys Agent; investigated suspicious activities via CloudTrail and CloudWatch audit logs.
- Resolved critical infrastructure incidents including application downtime and EMR failures using JIRA-based workflows; performed root cause analysis (RCA) and presented findings to client stakeholders.
- Implemented IAM policy governance and access control reviews, reducing privilege escalation risk across production environments in alignment with least-privilege principles.
Database Operations
- Performed RDS upgrades, storage scaling, and quarterly Oracle DB refresh (Prod-to-Acceptance cloning),
reducing environment provisioning effort by ~40%.
- Managed RDS version upgrades and investigated suspicious database access events using CloudTrail logs