Nityo InfotechWe are seeking a highly skilled Senior Cloud Applications Engineer to serve as a Subject Matter Expert (SME) for distributed applications running on hybrid cloud environments.
This role plays a key part in ensuring the reliability, performance, and scalability of enterprise cloud applications while driving automation, operational excellence, and continuous improvement.
Act as SME for distributed and cloud-native applications on hybrid cloud platforms, documenting best practices and guiding engineering teams.
Drive continuous operational improvements using metrics analysis, system performance data, and customer feedback.
Lead incident management efforts, including troubleshooting, response coordination, root cause analysis, and post-incident reviews.
Communicate complex technical issues clearly to development teams and stakeholders, ensuring timely resolution and long-term remediation.
Manage, deploy, and maintain enterprise applications and cloud-based systems using secure, scalable, and highly available architectures.
Proactively monitor, troubleshoot, and optimize application health, performance, availability, and reliability.
Perform in-depth log analysis and stack trace debugging to resolve issues raised by internal teams, partners, and end users.
Create and maintain detailed documentation for operational procedures, system configurations, and environment setups.
Identify and implement automation opportunities to reduce manual effort and operational overhead.
Mentor and train junior engineers across cloud, application, and operational domains.
Participate in a 24x7 shifting rotation supporting mission-critical production environments.
Bachelor's degree in Information Technology, Engineering, or a related technical field.
Minimum 5 years of experience supporting high-availability, production-grade cloud or enterprise application environments.
Strong background in automation, reliability engineering, and operational excellence.
At least 5+ years of hands-on experience with 1–2 tools per domain below:
Linux Administration & Troubleshooting:
RHEL, CentOS, Ubuntu, or similar Unix-based systems
Microservices architecture and distributed systems support
Logging & Monitoring:
Splunk, Grafana, Prometheus
PagerDuty, ServiceNow
Git, GitHub, GitLab
Certifications such as CKA, CKAD, or cloud certifications (AWS, Azure, GCP).
Experience supporting PaaS platforms, CDNs, Messaging Queues, API Gateways, and Proxies in scalable and resilient architectures.
Proven success collaborating with cross-functional teams in modern DevOps environments.
Strong scripting and automation skills using Bash, Python, or similar languages to improve operational efficiency.
¿Te interesa este puesto?