DevOps Engineer
Send a job offer directly to this candidate
I am a highly skilled System Development / Site Reliability Engineer, with over 25 years of experience in the technology industry, ensuring the reliability, performance, and security of systems and applications.
Mountain View, CA
07/2014 - 01/2024
Ensured the reliability and performance of cloud infrastructure serving the Nest platform, for both the production and development environments. Coordinated staggered deployments, including ad-hoc feature releases and hotfixes. Performed migrations and resiliency efforts. Delivered deployments with little to no impact to our users. Participated in on-call rotations to triage issues and maintain a 99+% uptime for our production infrastructure.
Maintained 99+% uptime for Google Nest services, serving our 10M+ customers
Optimized monitoring to improve our visibility into our production environment, reducing response time for incidents by 25%
Migrated existing infrastructure in AWS to GCP, cutting our operating cost while increasing reliability
24/7 Customer, Inc.
Campbell, CA
05/2012 - 07/2014
Was responsible for maintaining existing legacy bare metal production infrastructure, while also helping to design and implement its cloud replacement.
Maintained legacy production infrastructure that was serving our external clients, including AT&T, Verizon, and Capital One
Helped design and test new virtual infrastructure on hypervisors
Coordinated with Dev, Ops, QA teams to resolve critical issues in a timely manner
Network Operations Analyst / Lead Service Engineer
Tellme Networks / Microsoft Corporation
Mountain View, CA
03/2007 - 05/2012
Worked in shifts and, along with the NOC team, managed the Network Operations Center. We supported the production infrastructure which hosted our telephony services, to ensure minimum downtime.
Monitored and mitigated issues on IVR platform with multiple enterprise clients
Documented procedures and developed tools to automate everyday tasks
Trained and mentored new group members
Deployed and supported new technologies in production with little to no user impact
Worked with the Datacenter Ops to repair/replace production hardware when required
Led the weekend NOC team and was responsible for all weekend coverage of our production services. Handled administrative and personnel matters for the shift.
Managed a team of four in monitoring and mitigation of production issues on an IVR and web services platform
Set up scheduling for the work shifts
Compiled metrics on the efficiency of the group, as well as working with the group to improve said efficiency
B.A. Psychology from U.C. Berkeley