Site Reliability Engineering (SRE) is an engineering discipline that combines software development and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critica