Sarah Murphy
sarah@gitupandgo.com · San Francisco, CA · github.com/sarahmaeve
An experienced site reliability engineer and generalist technologist focused on model-first development and production stability practices.
Experience
Experimental Technologist, LLM / AIFeb 2026 – Present
Consulting / Self
- Built a proof-of-concept French language learning platform (rusty-french.com) for B1+ learners using generated translations and audio. Uses a Rust toolchain for HTML templating, unit tests, CSP validation, content feature flags, analytics, and mobile site performance.
- Designed a series of “unspooling” exercises that demonstrate Rust concepts, generated by Claude LLM.
Principal Site Reliability EngineerOct 2021 – Jul 2022
Treasure Financial
- Automated deployments for GCP-hosted services, moving from a sprint-based structure (2+ weeks) to a daily launch cadence for all modified services as well as hotfixes on demand.
- As the lead individual contributor in a seed-stage startup, took charge of on-call duties, stabilized infrastructure, and helped recruit and hire a developer team.
- Converted Kubernetes configurations into infrastructure as code, enabling scalability and streamlining regulatory compliance.
- Deployed Honeycomb for improved observability, restructured alert systems, and streamlined on-call rotations.
Kubernetes, CI/CD, Python, GCP, GKE, UNIX, PagerDuty, Helm, Flux, ArgoCD
Sr. Staff Site Reliability EngineerOct 2020 – Sep 2021
WePay
- Restructured the postmortem process to avoid repeated incidents.
- Held key roles in site and incident commander on-call rotations; expanded and formalized rotations, ensuring robust failovers and more flexible systems.
- Mentored junior and mid-level SRE candidates on process changes, code reviews, and on-call participation.
Kubernetes, Python, GCP
Principal Site Reliability EngineerJun 2018 – Aug 2019
Microsoft
- Advocated for Silicon Valley-style site reliability and production methods across all of Microsoft Azure.
- Presented tech talks on release engineering and site reliability, consulted with Azure teams globally.
Azure, SRE, Python, Go
Infrastructure EngineerOct 2016 – Jul 2017
Lever
- Repaired Chef automation of servers and added test kitchen and CI/CD support, transitioning from manual configs to Terraform for enhanced reliability.
- Moved releases from manual system to a rapid, chat-ops deployment.
Chef, EC2, AWS, CI/CD, ChatOps, Elasticsearch, Python, Ansible, SOC2
Senior Release EngineerSep 2012 – Sep 2015
Google
- Stabilized Google Play releases, automating builds and reducing deployment time from 20+ hours to 4 hours.
- Provided pre-launch release engineering consulting for Google Fi phone service, acting as SRE and RE for Android Metrics.
Python, Go, automation, Borg, UNIX, bash, awk, Kubernetes, TCP/IP
Release EngineerMay 2010 – Sep 2011
Facebook
- Performed releases at pre-IPO Facebook, deploying multiple times daily for a product with 450M to 800M daily active users.
- 24/7 on-call duties with high visibility and responsibility for the site’s operation.
Release EngineerAug 2006 – Mar 2009
Google
- Deployed the web front-end for Google Search (GWS / google.com), participating in the on-call rotation.
- Automated and deployed Google Ads, responsible for the bulk of Google’s revenue.
Other Interests
- Historical (14th–17th century) fencing with longsword and side sword
- Former unilateral freelance journalist in North Africa and SW Asia, 2002–2005
- French, Spanish, and other natural language interests