Summary
Senior infrastructure / site reliability engineer with deep experience modernizing legacy and hybrid environments through Infrastructure as Code, cloud-native identity, and Git-based workflows. Focused on reducing blast radius, removing failure-prone patterns, and enabling safer change.
Technical Skills
Cloud: GCP (GKE, Cloud Run, IAM, Workload Identity) | AWS (EKS, ECS, VPC)
Infrastructure: Terraform, Terragrunt, Ansible, GitHub Actions, ArgoCD
Operating Systems: Linux (RHEL, Ubuntu, Debian), Windows Server
Observability & Security: Datadog, CloudHealth, PagerDuty, Splunk
Datastores: Postgres, Redis, Elasticsearch
Languages & Tools: Bash, Python, Git, Docker
Experience
Site Reliability Engineer (Infrastructure / Platform)
MagMutual Insurance Company, Atlanta, GAApr. 2024 – Present
Joined a largely manual environment with no formal SLOs, incident metrics, or standardized infrastructure practices. Focused on building core infrastructure, identity, and change-management foundations to make the platform safer, more repeatable, and less failure-prone.- Designed and implemented a fully isolated GCP layout for dev, QA, UAT, and production across 12 projects using Terraform and Terragrunt, removing shared project risk and long-standing configuration drift.
- Deployed and maintained 10+ GKE clusters supporting internal tooling and application workloads, including infrastructure automation with AWX and ArgoCD.
- Migrated DNS and perimeter security from manual processes to Terraform-managed GitOps pipelines using GitHub Actions, reducing change risk and improving rollback safety.
- Replaced long-lived cloud credentials with Workload Identity Federation and least-privilege IAM in a HIPAA-regulated environment, closing several high-risk access patterns.
- Built and maintained Terraform-managed site-to-site VPN and DNS peering connectivity between on-premises systems and GCP.
- Created reusable Ansible automation for Linux systems joined to Active Directory using Kerberos and GSSAPI, reducing access issues and manual server configuration across ~150 hosts.
- Identified and corrected a latent production access failure caused by unsafe filesystem permissions on a long-running production VM, preventing a permanent OSLogin lockout during a planned migration.
Senior DevOps Engineer
Red Boundary Research, Charleston, SCOct. 2022 – Jan. 2024
Led infrastructure and operations for a small security startup, owning AWS architecture decisions, CI/CD strategy, and observability for their endpoint agent product.- Built and maintained infrastructure using Terraform, AWS VPCs, ECS, and related services to model diverse traffic routing and failure scenarios.
- Integrated Datadog for service monitoring and Elasticsearch for centralized log aggregation and analysis.
- Introduced CI/CD workflows to replace manual deployments, improving release reliability and reducing operational friction.
Senior DevOps Engineer
The Weather Company (IBM), Atlanta, GAOct. 2017 – Feb. 2021
Sole operations engineer supporting analytics and data science teams across production and QA AWS environments.- Reduced annual AWS spend by $355K+ through EMR spot instance migration, reserved instance strategy for Qliksense cluster, and systematic right-sizing informed by CloudHealth and Trusted Advisor analysis.
- Deployed and operated Kubernetes workloads on EKS using Terraform and Helm, supporting large-scale analytics pipelines with centralized logging via Elasticsearch.
- Diagnosed and recovered a production Cassandra cluster failure using custom recovery scripts.
Education
- Master of Business Administration (M.B.A.), Coastal Carolina University2014
- Bachelor of Science in Chemistry, Coastal Carolina University2013
Projects
- Home Kubernetes cluster (Terraform/Helm)
- Multi-network Kubernetes via HeadScale
- IoT automation (Home Assistant, pigeon loft with ESPHome)