Summary

Senior infrastructure / site reliability engineer with deep experience modernizing legacy and hybrid environments through Infrastructure as Code, cloud-native identity, and Git-based workflows. Focused on reducing blast radius, removing failure-prone patterns, and enabling safer change.


Technical Skills

Cloud: GCP (GKE, Cloud Run, IAM, Workload Identity) | AWS (EKS, ECS, VPC)
Infrastructure: Terraform, Terragrunt, Ansible, GitHub Actions, ArgoCD
Operating Systems: Linux (RHEL, Ubuntu, Debian), Windows Server
Observability & Security: Datadog, CloudHealth, PagerDuty, Splunk
Datastores: Postgres, Redis, Elasticsearch
Languages & Tools: Bash, Python, Git, Docker


Experience

Site Reliability Engineer (Infrastructure / Platform)

MagMutual Insurance Company, Atlanta, GAApr. 2024 – Present

Joined a largely manual environment with no formal SLOs, incident metrics, or standardized infrastructure practices. Focused on building core infrastructure, identity, and change-management foundations to make the platform safer, more repeatable, and less failure-prone.
  • Designed and implemented a fully isolated GCP layout for dev, QA, UAT, and production across 12 projects using Terraform and Terragrunt, removing shared project risk and long-standing configuration drift.
  • Deployed and maintained 10+ GKE clusters supporting internal tooling and application workloads, including infrastructure automation with AWX and ArgoCD.
  • Migrated DNS and perimeter security from manual processes to Terraform-managed GitOps pipelines using GitHub Actions, reducing change risk and improving rollback safety.
  • Replaced long-lived cloud credentials with Workload Identity Federation and least-privilege IAM in a HIPAA-regulated environment, closing several high-risk access patterns.
  • Built and maintained Terraform-managed site-to-site VPN and DNS peering connectivity between on-premises systems and GCP.
  • Created reusable Ansible automation for Linux systems joined to Active Directory using Kerberos and GSSAPI, reducing access issues and manual server configuration across ~150 hosts.
  • Identified and corrected a latent production access failure caused by unsafe filesystem permissions on a long-running production VM, preventing a permanent OSLogin lockout during a planned migration.

Senior DevOps Engineer

Red Boundary Research, Charleston, SCOct. 2022 – Jan. 2024

Led infrastructure and operations for a small security startup, owning AWS architecture decisions, CI/CD strategy, and observability for their endpoint agent product.
  • Built and maintained infrastructure using Terraform, AWS VPCs, ECS, and related services to model diverse traffic routing and failure scenarios.
  • Integrated Datadog for service monitoring and Elasticsearch for centralized log aggregation and analysis.
  • Introduced CI/CD workflows to replace manual deployments, improving release reliability and reducing operational friction.

Senior DevOps Engineer

The Weather Company (IBM), Atlanta, GAOct. 2017 – Feb. 2021

Sole operations engineer supporting analytics and data science teams across production and QA AWS environments.
  • Reduced annual AWS spend by $355K+ through EMR spot instance migration, reserved instance strategy for Qliksense cluster, and systematic right-sizing informed by CloudHealth and Trusted Advisor analysis.
  • Deployed and operated Kubernetes workloads on EKS using Terraform and Helm, supporting large-scale analytics pipelines with centralized logging via Elasticsearch.
  • Diagnosed and recovered a production Cassandra cluster failure using custom recovery scripts.

Education

  • Master of Business Administration (M.B.A.), Coastal Carolina University2014
  • Bachelor of Science in Chemistry, Coastal Carolina University2013

Projects

  • Home Kubernetes cluster (Terraform/Helm)
  • Multi-network Kubernetes via HeadScale
  • IoT automation (Home Assistant, pigeon loft with ESPHome)
updatedupdated2026-02-082026-02-08