SUMMARY:

Staff-level engineer with deep expertise in cloud infrastructure, Kubernetes, and platform reliability, driving stability and scalability across high-growth environments
Proven track record in DevOps, monitoring, observability, and incident response, with direct ownership of P0/P1 operational management and secure systems architecture
Strong experience leading cross-functional platform and SRE teams, mentoring engineers into senior roles, and scaling organizations through hiring and process improvements
Skilled in consolidating disjointed systems into modern, best-practice architectures that reduce operational complexity and improve reliability
Adept at vendor management, compliance (SOC 2, PCI, HIPAA), and security best practices, ensuring infrastructure meets both regulatory and business needs
Effective communicator and collaborator across engineering, product, operations, and executive leadership, representing engineering in strategic initiatives and acquisition due diligence

EXPERIENCE:

Staff Software Engineer, Platform

Atelio (by FIS) (via acq. of Bond Financial Technologies), USA (Remote)

06/2021 - 02/2025

Responsibilities:

Responsible for the operation and maintenance of our cloud infrastructure, reliability, observability, and monitoring
Oversaw and managed (player/coach methodology) the ongoing maintenance of Kubernetes, terraform, and cloud resources
Owned organization-wide incident response plans and procedures. Primarily escalation point for all P1/P0 incidents
Provide mentorship and guidance to junior and mid-level engineers for effective management of projects with appropriate prioritization and communication
Conduct research and comparative analysis of potential software vendors, and making build vs buy decisions
Manage relationships and contracts with external software vendors (AWS, Fastly, Datadog, VGS, StrongDM, Vanta)
Represented the Engineering organization at organization-wide leadership meetings during the post-acqusition period
Cross-functional collaboration with product teams to deprecate redundent systems and duplicated functionality to reduce operational complexity.
Reconciled existing infrastructure and tooling into appropriate Terraform projects
Assisted and Supplemented Product Engineering with new feature development based on priorities and required timelines

Key Accomplishments:

Architected and oversaw the consolidation of disjoint application microservices into a modern and best-practice application, reducing data workflow failures by ~95%
Grew Platform organization to 4 full time engineers + 2 additional sub-teams (IT Management and Techincal Escalation Engineering)
Spearheaded cross-functional initiatives with Product Engineering, Operations, and Leadership to bootstrap new organizations for IT Management and Technical Escalation Engineering responsibilities
Coordinated and managed the hiring process of 12 Senior+ Engineers to various product-engineer teams between February and April 2025
Coordinated with internal and external security contacts on acquisition, management, and maintainance of SOC2 Type 2 certification
Architected project plan, and provided ongoing implementation guidance on, an updated secure data encryption and storage across all application and data storage systems
Introduced and educated the engineering team on the use of Datadog for observability and monitoring
Coordinated with stakeholders across multiple organizations in the effort to transition all employees and systems to FIS-managed hardware and requirements
Delivered technical overview presentations and Q&A sessions during due diligence of acqusition process

Senior Software Engineer - Site Reliability Engineering

Fullstory, Austin, TX, USA (Remote)

02/2019 - 06/2021

Responsibilities:

Responsible for the maintainance and functionality of internally-build depployment orchestration system
Managed production and pre-production Kubernetes environments
Managed day-to-day operational issues and scaling of our internal Prometheus-based monitoring systems

Key Accomplishments:

Migrated node scheduling from job-based to attribute-based model, improving the utilization of compute resources and reducing scheduling delays
Evangelized the introduction of a Service Mesh across the engineering organization

Senior Software Engineer - Site Reliability Engineering

Yonder (formally New Knowledge), Austin, Texas, USA

02/2019 - 02-2020

Responsibilities:

Owned the prioritization and execution of all Devops, Infrastructure, and Site Reliability requirements
Maintained multiple Kubernetes clusters for both production and staging workloads
Worked with individual Product Engineering leads to reduce operational complexity and streamline our engineering process
Actively worked to reduce existing overengineered solutions and improve engineering productivity

Key Accomplishments:

Executed a Cloud Migration strategy to migrate all live stateless and stateful workloads from Azure to AWS without downtime
Worked with product engineering to standardize and restructure web scraping infrastructure to improve engineering velocity and reduce operational overhead by ~95%

Senior Software Engineer - Infrastructure

Pixlee, Austin, Texas, USA

11/2018 - 02/2019

Responsibilities:

Updated the development workflow of core applications to include modern and professional software engineering practices
Designed and developed reproducable and automated developer environments based in a Kubernetes environment
Identified and communicated fundamental issues in the existing configuration management, and developed a safe migration plan to correct the issues
Identified and communicated issues in the current production infrastructure which negatively impact system cost, reliabilty, and operational insight
Delivered a safe, long term plan to migrate to Kubernetes in order to reduce the infrastructure bloat, consolidate services, improve reliability, and ease operational burden

Staff Software Engineer

Cratejoy, Austin, Texas, USA

01/2018 - 10/2018

Responsibilities:

Managed our production Kubernetes infrastructure, staging environments, and CI/CD pipelines
Interfaced with individual product teams in order to plan for upcoming deployment, monitoring, and tooling needs
Migrated our central application deployments to team-specific automated deployments
Developed internal services to aid in the ease of development of user facing products

Key Accomplishments:

Architected and managed the development of the Cratejoy Custom Domain SSL feature (with Lets Encrypt)

Senior Software Engineer

Cratejoy, Austin, Texas, USA

02/2015 - 01/2018

Responsibilities:

Formed and led our Site Reliability Engineering team in order to prioritize stability, reliability, performance, and ease of development
Identify, investigate, and resolve platform-wide performance and reliability issues
Developed and released a reliable internal Traffic Analysis system (with full grainularity), used throughout the company to make business critical decisions
Developed and maintained features for the Merchant Tools section of the Cratejoy Platform

Key Accomplishments:

Developed internal support for, and implemented, an Engineering On-Call rotation and emergency response playbook
Led a migration of our internal infrastructure from Ansible managed machines to Kubernetes
Designed, implemented, and rolled out PayPal support for all storefronts, which is used by 1000+ Merchants, and accounts for ~15% of platform purchase volume
Received multiple internal awards, including company wide 'Impact of the Quarter' (Q4 2016) and 'Engineering Values' (Q1 2017)

EDUCATION:

Bachelor of Computer Science

Honours Computer Science Co-op, Psychology Minor

University of Waterloo, Waterloo, Ontario

Keywords

This section exists for ATSs. If you are a human, you can ignore this section.

Kubernetes · CNCF · Cloud Architecture · EKS · AWS · GCP · Terraform · Terraform Cloud · IaC · genAI · Generative AI · Cloudflare · Fastly · CDN · API Gateway · OpenAI · Anthropic · Mistral · Ollama · Git · Github · CI/CD · Github Actions · GHA · CircleCI · Buildkite · ArgoCD · Python · Flask · Django · FastAPI · Docker · containerd · Container Runtimes · gRPC · protobuf · SQL · PostgreSQL · pgSQL · RDS · Clickhouse · monitoring · observability · SLO · SLA · Datadog · Sentry · Prometheus · Grafana · Loki · OpenTelemetry · Hashicorp Vault · AWS KMS · AWS SSM · service mesh · istio · linkerd · tokenization · VGS · Skyflow · encryption · security · SOC2 · PCI · MongoDB · Apache Kafka · Redpanda · Event Driven Architecture · Node.js · Express.js · Golang · React · Vercel · Netlify · Nginx · Envoy · Keycloak · Auth0 · Clerk · IAM · OAuth · OIDC ·

Nicholas Mitchinson

SUMMARY:

EXPERIENCE:

Staff Software Engineer, Platform

Atelio (by FIS) (via acq. of Bond Financial Technologies), USA (Remote)

06/2021 - 02/2025

Responsibilities:

Key Accomplishments:

Senior Software Engineer - Site Reliability Engineering

Fullstory, Austin, TX, USA (Remote)

02/2019 - 06/2021

Responsibilities:

Key Accomplishments:

Senior Software Engineer - Site Reliability Engineering

Yonder (formally New Knowledge), Austin, Texas, USA

02/2019 - 02-2020

Responsibilities:

Key Accomplishments:

Senior Software Engineer - Infrastructure

Pixlee, Austin, Texas, USA

11/2018 - 02/2019

Responsibilities:

Staff Software Engineer

Cratejoy, Austin, Texas, USA

01/2018 - 10/2018

Responsibilities:

Key Accomplishments:

Senior Software Engineer

Cratejoy, Austin, Texas, USA

02/2015 - 01/2018

Responsibilities:

Key Accomplishments:

EDUCATION:

Bachelor of Computer Science

Keywords