Open to Senior DevOps / SRE roles

Sankalp Nayak

$ whoami → DevOps / SRE Engineer

DevOps & SRE engineer with 3+ years building and operating Kubernetes platforms on AWS, Azure & Kubernetes — at a fintech serving 22M+ requests/day. I cut cloud spend ~50%, ran a zero-downtime multi-cloud migration, and hold sub-0.1% error rates in production.

3+
YEARS IN DEVOPS
~50%
CLOUD COST CUT
22M+
REQUESTS / DAY
sankalp@prod ~
$ kubectl get pods -n production
17 microservices · Running · 22M+ req/day
$ aws ce get-cost --compare
  before  $22,500 / mo
  after   $11,300 / mo  ↓ ~50%
$ argocd app sync --all
  ✔ synced & healthy in 18s
$
~50%
AWS spend cut, zero regressions
40%
Lower MTTR with Datadog
18s
Pod startup, down from 45s
0
Audit findings · zero-trust
// MULTI-CLOUD PIPELINE
GitHub Actions CI ArgoCD AWS · EKS Azure · AKS
SCROLL
Measurable Impact

Numbers from production, not the brochure.

Every figure below ships with the engineering behind it — Graviton migrations, KEDA autoscaling, unified observability and zero-trust access controls running across live clusters.

FinOps · headline result
~50%
AWS cost reduction
Graviton migration, Karpenter spot pools, RDS/S3 cleanup & NAT consolidation — zero performance regressions.
$22.5K
BEFORE
$11.3K
AFTER
≈ $134K saved annually · monthly run-rate, recurring
Reliability at scale
22M+
Requests / day sustained
Across 17+ Java, Node.js & Python microservices on EKS.
< 0.1% error rate KEDA autoscaled
Velocity
50%
Faster release cycles
Jenkins & ArgoCD pipelines, with deployment errors minimized.
Observability
40%
Lower MTTR
Datadog consolidation killed 35% of false-positive alerts.
Performance
45→18s
Faster pod startup
EKS tuning; ungraceful evictions down 45% via PDBs & chaos testing.
Security
0
Audit findings
Zero-trust access cut high-risk vulnerabilities 60%.
About

I build platforms that stay up, stay secure, and cost less.

Multi-cloud by default. AWS and Azure in production, with Terraform keeping both reproducible.
Reliability obsessed. Chaos engineering, PDBs and SLOs so zero-downtime is the baseline.
FinOps minded. I treat the cloud bill as an engineering metric — and bring it down.
Sankalp Nayak
Sankalp Nayak
DevOps & SRE Engineer
Bangalore, India

"DevOps and SRE engineer with 3+ years building and operating Kubernetes platforms on AWS and Azure. Cut cloud spend ~50% (~$11K/month) at a fintech serving 22M+ requests/day, accelerated deployment cycles 50% via GitHub Actions and ArgoCD, and led a zero-downtime AWS-to-Azure multi-cloud migration."

CI/CD Automation Infrastructure as Code Zero-Trust Security FinOps / Cost Optimization
BCA
Christ University · 2020–23
Bangalore
India · IST (UTC+5:30)
Experience

Five roles, one trajectory — from full-stack to SRE platform leadership.

Current
ZET
Bangalore, India
Apr 2026 — Present

DevOps Engineer 2

  • Drove a company-wide FinOps program cutting AWS spend ~50% ($22.5K → $11.3K/mo, ~$134K annualized) via Graviton migration, Karpenter spot pools, RDS snapshot cleanup, S3 lifecycle policies and NAT/networking consolidation — zero performance regressions.
  • Leading the full AWS-to-Azure exit of the ZET Partner OU — migrating 5 AWS accounts and 15+ Kubernetes workloads to AKS, Aurora MySQL → Azure Database for MySQL Flexible Server, S3 → Blob Storage, on Terraform-managed landing zones using Workload Identity for keyless access.
  • Architecting cross-region Disaster Recovery (ap-south-1 ↔ ap-south-2) for tier-1 services with Aurora Global Database and a Pilot Light active-passive topology — targeting RTO/RPO under 20 minutes with automated failover runbooks and quarterly game-day drills.
  • Built ephemeral preview environments (GitHub Actions + Helm namespace-per-PR on Karpenter spot NodePools) isolating per-PR tests for 17+ microservices serving 22M+ requests/day, while owning platform reliability at a sub-0.1% error rate.
  • Replaced managed ElastiCache Redis with self-hosted Valkey on EC2 Graviton across prod and staging, fronted by Route 53 private DNS — eliminating per-hour ElastiCache cost with zero application changes.
  • Deployed VictoriaLogs alongside OpenSearch via Fluent Bit rewrite_tag routing, cutting multi-GB log query latency and storage cost; migrated Nexus from x86 to ARM64 Graviton ($741/yr saved, no CI rebuilds).
ZET
Bangalore, India
May 2025 — Apr 2026

DevOps Engineer

  • Migrated EKS production workloads (Java, Node.js, Python, Ruby) to ARM64 Graviton, contributing to a ~42% AWS cost reduction with no latency regression.
  • Optimized EKS performance — pod startup 45s → 18s, KEDA autoscaling on Prometheus RPS metrics across 15+ microservices, and ungraceful spot evictions cut 45% via Pod Disruption Budgets and chaos testing.
  • Led the AWS-to-Azure multi-cloud migration — AKS in segmented VNets, Azure Database for MySQL with read replicas, Application Gateway + WAF, and ACR/Key Vault/Service Bus/Storage behind private endpoints with zero public exposure.
  • Consolidated Grafana and New Relic into Datadog (infra, APM, logs, SLOs) — MTTR down 40% and 35% of false-positive alerts eliminated.
  • Enforced zero-trust with Twingate, JumpCloud SSO, IRSA, IMDSv2 and AWS WAF — zero audit findings and a 60% reduction in high-risk vulnerabilities.
  • Standardized multi-account governance (Control Tower, SCPs, GuardDuty, IAM Identity Center), built GitHub Actions CI/CD across staging/QA/prod, and stood up a Metabase + Databricks BI platform with automated ETL.
Hire3x
Bangalore, India
Sept 2024 — May 2025

DevOps Tech Lead

  • Led DevOps transformation across dev, QA, pre-prod & production Kubernetes environments for streamlined deployments.
  • Designed AWS auto-scaling that scales instances on demand — optimizing cost while holding performance.
  • Implemented Terraform IaC for consistent, scalable multi-cloud infrastructure.
  • Automated CI/CD with Jenkins & ArgoCD — release cycles down 50% and deployment errors minimized.
  • Migrated self-hosted MongoDB to MongoDB Atlas, improving scalability and security posture.
Hire3x
Bangalore, India
Oct 2023 — Sept 2024

DevOps Engineer & Full Stack Developer

  • Architected complete AWS EKS infrastructure with Terraform, reverse-engineering the existing setup via Terraformer while preserving legacy systems.
  • Orchestrated DigitalOcean→AWS migration with minimal downtime; added CloudWatch and Loki-Grafana monitoring.
  • Implemented HPA and Auto-Scaling Groups with zero-downtime rolling upgrades.
  • Built a VPN + firewall system that cut unauthorized access attempts by 95%, plus a Python cronjob-based DB backup service for disaster recovery.
  • Led migration from community GitLab to a self-hosted instance, streamlining repo management and collaboration.
Hire3x
Bangalore, India
Jan 2023 — Oct 2023

Software Developer

  • Spearheaded a video conferencing web app integrated into the core product platform.
  • Built a versatile file-converter API that cut processing time 50% through optimized backend architecture.
  • Designed scalable back-end services in Flask & Node.js with seamless front-end integration, and ran POCs to improve UX and operational efficiency.
Technical Stack

The toolbelt, by discipline.

Cloud

AWSAzureDigitalOceanGCP

Containers & Orchestration

DockerKubernetesHelmKarpenterKEDA

CI/CD & IaC

GitHub ActionsJenkinsTerraformCloudFormationArgoCDGitLab

Observability

DatadogPrometheusGrafanaLokiCloudWatchVictoriaLogsAzure Monitor

Security

TwingateJumpCloud SSOIRSAAWS WAFGuardDutyAzure Sentinel

Scripting & Networking

PythonBashNode.jsVPC / VNetLoad BalancersPrivate EndpointsAurora / Valkey
How I Architect

A multi-cloud, GitOps-driven, zero-trust platform.

Commit to production with no manual steps and no public attack surface. The same pattern runs on AWS EKS and Azure AKS, observed end-to-end by Datadog.

ZERO-TRUST BOUNDARY · private endpoints only GitHubsource GitHub Actionsbuild · test ArgoCDGitOps sync AWS · EKSKarpenter · KEDA Azure · AKSVNet · App GW DatadogAPM · logsSLOs · traces Terraform · Infrastructure as Codeprovisions every cluster & network, reproducibly ▹ Twingate · JumpCloud SSO · IRSA · IMDSv2 · AWS WAF gate every internal surface
Selected Projects

Things I built end-to-end.

VCCL Video Conferencing

2023–24

Full P2P video conferencing app — calls, screen sharing, real-time chat — with a custom CoTURN STUN/TURN server for connectivity, no central server dependency.

Vue.jsWebRTCSocket.ioCoTURN

WireGuard VPN Solution

2024

Secure VPN + firewall for org network security and controlled resource access — multi-user, with strict firewall rules guarding databases and internal services.

WireGuardKubernetesDigitalOcean

Universal File Converter API

2023

API converting docx, pptx, txt & HTML to PDF — cutting processing time 50%, using Node.js + Puppeteer for pixel-accurate HTML-to-PDF rendering.

FlaskAWSNode.jsPuppeteer
Training & Mentoring

AWS Training Instructor

Ran a week-long AWS certification program at St. Francis Xavier — taking 20+ final-year B.Tech students from cloud fundamentals through hands-on labs and assessments, with the cohort achieving AWS Academy Cloud Foundations certification.

20+
STUDENTS
CERTIFIED
Let's talk

Building something that needs to scale, stay up, and cost less?

I'm open to senior DevOps & SRE roles. The fastest way to reach me is email — or grab the résumé.

SNSankalp Nayak · DevOps Engineer 2
Built with the Zet design system · 2026