Resume

Experience

  1. ZS Associates

    Site Reliability Engineer | Jan 2023 — Present (Full Time)

    Proactively supervise service performance for enterprise-level distributed systems serving 10,000+ users, improving response times by 45% and reducing resource consumption by 30%.

    Lead incident response as primary on-call engineer with average MTTR of 15 minutes, reducing recurring issues by 70% through detailed root cause analysis.

    Develop automation using Python, Bash, and PowerShell — reducing manual intervention by 65% and saving 25+ hours weekly.

    Manage production Kubernetes clusters with Docker, implementing auto-healing, HPA, and achieving zero-downtime deployments through blue-green and canary strategies.

    Deploy monitoring infrastructure using Prometheus, Grafana, Datadog, and Splunk, reducing alert noise by 50% through intelligent thresholding.

    Implement IaC using Terraform, CloudFormation, and Ansible achieving 100% infrastructure reproducibility.

    Define and supervise SLIs/SLOs for 50+ microservices, establishing error budgets and driving data-driven reliability decisions.

  2. Samyak Softwares

    Industrial Trainee — DevOps & Cloud Infrastructure | Jun 2022 — Jul 2022

    Assisted with cloud infrastructure deployments, monitoring setup, automation scripting, and incident response procedures in AWS environments.
    Participated in CI/CD pipeline development, configuration management initiatives, and operational documentation creation.

Education

  1. MIT Academy of Engineering, Pune

    B.Tech in Information Technology — CGPA: 9.20/10 [2019 — 2023]

    Graduated in Information Technology with distinction. Specializing in Cloud Computing and Distributed Systems.

Certifications

  1. AWS Certified Developer — Associate

    Amazon Web Services
  2. HashiCorp Certified: Terraform Associate

    HashiCorp
  3. Microsoft Azure Fundamentals (AZ-900)

    Microsoft
  4. AWS Cloud Practitioner

    Amazon Web Services

Projects

  1. Enterprise SRE Automation Platform

    Python, Ansible, Terraform, Prometheus, Grafana | Ongoing

    Architected comprehensive SRE automation platform reducing operational toil by 70%. Implemented self-healing mechanisms and automated incident response workflows achieving 80% reduction in manual intervention and 99.95% uptime. Developed automated run-books with rollback capabilities integrated with Slack and PagerDuty.

  2. Distributed Microservices Platform

    Kubernetes, Docker, Istio, AWS, Terraform | May 2023

    Led reliability engineering for 50+ services across multi-region AWS infrastructure with Istio service mesh. Designed monitoring strategy using Prometheus, Grafana, and Datadog achieving 99.9% availability and sub-200ms P95 latency. Applied chaos engineering using AWS Fault Injection Simulator.

  3. Cloud Migration & Optimization Initiative

    AWS, Azure, Python, Terraform, CloudFormation | Aug 2022

    Led migration of 100+ applications from on-premises to AWS with zero data loss and minimal downtime. Implemented cost optimization strategies reducing cloud spend by 35% while improving performance.

My Skills

  • SRE & Incident Response
    90%
  • AWS / Azure / GCP
    85%
  • Kubernetes & Docker
    85%
  • Terraform & IaC
    85%
  • Python & Automation
    85%
  • Prometheus, Grafana & Observability
    80%
  • Linux & System Administration
    80%
  • CI/CD & DevOps
    80%