Hi, I'm Kuntal Cloud Data Specialist

Designing reliable cloud data systems focused on performance, automation, and AI-ready data delivery across AWS, Databricks, Python, PySpark, and SQL.

10+ Years Data Engineering & Analytics
100+ Pipelines Delivered
$20K/mo AWS Savings Delivered

Core capabilities backed by measurable outcomes.

Capabilities

  • Data Solution Architecture and Development
  • Data Lake and ETL/ELT Solution Design
  • Process Automation and Workflow Orchestration
  • CI/CD-Driven Deployment and Feature Release
  • Performance and Cost Optimization
  • Data Governance and Solution Ownership
  • ML Model Integration and Deployment
  • GenAI Based Solution Prototyping
  • Data Warehouse Modernization and Migration
  • BI Reporting and Analytics Enablement
  • Team Leadership and Stakeholder Management
  • Technical Mentoring and Team Enablement

Achievements

  • $20K/month Saved Through AWS Cost and Performance Optimization
  • 15-20 Hours Saved Per Run Through Data Pipeline Redesign and Optimization
  • 40 Hours/Month Saved Through Validation and Operational Workflow Automation
  • 100+ Data Pipelines Built Across AWS, Databricks, Python, PySpark, SQL, and Delta Lake
  • 30+ SQL Views and Stored Procedures Migrated from Legacy Platforms to AWS Redshift
  • 80+ Jobs Upgraded to Internal Python Package Repository from Public Repo, Ensuring Security Compliance and Platform Standardization
  • Re-Architected Data Pipelines for Improved Scalability, Reliability, Reusability, and Operational Efficiency
  • Owned End-to-End Design, Development, Deployment, and Operations of a Containerized Cloud Map Server

Technical Skills

Languages & Frameworks

  • Python
  • SQL
  • PySpark

Databases & Warehouses

  • Redshift
  • Postgres
  • SQL Server

ETL/ELT

  • AWS Glue
  • Databricks
  • Pentaho

Cloud

  • S3
  • Lambda
  • SNS
  • SQS
  • EventBridge
  • CloudWatch
  • EC2

Streaming

  • Kinesis Firehose

Open Table & NoSQL

  • Delta
  • DynamoDB

Orchestration

  • Step Functions
  • Airflow

AI/ML

  • Claude Code
  • GitHub Copilot
  • Genie

Tools

  • draw.io
  • LiveVox
  • Postman
  • Jira
  • Tableau

CICD & DevOps

  • GitHub Actions
  • AWS CDK
  • AWS CloudFormation
  • Azure DevOps
  • Docker
  • ECR
  • ECS
  • Databricks Asset Bundle

Route Optimization

Automated monthly and daily sales rep scheduling process for HCP and HCO coverage using AWS based data pipelines, Veeva API integration, and route optimization Machine Learning models.

Route optimization architecture

Sales Rep Training Analytics

A Databricks based Sales Rep Training Analytics platform integrating SAP SuccessFactors training data with Veeva call and email activity to measure training coverage, completion trends, post-training engagement, and field execution.

Sales Rep Training Analytics architecture diagram

Sales and Activity Reporting

A reporting solution that provides data-driven insights into Sales Rep performance, HCP/HCO effectiveness, payer impact, and drug performance through streamlined data collection, visualization, and analysis using AWS Services.

Analytics Sales and Activity Reporting

Automated Testing Framework

Integrate AWS and Jira to automate test lifecycle, reporting, issue creation, and operational visibility.

Automation Automated Testing Framework

MiSol D&A

The Bridgestone Mileage Solutions D&A Program focuses on using data and advanced analytics to optimize tire performance, improve fleet operations, enhance customer solutions, and generate actionable insights from tire mileage, fleet, and third-party system data.

Analytics MiSol D&A

ADW to CDP Migration

This project migrates Bridgestone's SAP-based ADW (Analytical Data Warehouse) to a modern, scalable AWS Redshift-based CDP (Central Data Platform), enabling improved performance, enhanced analytics, and better decision-making through cloud-native tools and technologies.

Migration ADW to CDP Migration

TAC Data Lake Setup

A unified and scalable central Data Lake solution on the AWS Cloud to analyze Holcim's transportation and logistics data, including fleet, maintenance, delivery performance, and third-party logistics data.

Analytics TAC Data Lake Setup

Principal Engineer

Eli Lilly

22 May 2023 - Present Bangalore, India
  • Key Roles - Senior Data Engineer, Lead Data Engineer, Solution Architect
  • Re-designed data pipelines for performance optimization and cost efficiency
  • Implemented config-driven and event-driven data pipelines for scalability
  • Built and maintained data pipelines and data lake solutions using Python, PySpark, SQL, Databricks, AWS Glue, and AWS services
  • Implemented Delta Lake Medallion architecture across multi-layered data zones
  • Implemented an end-to-end Open-Source Map server using ECS, ECR, Lambda
  • Deployed and optimized ML model execution on AWS ECS, ECR, Lambda
  • Built local Docker-based Spark development & model run setups; conducted hands-on sessions for team enablement
  • Designed quick MVP solutions for GenAI, LLM, and MCP-based AI use cases
  • Created CI/CD pipelines using GitHub Actions, AWS CDK, Databricks Asset Bundles
  • Implemented custom Glue logging, automated testing, table-level lineage, and automated ServiceNow ticketing
  • Enhanced data pipelines to support new and expanding drug launches
  • Led internal and vendor teams, presented release demos and technical walkthroughs, cyber security reviews and owned end-to-end solution delivery

IT Analyst

Tata Consultancy Services

18 Oct 2021 - 19 May 2023 Kolkata, India
  • Key Roles - Senior Data Engineer
  • Designed and built Data Lake ingestion pipelines using AWS Glue, PySpark, SQL
  • Implemented CDC-based ingestion into Data Lake using PySpark and AWS Glue
  • Migrated SAP HANA Warehouse workloads to AWS Redshift
  • Automated semi-structured Excel data extraction using AWS Lambda, Python
  • Integrated Dynamics 365 CRM data using Azure Data Factory and Databricks
  • Developed Redshift SQL views and ER diagrams to support BI dashboards
  • Automated Glue job monitoring and failure notifications using EventBridge, SNS
  • Created CI/CD pipelines using Azure DevOps, CDK, CloudFormation
  • Prepared STTM documents defining field mappings and business logic
  • Participated in Sprint demos, and technical walkthroughs

Associate Projects

Cognizant Technology Solutions

2 Nov 2018 - 8 Oct 2021 Bangalore, India
  • Key Roles - Data Engineer
  • Delivered PLM data management and analytics workflows using SQL, Oracle, AWS
  • Developed ETL data flows to support PLM reporting using AWS Glue, Python, SQL
  • Coordinated delivery through GitHub version control and Jira workflow tracking

Associate IT Consultant

ITC InfoTech

27 Feb 2017 - 29 Oct 2018 Bangalore, India
  • Key Roles - ETL Developer, Application Developer
  • Supported PLM analytics using SQL Server, Pentaho Kettle, and REST APIs.
  • Coordinated SVN based delivery and improved code quality using SonarQube

Support Engineer

Hewlett Packard Enterprise

29 Feb 2016 - 21 Feb 2017 Bangalore, India
  • Key Roles - Data Analyst
  • Supported vendor order data management using SQL Server and MySQL
  • Handled enterprise operational support, issue tracking using HPSM

Analyst

Cognizant Technology Solutions

21 Nov 2014 - 15 May 2015 Kolkata, India
  • Key Roles - Data Analyst
  • Performed Vendor Master Data stewardship and analysis using Siebel CRM, SQL
  • Supported data accuracy, business process compliance and reporting activities

Academic Background

  • Bachelor of Technology, 2014 Electronics and Communication Engineering West Bengal University of Technology
  • Higher Secondary, 2009 Science Nirmal Hriday Ashram, West Bengal
  • Secondary Education, 2007 General Education Ramkrishna Mission Vidyabhawan, West Bengal

Start a Conversation for Consulting and Collaborations