About Me

Saurabh Suman Profile Picture

Saurabh Suman

Data Engineer

Data Engineer with 7+ years of experience in building and managing large-scale data pipelines, ELT processes, and data warehouse solutions. Can take complete ownership of the data – from building and testing to deployment and support and also handle day-to-day monitoring and fixing issues in the pipelines to keep things running smoothly.

Work Experience

Application Developer

Fujitsu India Pvt. Ltd. | 01/2025 - Present | Pune

  • Developed Azure Logic Apps to automatically detect and respond to pipeline failures, enhancing system resilience and reducing manual intervention.
  • Designed an optimized data processing workflow from Self-Hosted integration to the gold layer, reducing average latency by 10 hours and operational costs by 15%.
  • Developed Script to automatically fetch the incremental API data without need to know number of Pages to reduce the pipeline failure and incorrect data capture.

Technical Lead- I

Citiustech Healthcare Technology | 09/2024 - 01/2025 | Pune, India

  • Led a team of 4 to migrate legacy ADF + Synapse pipelines into Microsoft Fabric.
  • Rewrote Databricks Spark workloads into Fabric notebooks ensuring functional parity.
  • Implemented HIPAA-compliant security: IAM roles, workspace governance, network rules.

Data Scientist

Tiger Analytics | 06/2022 - 09/2024 | Chennai, India

  • Collaborated closely with business teams to translate requirements into scalable ETL solutions, delivering high-impact data products.
  • Reduced region-specific solution development time by 30% through modular and reusable code design.
  • Enhanced pipeline reliability by 60% via robust validation checks across all stages of data flow, and implemented alerting mechanisms for data anomalies.
  • Led a team with a focus on delivery excellence, mentorship, and knowledge sharing to uplift overall team productivity.

Specialist Programmer

Infosys | 06/2018 - 06/2022 | Pune, India

  • Designed and maintained ETL pipelines with automation across different domains, such as Subscription Services and Banking, by using Azure services along with Airflow and dbt, ensuring the quality and timely availability of data.
  • Developed a Python-based automation tool for background check document generation, reducing manual effort by 85%.
  • Recognized with multiple Infosys Insta Awards; fast-tracked through three promotions for consistent high performance.

Education

Bachelor of Engineering in Information Technology

Jabalpur Engineering College | 2014 - 2018 | Jabalpur, M.P.

  • Specialized in Information Technology

Skills

Big Data Technologies

PySpark Delta Lake Spark SQL

Cloud Computing

ADF Databricks ADLS Fabric Logic Apps Synapse AWS Glue

Data Engineering

ETL/ELT Pipeline

Familiar

FastAPI GenAI PowerBI Spark Streaming

Programming & Data Analysis

Python SQL

Tools & Platforms

Azure DevOps VSCode Data Modeling Linux MS Excel SSMS Windows

Achievements & Certificates

Achievements

  • Received 3 Insta Award

    Received 3 Insta Award in Infosys.

  • Proficiency Test

    Got in top 3% of Maths Proficiency Test in 2011.

Certificates & Conferences

  • Microsoft Certified: Fabric Data Engineer Associate

    Certification demonstrating proficiency in designing and implementing Microsoft Fabric data engineering solutions

    Microsoft | 2026-03-31
  • Databricks Certified Data Engineer Associate

    Professional certification validating expertise in building and optimizing data engineering solutions with Databricks

    Databricks | 2025-02-20
  • Microsoft Certified: Azure Data Fundamentals

    Certification validating foundational knowledge of core data concepts and Azure data services

    Microsoft | 2025-02-01

Projects

API Data Ingestion

Developed a robust system for incremental API data ingestion with automatic pagination handling and fault tolerance capabilities.

Python REST API Azure Logic Apps
Portfolio Website

A responsive PHP portfolio/resume website with print functionality. Features modern design with Bootstrap and comprehensive resume sections.

CSS PHP Bootstrap
Oreo IO

Oreo IO is a data platform built to help developers collaboratively work on data ingestion with features like live edit, multiple file support, approval based.

Python Go React Postgres