Aniket Palaskar | Data Engineer

About Me

I'm a Data Engineer with 3+ years of experience designing and maintaining high-performance data systems. My expertise lies in building streaming and batch pipelines using tools like Python, RabbitMQ, and AWS.

I specialize in creating reliable, scalable systems that handle large volumes of data efficiently. With a strong foundation in distributed systems and data engineering principles, I focus on delivering solutions that are both performant and maintainable.

I help companies turn messy, high-volume data into reliable systems that teams can trust.

Data Engineering

Building robust data pipelines

Cloud Infrastructure

AWS, Docker, and DevOps

Performance Optimization

Scalable and efficient systems

Professional Experience

Software Developer | Provakil, Pune, India

Feb 2023 - Present

Engineered scalable Python scraping pipelines for 20+ national/state courts to ingest 5000+ case records daily
Stored structured data, PDFs, and HTML efficiently in AWS S3, ensuring data accessibility and integrity
Automated data quality checks, reducing manual validation efforts by 30% and improving data accuracy
Designed and integrated data ingestion workflows with RabbitMQ, enabling multiple consumer services to process listings asynchronously; improved data freshness by 30% and increased system throughput by 25%
Proactively debugged and resolved complex pipeline failures by analyzing logs in Elasticsearch and monitoring queues, reducing mean-time-to-resolution (MTTR) for data issues by 40%
Developed automated notification systems using Pandas and Python, generating and delivering case summaries via email/WhatsApp, enhancing user engagement
Collaborated with DevOps and backend teams on cross-service debugging, server configuration, and deployment processes

Production Engineering Highlights

Reduced MTTR by 40% through proactive monitoring and log analysis
Debugged asynchronous pipelines by analyzing queue backlogs and consumer behavior
Collaborated across backend and DevOps teams during incident resolution

Technical Skills

Programming & Scripting

Python (Pandas, BeautifulSoup, Selenium, Django, Flask), SQL

Data Pipelines & Messaging

Apache Kafka, RabbitMQ, ETL/ELT, Cron, Distributed Systems

Cloud & DevOps

AWS (S3, EC2), Docker, Git, CI/CD

Databases & Storage

MongoDB, PostgreSQL, Elasticsearch

Personal Projects

BullBearSim - Real-Time Market Data Pipeline

Built end-to-end streaming pipeline with Python, Apache Kafka, and PostgreSQL for real-time stock data simulation. Containerized ecosystem using Docker and created live dashboards with Grafana for data visualization.

PythonApache KafkaPostgreSQLDockerGrafanaAWS EC2

Data Visualizer - Full-Stack Analytics Platform

Developed Django web application enabling users to upload datasets and generate interactive visualizations. Implemented data processing with Pandas and visualization with Matplotlib/Seaborn. Designed responsive UI with Bootstrap.

PythonDjangoPandasMatplotlibBootstrapPostgreSQL

Apache Airflow – Workflow Orchestration (Learning Project)

Explored Apache Airflow fundamentals by building sample DAGs while following official documentation and tutorial resources. Practiced task dependencies, scheduling, retries, and failure handling using Python operators to understand batch workflow orchestration.

Apache AirflowPythonDAGsETL Basics

Apache Spark – Distributed Data Processing (Learning Project)

Explored Apache Spark concepts by running local PySpark jobs to process structured datasets. Practiced transformations, actions, and basic performance considerations to understand how distributed processing works at scale.

Apache SparkPySparkDistributed Computing

System Design Highlights

Designed event-driven ingestion using queues and asynchronous consumers
Built batch workflows using Apache Airflow DAGs with retries and failure handling
Implemented streaming pipelines using Apache Kafka for real-time data processing
Stored structured and unstructured data in PostgreSQL and AWS S3
Monitored and debugged production issues using logs and queue metrics

Education & Certifications

Education

B-Tech in Electronics and Telecommunication Engineering

Deogiri Institute of Engineering and Management Studies

2018 – 2022

Certifications

Python Certification

EdYoda Digital University

2022
Data Science Certification

EdYoda Digital University

2023
100 Days of Code: Python

Udemy

2023
Python (Basic) & Problem Solving (Basic)

HackerRank

2023

Hi, I'm Aniket Palaskar

Data Engineer & Software Developer