Hi, I'm Aniket Palaskar

Data Engineer & Software Developer

Designing, building, and maintaining scalable data pipelines and distributed systems. Passionate about reliable data systems, automation, and performance engineering.

Aniket Palaskar

About Me

I'm a Data Engineer with 3+ years of experience designing and maintaining high-performance data systems. My expertise lies in building streaming and batch pipelines using tools like Python, RabbitMQ, and AWS.

I specialize in creating reliable, scalable systems that handle large volumes of data efficiently. With a strong foundation in distributed systems and data engineering principles, I focus on delivering solutions that are both performant and maintainable.

I help companies turn messy, high-volume data into reliable systems that teams can trust.

Data Engineering

Building robust data pipelines

Cloud Infrastructure

AWS, Docker, and DevOps

Performance Optimization

Scalable and efficient systems

Professional Experience

Software Developer | Provakil, Pune, India

Feb 2023 - Present
  • Engineered scalable Python scraping pipelines for 20+ national/state courts to ingest 5000+ case records daily
  • Stored structured data, PDFs, and HTML efficiently in AWS S3, ensuring data accessibility and integrity
  • Automated data quality checks, reducing manual validation efforts by 30% and improving data accuracy
  • Designed and integrated data ingestion workflows with RabbitMQ, enabling multiple consumer services to process listings asynchronously; improved data freshness by 30% and increased system throughput by 25%
  • Proactively debugged and resolved complex pipeline failures by analyzing logs in Elasticsearch and monitoring queues, reducing mean-time-to-resolution (MTTR) for data issues by 40%
  • Developed automated notification systems using Pandas and Python, generating and delivering case summaries via email/WhatsApp, enhancing user engagement
  • Collaborated with DevOps and backend teams on cross-service debugging, server configuration, and deployment processes

Production Engineering Highlights
  • Reduced MTTR by 40% through proactive monitoring and log analysis
  • Debugged asynchronous pipelines by analyzing queue backlogs and consumer behavior
  • Collaborated across backend and DevOps teams during incident resolution

Technical Skills

Programming & Scripting

Python (Pandas, BeautifulSoup, Selenium, Django, Flask), SQL

Data Pipelines & Messaging

Apache Kafka, RabbitMQ, ETL/ELT, Cron, Distributed Systems

Cloud & DevOps

AWS (S3, EC2), Docker, Git, CI/CD

Databases & Storage

MongoDB, PostgreSQL, Elasticsearch

Personal Projects

BullBearSim - Real-Time Market Data Pipeline

Built end-to-end streaming pipeline with Python, Apache Kafka, and PostgreSQL for real-time stock data simulation. Containerized ecosystem using Docker and created live dashboards with Grafana for data visualization.

PythonApache KafkaPostgreSQLDockerGrafanaAWS EC2

Data Visualizer - Full-Stack Analytics Platform

Developed Django web application enabling users to upload datasets and generate interactive visualizations. Implemented data processing with Pandas and visualization with Matplotlib/Seaborn. Designed responsive UI with Bootstrap.

PythonDjangoPandasMatplotlibBootstrapPostgreSQL

Apache Airflow – Workflow Orchestration (Learning Project)

Explored Apache Airflow fundamentals by building sample DAGs while following official documentation and tutorial resources. Practiced task dependencies, scheduling, retries, and failure handling using Python operators to understand batch workflow orchestration.

Apache AirflowPythonDAGsETL Basics

Apache Spark – Distributed Data Processing (Learning Project)

Explored Apache Spark concepts by running local PySpark jobs to process structured datasets. Practiced transformations, actions, and basic performance considerations to understand how distributed processing works at scale.

Apache SparkPySparkDistributed Computing

System Design Highlights

  • Designed event-driven ingestion using queues and asynchronous consumers
  • Built batch workflows using Apache Airflow DAGs with retries and failure handling
  • Implemented streaming pipelines using Apache Kafka for real-time data processing
  • Stored structured and unstructured data in PostgreSQL and AWS S3
  • Monitored and debugged production issues using logs and queue metrics

Education & Certifications

Education

B-Tech in Electronics and Telecommunication Engineering

Deogiri Institute of Engineering and Management Studies

2018 – 2022

Certifications

  • Python Certification
    EdYoda Digital University
    2022
  • Data Science Certification
    EdYoda Digital University
    2023
  • 100 Days of Code: Python
    Udemy
    2023
  • Python (Basic) & Problem Solving (Basic)
    HackerRank
    2023

Get In Touch

I'm open to new data engineering roles, collaborations, or freelance projects. Reach out if you'd like to discuss opportunities or projects.