Data Engineer, Data Scientist & Web Developer.

Specializing in productionized machine learning, MLOps, data analytics, and data engineering. Reach out to me using the contact form or email me at


Projects that I have completed in various domains.

Card image cap
Machine Learning Operations (Mlops) On Aws Design And Development


Designed and developed a machine learning operations (MLOps) solution on AWS learn more

Card image cap
Insider Threat Monitoring Solution On Aws

machine learning, data engineering

A large financial services company needed to monitor all priviledged activities occurring in several hundred databases to track insider threats.  learn more

Card image cap
End-To-End Forecasting Pipelines For Supply Chain Of A Large Restaurant Franchise

machine learning, mlops

Forecasting inventory management solution for a large restaurant franchise to estimate ingredient usage up to a week in advance so appropriate stock can be ordered. learn more

Card image cap
Document Classification Pipelines With Retraining And Deployment To Inference Endpoint

machine learning, mlops

NLP model development, testing, and productionization through MLOps practice to programatically retrain, deploy and monitor performance
for classifying documents uploaded through a digital mailroom for a national mortgage lender. learn more

Card image cap
Replicating Data From The Data Warehouse Into Domain Specific Redshift

data engineering

Data replication from the central data warehouse Redshift cluster to department specific Redshift instances using Apache Airflow as the ELT pipeline orchestrator for a large restaurant franchise. learn more


Hands on skills using popular cloud technologies to solve data and machine learning challenges in production at scale

Python 100%
SQL 100%
Machine Learning 80%
Spark 80%
AWS/Azure/GCP 75%
NLP 85%
Computer Vision 80%
Web Development 50%


Interested parties may view my resume to see if I am a good fit for what you're looking for


Ahsin Shabbir

I like to build awesome things. Software engineering is my passion. I am a pragmatist rather than an idealist. A working solution is more important than a perfect one.


Master of Computer Science

2019 - 2021

Georgia Institute of Technology, Atlanta, Georgia

Specialization in machine learning

Professional Experience

Senior Data Engineer

2021 - Present

Cognizant Softvision

  • I am building DAGs that ingest data from Athena/Redshift, perform business logic to transform the data, run data preprocessing (normalizing values, flagging outliers), model training, and model inference. Inference is being done with batch processing rather than real-time. The client has no need for forecasting models to be real-time.
  • For another client, I used the Azure Machine Learning studio to create DAGs for data ingestion, data preprocessing, model training, and model deployment. Deployment was done on a Kubernetes cluster (AKS) with the model exposed as a Dockerized REST API for real-time inference.
  • I built a data mart for the client to lay the foundation for the enterprise data that is needed to build all the data sets used by supply chain forecasting models. I constructed ELT pipelines with Airflow to replicate data from multiple datalake sources into a supply chain data mart. I then created materialized views on top of the raw data so that the datasets are ready to be consumed by data scientists and other machine learning engineers. I utilized Flyway to create all tables and views so that there is version-controlled schema evolution. I optimized Redshift table definition Distribution Key and Sort Key configurations based on looking at the queries that are run most frequently so that performance is orders of magnitude faster. I created tables that model what is needed to implement a phase 2 approach where the client can utilize Arize and MLFlow to run automatic model monitoring, retraining, deployment, and A/B testing
  • I am working on architectures for scaling the batch forecasts to be more performant. The simplest architecture uses AWS glue (pyspark) and utilizes distributed worker nodes performing tasks on partitions of data. Kafka or other streaming technology is on the table for online real-time forecasting.

Machine Learning Engineer

2020 - 2021

United Wholesale Mortgage

Pontiac, MI

  • Developed, tested, and integrated into production an end-to-end multilabel machine learning classification model that processes on average 1,000,000 requests per day asynchronously with 91% (+/- 5%) average daily accuracy.
  • Models are deployed as a microservice on-prem and consumed by the document management tools. Concurrency, scalability, fault tolerance, and network security are built into the microservice.

Machine Learning Engineer

2017 - 2020

Ford Motor Company

Dearborn, MI

  • Analytics from IOT sensor data to predict machine failure in global plants. Lead the development of the machine learning time-series predictions, data ingestion/ETL pipeline, provisioning of PCF/AE5 platform for deployment, and updates to a web dashboard displaying refreshed sensor data with summary statistics.
  • Developed machine learning pipelines built with Python to classify text data using NLP (gensim, spacy, NLTK).


Consulting on a broad range of topics.

NLP Models

I can build classification models for text data.

Data Warehousing

I can help you setup a data warehouse.

Data Pipelines

Using Airflow or similar orchestration tools I can build pipelines for ETL and ELT

Computer Vision

I can develop object detection, segmentation, and custom vision models

Business Analytics

Using a combination of SQL, Python, and visualization packages, I can setup dynamic dashboards that provide insights into the business operations.

Time-series Forecasting

I can build models from time series data constructing features that are tightly coupled to the domain your business is in.


You can reach out to me using the email below.

Alternatively, fill in the form below to send me a message.

I consent to receive a confirmation email from