Market Basket Analysis of Instacart
Predicting the next product of the customer using Instacart data.
Objective
The goal of this project is to identify patterns in consumer purchase behavior on Instacart and suggest product combinations that might be included in various promotions.
This project also aims to predict which previously purchased products will be in a user's future order by using anonymized data on customer orders over time.
Data Preparation
We combined six datasets containing customer purchase information into a single dataset. Key datasets included orders, products, aisles, and departments.
These were merged based on matching identifiers. Data cleaning and preparation were conducted using SAS Studio to create a final dataset ready for analysis.
Data Cleaning
There were no null or empty values for the variables like aisle, departments, Order_product_prior, order_product_train, and products datasets.
Order dataset has some null values in days since the prior order variable and only 5% of the values were found to be missing, and this has been rejected since the count is very low to be a significant issue.
All the datasets were merged using SAS Studio and SAS Enterprise Guide
Model Selection
We employed the Apriori algorithm to uncover frequent itemsets or product combinations within our dataset. This algorithm is efficient for identifying patterns in transactional data.
To predict future purchases based on these patterns, we utilized XGBoost, a powerful and versatile machine learning algorithm known for its high predictive accuracy.
By combining these two methods, we aimed to gain comprehensive insights into customer buying behavior and make reliable purchase predictions.
Conclusion and Future work
We conducted a comprehensive analysis of Instacart data using market basket analysis.
By identifying product associations and employing the XGBoost classifier, we generated actionable insights for optimizing product recommendations and increasing revenue.
Our findings demonstrate the potential of data-driven strategies for enhancing customer experience and business performance.
Future work can include prediction based on neural nets, deep learning and using different metrics to predict the next buy.
Also, Collaborative filtering can be used to suggest products to customers.