Homework 6

Association Rules with Grocery Data

Author

Byeong-Hak Choe

Published

April 27, 2026

Modified

April 27, 2026

Direction

  • Please submit one Quarto Document of Homework 6 to Brightspace using the following file naming convention:

  • Example:

    • danl-320-hw6-choe-byeonghak.qmd
  • Due: May 6, 2026, 11:59 P.M. (ET)

  • Please send Byeong-Hak an email (bchoe@geneseo.edu) if you have any questions.


Setup

library(tidyverse)
library(rmarkdown)

library(arules)
library(arulesViz)
library(plotly)


📥 Load the Grocery Transaction Data

The data are stored in a single transaction format, where each row represents one item purchased in one transaction.

grocery <- read.transactions(
  "https://bcdanl.github.io/data/market_basket.tsv",
  format = "single",
  header = TRUE,
  cols = c(1, 2),
  rm.duplicates = TRUE
)

Question 1. 🏷️ Transaction and Item Labels

What do the labels for the column and the row of grocery represent?


Question 2. 📦 Transaction Size Distribution

What are the first quartile, median, third quartile, and maximum of transaction sizes in grocery? Visualize the distribution of transaction sizes.

Question 3. 🔝 Top 50 Most Frequently Purchased Items

Find the top 50 most frequently occurring items in grocery. Also, visualize the distribution of top 50 item occurrences in grocery.


Question 4. ⛏️ Association Rules from Grocery Transactions

From the subset of grocery whose transaction size is greater than 1, find association rules with:

  • minimum support = 0.01,
  • minimum confidence = 0.25, and
  • minimum rule length = 2.

Find the top 10 rules in terms of lift values.


Question 5. 🧠 Interpret the Rule with the Highest Lift

Interpret the following qualities of the rule with the highest lift:

  1. confidence,
  2. coverage, and
  3. lift.


Question 6. 🥛 What Did Customers Buy Before Buying Whole Milk?

What item(s) did customers buy before buying whole milk?

For this milk rule, use:

  • minimum support = 0.01,
  • minimum confidence = 0.25, and
  • minimum rule length = 2.


Question 7. 🛍️ What Are Customers Who Bought Whole Milk Also Likely to Buy?

Find item(s) that customers who bought whole milk are also likely to buy.


Question 8. 📉 Items Customers Are Less Likely to Buy Before Buying Whole Milk

Using the result of association rule mining in Question 4, find item(s) that customers are less likely to buy before buying whole milk. Why do you think those items are less likely to be purchased when customers buy whole milk?

Back to top