Variations of Food Prices in Italian Supermarkets
Authors/Creators
Description
The dataset includes retail prices for meat, fruit, and vegetable products collected over a period spanning from December 2020 to March 2023. The data is structured in tabular format and includes multiple columns providing detailed attributes for each entry. Specifically, each row in the dataset represents the price of a product recorded at a specific date.
The columns in the dataset are:
-
date: Date of price collection, format DD/MM/YYYY (e.g., 03/12/2020).
-
price: Retail price in euros (EUR), using a decimal point (.).
-
product_id: A unique identifier assigned to each product.
-
store_id: Anonymized unique identifier of the store where the price was recorded.
-
region: Italian region where the store is located (e.g., Calabria, Lazio).
-
product: Full commercial name of the product, including quantity or weight (e.g., "arance navelina italia calibro 1.5 kg").
-
COICOP5: Product classification at the 5-digit level based on the COICOP nomenclature (e.g., "Oranges").
-
COICOP4: Higher-level COICOP category (e.g., "Fruit", "Meat", "Vegetable").
Units and Notes:
- Currency: All prices are in euros (EUR).
- Quantities: The quantity or weight is included in the product field (e.g., "1.5 kg", "500 g").
- Date Format: Dates are in DD/MM/YYYY format.
- COICOP classification: Assigned via manual annotation and rule-based categorization using domain-specific keywords.
File Information:
- Format: CSV (.csv), UTF-8 encoded, comma-separated.
- Each row corresponds to one product observation at a specific store on a specific date.
- No missing values are present in the cleaned version.
This structure facilitates comprehensive analyses, enabling exploration of regional price variations, comparisons across product categories, and time-series investigations into price dynamics within the Italian retail food market.
# ------------------------------------------------------------
# Sample Code for Dataset Analysis
# ------------------------------------------------------------
# Required libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load dataset
df = pd.read_csv("Variations_Food_Prices_Italian_Supermarkets.csv")
# Convert date column
df['date'] = pd.to_datetime(df['date']) # Format: YYYY-MM-DD
# Define category colors
category_colors = {"Fruit": "blue", "Vegetable": "green", "Meat": "red"}
# ------------------------------------------------------------
# Geographic distribution of unique products by region
# ------------------------------------------------------------
geo = df.groupby(["region", "COICOP4"])["product_id"].nunique().reset_index()
pivot_geo = geo.pivot(index="region", columns="COICOP4", values="product_id").fillna(0)
pivot_geo["Total"] = pivot_geo.sum(axis=1)
pivot_geo = pivot_geo.sort_values("Total", ascending=False).drop(columns="Total")
pivot_geo = pivot_geo[["Fruit", "Meat", "Vegetable"]]
pivot_geo.plot(kind="bar", stacked=True, figsize=(10,6), color=["blue", "red", "green"])
plt.ylabel("Number of Unique Products")
plt.title("Geographic Distribution by Region and Category (Sorted)")
plt.xticks(rotation=45, ha="right")
plt.legend(title="Category")
plt.tight_layout()
plt.show()
# ------------------------------------------------------------
# Basic analysis: average price trend over time (by COICOP4)
# ------------------------------------------------------------
price_trend = df.groupby(["date", "COICOP4"])["price"].mean().reset_index()
plt.figure(figsize=(10,5))
sns.lineplot(data=price_trend, x="date", y="price", hue="COICOP4", palette=category_colors)
plt.title("Average Price Over Time by COICOP4 Category")
plt.xlabel("Date")
plt.ylabel("Average Price (€)")
plt.legend(title="Category")
plt.tight_layout()
plt.show()
Files
Variations_Food_Prices_Italian_Supermarkets.csv
Files
(377.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9ebdf8d41f1717f11e094e5690216e70
|
377.0 MB | Preview Download |
Additional details
Software
- Programming language
- Python
References
- Adjemian, M.K., Arita, S., Meyer, S., Salin, D., 2024. Factors affecting recent food price inflation in the united states. Applied Economic Perspectives and Policy 46, 648–676
- Baffes, J., Kshirsagar, V., Mitchell, D., 2015. What drives local food prices. Evidence from the Tanzanian maize market 500
- Fajobi, D.T., Ajetomobi, J.O., Raufu, M.O., Fajobi, M.O., Paramasivam, P., 2024. Effects of food price on nutrition outcomes among women in nigeria. Food Science & Nutrition 12, 94–104
- Kuma, B., Gata, G., 2023. Factors affecting food price inflation in ethiopia: An autoregressive distributed lag approach. Journal of Agriculture and Food Research 12, 100548
- Matita, M., Mazalale, J., Quaife, M., Johnston, D., Cornelsen, L., Kamwanja, T., Smith, R., Walls, H., 2024. Food choice responses to changes in the price of a staple crop: a discrete choice experiment of maize in rural malawi. Food Security 16, 1–16
- Sturm, R., Datar, A., 2011. Regional price differences and food consumption frequency among elementary school children. Public Health 125, 136–141
- Xu, J., 2024. The role of carbon pricing in food inflation: Evidence from canadian provinces. arXiv preprint arXiv:2404.09467
- Zhen, C., Chen, Y., Lin, B.H., Karns, S., Mancino, L., Ver Ploeg, M., 2024. Do obese and nonobese consumers respond differently to price changes? implications of preference heterogeneity for obesity-oriented food taxes and subsidies. American Journal of Agricultural Economics 106, 1058–1088