File uploads: We have fixed an issue which caused file uploads to fail. We apologise for the inconvenience it may have caused.

Published December 19, 2021 | Version v2.0
Dataset Open

BigMart Retail Sales

Creators

Description

Nothing ever becomes real till it is experienced.

-John Keats

 

While we don't know the context in which John Keats mentioned this, we are sure about its implication in data science. While you would have enjoyed and gained exposure to real world problems in this challenge, here is another opportunity to get your hand dirty with this practice problem.

_______________________________________

Problem Statement :

The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and find out the sales of each product at a particular store.

Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.

Please note that the data may have missing values as some stores might not report all the data due to technical glitches. Hence, it will be required to treat them accordingly.

________________________________________

Data :

We have 14204 samples in data set.

 

Variable Description

  • Item Identifier: A code provided for the item of sale
  • Item Weight: Weight of item
  • Item Fat Content: A categorical column of how much fat is present in the item: ‘Low Fat’, ‘Regular’, ‘low fat’, ‘LF’, ‘reg’
  • Item Visibility: Numeric value for how visible the item is
  • Item Type: What category does the item belong to: ‘Dairy’, ‘Soft Drinks’, ‘Meat’, ‘Fruits and Vegetables’, ‘Household’, ‘Baking Goods’, ‘Snack Foods’, ‘Frozen Foods’, ‘Breakfast’, ’Health and Hygiene’, ‘Hard Drinks’, ‘Canned’, ‘Breads’, ‘Starchy Foods’, ‘Others’, ‘Seafood’.
  • Item MRP: The MRP price of item
  • Outlet Identifier: Which outlet was the item sold. This will be categorical column
  • Outlet Establishment Year: Which year was the outlet established
  • Outlet Size: A categorical column to explain size of outlet: ‘Medium’, ‘High’, ‘Small’.
  • Outlet Location Type: A categorical column to describe the location of the outlet: ‘Tier 1’, ‘Tier 2’, ‘Tier 3’
  • Outlet Type: Categorical column for type of outlet: ‘Supermarket Type1’, ‘Supermarket Type2’, ‘Supermarket Type3’, ‘Grocery Store’
  • Item Outlet Sales: The number of sales for an item.

_________________________________________

Evaluation Metric:

We will use the Root Mean Square Error value to judge your response

Files

data.csv

Files (1.5 MB)

Name Size Download all
md5:3f4107e450913dd7243177121b1dc0a2
1.5 MB Preview Download