Published December 20, 2018 | Version 0.5.0
Dataset Open

Brisbane Library Checkout Data

  • 1. Monash University

Description

This has been copied from the README.md file

bris-lib-checkout

This provides tidied up data from the Brisbane library checkouts

Retrieving and cleaning the data

The script for retrieving and cleaning the data is made available in scrape-library.R.

The data

  • The data/ folder contains the tidy data
  • The data-raw/ folder contains the raw data

data/

This contains four tidied up dataframes:

  • tidy-brisbane-library-checkout.csv
  • metadata_branch.csv
  • metadata_heading.csv
  • metadata_item_type.csv

tidy-brisbane-library-checkout.csv contains the following columns, with the metadata file metadata_heading containing the description of these columns.

knitr::kable(readr::read_csv("data/metadata_heading.csv"))
#> Parsed with column specification:
#> cols(
#>   heading = col_character(),
#>   heading_explanation = col_character()
#> )

heading

heading_explanation

Title

Title of Item

Author

Author of Item

Call Number

Call Number of Item

Item id

Unique Item Identifier

Item Type

Type of Item (see next column)

Status

Current Status of Item

Language

Published language of item (if not English)

Age

Suggested audience

Checkout Library

Checkout branch

Date

Checkout date

We also added year, month, and day columns.

The remaining data are all metadata files that contain meta information on the columns in the checkout data:

library(tidyverse)
#> ── Attaching packages ────────────── tidyverse 1.2.1 ──
#> ✔ ggplot2 3.1.0           ✔ purrr   0.2.5     
#> ✔ tibble  1.4.99.9006     ✔ dplyr   0.7.8     
#> ✔ tidyr   0.8.2           ✔ stringr 1.3.1     
#> ✔ readr   1.3.0           ✔ forcats 0.3.0
#> ── Conflicts ───────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()
knitr::kable(readr::read_csv("data/metadata_branch.csv"))
#> Parsed with column specification:
#> cols(
#>   branch_code = col_character(),
#>   branch_heading = col_character()
#> )

branch_code

branch_heading

ANN

Annerley

ASH

Ashgrove

BNO

Banyo

BRR

BrackenRidge

BSQ

Brisbane Square Library

BUL

Bulimba

CDA

Corinda

CDE

Chermside

CNL

Carindale

CPL

Coopers Plains

CRA

Carina

EPK

Everton Park

FAI

Fairfield

GCY

Garden City

GNG

Grange

HAM

Hamilton

HPK

Holland Park

INA

Inala

IPY

Indooroopilly

MBG

Mt. Coot-tha

MIT

Mitchelton

MTG

Mt. Gravatt

MTO

Mt. Ommaney

NDH

Nundah

NFM

New Farm

SBK

Sunnybank Hills

SCR

Stones Corner

SGT

Sandgate

VAN

Mobile Library

TWG

Toowong

WND

West End

WYN

Wynnum

ZIL

Zillmere

knitr::kable(readr::read_csv("data/metadata_item_type.csv"))
#> Parsed with column specification:
#> cols(
#>   item_type_code = col_character(),
#>   item_type_explanation = col_character()
#> )

item_type_code

item_type_explanation

AD-FICTION

Adult Fiction

AD-MAGS

Adult Magazines

AD-PBK

Adult Paperback

BIOGRAPHY

Biography

BSQCDMUSIC

Brisbane Square CD Music

BSQCD-ROM

Brisbane Square CD Rom

BSQ-DVD

Brisbane Square DVD

CD-BOOK

Compact Disc Book

CD-MUSIC

Compact Disc Music

CD-ROM

CD Rom

DVD

DVD

DVD_R18+

DVD Restricted - 18+

FASTBACK

Fastback

GAYLESBIAN

Gay and Lesbian Collection

GRAPHICNOV

Graphic Novel

ILL

InterLibrary Loan

JU-FICTION

Junior Fiction

JU-MAGS

Junior Magazines

JU-PBK

Junior Paperback

KITS

Kits

LARGEPRINT

Large Print

LGPRINTMAG

Large Print Magazine

LITERACY

Literacy

LITERACYAV

Literacy Audio Visual

LOCSTUDIES

Local Studies

LOTE-BIO

Languages Other than English Biography

LOTE-BOOK

Languages Other than English Book

LOTE-CDMUS

Languages Other than English CD Music

LOTE-DVD

Languages Other than English DVD

LOTE-MAG

Languages Other than English Magazine

LOTE-TB

Languages Other than English Taped Book

MBG-DVD

Mt Coot-tha Botanical Gardens DVD

MBG-MAG

Mt Coot-tha Botanical Gardens Magazine

MBG-NF

Mt Coot-tha Botanical Gardens Non Fiction

MP3-BOOK

MP3 Audio Book

NONFIC-SET

Non Fiction Set

NONFICTION

Non Fiction

PICTURE-BK

Picture Book

PICTURE-NF

Picture Book Non Fiction

PLD-BOOK

Public Libraries Division Book

YA-FICTION

Young Adult Fiction

YA-MAGS

Young Adult Magazine

YA-PBK

Young Adult Paperback

Example usage

Let’s explore the data

bris_libs <- readr::read_csv("data/bris-lib-checkout.csv")
#> Parsed with column specification:
#> cols(
#>   title = col_character(),
#>   author = col_character(),
#>   call_number = col_character(),
#>   item_id = col_double(),
#>   item_type = col_character(),
#>   status = col_character(),
#>   language = col_character(),
#>   age = col_character(),
#>   library = col_character(),
#>   date = col_double(),
#>   datetime = col_datetime(format = ""),
#>   year = col_double(),
#>   month = col_double(),
#>   day = col_character()
#> )
#> Warning: 20 parsing failures.
#>    row     col expected  actual                         file
#> 587795 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 590579 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 590597 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 595774 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> 597567 item_id a double REFRESH 'data/bris-lib-checkout.csv'
#> ...... ....... ........ ....... ............................
#> See problems(...) for more details.

We can count the number of titles, item types, suggested age, and the library given:

library(dplyr)
count(bris_libs, title, sort = TRUE)
#> # A tibble: 121,046 x 2
#>    title                                n
#>    <chr>                            <int>
#>  1 Australian house and garden       1469
#>  2 New scientist (Australasian ed.)  1380
#>  3 Australian home beautiful         1331
#>  4 Country style                     1229
#>  5 The New idea                      1186
#>  6 Hello                             1133
#>  7 Woman's day                       1096
#>  8 Country life                      1056
#>  9 Better homes and gardens. (AU)    1041
#> 10 Yi Zhou Kan                        884
#> # … with 121,036 more rows
count(bris_libs, item_type, sort = TRUE)
#> # A tibble: 69 x 2
#>    item_type       n
#>    <chr>       <int>
#>  1 PICTURE-BK 121126
#>  2 DVD         98283
#>  3 AD-PBK      91671
#>  4 JU-PBK      88402
#>  5 NONFICTION  76168
#>  6 AD-MAGS     60516
#>  7 AD-FICTION  53090
#>  8 LARGEPRINT  19113
#>  9 JU-FICTION  17261
#> 10 LOTE-BOOK   12303
#> # … with 59 more rows
count(bris_libs, age, sort = TRUE)
#> # A tibble: 5 x 2
#>   age           n
#>   <chr>     <int>
#> 1 ADULT    420287
#> 2 JUVENILE 283902
#> 3 YA        13715
#> 4 <NA>        147
#> 5 UNKNOWN      36
count(bris_libs, library, sort = TRUE)
#> # A tibble: 38 x 2
#>    library     n
#>    <chr>   <int>
#>  1 SBK     49154
#>  2 BSQ     45968
#>  3 CNL     45642
#>  4 IPY     44569
#>  5 GCY     43090
#>  6 CDE     42775
#>  7 ASH     42086
#>  8 WYN     35124
#>  9 KEN     33947
#> 10 MTO     31201
#> # … with 28 more rows

License

This data is provided under a CC BY 4.0 license

It has been downloaded from Brisbane library checkouts, and tidied up using the code in data-raw.

Notes

It has been downloaded from [Brisbane library checkouts](https://www.data.brisbane.qld.gov.au/data/dataset/library-checkouts-branch-date#), and tidied up using the code in `data-raw`.

Files

README.md

Files (63.9 MB)

Name Size Download all
md5:27ddc9cf0cbb764ab16e991f8ca4344c
63.9 MB Download
md5:715eef06df62792c9bddd8cc07eee91f
10.3 kB Preview Download
md5:318ddd88323678f150dc7b92747caa4d
3.3 kB Download