The CARS Project. A tale spanning the entire history of data science
Description
A brief story that began at an EPSRC sandpit on travel behaviour in 2010 - where Eddie discovered that the government had decided to published the anonymised results of MOT tests (UK's annual vehicle reliability test). Using linux command line tools, Eddie discovered a kind-of unintended "Easter egg" - that it was possible to link the records related to a single vehicle, and as the odometer (mileage) at each test id recorded, infer how much each vehicle is driven annually. Many iterations of this work followed and several quite major grants - in which we also had privileged access to special versions of the data which included location of each vehicle's registered keeper - enabling much more data fun with linkages to census statistics and so on. Quite apart from the data science there were many lessons in privacy, ethics, commercial relationships, licences, data protection, secure computing environments. Most recently this has all led to the CARS (Connecting Administrative vehicle data for Research on Sustainable Transport) ADR UK/ESRC project. We've got some recent results on vehicle use in the pandemic and also data we fed into the recent government consultation on stretching the gap between MOT tests. We'll conclude with a wish list of all the transport data we would like to have but will probably never get our hands on.
Files
REWilsonJThomasDataScienceTalk.pdf
Files
(2.1 MB)
Name | Size | Download all |
---|---|---|
md5:4d4ef7c996850e9fcd4fffb222707157
|
2.1 MB | Preview Download |