D2.3 Report on Data Expansion and Population
Creators
Description
The document presents an overview of the EFRA Data Platform's data assets and sources. It details the sources, challenges, and recent integrations of data relevant to global food safety. The EFRA Platform already contains millions of data records from highly heterogeneous sources, including:
1. Public Food Safety Authorities: Data from over 50 international food safety authority websites, daily scraped and customized for efficient information extraction. The challenge is the data's heterogeneity in language, format, and lack of a global schema.
2. EFSA Lab Tests: Annual lab test results from the European Food Safety Authority, aggregated from national authorities across Europe, with the challenge of transforming non-machine-readable formats.
3. Food Safety News Sites: Up-to-date information from authoritative food safety websites, requiring sophisticated processing to structure the natural language content.
4. Weather Data: Weather information from the Visual Crossing API, with challenges in data consistency and quality due to variations in measurement units and discrepancies in timestamps.
5. Regulatory Bodies: Regulations and news from public authorities worldwide, featuring language diversity, various document formats, and complex tables that complicate data parsing.
6. Pests Data: Scientific data on pests, manually extracted and paired with pesticide information from governmental sources, presented in machine-readable formats.
7. Food-safety Videos: Information extracted from food safety videos from CDC, EFSA, and USDA via speech transcription, with recent additions to the data sources.
8. Private Sources: Data from the Agrivi Platform on farm management, pest alarms, and weather parameters; the Moy Park MTech Platform on poultry production and Salmonella testing; and the Food Fortress Platform on mycotoxin analysis in animal feed.
The executive summary highlights the integration of various data sources into EFRA, enhancing the understanding of food safety trends, and elaborates on the data's format, the languages it's presented in, the inherent challenges in handling and standardizing the diverse information, and the solutions implemented to create a uniform dataset for stakehold
Files
EFRA D2.3.1_Data.Expansion-final_2024.03.29.pdf
Files
(4.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:ac59638ded63911edd8d4b3b9260e748
|
4.4 MB | Preview Download |