Semantic Annotation for Tabular Data with DBpedia: Adapted SemTab 2019 with DBpedia 2016-10
- 1. National Institute of Informatics
- 2. National Institute of Advanced Industrial Science and Technology
Description
Semantic Annotation for Tabular Data with DBpedia: Adapted SemTab 2019 with DBpedia 2016-10
Github: https://github.com/phucty/mtab4dbpedia
---------------------------------------------------------------------------------------------------------------------------------------
CEA:
-
Keep only valid entities in DBpedia 2016-10
-
Resolve percentage encoding
-
Add missing redirect entities
CTA:
-
Keep only valid types
-
Resolve transitive types (parents and equivalent types of the specific type) with DBpedia ontology 2016-10
CPA:
-
Add equivalent properties
Statistic of Adapted Tabular data SemTab 2019
| | CEA | | | CPA | | | CTA | | |
|---------|:--------:|:-------:|:------:|:--------:|:-------:|:------:|:--------:|---------|--------|
| | Orginal | Adapted | Change | Orginal | Adapted | Change | Orginal | Adapted | Change |
| Round 1 | 8418 | 8406 | -0.14% | 116 | 116 | 0.00% | 120 | 120 | 0.00% |
| Round 2 | 463796 | 457567 | -1.34% | 6762 | 6762 | 0.00% | 14780 | 14333 | -3.02% |
| Round 3 | 406827 | 406820 | 0.00% | 7575 | 7575 | 0.00% | 5762 | 5673 | -1.54% |
| Round 4 | 107352 | 107351 | 0.00% | 2747 | 2747 | 0.00% | 1732 | 1717 | -0.87% |
---------------------------------------------------------------------------------------------------------------------------------------
DBpedia 2016-10 extra resources: (Original dataset http://downloads.dbpedia.org/2016-10/)
---------------------------------------------------------------------------------------------------------------------------------------
File: _dbpedia_classes_2016-10.csv
Information: DBpedia classes and parents: (We remove the abstract types: Agent, Thing)
Total: 759 classes
Structure: [class, parents (separate with space)] (without prefix dbo: or http://dbpedia.org/ontology/)
Example: "City","Location Place PopulatedPlace Settlement"
---------------------------------------------------------------------------------------------------------------------------------------
File: _dbpedia_properties_2016-10.csv
Information: DBpedia properties and these equivalents
Total: 2865 properties
Structure: [property, it’s equivalent properties] (without prefix dbo: or http://dbpedia.org/ontology/)
Example: "restingDate","deathDate"
---------------------------------------------------------------------------------------------------------------------------------------
File: _dbpedia_domains_2016-10.csv
Information: DBpedia properties and these domain types
Total: 2421 properties (have types as their domain)
Structure: [property, type (domain)] (without prefix dbo: or http://dbpedia.org/ontology/)
Example: "deathDate","Person"
---------------------------------------------------------------------------------------------------------------------------------------
File: _dbpedia_entities_2016-10.jsonl.bz2
Information: DBpedia entity dump
Format: json list bz2 (bz2 Compressed json list)
Source: DBpedia dump 2016-10 core
Total: 5,289,577 entities (No disambiguation entities)
Structure:
An entity: for example “Tokyo”: (datatype: dictionary),
{
'wd': 'Q1322032', (Wikidata ID, datatype: string)
'wp': 'Tokyo', (Wikipedia ID, add prefix https://en.wikipedia.org/wiki/ + wp to get the Wikipedia URL, datatype: string)
'dp': 'Tokyo', (DBpedia ID, add prefix http://dbpedia.org/resource/ + dp to get the DBpedia URL, datatype: string)
'label': 'Tokyo', (Entity label, datatype: string)
'aliases': ['To-kyo', 'Tôkyô Prefecture', ..], (Other entity names, datatype: list)
'aliases_multilingual': ['东京小子', 'طوكيو', ...], (Other entity names in multilingual, datatype: list)
'types_specific': 'City', (Entity direct type, datatype: string)
'types_transitive': ['Human settlement', 'City', 'PopulatedPlace', 'Location', 'Place', 'Settlement'], (Entity transitive types, datatype: list)
'claims_entity': { (entity statements, datatype: dictionary. Keys: properties, Values: list of tail entities)
'governingBody': ['Tokyo Metropolitan Government'],
'subdivision': ['Honshu', 'Kantō region'],
...
},
'claims_literal': {
'string': { (String literal: datatype: dictionary. Keys: properties, Values: list of values
'postalCode': ['JP-13'],
'utcOffset': ['+09:00', '+9'],
…
}
'time': { (Time literal: datatype: dictionary. Keys: properties, Values: list of date time
'populationAsOf': ['2016-07-31'],
...
}),
'quantity': { (Numerical literal: datatype: dictionary. Keys: properties, Values: list of values
populationDesity: [6224.66, 6349.0],
'maximumElevation': [2017],
...
},
'pagerank': 2.2167366040153352e-06 (Entity page rank score calculated on DBpedia Graph)
}
---------------------------------------------------------------------------------------------------------------------------------------
THIS DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Notes
Files
_dbpedia_classes_2016-10.csv
Files
(1.4 GB)
Name | Size | Download all |
---|---|---|
md5:c9c2610533b87e366193566dcefadbbb
|
33.7 kB | Preview Download |
md5:9f78b253eba405e3587c90df1c02487e
|
72.3 kB | Preview Download |
md5:102dee1c3848b4206bb92e8c94f77b58
|
1.4 GB | Download |
md5:5ba557fd1e145964d4729935219feba2
|
57.1 kB | Preview Download |
md5:8501aa7ea2c615359e511c1057771743
|
42.0 MB | Download |
Additional details
Related works
- Is cited by
- Dataset: 10.5281/zenodo.3518539 (DOI)