There is a newer version of the record available.

Published March 4, 2019 | Version 0.2.0
Dataset Open

Extract from the Library's Main Catalog

  • 1. Staatsbibliothek zu Berlin

Description

The data set is based on the main catalog of the library. Currently, the following fields are extracted:

  • title
  • author (+ optional GND ID)
  • publisher
  • place of publication
  • country of publication

The extract has been created by the processPicaPlus script available here. However, some special characters might not have been extracted correctly in version 0.1.0.

Change Log:

0.2.0    fixes various encoding issues for non-ASCII characters

 

Dataset Characteristics

The following languages are available in separate data files:

  • eng
  • ger
  • lat
  • fre
  • ita
  • spa
  • por
  • dut
  • swe
  • dan
  • nor
  • ice
  • fry

The other languages are present in the data set but have not been separated, i.e., they are combined in one data file:

'fre', 'rus', 'pol', 'ger', 'eng', 'lit', 'dan', 'dut', 'spa', 'swe', 'ita', 'lat', 'nor', 'ind', 'bul', 'grc', 'fry', 'rum', 'cze', 'slo', 'bel', 'ice', 'fin', 'gre', 'hun', 'tur', 'enm', 'hrv', 'est', 'srp', 'roh', 'syr', 'wen', 'mal', 'afr', 'slv', 'mac', 'smi', 'nds', 'qmw', 'pra', 'oci', 'bre', 'san', 'alb', 'baq', 'non', 'ara', 'chm', 'per', 'cat', 'gmh', 'sla', 'arm', 'ukr', 'por', 'chu', 'heb', 'arc', 'gle', 'tib', 'lav', 'geo', 'crp', 'hin', 'mul', 'chi', 'epo', 'kor', 'kan', 'vot', 'csb', 'glg', 'kaz', 'frm', 'jpn', 'bur', 'srd', 'sal', 'ira', 'bos', 'mol', 'rom', 'tat', 'aze', 'yid', 'mar', 'mak', 'pli', 'rys', 'tgk', 'map', 'vie', 'tuk', 'oss', 'ota', 'tut', 'ben', 'sun', 'tir', 'bak', 'chv', 'ber', 'khm', 'may', 'pan', 'uzb', 'swa', 'kir', 'egy', 'dum', 'nep', 'cop', 'mon', 'tam', 'urd', 'zxx', 'wel', 'mis', 'ng', 'goh', 'dt', 'en', 'fao', 'fro', 'pus', 'kur', 'cus', 'hau', 'uig', 'sit', 'dt.', 'cpf', 'tgl', 'qoj', 'tag', 'raj', 'fiu', 'xal', 'kbd', 'udm', 'scr', 'gag', 'kas', 'scc', 'pro', 'tha', 'dar', 'dr', 'sna', 'ewe', 'de', 'dra', 'ang', 'ine', 'zza', 'und', 'ave', 'amh', 'crh', 'jav', 'cpe', 'akk', 'dsb', 'qce', 'guj', 'ltz', 'got', 'bua', 'peo', 'mdr', 'nob', 'ava', 'che', 'sux', 'kok', 'zap', 'nl', 'inc', 'sah', 'gem', 'law', 'bem', 'sin', 'qdo', 'hsb', 'som', 'lao', 'kam', 'kom', 'abk', 'roa', 'cau', 'ady', 'bat', 'mlt', 'sai', 'xho', 'paa', 'sot', 'bnt', 'lug', 'myn', 'kar', 'qhe', 'kin', 'zul', 'tsn', 'apa', 'nso', 'yao', 'yor', 'bih', 'nog', 'nap', 'loz', 'nbl', 'kon', 'nya', 'snh', 'chn', 'run', 'suk', 'fur', 'osa', 'bra', 'den', 'kpe', 'kal', 'tig', 'wol', 'gla', 'lad', 'mos', 'cre', 'krc', 'ge', 'fr', 'dak', 'fij', 'mad', 'srr', 'kum', 'her', 'nai', 'cel', 'inh', 'kro', 'hit', 'pal', 'tmh', 'tsw', 'bam', 'kab', 'kik', 'kua', 'lub', 'luo', 'nub', 'tem', 'znd', 'mai', 'tai', 'qkr', 'ful', 'man', 'lol', 'sag', 'tog', 'hai', 'arg', 'fat', 'nav', 'niu', 'ibo', 'ido', 'men', 'qju', 'gaa', 'vol', 'nah', 'mlg', 'nic', 'ijo', 'sus', 'orm', 'smo', 'mag', 'tyv', 'mnc', 'cos', 'mdf', 'kaa', 'dua', 'gez', 'ton', 'ven', 'snd', 'syc', 'nym', 'nia', 'sem', 'chg', 'fan', 'twi', 'mas', 'ina', 'ile', 'art', 'ori', 'qai', 'arw', 'mao', 'bas', 'kmb', 'tiv', 'bal', 'tar', 'tpi', 'abs', 'asm', 'qqa', 'iku', 'min', 'rup', 'tel', 'or', 'tah', 'aka', 'day', 'qqg', 'lah', 'lus', 'sio', 'oto', 'alg', 'shn', 'ndo', 'haw', 'tso', 'mus', 'cai', 'qev', 'new', 'zha', 'grn', 'khi', 'ssw', 'nde', 'bla', 'grb', 'mun', 'din', 'sam', 'mwr', 'cor', 'sat', 'cho', 'ger,', 'que', 'btk', 'glv', 'rar', 'jk', 'nno', 'cmc', 'mga', 'jw', 'iro', 'sog', 'hat', 'dzo', 'mkh', 'bik', 'ban', 'ilo', 'pam', 'ts', 'sme', 'myv', 'qnn', 'jpr', 'qte', 'yap', 'bis', 'sga', 'qkj', 'pap', 'ath', 'ipk', 'phi', 'sco', 'del', 'moh', 'iri', 'gae', 'ryl', 'our', 't--', 'grk', 'ssa', 'awa', 'efi', 'jrb', 'enk', 'kru', 'oji', 'arn', 'car', 'gsw', 'lez', 'war', 'ace', 'qrn', 'wln', 'ceb', 'aar', 'bug', 'kaw', 'chr', 'cpp', 'tet', 'aym', 'ces', 'hmo'

Files

dan_out.txt

Files (1.7 GB)

Name Size Download all
md5:3103eff4eef90aad253eba5229c577c2
3.2 MB Preview Download
md5:bf76946eaa93a6b1c582788899a47eb6
12.7 MB Preview Download
md5:bd17fe565e1fe4beef680603450ef013
299.4 MB Preview Download
md5:e1b39cb53ea0a2108e757ed3e0912007
67.2 MB Preview Download
md5:d36a9fd05d58a13d3995905bb5402dbc
59.6 kB Preview Download
md5:96b56d119e58742a8e9204ca01baef68
439.4 MB Preview Download
md5:4106480e44a727a9433a26e1a604aa86
163.5 kB Preview Download
md5:e3d4555beebb82b92186e492c167f9cc
26.6 MB Preview Download
md5:ef758fed9a3fcb17c4ce979a367fb2e3
60.8 MB Preview Download
md5:741732a261553531d9a5dd67ea2f09ce
1.7 MB Preview Download
md5:328e0b64b81c3d438ac746e373fdc540
766.1 MB Preview Download
md5:be83199a755596125dd5b770d46c3f13
1.4 MB Preview Download
md5:b872cd9e81cb700f649afc419b949461
7.0 MB Preview Download
md5:d723ea58f8630c3c97fb9b7cd12c5d7c
3.1 kB Preview Download
md5:07cdfac0464345894eaf7392b50ed894
5.1 MB Preview Download