Software Open Access

The D-NET software toolkit: dnet-basic-aggregator

Michele Artini; Claudio Atzori; Alessia Bardi; Sandro La Bruzzo; Paolo Manghi; Andrea Mannocci

D-Net Software Toolikt

The D-NET Software Toolkit is a system that offers functionalities for the collection (“harvesting”), transformation, aggregation, and indexing of metadata records collected from an arbitrary number of data sources, complying with different protocols and data exchange formats. D-NET sets a workflow language, which developers can use to combine a variety of D-NET data management services, configure them to handle data according to given data models, and pipeline them into autonomic data processing workflows.

This software package is a simplified version of the D-Net toolkit and consists of a web application with a minimal set of services for:

  • Collection of metadata records in oai_dc format via OAI-PMH, FTP, local file system, HTTP.

  • Transformation of the collected metadata records into an internal format named DMF (Driver Metadata Format)

  • Indexing of DMF records in a Solr full-text index

  • OAI-PMH export of aggregated metadata records in DMF and oai_dc formats. More formats can be added at runtime by providing a dedicated XSLT from DMF to the desired target format.

Major changes in version 1.3.0

  • OAI Publisher:
    • fixed cache management
    • fixed oai consistency (post feed) workflow branch
    • fixed deletion of content when workflow of data sources are deleted
  • D-Net enabling services:
    • using cache for subscription access
    • support only one subscription registry
  • Mongo based services (mdstore, oaistore, wf logging):
    • using API of mongo-java-driver 3.2.2, removed usage of deprecated methods
    • tracking the number of stored records to possibly highlight the collection of records with the same identifier
  • GUI:
    • enabling deletion of APIs via GUI
    • enabling editing of metadata_identifier_path
    • more info available in the datasource section
    • removed map of data sources (TODO: adapt to the new google map API)
  • Metadata collection:
    • handling HTML illegal entities in collected XMLs
  • Indexing:
    • default query operator for "bag of words" queries set to AND instead of OR
  • Workflow manager
    • do not launch workflows that were scheduled for execution during a pause of the aggregation system ("prepare for shutdown")

Official Web Site: http://www.d-net.research-infrastructures.eu

Need support? Contact us via email on dnet-team[at]isti.cnr.it

Files (392.9 kB)
Name Size
dnet-team/dnet-basic-aggregator-1.3.0.zip
md5:119ee0ff5f82d26be1987fc607d3df38
392.9 kB Download
804
78
views
downloads
All versions This version
Views 804145
Downloads 7819
Data volume 473.3 MB7.5 MB
Unique views 681128
Unique downloads 6416

Share

Cite as