There is a newer version of the record available.

Published January 30, 2018 | Version 0.1.0
Software Open

stain/arcp-py: arcp 0.1.0

  • 1. eScience lab, University of Manchester

Description

Create/parse arcp (Archive and Package) URIs.

This module provides functions for creating arcp URIs (draft-soilandreyes-arcp), which can be used for identifying or parsing hypermedia files packaged in an archive or package, like a ZIP file:

>>> from arcp import *
>>> arcp_random()
'arcp://uuid,dcd6b1e8-b3a2-43c9-930b-0119cf0dc538/'
>>> arcp_random("/foaf.ttl", fragment="me")
'arcp://uuid,dcd6b1e8-b3a2-43c9-930b-0119cf0dc538/foaf.ttl#me'
>>> arcp_hash(b"Hello World!", "/folder/")
'arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/'
>>> arcp_location("http://example.com/data.zip", "/file.txt")
'arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/file.txt'

arcp URLs can be used with urllib.parse, for instance using urljoin to resolve relative references:

>>> css = arcp.arcp_name("app.example.com", "css/style.css")
>>> urllib.parse.urljoin(css, "../fonts/foo.woff")
'arcp://name,app.example.com/fonts/foo.woff'

In addition this module provides functions that can be used to parse arcp URIs into its constituent fields:

>>> is_arcp_uri("arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/file.txt")
True
>>> is_arcp_uri("http://example.com/t")
False
>>> u = parse_arcp("arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/file.txt")
ARCPSplitResult(scheme='arcp',prefix='uuid',name='b7749d0b-0e47-5fc4-999d-f154abe68065',
  uuid='b7749d0b-0e47-5fc4-999d-f154abe68065',path='/file.txt',query='',fragment='')
>>> u.path
'/file.txt'
>>> u.prefix
'uuid'
>>> u.uuid
UUID('b7749d0b-0e47-5fc4-999d-f154abe68065')
>>> u.uuid.version
5
>>> parse_arcp("arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/").hash
('sha-256', '7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069')

The object returned from parse_arcp is similar to ParseResult from urlparse, but contains additional properties prefix, uuid, ni, hash and name, some of which will be None depending on the arcp prefix.

The function arcp.parse.urlparse can be imported as an alternative to urllib.parse.urlparse. If the scheme is arcp then the extra arcp fields like prefix, uuid, hash and name are available as from parse_arcp, otherwise the output is the same as from regular urlparse:

>>> from arcp.parse import urlparse
>>> urlparse("arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/soup;sads")
ARCPParseResult(scheme='arcp',prefix='ni',
   name='sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk',
   ni='sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk',
   hash=('sha-256', '7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069',
   path='/folder/soup;sads',query='',fragment='')
>>> urlparse("http://example.com/help?q=a")
ParseResult(scheme='http', netloc='example.com', path='/help', params='',
  query='q=a', fragment='')

Files

stain/arcp-py-0.1.0.zip

Files (20.4 kB)

Name Size Download all
md5:ffabe5d9333e1cc7ef7744c3a906e90d
20.4 kB Preview Download

Additional details

Funding

BioExcel – Centre of Excellence for Biomolecular Research 675728
European Commission