Stian Soiland-Reyes
2018-02-05
<p>Create/parse <a href="https://tools.ietf.org/html/draft-soilandreyes-arcp-03">arcp</a> (Archive and Package) URIs.</p>
<p><strong>Introduction</strong></p>
<p><code>arcp</code> provides functions for creating <a href="https://tools.ietf.org/html/draft-soilandreyes-arcp-03">arcp</a> URIs, which can be used for identifying or parsing hypermedia files packaged in an archive or package, like a ZIP file.</p>
<p>arcp URIs can be used to consume or reference hypermedia resources bundled inside a file archive or an application package, as well as to resolve URIs for archive resources within a programmatic framework.</p>
<p>This URI scheme provides mechanisms to generate a unique base URI to represent the root of the archive, so that relative URI references in a bundled resource can be resolved within the archive without having to extract the archive content on the local file system.</p>
<p>An arcp URI can be used for purposes of isolation (e.g. when consuming multiple archives), security constraints (avoiding "climb out" from the archive), or for externally identiyfing sub-resources referenced by hypermedia formats.</p>
<p><strong>Examples</strong>:</p>
<ul>
<li><code>arcp://uuid,32a423d6-52ab-47e3-a9cd-54f418a48571/doc.html</code></li>
<li><code>arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/pics/</code></li>
<li><code>arcp://ni,sha-256;F-34D4TUeOfG0selz7REKRDo4XePkewPeQYtjL3vQs0/</code></li>
<li><code>arcp://name,gallery.example.org/</code></li>
</ul>
<p>The different forms of URI <a href="https://tools.ietf.org/id/draft-soilandreyes-arcp-03.html#rfc.section.4.1">authority</a> in arcp URIs can be used depending on which uniqueness constraints to apply when addressing an archive. See the <a href="https://tools.ietf.org/html/draft-soilandreyes-arcp-03">arcp</a> specification (<em>draft-soilandreyes-arcp</em>) for details.</p>
<p>Note that this library only provides mechanisms to <em>generate</em> and <em>parse</em> arcp URIs, and do <em>not</em> integrate with any particular archive or URL handling modules like <code>zipfile</code> or <code>urllib.request</code>.</p>
<p><strong>Installing</strong></p>
<p>You will need Python 2.7, Python 3.4 or later (Recommended: 3.6).</p>
<p>If you have <a href="https://docs.python.org/3/installing/">pip</a>, then the easiest is normally to install from <a href="https://pypi.org/project/arcp/">https://pypi.org/project/arcp/</a> using:</p>
<pre><code>pip install arcp
</code></pre>
<p>If you want to install manually from this code base, then try:</p>
<pre><code>python setup.py install
</code></pre>
<p><strong>Usage</strong></p>
<p>For full documentation, see <a href="http://arcp.readthedocs.io/">http://arcp.readthedocs.io/</a> or use <code>help(arcp)</code></p>
<p>This module provides functions for creating <a href="https://tools.ietf.org/html/draft-soilandreyes-arcp-03">arcp</a> URIs, which can be used for identifying or parsing hypermedia files packaged in an archive or package, like a ZIP file:</p>
<pre><code>>>> from arcp import *
>>> arcp_random()
'arcp://uuid,dcd6b1e8-b3a2-43c9-930b-0119cf0dc538/'
>>> arcp_random("/foaf.ttl", fragment="me")
'arcp://uuid,dcd6b1e8-b3a2-43c9-930b-0119cf0dc538/foaf.ttl#me'
>>> arcp_hash(b"Hello World!", "/folder/")
'arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/'
>>> arcp_location("http://example.com/data.zip", "/file.txt")
'arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/file.txt'
</code></pre>
<p>arcp URLs can be used with <code>urllib.parse</code>, for instance using <code>urljoin</code> to resolve relative references::</p>
<pre><code>>>> css = arcp.arcp_name("app.example.com", "css/style.css")
>>> urllib.parse.urljoin(css, "../fonts/foo.woff")
'arcp://name,app.example.com/fonts/foo.woff'
</code></pre>
<p>In addition this module provides functions that can be used to parse arcp URIs into its constituent fields:</p>
<pre><code>>>> is_arcp_uri("arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/file.txt")
True
>>> is_arcp_uri("http://example.com/t")
False
>>> u = parse_arcp("arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/file.txt")
ARCPSplitResult(scheme='arcp',prefix='uuid',name='b7749d0b-0e47-5fc4-999d-f154abe68065',
uuid='b7749d0b-0e47-5fc4-999d-f154abe68065',path='/file.txt',query='',fragment='')
>>> u.path
'/file.txt'
>>> u.prefix
'uuid'
>>> u.uuid
UUID('b7749d0b-0e47-5fc4-999d-f154abe68065')
>>> u.uuid.version
5
>>> parse_arcp("arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/").hash
('sha-256', '7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069')
</code></pre>
<p>The object returned from <code>parse_arcp</code> is similar to <code>ParseResult</code> from <code>urlparse</code>, but contains additional properties <code>prefix</code>, <code>uuid</code>, <code>ni</code>, <code>hash</code> and <code>name</code>, some of which will be <code>None</code> depending on the arcp prefix.</p>
<p>The function <code>arcp.parse.urlparse</code> can be imported as an alternative to <code>urllib.parse.urlparse</code>. If the scheme is <code>arcp</code> then the extra arcp fields like <code>prefix</code>, <code>uuid</code>, <code>hash</code> and <code>name</code> are available as from <code>parse_arcp</code>, otherwise the output is the same as from regular <code>urlparse</code>:</p>
<pre><code>>>> from arcp.parse import urlparse
>>> urlparse("arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/soup;sads")
ARCPParseResult(scheme='arcp',prefix='ni',
name='sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk',
ni='sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk',
hash=('sha-256', '7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069',
path='/folder/soup;sads',query='',fragment='')
>>> urlparse("http://example.com/help?q=a")
ParseResult(scheme='http', netloc='example.com', path='/help', params='',
query='q=a', fragment='')
>>> from arcp.parse import urlparse
>>> urlparse("arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/soup;sads")
ARCPParseResult(scheme='arcp',prefix='ni',
name='sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk',
ni='sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk',
hash=('sha-256', '7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069',
path='/folder/soup;sads',query='',fragment='')
>>> urlparse("http://example.com/help?q=a")
ParseResult(scheme='http', netloc='example.com', path='/help', params='',
query='q=a', fragment='')
</code></pre>
https://doi.org/10.5281/zenodo.1165986
oai:zenodo.org:1165986
Zenodo
https://github.com/stain/arcp-py/tree/0.2.0
http://arcp.readthedocs.io/en/0.2.0/
https://pypi.org/project/arcp/0.2.0/
https://pypi.python.org/pypi/arcp/0.1.0
https://tools.ietf.org/html/draft-soilandreyes-arcp-03
https://github.com/stain/arcp-py/releases/tag/0.2.0
https://zenodo.org/communities/linkeddata
https://zenodo.org/communities/eu
https://doi.org/10.5281/zenodo.1162749
info:eu-repo/semantics/openAccess
Apache License 2.0
http://www.apache.org/licenses/LICENSE-2.0
stain/arcp-py: arcp 0.2.0
info:eu-repo/semantics/other