CLAM Data API¶
The CLAM Data API is at the heart of CLAM. It contains various data structures CLAM uses, such as the Profiles, Input Templates, Output Templates, Metadata, etc... This API is used by CLAM internally but is also designed to be used in your system wrapper scripts and clients!
-
class
clam.common.data.
AbstractMetaField
(key, value=None)¶ This abstract class is the basis for derived classes representing metadata fields of particular types. A metadata field is in essence a (key, value) pair. These classes are used in output templates (described by the XML tag
meta
). They are not used byCLAMMetaData
-
static
fromxml
(node)¶ Static method returning an MetaField instance (any subclass of AbstractMetaField) from the given XML description. Node can be a string or an etree._Element.
-
resolve
(data, parameters, parentfile, relevantinputfiles)¶
-
xml
(operator='set', indent='')¶ Serialize the metadata field to XML
-
static
-
class
clam.common.data.
Action
(*args, **kwargs)¶ -
static
fromxml
(node)¶ Static method returning an Action instance from the given XML description. Node can be a string or an etree._Element.
-
xml
(indent='')¶
-
static
-
exception
clam.common.data.
AuthRequired
(msg='')¶ Raised on 401 - Authentication Required error. Service requires authentication, pass user credentials in CLAMClient constructor.
-
exception
clam.common.data.
AuthenticationRequired
¶ This Exception is raised when authentication is required but has not been provided
-
exception
clam.common.data.
BadRequest
¶
-
class
clam.common.data.
CLAMData
(xml, client=None, localroot=False)¶ Instances of this class hold all the CLAM Data that is automatically extracted from CLAM XML responses. Its member variables are:
baseurl
- The base URL to the service (string)projecturl
- The full URL to the selected project, if any (string)status
- Can be:clam.common.status.READY
(0),``clam.common.status.RUNNING`` (1), orclam.common.status.DONE
(2)statusmessage
- The latest status message (string)completion
- An integer between 0 and 100 indicatingthe percentage towards completion.
parameters
- List of parameters (but use the methods instead)profiles
- List of profiles ([ Profile ]
)input
- List of input files ([ CLAMInputFile ]
); useinputfiles()
instead for easier accessoutput
- List of output files ([ CLAMOutputFile ]
)projects
- List of project IDs ([ string ]
)corpora
- List of pre-installed corporaerrors
- Boolean indicating whether there are errors in parameter specificationerrormsg
- String containing an error messageoauth_access_token
- OAuth2 access token (empty if not used, string)
Note that depending on the current status of the project, not all may be available.
-
baseurl
= None¶ String containing the base URL of the webserivice
-
commandlineargs
()¶
-
corpora
= None¶ - List of pre-installed corpora
-
errormsg
= None¶ String containing an error message if an error occured
-
errors
= None¶ Boolean indicating whether there are errors in parameter specification
-
input
= None¶ List of output files ([ CLAMInputFile ])
-
inputfile
(inputtemplate=None)¶ Return the inputfile for the specified inputtemplate, if
inputtemplate=None
, inputfile is returned regardless of inputtemplate. This function may only return 1 and returns an error when multiple input files can be returned, useinputfiles()
instead.
-
inputfiles
(inputtemplate=None)¶ Generator yielding all inputfiles for the specified inputtemplate, if
inputtemplate=None
, inputfiles are returned regardless of inputtemplate.
-
inputtemplate
(template_id)¶ Return the inputtemplate with the specified ID. This is used to resolve a inputtemplate ID to an InputTemplate object instance
-
inputtemplates
()¶ Return all input templates as a list (of InputTemplate instances)
-
output
= None¶ List of output files ([ CLAMOutputFile ])
-
parameter
(parameter_id)¶ Return the specified global parameter (the entire object, not just the value)
-
parametererror
()¶ Return the first parameter error, or False if there is none
-
parameters
= None¶ This contains a list of (parametergroup, [parameters]) tuples.
-
parseresponse
(xml, localroot=False)¶ Parses CLAM XML, there’s usually no need to call this directly
-
passparameters
()¶ Return all parameters as {id: value} dictionary
-
profiles
= None¶ List of profiles ([ Profile ])
-
projects
= None¶ List of projects ([ string ])
-
projecturl
= None¶ String containing the full URL to the project, if a project was indeed selected
-
status
= None¶ The current status of the service, returns clam.common.status.READY (1), clam.common.status.RUNNING (2), or clam.common.status.DONE (3)
-
statusmessage
= None¶ The current status of the service in a human readable message
-
class
clam.common.data.
CLAMFile
(projectpath, filename, loadmetadata=True, client=None, requiremetadata=False)¶ -
attachviewers
(profiles)¶ Attach viewers and converters to file, automatically scan all profiles for outputtemplate or inputtemplate
-
basedir
= ''¶
-
copy
(target, timeout=500)¶ Copy or download this file to a new local file
-
delete
()¶ Delete this file
-
loadmetadata
()¶ Load metadata for this file. This is usually called automatically upon instantiation, except if explicitly disabled. Works both locally as well as for clients connecting to a CLAM service.
-
metafilename
()¶ Returns the filename for the metadata file (not full path). Only used for local files.
-
read
()¶ Loads all lines in memory
-
readlines
()¶ Loads all lines in memory
-
validate
()¶ Validate this file. Returns a boolean.
-
-
class
clam.common.data.
CLAMInputFile
(projectpath, filename, loadmetadata=True, client=None, requiremetadata=False)¶ -
basedir
= 'input'¶
-
-
class
clam.common.data.
CLAMMetaData
(file, **kwargs)¶ A simple hash structure to hold arbitrary metadata
-
allowcustomattributes
= True¶
-
attributes
= None¶
-
static
fromxml
(node, file=None)¶ Read metadata from XML. Static method returning an CLAMMetaData instance (or rather; the appropriate subclass of CLAMMetaData) from the given XML description. Node can be a string or an etree._Element.
-
httpheaders
()¶ HTTP headers to output for this format. Yields (key,value) tuples. Should be overridden in sub-classes!
-
items
()¶ Returns all items as (key, value) tuples
-
loadinlinemetadata
()¶ Not implemented
-
mimetype
= 'text/plain'¶
-
save
(filename)¶ Save metadata to XML file
-
saveinlinemetadata
()¶ Not implemented
-
schema
= ''¶
-
validate
()¶ Validate the metadata
-
xml
(indent='')¶ Render an XML representation of the metadata
-
-
class
clam.common.data.
CLAMOutputFile
(projectpath, filename, loadmetadata=True, client=None, requiremetadata=False)¶ -
basedir
= 'output'¶
-
-
class
clam.common.data.
CLAMProvenanceData
(serviceid, servicename, serviceurl, outputtemplate_id, outputtemplate_label, inputfiles, parameters=None, timestamp=None)¶ Holds provenance data
-
static
fromxml
(node)¶ Return a CLAMProvenanceData instance from the given XML description. Node can be a string or an lxml.etree._Element.
-
xml
(indent='')¶ Serialise provenance data to XML. This is included in CLAM Metadata files
-
static
-
class
clam.common.data.
CMDIMetaData
(file, **kwargs)¶ Direct CMDI Metadata support, not implemented yet, reserved for future use
-
class
clam.common.data.
CopyMetaField
(key, value=None)¶ In CopyMetaField, the value is in the form of templateid.keyid, denoting where to copy from. If not keyid but only a templateid is specified, the keyid of the metafield itself will be assumed.
-
resolve
(data, parameters, parentfile, relevantinputfiles)¶
-
xml
(indent='')¶
-
-
exception
clam.common.data.
FormatError
(value)¶ This Exception is raised when the CLAM response is not in the valid CLAM XML format
-
exception
clam.common.data.
HTTPError
¶ This Exception is raised when certain data (such a metadata), can’t be retrieved over HTTP
-
class
clam.common.data.
InputSource
(**kwargs)¶ -
check
()¶ Checks if this inputsource is usable in INPUTSOURCES
-
isdir
()¶
-
isfile
()¶
-
xml
(indent='')¶
-
-
class
clam.common.data.
InputTemplate
(template_id, formatclass, label, *args, **kwargs)¶ This class represents an input template. A slot with a certain format and function to which input files can be uploaded
-
static
fromxml
(node)¶ Static method returning an InputTemplate instance from the given XML description. Node can be a string or an etree._Element.
-
generate
(file, validatedata=None, inputdata=None, user=None)¶ Convert the template into instantiated metadata, validating the data in the process and returning errors otherwise. inputdata is a dictionary-compatible structure, such as the relevant postdata. Return (success, metadata, parameters), error messages can be extracted from parameters[].error. Validatedata is a (errors,parameters) tuple that can be passed if you did validation in a prior stage, if not specified, it will be done automatically.
-
json
()¶ Produce a JSON representation for the web interface
-
match
(metadata, user=None)¶ Does the specified metadata match this template? returns (success,metadata,parameters)
-
matchingfiles
(projectpath)¶ Checks if the input conditions are satisfied, i.e the required input files are present. We use the symbolic links .*.INPUTTEMPLATE.id.seqnr to determine this. Returns a list of matching results (seqnr, filename, inputtemplate).
-
validate
(postdata, user=None)¶ Validate posted data against the inputtemplate
-
xml
(indent='')¶ Produce Template XML
-
static
-
exception
clam.common.data.
NoConnection
¶
-
exception
clam.common.data.
NotFound
(msg='')¶ Raised on 404 - Not Found Errors
-
class
clam.common.data.
OutputTemplate
(template_id, formatclass, label, *args, **kwargs)¶ -
findparent
(inputtemplates)¶ Find the most suitable parent, that is: the first matching unique/multi inputtemplate
-
static
fromxml
(node)¶ Static method return an OutputTemplate instance from the given XML description. Node can be a string or an etree._Element.
-
generate
(profile, parameters, projectpath, inputfiles, provenancedata=None)¶ Yields (outputfilename, metadata) tuples
-
generatemetadata
(parameters, parentfile, relevantinputfiles, provenancedata=None)¶ Generate metadata, given a filename, parameters and a dictionary of inputdata (necessary in case we copy from it)
-
getparent
(profile)¶ Resolve a parent ID
-
xml
(indent='')¶ Produce Template XML
-
-
class
clam.common.data.
ParameterCondition
(**kwargs)¶ -
allpossibilities
()¶ Returns all possible outputtemplates that may occur (recusrively applied)
-
evaluate
(parameters)¶ Returns False if there’s no match, or whatever the ParameterCondition evaluates to (recursively applied!)
-
static
fromxml
(node)¶ Static method returning a ParameterCondition instance from the given XML description. Node can be a string or an etree._Element.
-
match
(parameters)¶
-
xml
(indent='')¶
-
-
exception
clam.common.data.
ParameterError
(msg='')¶ Raised on Parameter Errors, i.e. when a parameter does not validate, is missing, or is otherwise set incorrectly.
-
class
clam.common.data.
ParameterMetaField
(key, value=None)¶ -
resolve
(data, parameters, parentfile, relevantinputfiles)¶
-
xml
(indent='')¶
-
-
exception
clam.common.data.
PermissionDenied
(msg='')¶ Raised on 403 - Permission Denied Errors (but only if no CLAM XML response is provided)
-
class
clam.common.data.
Profile
(*args)¶ -
static
fromxml
(node)¶ Return a profile instance from the given XML description. Node can be a string or an etree._Element.
-
generate
(projectpath, parameters, serviceid, servicename, serviceurl)¶ Generate output metadata on the basis of input files and parameters. Projectpath must be absolute.
-
match
(projectpath, parameters)¶ Check if the profile matches all inputdata and produces output given the set parameters. Returns a boolean
-
matchingfiles
(projectpath)¶ Return a list of all inputfiles matching the profile (filenames)
-
out
(indent='')¶
-
outputtemplates
()¶ Returns all outputtemplates, resolving ParameterConditions to all possibilities
-
xml
(indent='')¶ Produce XML output for the profile
-
static
-
exception
clam.common.data.
ServerError
(msg='')¶ Raised on 500 - Internal Server Error. Indicates that something went wrong on the server side.
-
class
clam.common.data.
SetMetaField
(key, value=None)¶ -
resolve
(data, parameters, parentfile, relevantinputfiles)¶
-
xml
(indent='')¶
-
-
exception
clam.common.data.
TimeOut
¶
-
class
clam.common.data.
UnsetMetaField
(key, value=None)¶ -
resolve
(data, parameters, parentfile, relevantinputfiles)¶
-
xml
(indent='')¶
-
-
exception
clam.common.data.
UploadError
(msg='')¶
-
clam.common.data.
escape
(s, quote)¶
-
clam.common.data.
getclamdata
(filename, custom_formats=None)¶
-
clam.common.data.
parsexmlstring
(node)¶
-
clam.common.data.
processhttpcode
(code, allowcodes=None)¶
-
clam.common.data.
processparameter
(postdata, parameter, user=None)¶
-
clam.common.data.
processparameters
(postdata, parameters, user=None)¶
-
clam.common.data.
profiler
(profiles, projectpath, parameters, serviceid, servicename, serviceurl, printdebug=None)¶ Given input files and parameters, produce metadata for outputfiles. Returns list of matched profiles if succesfull, empty list otherwise
-
clam.common.data.
resolveinputfilename
(filename, parameters, inputtemplate, nextseq=0, project=None)¶
-
clam.common.data.
resolveoutputfilename
(filename, globalparameters, localparameters, outputtemplate, nextseq, project, inputfilename)¶
-
clam.common.data.
sanitizeparameters
(parameters)¶ Construct a dictionary of parameters, for internal use only
-
clam.common.data.
shellsafe
(s, quote='', doescape=True)¶ Returns the value string, wrapped in the specified quotes (if not empty), but checks and raises an Exception if the string is at risk of causing code injection