pymatgen.io.abinit.scheduler_error_parsers module¶
-
class
AbstractError(errmsg, meta_data)[source]¶ Bases:
objectError base class
-
property
application_adapter_solutions¶ to be implemented by concrete errors returning a list of tuples defining corrections. The First element of the tuple should be a string of one of the methods in CorrectorProtocolApplication, the second element should contain the arguments.
-
property
name¶
-
property
scheduler_adapter_solutions¶ to be implemented by concrete errors returning a list of tuples defining corrections. The First element of the tuple should be a string of one of the methods in CorrectorProtocolScheduler, the second element should contain the arguments.
-
property
-
class
AbstractErrorParser(err_file, out_file=None, run_err_file=None, batch_err_file=None)[source]¶ Bases:
objectAbstract class for parsing errors originating from the scheduler system and error that are not reported by the program itself, i.e. segmentation faults.
A concrete implementation of this class for a specific scheduler needs a class attribute ERRORS for containing a dictionary specifying error:
- ERRORS = {ErrorClass: {
- ‘file_specifier’{
‘string’: “the string to be looked for”, ‘meta_filter’: “string specifing the regular expression to obtain the meta data” }
}
-
abstract property
error_definitions¶
-
class
CorrectorProtocolApplication[source]¶ Bases:
objectAbstract class to define the protocol / interface for correction operators. The client code quadapters / submission script generator method / … should implement these methods.
-
abstract
decrease_mem()[source]¶ Method to increase then memory in the calculation. It is called when a calculation seemed to have been crashed due to a insufficient memory.
returns True if the memory could be increased False otherwise
-
abstract property
name¶
-
abstract
-
class
CorrectorProtocolScheduler[source]¶ Bases:
objectAbstract class to define the protocol / interface for correction operators. The client code quadapters / submission script generator method / … should implement these methods.
-
abstract
exclude_nodes(nodes)[source]¶ Method to exclude certain nodes from being used in the calculation. It is called when a calculation seemed to have been crashed due to a hardware failure at the nodes specified.
nodes: list of node numbers that were found to cause problems
returns True if the memory could be increased False otherwise
-
abstract
increase_cpus()[source]¶ Method to increse the number of cpus being used in the calculation. It is called when a calculation seemed to have been crashed due to time or memory limits being broken.
returns True if the memory could be increased False otherwise
-
abstract
increase_mem()[source]¶ Method to increase then memory in the calculation. It is called when a calculation seemed to have been crashed due to a insufficient memory.
returns True if the memory could be increased False otherwise
-
abstract
increase_time()[source]¶ Method to increase te time for the calculation. It is called when a calculation seemed to have been crashed due to a time limit.
returns True if the memory could be increased False otherwise
-
abstract property
name¶
-
abstract
-
class
DiskError(errmsg, meta_data)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractErrorErrors involving problems writing to disk.
-
class
FullQueueError(errmsg, meta_data)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractErrorErrors occurring at submission. To many jobs in the queue / total cpus / .. .
-
class
MasterProcessMemoryCancelError(errmsg, meta_data)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractErrorError due to exceeding the memory limit for the job on the master node.
-
class
MemoryCancelError(errmsg, meta_data)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractError- Error due to exceeding the memory limit for the job.
.limit will return a list of limits that were broken, None if it could not be determined.
-
property
application_adapter_solutions¶ to be implemented by concrete errors returning a list of tuples defining corrections. The First element of the tuple should be a string of one of the methods in CorrectorProtocolApplication, the second element should contain the arguments.
-
property
limit¶
-
property
scheduler_adapter_solutions¶ to be implemented by concrete errors returning a list of tuples defining corrections. The First element of the tuple should be a string of one of the methods in CorrectorProtocolScheduler, the second element should contain the arguments.
-
class
NodeFailureError(errmsg, meta_data)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractError- Error due the hardware failure of a specific node.
.node will return a list of problematic nodes, None if it could not be determined.
-
property
nodes¶
-
property
scheduler_adapter_solutions¶ to be implemented by concrete errors returning a list of tuples defining corrections. The First element of the tuple should be a string of one of the methods in CorrectorProtocolScheduler, the second element should contain the arguments.
-
class
PBSErrorParser(err_file, out_file=None, run_err_file=None, batch_err_file=None)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractErrorParser- Implementation for the PBS scheduler
PBS: job killed: walltime 932 exceeded limit 900 PBS: job killed: walltime 46 exceeded limit 30 PBS: job killed: vmem 2085244kb exceeded limit 1945600kb
-
property
error_definitions¶
-
class
SlaveProcessMemoryCancelError(errmsg, meta_data)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractErrorError due to exceeding the memory limit for the job on a node different from the master.
-
class
SlurmErrorParser(err_file, out_file=None, run_err_file=None, batch_err_file=None)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractErrorParserImplementation of the error definitions for the Slurm scheduler
-
property
error_definitions¶
-
property
-
class
SubmitError(errmsg, meta_data)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractErrorErrors occurring at submission. The limits on the cluster may have changed.
-
class
TimeCancelError(errmsg, meta_data)[source]¶ Bases:
pymatgen.io.abinit.scheduler_error_parsers.AbstractError- Error due to exceeding the time limit for the job.
.limit will return a list of limits that were broken, None if it could not be determined.
-
property
application_adapter_solutions¶ to be implemented by concrete errors returning a list of tuples defining corrections. The First element of the tuple should be a string of one of the methods in CorrectorProtocolApplication, the second element should contain the arguments.
-
property
limit¶
-
property
scheduler_adapter_solutions¶ to be implemented by concrete errors returning a list of tuples defining corrections. The First element of the tuple should be a string of one of the methods in CorrectorProtocolScheduler, the second element should contain the arguments.
-
get_parser(scheduler, err_file, out_file=None, run_err_file=None, batch_err_file=None)[source]¶ Factory function to provide the parser for the specified scheduler. If the scheduler is not implemented None is returned. The files, string, correspond to file names of the out and err files: err_file stderr of the scheduler out_file stdout of the scheduler run_err_file stderr of the application batch_err_file stderr of the submission
- Returns
None if scheduler is not supported.