pymatgen.io.abinit.flows module¶
A Flow is a container for Works, and works consist of tasks. Flows are the final objects that can be dumped directly to a pickle file on disk Flows are executed using abirun (abipy).
-
class
Flow(workdir, manager=None, pickle_protocol=-1, remove=False)[source]¶ Bases:
pymatgen.io.abinit.nodes.Node,pymatgen.io.abinit.works.NodeContainer,monty.json.MSONableThis object is a container of work. Its main task is managing the possible inter-dependencies among the work and the creation of dynamic workflows that are generated by callbacks registered by the user.
Important methods for constructing flows:
Parameters: - workdir – String specifying the directory where the works will be produced. if workdir is None, the initialization of the working directory is performed by flow.allocate(workdir).
- manager –
TaskManagerobject responsible for the submission of the jobs. If manager is None, the object is initialized from the yaml file located either in the working directory or in the user configuration dir. - pickle_protocol – Pickle protocol version used for saving the status of the object. -1 denotes the latest version supported by the python interpreter.
- remove – attempt to remove working directory workdir if directory already exists.
-
Error¶ alias of
FlowError
-
PICKLE_FNAME= '__AbinitFlow__.pickle'¶
-
Results¶ alias of
FlowResults
-
VERSION= '0.1'¶
-
abivalidate_inputs()[source]¶ Run ABINIT in dry mode to validate all the inputs of the flow.
Returns: (isok, tuples) isok is True if all inputs are ok. tuples is List of namedtuple objects, one for each task in the flow. Each namedtuple has the following attributes:
retcode: Return code. 0 if OK. log_file: log file of the Abinit run, use log_file.read() to access its content. stderr_file: stderr file of the Abinit run. use stderr_file.read() to access its content.Raises: RuntimeError if executable is not in $PATH.
-
all_ok¶ True if all the tasks in works have reached S_OK.
-
allocate(workdir=None)[source]¶ Allocate the Flow i.e. assign the workdir and (optionally) the
TaskManagerto the different tasks in the Flow.Parameters: workdir – Working directory of the flow. Must be specified here if we haven’t initialized the workdir in the __init__.
-
allocated¶ Numer of allocations. Set by allocate.
-
as_dict(**kwargs)[source]¶ JSON serialization, note that we only need to save a string with the working directory since the object will be reconstructed from the pickle file located in workdir
-
batch(timelimit=None)[source]¶ Run the flow in batch mode, return exit status of the job script. Requires a manager.yml file and a batch_adapter adapter.
Parameters: - timelimit – Time limit (int with seconds or string with time given with the slurm convention:
- "days-hours – minutes:seconds”). If timelimit is None, the default value specified in the
- entry of manager.yml is used. (batch_adapter) –
-
build_and_pickle_dump(abivalidate=False)[source]¶ Build dirs and file of the Flow and save the object in pickle format. Returns 0 if success
Parameters: abivalidate – If True, all the input files are validate by calling the abinit parser. If the validation fails, ValueError is raise.
-
cancel(nids=None)[source]¶ Cancel all the tasks that are in the queue. nids is an optional list of node identifiers used to filter the tasks.
Returns: Number of jobs cancelled, negative value if error
-
check_pid_file()[source]¶ This function checks if we are already running the
Flowwith aPyFlowScheduler. Raises: Flow.Error if the pif file of the scheduler exists.
-
check_status(**kwargs)[source]¶ Check the status of the works in self.
Parameters: - show – True to show the status of the flow.
- kwargs – keyword arguments passed to show_status
-
chroot(new_workdir)[source]¶ Change the workir of the
Flow. Mainly used for allowing the user to open the GUI on the local host and access the flow from remote via sshfs.Note
Calling this method will make the flow go in read-only mode.
-
connect_signals()[source]¶ Connect the signals within the Flow. The Flow is responsible for catching the important signals raised from its works.
-
debug(status=None, nids=None)[source]¶ This method is usually used when the flow didn’t completed succesfully It analyzes the files produced the tasks to facilitate debugging. Info are printed to stdout.
Parameters: - status – If not None, only the tasks with this status are selected
- nids – optional list of node identifiers used to filter the tasks.
-
errored_tasks¶ List of errored tasks.
-
find_deadlocks()[source]¶ This function detects deadlocks
Returns: deadlocks, runnables, running Return type: named tuple with the tasks grouped in
-
fix_abicritical()[source]¶ This function tries to fix critical events originating from ABINIT. Returns the number of tasks that have been fixed.
-
fix_queue_critical()[source]¶ This function tries to fix critical events originating from the queue submission system.
Returns the number of tasks that have been fixed.
-
classmethod
from_inputs(workdir, inputs, manager=None, pickle_protocol=-1, task_class=<class 'pymatgen.io.abinit.tasks.ScfTask'>, work_class=<class 'pymatgen.io.abinit.works.Work'>)[source]¶ Construct a simple flow from a list of inputs. The flow contains a single Work with tasks whose class is given by task_class.
Warning
Don’t use this interface if you have dependencies among the tasks.
Parameters: - workdir – String specifying the directory where the works will be produced.
- inputs – List of inputs.
- manager –
TaskManagerobject responsible for the submission of the jobs. If manager is None, the object is initialized from the yaml file located either in the working directory or in the user configuration dir. - pickle_protocol – Pickle protocol version used for saving the status of the object. -1 denotes the latest version supported by the python interpreter.
- task_class – The class of the
Task. - work_class – The class of the
Work.
-
get_dict_for_mongodb_queries()[source]¶ This function returns a dictionary with the attributes that will be put in the mongodb document to facilitate the query. Subclasses may want to replace or extend the default behaviour.
-
get_mongo_info()[source]¶ Return a JSON dictionary with information on the flow. Mainly used for constructing the info section in FlowEntry. The default implementation is empty. Subclasses must implement it
-
get_njobs_in_queue(username=None)[source]¶ returns the number of jobs in the queue, None when the number of jobs cannot be determined.
Parameters: username – (str) the username of the jobs to count (default is to autodetect)
-
groupby_status()[source]¶ Returns a ordered dictionary mapping the task status to the list of named tuples (task, work_index, task_index).
-
groupby_task_class()[source]¶ Returns a dictionary mapping the task class to the list of tasks in the flow
-
has_chrooted¶ Returns a string that evaluates to True if we have changed the workdir for visualization purposes e.g. we are using sshfs. to mount the remote directory where the Flow is located. The string gives the previous workdir of the flow.
-
has_db¶ True if flow uses MongoDB to store the results.
-
iflat_nodes(status=None, op='==', nids=None)[source]¶ Generators that produces a flat sequence of nodes. if status is not None, only the tasks with the specified status are selected. nids is an optional list of node identifiers used to filter the nodes.
-
iflat_tasks(status=None, op='==', nids=None)[source]¶ Generator to iterate over all the tasks of the
Flow.If status is not None, only the tasks whose status satisfies the condition (task.status op status) are selected status can be either one of the flags defined in the
Taskclass (e.g Task.S_OK) or a string e.g “S_OK” nids is an optional list of node identifiers used to filter the tasks.
-
iflat_tasks_wti(status=None, op='==', nids=None)[source]¶ Generator to iterate over all the tasks of the Flow. :Yields: (task, work_index, task_index)
If status is not None, only the tasks whose status satisfies the condition (task.status op status) are selected status can be either one of the flags defined in the
Taskclass (e.g Task.S_OK) or a string e.g “S_OK” nids is an optional list of node identifiers used to filter the tasks.
-
inspect(nids=None, wslice=None, **kwargs)[source]¶ Inspect the tasks (SCF iterations, Structural relaxation …) and produces matplotlib plots.
Parameters: - nids – List of node identifiers.
- wslice – Slice object used to select works.
- kwargs – keyword arguments passed to task.inspect method.
Note
nids and wslice are mutually exclusive. If nids and wslice are both None, all tasks in self are inspected.
Returns: List of matplotlib figures.
-
listext(ext, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶ Print to the given stream a table with the list of the output files with the given ext produced by the flow.
-
look_before_you_leap()[source]¶ This method should be called before running the calculation to make sure that the most important requirements are satisfied.
Returns: List of strings with inconsistencies/errors.
-
make_light_tarfile(name=None)[source]¶ Lightweight tarball file. Mainly used for debugging. Return the name of the tarball file.
-
make_scheduler(**kwargs)[source]¶ Build a return a
PyFlowSchedulerto run the flow.Parameters: kwargs – if empty we use the user configuration file. if filepath in kwargs we init the scheduler from filepath. else pass **kwargs to PyFlowScheduler__init__ method.
-
make_tarfile(name=None, max_filesize=None, exclude_exts=None, exclude_dirs=None, verbose=0, **kwargs)[source]¶ Create a tarball file.
Parameters: - name – Name of the tarball file. Set to os.path.basename(flow.workdir) + “tar.gz”` if name is None.
- max_filesize (int or string with unit) – a file is included in the tar file if its size <= max_filesize Can be specified in bytes e.g. max_files=1024 or with a string with unit e.g. max_filesize=”1 Mb”. No check is done if max_filesize is None.
- exclude_exts – List of file extensions to be excluded from the tar file.
- exclude_dirs – List of directory basenames to be excluded.
- verbose (int) – Verbosity level.
- kwargs – keyword arguments passed to the
TarFileconstructor.
Returns: The name of the tarfile.
-
mongo_assimilate()[source]¶ This function is called by client code when the flow is completed Return a JSON dictionary with the most important results produced by the flow. The default implementation is empty. Subclasses must implement it
-
mongo_id¶
-
ncores_allocated¶ Returns the number of cores allocated in this moment. A core is allocated if it’s running a task or if we have submitted a task to the queue manager but the job is still pending.
-
ncores_reserved¶ Returns the number of cores reserved in this moment. A core is reserved if the task is not running but we have submitted the task to the queue manager.
-
ncores_used¶ Returns the number of cores used in this moment. A core is used if there’s a job that is running on it.
-
num_errored_tasks¶ The number of tasks whose status is S_ERROR.
-
num_tasks¶ Total number of tasks
-
num_unconverged_tasks¶ The number of tasks whose status is S_UNCONVERGED.
-
open_files(what='o', status=None, op='==', nids=None, editor=None)[source]¶ Open the files of the flow inside an editor (command line interface).
Parameters: - what –
string with the list of characters selecting the file type Possible choices:
i ==> input_file, o ==> output_file, f ==> files_file, j ==> job_file, l ==> log_file, e ==> stderr_file, q ==> qout_file, all ==> all files. - status – if not None, only the tasks with this status are select
- op – status operator. Requires status. A task is selected if task.status op status evaluates to true.
- nids – optional list of node identifiers used to filter the tasks.
- editor – Select the editor. None to use the default editor ($EDITOR shell env var)
- what –
-
parse_timing(nids=None)[source]¶ Parse the timer data in the main output file(s) of Abinit. Requires timopt /= 0 in the input file (usually timopt = -1)
Parameters: nids – optional list of node identifiers used to filter the tasks. Return:
AbinitTimerParserinstance, None if error.
-
pickle_dumps(protocol=None)[source]¶ Return a string with the pickle representation. protocol selects the pickle protocol. self.pickle_protocol is
used if protocol is None
-
pickle_file¶ The path of the pickle file.
-
classmethod
pickle_load(filepath, spectator_mode=True, remove_lock=False)[source]¶ Loads the object from a pickle file and performs initial setup.
Parameters: - filepath – Filename or directory name. It filepath is a directory, we scan the directory tree starting from filepath and we read the first pickle database. Raise RuntimeError if multiple databases are found.
- spectator_mode – If True, the nodes of the flow are not connected by signals. This option is usually used when we want to read a flow in read-only mode and we want to avoid callbacks that can change the flow.
- remove_lock – True to remove the file lock if any (use it carefully).
-
pid_file¶ The path of the pid file created by PyFlowScheduler.
-
plot_networkx(mode='network', with_edge_labels=False, ax=None, node_size='num_cores', node_label='name_class', layout_type='spring', **kwargs)[source]¶ Use networkx to draw the flow with the connections among the nodes and the status of the tasks.
- Args:
- mode: networkx to show connections, status to group tasks by status.
with_edge_labels: True to draw edge labels.
ax: matplotlib
Axesor None if a new figure should be created. node_size: By default, the size of the node is proportional to the number of cores used. node_label: By default, the task class is used to label node. layout_type: Get positions for all nodes using layout_type. e.g. pos = nx.spring_layout(g)
Warning
Requires networkx package.
keyword arguments controlling the display of the figure:
kwargs Meaning title Title of the plot (Default: None). show True to show the figure (default: True). savefig ‘abc.png’ or ‘abc.eps’ to save the figure to a file. size_kwargs Dictionary with options passed to fig.set_size_inches example: size_kwargs=dict(w=3, h=4) tight_layout True if to call fig.tight_layout (default: False)
-
pyfile¶ Absolute path of the python script used to generate the flow. Set by set_pyfile
-
rapidfire(check_status=True, **kwargs)[source]¶ Use
PyLauncherto submits tasks in rapidfire mode. kwargs contains the options passed to the launcher.Returns: number of tasks submitted.
-
register_task(input, deps=None, manager=None, task_class=None)[source]¶ Utility function that generates a Work made of a single task
Parameters: - input –
AbinitInput - deps – List of
Dependencyobjects specifying the dependency of this node. An empy list of deps implies that this node has no dependencies. - manager – The
TaskManagerresponsible for the submission of the task. If manager is None, we use theTaskManagerspecified during the creation of the work. - task_class – Task subclass to instantiate. Default:
AbinitTask
Returns: The generated
Workfor the task, work[0] is the actual task.- input –
-
register_work(work, deps=None, manager=None, workdir=None)[source]¶ Register a new
Workand add it to the internal list, taking into account possible dependencies.Parameters: - work –
Workobject. - deps – List of
Dependencyobjects specifying the dependency of this node. An empy list of deps implies that this node has no dependencies. - manager – The
TaskManagerresponsible for the submission of the task. If manager is None, we use the TaskManager specified during the creation of the work. - workdir – The name of the directory used for the
Work.
Returns: The registered
Work.- work –
-
register_work_from_cbk(cbk_name, cbk_data, deps, work_class, manager=None)[source]¶ Registers a callback function that will generate the
Taskof theWork.Parameters: - cbk_name – Name of the callback function (must be a bound method of self)
- cbk_data – Additional data passed to the callback function.
- deps – List of
Dependencyobjects specifying the dependency of the work. - work_class –
Workclass to instantiate. - manager – The
TaskManagerresponsible for the submission of the task. If manager is None, we use the TaskManager specified during the creation of theFlow.
Returns: The
Workthat will be finalized by the callback.
-
reload()[source]¶ Reload the flow from the pickle file. Used when we are monitoring the flow executed by the scheduler. In this case, indeed, the flow might have been changed by the scheduler and we have to reload the new flow in memory.
-
select_tasks(nids=None, wslice=None)[source]¶ Return a list with a subset of tasks.
Parameters: - nids – List of node identifiers.
- wslice – Slice object used to select works.
Note
nids and wslice are mutually exclusive. If no argument is provided, the full list of tasks is returned.
-
set_garbage_collector(exts=None, policy='task')[source]¶ Enable the garbage collector that will remove the big output files that are not needed.
Parameters: - exts – string or list with the Abinit file extensions to be removed. A default is provided if exts is None
- policy – Either flow or task. If policy is set to ‘task’, we remove the output
files as soon as the task reaches S_OK. If ‘flow’, the files are removed
only when the flow is finalized. This option should be used when we are dealing
with a dynamic flow with callbacks generating other tasks since a
Taskmight not be aware of its children when it reached S_OK.
-
set_spectator_mode(mode=True)[source]¶ When the flow is in spectator_mode, we have to disable signals, pickle dump and possible callbacks A spectator can still operate on the flow but the new status of the flow won’t be saved in the pickle file. Usually the flow is in spectator mode when we are already running it via the scheduler or other means and we should not interfere with its evolution. This is the reason why signals and callbacks must be disabled. Unfortunately preventing client-code from calling methods with side-effects when the flow is in spectator mode is not easy (e.g. flow.cancel will cancel the tasks submitted to the queue and the flow used by the scheduler won’t see this change!
-
set_workdir(workdir, chroot=False)[source]¶ Set the working directory. Cannot be set more than once unless chroot is True
-
show_abierrors(nids=None, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶ Write to the given stream the list of ABINIT errors for all tasks whose status is S_ABICRITICAL.
Parameters: - nids – optional list of node identifiers used to filter the tasks.
- stream – File-like object. Default: sys.stdout
-
show_corrections(status=None, nids=None)[source]¶ Show the corrections applied to the flow at run-time.
Parameters: - status – if not None, only the tasks with this status are select.
- nids – optional list of node identifiers used to filter the tasks.
Return: The number of corrections found.
-
show_dependencies(stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶ Writes to the given stream the ASCII representation of the dependency tree.
-
show_events(status=None, nids=None)[source]¶ Print the Abinit events (ERRORS, WARNIING, COMMENTS) to stdout
Parameters: - status – if not None, only the tasks with this status are select
- nids – optional list of node identifiers used to filter the tasks.
-
show_history(status=None, nids=None, full_history=False, metadata=False)[source]¶ Print the history of the flow to stdout.
Parameters: - status – if not None, only the tasks with this status are select
- full_history – Print full info set, including nodes with an empty history.
- nids – optional list of node identifiers used to filter the tasks.
- metadata – print history metadata (experimental)
-
show_info(**kwargs)[source]¶ Print info on the flow i.e. total number of tasks, works, tasks grouped by class.
Example
Task Class Number ———— ——– ScfTask 1 NscfTask 1 ScrTask 2 SigmaTask 6
-
show_inputs(varnames=None, nids=None, wslice=None, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶ Print the input of the tasks to the given stream.
Parameters: - varnames – List of Abinit variables. If not None, only the variable in varnames are selected and printed.
- nids – List of node identifiers. By defaults all nodes are shown
- wslice – Slice object used to select works.
- stream – File-like object, Default: sys.stdout
-
show_qouts(nids=None, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶ Write to the given stream the content of the queue output file for all tasks whose status is S_QCRITICAL.
Parameters: - nids – optional list of node identifiers used to filter the tasks.
- stream – File-like object. Default: sys.stdout
-
show_status(**kwargs)[source]¶ Report the status of the works and the status of the different tasks on the specified stream.
Parameters: - stream – File-like object, Default: sys.stdout
- nids – List of node identifiers. By defaults all nodes are shown
- wslice – Slice object used to select works.
- verbose – Verbosity level (default 0). > 0 to show only the works that are not finalized.
-
show_summary(**kwargs)[source]¶ Print a short summary with the status of the flow and a counter task_status –> number_of_tasks
Parameters: stream – File-like object, Default: sys.stdout Example
Status Count ——— ——- Completed 10
<Flow, node_id=27163, workdir=flow_gwconv_ecuteps>, num_tasks=10, all_ok=True
-
single_shot(check_status=True, **kwargs)[source]¶ Use
PyLauncherto submits one task. kwargs contains the options passed to the launcher.Returns: number of tasks submitted.
-
status_counter¶ Returns a
Counterobject that counts the number of tasks with given status (use the string representation of the status as key).
-
tasks_from_nids(nids)[source]¶ Return the list of tasks associated to the given list of node identifiers (nids).
Note
Invalid ids are ignored
-
classmethod
temporary_flow(manager=None)[source]¶ Return a Flow in a temporary directory. Useful for unit tests.
-
to_dict(**kwargs)¶ JSON serialization, note that we only need to save a string with the working directory since the object will be reconstructed from the pickle file located in workdir
-
unconverged_tasks¶ List of unconverged tasks.
-
use_smartio()[source]¶ This function should be called when the entire Flow has been built. It tries to reduce the pressure on the hard disk by using Abinit smart-io capabilities for those files that are not needed by other nodes. Smart-io means that big files (e.g. WFK) are written only if the calculation is unconverged so that we can restart from it. No output is produced if convergence is achieved.
-
works¶ List of
Workobjects contained in self..
-
class
G0W0WithQptdmFlow(workdir, scf_input, nscf_input, scr_input, sigma_inputs, manager=None)[source]¶ Bases:
pymatgen.io.abinit.flows.FlowBuild a
Flowfor one-shot G0W0 calculations. The computation of the q-points for the screening is parallelized with qptdm i.e. we run independent calculations for each q-point and then we merge the final results.Parameters: - workdir – Working directory.
- scf_input – Input for the GS SCF run.
- nscf_input – Input for the NSCF run (band structure run).
- scr_input – Input for the SCR run.
- sigma_inputs – Input(s) for the SIGMA run(s).
- manager –
TaskManagerobject used to submit the jobs Initialized from manager.yml if manager is None.
-
bandstructure_flow(workdir, scf_input, nscf_input, dos_inputs=None, manager=None, flow_class=<class 'pymatgen.io.abinit.flows.Flow'>, allocate=True)[source]¶ Build a
Flowfor band structure calculations.Parameters: - workdir – Working directory.
- scf_input – Input for the GS SCF run.
- nscf_input – Input for the NSCF run (band structure run).
- dos_inputs – Input(s) for the NSCF run (dos run).
- manager –
TaskManagerobject used to submit the jobs Initialized from manager.yml if manager is None. - flow_class – Flow subclass
- allocate – True if the flow should be allocated before returning.
Returns: Flowobject
-
g0w0_flow(workdir, scf_input, nscf_input, scr_input, sigma_inputs, manager=None, flow_class=<class 'pymatgen.io.abinit.flows.Flow'>, allocate=True)[source]¶ Build a
Flowfor one-shot $G_0W_0$ calculations.Parameters: - workdir – Working directory.
- scf_input – Input for the GS SCF run.
- nscf_input – Input for the NSCF run (band structure run).
- scr_input – Input for the SCR run.
- sigma_inputs – List of inputs for the SIGMA run.
- flow_class – Flow class
- manager –
TaskManagerobject used to submit the jobs. Initialized from manager.yml if manager is None. - allocate – True if the flow should be allocated before returning.
Returns: Flowobject
-
phonon_flow(workdir, scf_input, ph_inputs, with_nscf=False, with_ddk=False, with_dde=False, manager=None, flow_class=<class 'pymatgen.io.abinit.flows.PhononFlow'>, allocate=True)[source]¶ Build a
PhononFlowfor phonon calculations.Parameters: - workdir – Working directory.
- scf_input – Input for the GS SCF run.
- ph_inputs – List of Inputs for the phonon runs.
- with_nscf – add an nscf task in front of al phonon tasks to make sure the q point is covered
- with_ddk – add the ddk step
- with_dde – add the dde step it the dde is set ddk is switched on automatically
- manager –
TaskManagerused to submit the jobs Initialized from manager.yml if manager is None. - flow_class – Flow class
Returns: Flowobject