_images/AiiDA_transparent_logo.png

The aiida-lsmo plugin for AiiDA

Getting started

This plugin is a collection of work chains and calculation functions that combine the use of multiple codes (e.g., CP2K, DDEC, Raspa, Zeo++, …) to achieve advanced automated tasks.

Installation

Use the following commands to install the plugin:

git clone https://github.com/yakutovicha/aiida-lsmo .
cd aiida-lsmo
pip install -e .

Note

This will install also the related plugins (e.g., aiida-cp2k, aiida-raspa, …) if not present already, but the codes (e.g, CP2K, RASPA, …) need to be set up before using these work chains.

Usage

Consider that, for each work chain, at least one example is provided in the examples directory: these examples are usually quick and you can run them on your localhost in a couple of minutes.

A quick demo on how to submit a work chain:

verdi daemon start         # make sure the daemon is running
cd examples
verdi run run_IsothermWorkChain_HKUST-1.py raspa@localhost zeopp@localhost

Note that in the running script, the work chain is imported using the WorkflowFactory:

from aiida.plugins import WorkflowFactory

IsothermWorkChain = WorkflowFactory('lsmo.isotherm')

while a calculation function is imported with the CalculationFactory:

from aiida.plugins import CalculationFactory

FFBuilder = CalculationFactory('lsmo.ff_builder')

After you run the work chain you can inspect the log, for example:

$ verdi process report

2019-11-22 16:54:52 [90962 | REPORT]: [266248|Cp2kMultistageWorkChain|setup_multistage]: Unit cell was NOT resized
2019-11-22 16:54:52 [90963 | REPORT]: [266248|Cp2kMultistageWorkChain|run_stage]: submitted Cp2kBaseWorkChain for stage_0/settings_0
2019-11-22 16:54:52 [90964 | REPORT]:   [266252|Cp2kBaseWorkChain|run_calculation]: launching Cp2kCalculation<266253> iteration #1
2019-11-22 16:55:13 [90965 | REPORT]:   [266252|Cp2kBaseWorkChain|inspect_calculation]: Cp2kCalculation<266253> completed successfully
2019-11-22 16:55:13 [90966 | REPORT]:   [266252|Cp2kBaseWorkChain|results]: work chain completed after 1 iterations
2019-11-22 16:55:14 [90967 | REPORT]:   [266252|Cp2kBaseWorkChain|on_terminated]: remote folders will not be cleaned
2019-11-22 16:55:14 [90968 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_settings_stage0]: Bandgaps spin1/spin2: -0.058 and -0.058 ev
2019-11-22 16:55:14 [90969 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_settings_stage0]: BAD SETTINGS: band gap is < 0.100 eV
2019-11-22 16:55:14 [90970 | REPORT]: [266248|Cp2kMultistageWorkChain|run_stage]: submitted Cp2kBaseWorkChain for stage_0/settings_1
2019-11-22 16:55:15 [90971 | REPORT]:   [266259|Cp2kBaseWorkChain|run_calculation]: launching Cp2kCalculation<266260> iteration #1
2019-11-22 16:55:34 [90972 | REPORT]:   [266259|Cp2kBaseWorkChain|inspect_calculation]: Cp2kCalculation<266260> completed successfully
2019-11-22 16:55:34 [90973 | REPORT]:   [266259|Cp2kBaseWorkChain|results]: work chain completed after 1 iterations
2019-11-22 16:55:34 [90974 | REPORT]:   [266259|Cp2kBaseWorkChain|on_terminated]: remote folders will not be cleaned
2019-11-22 16:55:35 [90975 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_settings_stage0]: Bandgaps spin1/spin2: 0.000 and 0.000 ev
2019-11-22 16:55:35 [90976 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_stage]: Structure updated for next stage
2019-11-22 16:55:35 [90977 | REPORT]: [266248|Cp2kMultistageWorkChain|run_stage]: submitted Cp2kBaseWorkChain for stage_1/settings_1
2019-11-22 16:55:35 [90978 | REPORT]:   [266266|Cp2kBaseWorkChain|run_calculation]: launching Cp2kCalculation<266267> iteration #1
2019-11-22 16:55:53 [90979 | REPORT]:   [266266|Cp2kBaseWorkChain|inspect_calculation]: Cp2kCalculation<266267> completed successfully
2019-11-22 16:55:53 [90980 | REPORT]:   [266266|Cp2kBaseWorkChain|results]: work chain completed after 1 iterations
2019-11-22 16:55:54 [90981 | REPORT]:   [266266|Cp2kBaseWorkChain|on_terminated]: remote folders will not be cleaned
2019-11-22 16:55:54 [90982 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_stage]: Structure updated for next stage
2019-11-22 16:55:54 [90983 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_stage]: All stages computed, finishing...
2019-11-22 16:55:55 [90984 | REPORT]: [266248|Cp2kMultistageWorkChain|results]: Outputs: Dict<266273> and StructureData<266271>

You can also inspect the inputs/outputs in a single glance with verdi node show, for example:

$ verdi node show 266248

Property     Value
-----------  ------------------------------------
type         Cp2kMultistageWorkChain
state        Finished [0]
pk           266248
uuid         f707d727-f7c2-4232-a90c-d9e2711e5fe6
label
description
ctime        2019-11-22 16:54:51.692140+00:00
mtime        2019-11-22 16:55:55.239555+00:00
computer     [21] localhost

Inputs                 PK      Type
---------------------  ------  -------------
cp2k_base
    clean_workdir      266246  Bool
    max_iterations     266245  Int
    cp2k
        code           265588  Code
        parameters     266244  Dict
min_cell_size          266247  Float
protocol_modify        266243  Dict
protocol_tag           266241  Str
starting_settings_idx  266242  Int
structure              266240  StructureData

Outputs                    PK  Type
---------------------  ------  -------------
last_input_parameters  266265  Dict
output_parameters      266273  Dict
output_structure       266271  StructureData
remote_folder          266268  RemoteData

Called                      PK  Type
----------------------  ------  ----------------
CALL                    266272  CalcFunctionNode
run_stage_1_settings_1  266266  WorkChainNode
run_stage_0_settings_1  266259  WorkChainNode
run_stage_0_settings_0  266252  WorkChainNode
CALL                    266249  CalcFunctionNode

Log messages
----------------------------------------------
There are 11 log messages for this calculation
Run 'verdi process report 266248' to see them

Another good idea is to print the graph of your workflow with verdi node graph generate, to inspect all its internal steps:

_images/multistage_wc_al.png

LSMO calc functions and work chains

In the following section all the calc functions and work chains of the aiida-lsmo plugin are listed and documented.

Force Field Builder

The ff_builder() calculation function allows to combine the force field parameters (typically for a Lennard-Jones potential) for a framework and the molecule(s), giving as an output the .def files required by Raspa. To see the list of available parameterization for the frameworks and the available molecules, give a look to the file ff_data.yaml.

What it can do:

  1. Switch settings that are written in the .def files of Raspa, such as tail-corrections, truncation/shifting and mixing rules.

  2. Decide to separate the interactions, so that framework-molecule interactions and molecule-molecule interactions are parametrized differently (e.g., TraPPE for molecule-mololecule and UFF, instead of UFF/TraPPE for framework-molecule).

What it currently can not do:

  1. Deal with flexible molecules.

  2. Take parameters from other files (e.g., YAML).

  3. Generate .def files for a molecule, given just the geometry: it has to be included in the ff_data.yaml file.

Inputs details

  • Parameters Dict:

    PARAMS_EXAMPLE = Dict( dict = {
       'ff_framework': 'UFF',              # See force fields available in ff_data.yaml as framework.keys()
       'ff_molecules': {                   # See molecules available in ff_data.yaml as ff_data.keys(
           'CO2': 'TraPPE',                    # See force fields available in ff_data.yaml as {molecule}.keys()
           'N2': 'TraPPE'
       },
       'shifted': True,                    # If True shift despersion interactions, if False simply truncate them
       'tail_corrections': False,          # If True apply tail corrections based on homogeneous-liquid assumption
       'mixing_rule': 'Lorentz-Berthelot', # Options: 'Lorentz-Berthelot' or 'Jorgensen'
       'separate_interactions': True       # If True use framework's force field for framework-molecule interactions
    })
    

Outputs details

  • Dictionary containing the .def files as SinglefileData. This output dictionary is ready to be used as a files input of the RaspaCalculation: you can find and example of usage of this CalcFunction in the IsothermWorkChain, or a minimal test usage in the examples.

Selectivity calculators

The calc_selectivity() calculation function computes the selectivity of two gas in a material, as the ratio between their Henry coefficients. In the future this module will host also different metrics to assess selectivity, for specific applications.

Working Capacity calculators

The module calcfunctions/working_cap.py contains a collections of calculation functions to compute the working capacities for different compound (e.g., CH4, H2) at industrially reference/relevant conditions. The working capacity is the usable amount of a stored adsorbed compound between the loading and discharging temperature and pressure. These are post-processing calculation from the output_parameters of Isotherm or IsothermMultiTemp work chains, that needs to be run at specific conditions: see the header of the calc function to know them. Their inner working is very simple but they are collected in this repository to be used as a reference in our group. If you are investigating some different gas storage application, consider including a similar script here.

An example is calc_ch4_working_cap() for methane storage.

Isotherm work chain

The IsothermWorkChain() work function allows to compute a single-component isotherm in a framework, from a few settings.

What it does, in order:

  1. Run a geometry calculation (Zeo++) to assess the accessible probe-occubiable pore volume and the needed blocking spheres.

  2. Stop if the structure is non-porous, i.e., not permeable to the molecule.

  3. Get the parameters of the force field using the FFBuilder.

  4. Get the number of unit cell replicas needed to have correct periodic boundary conditions at the given cutoff.

  5. Compute the adsorption at zero loading (e.g., the Henry coefficient, kH) from a Widom insertion calculation using Raspa.

  6. Stop if the kH is not more that a certain user-defined threshold: this can be used for screening purpose, or to intentionally compute only the kH using this work chain.

  7. Given a min/max range, propose a list of pressures that sample the isotherm uniformly. However, the user can also specify a defined list of pressure and skip this automatic selection.

  8. Compute the isotherm using Grand Canonical Monte Carlo (GCMC) sampling in series, and restarting each system from the previous one for a short and efficient equilibration.

What it can not do:

  1. Compute isotherms at different temperatures (see IsothermMultiTemp work chain for this).

  2. Compute multi-component isotherms, as it would complicate a lot the input, output and logic, and it is not trivial to assign the mixture composition of the bulk gas at different pressure, for studying a real case.

  3. It is not currently possible to play too much with Monte Carlo probabilities and other advanced settings in Raspa.

  4. Sample the isotherm uniformly in case of “type II” isotherms, i.e., like for water, having significant cooperative insertion.

  5. Run the different pressures in parallel: this would be less efficient because you can not restart from the previous configuration, and not necessarily much faster considering that equilibrating the higher pressure calculation will be anyway the bottleneck.

workchainaiida_lsmo.workchains.IsothermWorkChain

Workchain that computes volpo and blocking spheres: if accessible volpo>0 it also runs a raspa widom calculation for the Henry coefficient.

Inputs:

  • geometric, Dict, optional – [Only used by IsothermMultiTempWorkChain] Already computed geometric properties
  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • molecule, (Str, Dict), required – Adsorbate molecule: settings to be read from the yaml.Advanced: input a Dict for non-standard settings.
  • parameters, Dict, required – Parameters for the Isotherm workchain: will be merged with IsothermParameters_defaults.
  • raspa_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
          • parser_name, str, optional, non_db – Set a string for the output parser. Can be None if no output plugin is available or needed
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db – Set the calculation to use mpi
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • structure, CifData, required – Adsorbent framework CIF.
  • zeopp, Namespace
    Namespace Ports
    • code, Code, required – The Code to use for this job.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
      • description, str, optional, non_db – Description to set on the process node.
      • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
      • label, str, optional, non_db – Label to set on the process node.
      • options, Namespace
        Namespace Ports
        • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
        • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
        • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
        • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
        • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
        • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
        • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
        • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
        • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
        • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
        • parser_name, str, optional, non_db
        • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
        • priority, str, optional, non_db – Set the priority of the job to be queued
        • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
        • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
        • resources, dict, optional, non_db
        • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
        • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
        • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
        • withmpi, bool, optional, non_db – Set the calculation to use mpi
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.

Outputs:

  • block, SinglefileData, optional – Blocked pockets fileoutput file.
  • output_parameters, Dict, required – Results of the single temperature wc: keys can vay depending on is_porous and is_kh_enough booleans.

Outline:

setup(Initialize the parameters)
run_zeopp(Perform Zeo++ block and VOLPO calculations.)
if(should_run_widom)
    run_raspa_widom(Run a Widom calculation in Raspa.)
    if(should_run_gcmc)
        init_raspa_gcmc(Choose the pressures we want to sample, report some details, and update settings for GCMC)
        while(should_run_another_gcmc)
            run_raspa_gcmc(Run a GCMC calculation in Raspa @ T,P.)
return_output_parameters(Merge all the parameters into output_parameters, depending on is_porous and is_kh_ehough.)

Inputs details

  • structure (CifData) is the framework with partial charges (provided as _atom_site_charge column in the CIF file)

  • molecule can be provided both as a Str or Dict. It contains information about the molecule force field and approximated spherical-probe radius for the geometry calculation. If provided as a string (e.g., co2, n2) the work chain looks up at the corresponding dictionary in isotherm_data/isotherm_molecules.yaml. The input dictionary reads as, for example:

    co2:
      name: CO2          # Raspa's MoleculeName
      forcefield: TraPPE # Raspa's MoleculeDefinition
      molsatdens: 21.2   # Density of the liquid phase of the molecule in (mol/l). Typically I run a simulation at 300K/200bar
      proberad: 1.525    # radius used for computing VOLPO and Block (Angs). Typically FF's sigma/2
      singlebead: False  # if true: RotationProbability=0
      charged: True      # if true: ChargeMethod=Ewald
    
  • parameters (Dict) modifies the default parameters:

    parameters = {
      "ff_framework": "UFF",  # (str) Forcefield of the structure.
      "ff_separate_interactions": False,  # (bool) Use "separate_interactions" in the FF builder.
      "ff_mixing_rule": "Lorentz-Berthelot",  # (string) Choose 'Lorentz-Berthelot' or 'Jorgensen'.
      "ff_tail_corrections": True,  # (bool) Apply tail corrections.
      "ff_shifted": False,  # (bool) Shift or truncate the potential at cutoff.
      "ff_cutoff": 12.0,  # (float) CutOff truncation for the VdW interactions (Angstrom).
      "temperature": 300,  # (float) Temperature of the simulation.
      "temperature_list": None,  # (list) To be used by IsothermMultiTempWorkChain.
      "zeopp_volpo_samples": int(1e5),  # (int) Number of samples for VOLPO calculation (per UC volume).
      "zeopp_block_samples": int(100),  # (int) Number of samples for BLOCK calculation (per A^3).
      "raspa_minKh": 1e-10,  # (float) If Henry coefficient < raspa_minKh do not run the isotherm (mol/kg/Pa).
      "raspa_verbosity": 10,  # (int) Print stats every: number of cycles / raspa_verbosity.
      "raspa_widom_cycles": int(1e5),  # (int) Number of Widom cycles.
      "raspa_gcmc_init_cycles": int(1e3),  # (int) Number of GCMC initialization cycles.
      "raspa_gcmc_prod_cycles": int(1e4),  # (int) Number of GCMC production cycles.
      "pressure_list": None,  # (list) Pressure list for the isotherm (bar): if given it will skip to guess it.
      "pressure_precision": 0.1,  # (float) Precision in the sampling of the isotherm: 0.1 ok, 0.05 for high resolution.
      "pressure_maxstep": 5,  # (float) Max distance between pressure points (bar).
      "pressure_min": 0.001,  # (float) Lower pressure to sample (bar).
      "pressure_max": 10  # (float) Upper pressure to sample (bar).
    }
    

Note that if the pressure_list value is provided, the other pressure inputs are neglected and the automatic pressure selection of the work chain is skipped.

  • geometric is not meant to be used by the user, but by the IsothermMultiTemp work chains.

Outputs details

  • output_parameters (Dict) whose length depends whether is_porous is True (if not, only geometric outputs are reported in the dictionary), and whether is_kh_enough (if False, it prints only the output of the Widom calculation, otherwise it also reports the isotherm data). This is an example of a full isotherm with is_porous=True and is_kh_enough=True, for 6 pressure points at 298K

    {
        "Density": 0.385817,
        "Density_unit": "g/cm^3",
        "Estimated_saturation_loading": 51.586704,
        "Estimated_saturation_loading_unit": "mol/kg",
        "Input_block": [
            1.865,
            100
        ],
        "Input_ha": "DEF",
        "Input_structure_filename": "19366N2.cif",
        "Input_volpo": [
            1.865,
            1.865,
            100000
        ],
        "Number_of_blocking_spheres": 0,
        "POAV_A^3": 8626.94,
        "POAV_A^3_unit": "A^3",
        "POAV_Volume_fraction": 0.73173,
        "POAV_Volume_fraction_unit": null,
        "POAV_cm^3/g": 1.89657,
        "POAV_cm^3/g_unit": "cm^3/g",
        "PONAV_A^3": 0.0,
        "PONAV_A^3_unit": "A^3",
        "PONAV_Volume_fraction": 0.0,
        "PONAV_Volume_fraction_unit": null,
        "PONAV_cm^3/g": 0.0,
        "PONAV_cm^3/g_unit": "cm^3/g",
        "Unitcell_volume": 11789.8,
        "Unitcell_volume_unit": "A^3",
        "adsorption_energy_widom_average": -9.7886451805,
        "adsorption_energy_widom_dev": 0.0204010566,
        "adsorption_energy_widom_unit": "kJ/mol",
        "conversion_factor_molec_uc_to_cm3stp_cm3": 3.1569089445,
        "conversion_factor_molec_uc_to_gr_gr": 5.8556741651,
        "conversion_factor_molec_uc_to_mol_kg": 0.3650669679,
        "henry_coefficient_average": 6.72787e-06,
        "henry_coefficient_dev": 3.94078e-08,
        "henry_coefficient_unit": "mol/kg/Pa",
        "is_kh_enough": true,
        "is_porous": true,
        "isotherm": {
            "enthalpy_of_adsorption_average": [
                -12.309803364014,
                ...
                -9.6064899852835
            ],
            "enthalpy_of_adsorption_dev": [
                0.34443269062882,
                ...
                0.2598580313121
            ],
            "enthalpy_of_adsorption_unit": "kJ/mol",
            "loading_absolute_average": [
                0.65880897694654,
                ...
                17.302504097082
            ],
            "loading_absolute_dev": [
                0.041847687204507,
                ...
                0.14638828764266
            ],
            "loading_absolute_unit": "mol/kg",
            "pressure": [
                1.0,
                ...
                65
            ],
            "pressure_unit": "bar"
        },
        "temperature": 298,
        "temperature_unit": "K"
    }
    
  • block (SinglefileData) file is outputted if blocking spheres are found and used for the isotherm. Therefore, this is ready to be used for a new, consistent, Raspa calculation.

IsothermMultiTemp work chain

The IsothermMultiTempWorkChain() work chain can run in parallel the Isotherm work chain at different temperatures. Since the geometry initial calculation to get the pore volume and blocking spheres is not dependent on the temperature, this is run only once. Inputs and outputs are very similar to the Isotherm work chain.

What it can do:

  1. Compute the kH at every temperature and guess, for each temperature, the pressure points needed for an uniform sampling of the isotherm.

What it can not do:

  1. Select specific pressure points (as pressure_list) that are different at different temperatures.

  2. Run an isobar curve (same pressure, different pressures) restarting each GCMC calculation from the previous system.

workchainaiida_lsmo.workchains.IsothermMultiTempWorkChain

Run IsothermWorkChain for multiple temperatures: first compute geometric properties and then submit Widom+GCMC at different temperatures in parallel

Inputs:

  • geometric, Dict, optional – [Only used by IsothermMultiTempWorkChain] Already computed geometric properties
  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • molecule, (Str, Dict), required – Adsorbate molecule: settings to be read from the yaml.Advanced: input a Dict for non-standard settings.
  • parameters, Dict, required – Parameters for the Isotherm workchain: will be merged with IsothermParameters_defaults.
  • raspa_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
          • parser_name, str, optional, non_db – Set a string for the output parser. Can be None if no output plugin is available or needed
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db – Set the calculation to use mpi
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • structure, CifData, required – Adsorbent framework CIF.
  • zeopp, Namespace
    Namespace Ports
    • code, Code, required – The Code to use for this job.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
      • description, str, optional, non_db – Description to set on the process node.
      • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
      • label, str, optional, non_db – Label to set on the process node.
      • options, Namespace
        Namespace Ports
        • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
        • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
        • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
        • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
        • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
        • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
        • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
        • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
        • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
        • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
        • parser_name, str, optional, non_db
        • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
        • priority, str, optional, non_db – Set the priority of the job to be queued
        • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
        • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
        • resources, dict, optional, non_db
        • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
        • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
        • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
        • withmpi, bool, optional, non_db – Set the calculation to use mpi
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.

Outputs:

  • block, SinglefileData, optional – Blocked pockets fileoutput file.
  • output_parameters, Dict, required – Results of isotherms run at different temperatures.

Outline:

run_geometric(Perform Zeo++ block and VOLPO calculation with IsothermWC.)
if(should_continue)
    run_isotherms(Compute isotherms at different temperatures.)
collect_isotherms(Collect all the results in one Dict)

Inputs details

  • parameters (Dict), compared to the input of the Isotherm work chain, contains the key temperature_list and neglects the key temperature:

    "temperature_list": [278, 298.15, 318.0],
    

Outputs details

  • output_parameters (Dict) contains the temperature and isotherm as lists. In this example 3 pressure points are computed at 77K, 198K and 298K:

    {
        "Density": 0.731022,
        "Density_unit": "g/cm^3",
        "Estimated_saturation_loading": 22.1095656,
        "Estimated_saturation_loading_unit": "mol/kg",
        "Input_block": [
            1.48,
            100
        ],
        "Input_ha": "DEF",
        "Input_structure_filename": "tmpQD_OdI.cif",
        "Input_volpo": [
            1.48,
            1.48,
            100000
        ],
        "Number_of_blocking_spheres": 0,
        "POAV_A^3": 1579.69,
        "POAV_A^3_unit": "A^3",
        "POAV_Volume_fraction": 0.45657,
        "POAV_Volume_fraction_unit": null,
        "POAV_cm^3/g": 0.624564,
        "POAV_cm^3/g_unit": "cm^3/g",
        "PONAV_A^3": 0.0,
        "PONAV_A^3_unit": "A^3",
        "PONAV_Volume_fraction": 0.0,
        "PONAV_Volume_fraction_unit": null,
        "PONAV_cm^3/g": 0.0,
        "PONAV_cm^3/g_unit": "cm^3/g",
        "Unitcell_volume": 3459.91,
        "Unitcell_volume_unit": "A^3",
        "adsorption_energy_widom_average": [
            -6.501026119,
            -3.7417828535,
            -2.9538187687
        ],
        "adsorption_energy_widom_dev": [
            0.0131402719,
            0.0109470973,
            0.009493264
        ],
        "adsorption_energy_widom_unit": "kJ/mol",
        "conversion_factor_molec_uc_to_cm3stp_cm3": 10.757306634,
        "conversion_factor_molec_uc_to_gr_gr": 1.3130795208,
        "conversion_factor_molec_uc_to_mol_kg": 0.6565397604,
        "henry_coefficient_average": [
            0.000590302,
            1.36478e-06,
            4.59353e-07
        ],
        "henry_coefficient_dev": [
            6.20272e-06,
            2.92729e-09,
            1.3813e-09
        ],
        "henry_coefficient_unit": "mol/kg/Pa",
        "is_kh_enough": [
            true,
            true,
            true
        ],
        "is_porous": true,
        "isotherm": [
            {
                "enthalpy_of_adsorption_average": [
                    -4.8763191239929,
                    -4.071414615084,
                    -3.8884980003825
                ],
                "enthalpy_of_adsorption_dev": [
                    0.27048724983995,
                    0.17838206413742,
                    0.30520201541493
                ],
                "enthalpy_of_adsorption_unit": "kJ/mol",
                "loading_absolute_average": [
                    8.8763231830174,
                    13.809017193987,
                    24.592736102413
                ],
                "loading_absolute_dev": [
                    0.10377880404968,
                    0.057485479697981,
                    0.1444399097573
                ],
                "loading_absolute_unit": "mol/kg",
                "pressure": [
                    1.0,
                    5.0,
                    100
                ],
                "pressure_unit": "bar"
            },
            {
                "enthalpy_of_adsorption_average": [
                    -5.3762452088166,
                    -5.304498349588,
                    -5.1469837785704
                ],
                "enthalpy_of_adsorption_dev": [
                    0.16413676386221,
                    0.23624406142692,
                    0.16877234291986
                ],
                "enthalpy_of_adsorption_unit": "kJ/mol",
                "loading_absolute_average": [
                    0.13688033329639,
                    0.64822632568393,
                    8.2218063857542
                ],
                "loading_absolute_dev": [
                    0.0022470007645714,
                    0.015908634630445,
                    0.063314699465606
                ],
                "loading_absolute_unit": "mol/kg",
                "pressure": [
                    1.0,
                    5.0,
                    100
                ],
                "pressure_unit": "bar"
            },
            {
                "enthalpy_of_adsorption_average": [
                    -5.3995609987279,
                    -5.5404431584811,
                    -5.410077906097
                ],
                "enthalpy_of_adsorption_dev": [
                    0.095159861315507,
                    0.081469905963932,
                    0.1393537452296
                ],
                "enthalpy_of_adsorption_unit": "kJ/mol",
                "loading_absolute_average": [
                    0.04589212925196,
                    0.22723251444794,
                    3.8118903657499
                ],
                "loading_absolute_dev": [
                    0.0018452227888317,
                    0.0031557689853122,
                    0.047824194130595
                ],
                "loading_absolute_unit": "mol/kg",
                "pressure": [
                    1.0,
                    5.0,
                    100
                ],
                "pressure_unit": "bar"
            }
        ],
        "temperature": [
            77,
            198,
            298
        ],
        "temperature_unit": "K"
    }
    

IosthermCalcPE work chain

The IsothermCalcPEWorkChain() work chain takes as an input a structure with partial charges, computes the isotherms for CO2 and N2 at ambient temperature and models the process of carbon capture and compression for geological sequestration. The final outcome informs about the performance of the adsorbent for this application, including the CO2 parasitic energy, i.e., the energy that is required to separate and compress one kilogram of CO2, using that material. Default input mixture is coal post-combustion flue gas, but also natural gas post-combustion and air mixtures are available.

workchainaiida_lsmo.workchains.IsothermCalcPEWorkChain

Compute CO2 parassitic energy (PE) after running IsothermWorkChain for CO2 and N2 at 300K.

Inputs:

  • geometric, Dict, optional – [Only used by IsothermMultiTempWorkChain] Already computed geometric properties
  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • parameters, Dict, optional – Parameters for Isotherm work chain
  • pe_parameters, Dict, optional – Parameters for PE process modelling
  • raspa_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
          • parser_name, str, optional, non_db – Set a string for the output parser. Can be None if no output plugin is available or needed
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db – Set the calculation to use mpi
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • structure, CifData, required – Adsorbent framework CIF.
  • zeopp, Namespace
    Namespace Ports
    • code, Code, required – The Code to use for this job.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
      • description, str, optional, non_db – Description to set on the process node.
      • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
      • label, str, optional, non_db – Label to set on the process node.
      • options, Namespace
        Namespace Ports
        • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
        • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
        • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
        • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
        • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
        • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
        • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
        • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
        • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
        • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
        • parser_name, str, optional, non_db
        • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
        • priority, str, optional, non_db – Set the priority of the job to be queued
        • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
        • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
        • resources, dict, optional, non_db
        • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
        • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
        • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
        • withmpi, bool, optional, non_db – Set the calculation to use mpi
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.

Outputs:

  • co2, Namespace
    Namespace Ports
    • block, SinglefileData, optional – Blocked pockets fileoutput file.
    • output_parameters, Dict, required – Results of the single temperature wc: keys can vay depending on is_porous and is_kh_enough booleans.
  • n2, Namespace
    Namespace Ports
    • block, SinglefileData, optional – Blocked pockets fileoutput file.
    • output_parameters, Dict, required – Results of the single temperature wc: keys can vay depending on is_porous and is_kh_enough booleans.
  • output_parameters, Dict, required – Output parmaters of a calc_PE calculations

Outline:

run_isotherms(Run Isotherm work chain for CO2 and N2.)
run_calcpe(Expose isotherm outputs, prepare calc_pe, run it and return the output.)

Multistage work chain

The Cp2kMultistageWorkChain() work chain in meant to automate DFT optimizations in CP2K and guess some good parameters for the simulation, but it is written in such a versatile fashion that it can be used for many other functions.

What it can do:

  1. Given a protocol YAML with different settings, the work chains iterates until it converges the SCF calculation. The concept is to use general options for settings_0 and more and more robust for the next ones.

  2. The protocol YAML contains also a number of stages, i.e., different MOTION settings, that are executed one after the other, restarting from the previous calculation. During the first stage, stage_0, different settings are tested until the SCF converges at the last step of stage_0. If this dos not happening the work chain stops. Otherwise it continues running stage_1, and all the other stages that are included in the protocol.

  3. These stages can be used for running a robust cell optimization, i.e., combining first some MD steps to escape metastable geometries and later the final optimization, or ab-initio MD, first equilibrating the system with a shorter time constant for the thermostat, and then collecting statistics in the second stage.

  4. Some default protocols are provided in workchains/multistage_protocols and they can be imported with simple tags such as test, default, robust_conv. Otherwise, the user can take inspiration from these to write his own protocol and pass it to the work chain.

  5. Compute the band gap.

  6. You can restart from a previous calculation, e.g., from an already computed wavefunction.

What it can not do:

  1. Run CP2K calculations with k-points.

  2. Run CP2K advanced calculations, e.g., other than ENERGY, GEO_OPT, CELL_OPT and MD.

workchainaiida_lsmo.workchains.Cp2kMultistageWorkChain

Submits Cp2kBase workchains for ENERGY, GEO_OPT, CELL_OPT and MD jobs iteratively The protocol_yaml file contains a series of settings_x and stage_x: the workchains starts running the settings_0/stage_0 calculation, and, in case of a failure, changes the settings untill the SCF of stage_0 converges. Then it uses the same settings to run the next stages (i.e., stage_1, etc.).

Inputs:

  • cp2k_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • cp2k, Namespace
      Namespace Ports
      • basissets, Namespace – A dictionary of basissets to be used in the calculations: key is the atomic symbol, value is either a single basisset or a list of basissets. If multiple basissets for a single symbol are passed, it is mandatory to specify a KIND section with a BASIS_SET keyword matching the names (or aliases) of the basissets.
      • code, Code, required – The Code to use for this job.
      • file, Namespace – additional input files
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db
          • parser_name, str, optional, non_db – Parser of the calculation: the default is cp2k_advanced_parser to get the necessary info
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • min_cell_size, Float, optional – To avoid using k-points, extend the cell so that min(perp_width)>min_cell_size
  • parent_calc_folder, RemoteData, optional – Provide an initial parent folder that contains the wavefunction for restart
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol to be read from {tag}.yaml unless protocol_yaml input is specified
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file with the multistage settings (and ignore protocol_tag)
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, StructureData, optional – Input structure

Outputs:

  • last_input_parameters, Dict, optional – CP2K input parameters used (and possibly working) used in the last stage
  • output_parameters, Dict, optional – Output CP2K parameters of all the stages, merged together
  • output_structure, StructureData, optional – Processed structure (missing if only ENERGY calculation is performed)
  • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.

Outline:

setup_multistage(Setup initial parameters.)
while(should_run_stage0)
    run_stage(Check for restart, prepare input, submit and direct output to context.)
    inspect_and_update_settings_stage0(Inspect the stage0/settings_{idx} calculation and check if it is needed to update the settings and resubmint the calculation.)
inspect_and_update_stage(Update geometry, parent folder and the new &MOTION settings.)
while(should_run_stage)
    run_stage(Check for restart, prepare input, submit and direct output to context.)
    inspect_and_update_stage(Update geometry, parent folder and the new &MOTION settings.)
results(Gather final outputs of the workchain.)

Inputs details

  • structure (StructureData, NOTE this is not a CifData) is the system to investigate. It can be also a molecule in a box and not necessarily a 2D/3D framework.

  • protocol_tag (Str) calls a default protocol. Currently available:

default

Main choice, uses PBE-D3(BJ) with 600Ry/DZVP basis set and GTH pseudopotential. First settings are with OT, and if not working it switches to diagonalization and smearing. As for the stages it runs a cell optimization, a short NPT MD and again cell optimization.

test

Quick protocol for testing purpose.

robust_conv

Similar to default but using more robust and more expensive settings for the SCF convergence.

singlepoint

Same settings as default but running only one stage for a single point calculation. Used to exploit the automation of this work chain for a simple energy calculation.

  • protocol_yaml (SinglefileData) is used to specify a custom protocol through a YAML file. See the default YAML file as an example. Note that the dictionary need to contain the following keys:

protocol_description

An user friendly description of the protocol.

initial_magnetization

Dictionary of KIND/MAGNETIZATION for each element.

basis_set

Dictionary of KIND/BASIS_SET for each element.

pseudopotential

Dictionary of KIND/POTENTIAL for each element.

bandgap_thr_ev

Any `stage_0 using OT and evaluating a band gap below this threshold will be considered as a failure.

  • settings_0

  • settings_1

Settings updated in stage_0 until the SCF converges.

  • stage_0

  • stage_1

CP2K settings that are updated at every stage.

Other keys may be add in future to introduce new functionalities to the Multistage work chain.

  • starting_settings_idx (Int) is used to start from a custom index of the settings. If for example you know that the material is conductive and needs for smearing, you can use Int(1) to update directly the settings to settings_1 that applies electron smearing: this is the case of default protocol.

  • min_cell_size (Float) is used to extend the unit cell, so that the minimum perpendicular width of the cell is bigger than a certain specified value. This needed when a cell length is too narrow and the plane wave auxiliary basis set is not accurate enough at the Gamma point only. Also this may be needed for hybrid range-separated potentials that require a sufficient non-overlapping cutoff.

Note

Need to explain it further in Technicalities.

  • parent_calc_folder (RemoteData) is used to restart from a previously computed wave function.

  • cp2k_base.cp2k.parameters (Dict) can be used to specify some cp2k parameters that will be always overwritten just before submitting every calculation.

Outputs details

  • output_structure (StructureData) is the final structure at the end of the last stage. It is not outputted in case of a single point calculation, since it does not update the geometry of the system.

  • output_parameters (Dict), here it is an example for Aluminum, where the settings_0 calculation is discarded because of a negative band gap, and therefore switched to settings_1 which make the SCF converge and they are used for 2 stages:

    {
        "cell_resized": "1x1x1",
        "dft_type": "RKS",
        "final_bandgap_spin1_au": 6.1299999999931e-06,
        "final_bandgap_spin2_au": 6.1299999999931e-06,
        "last_tag": "stage_1_settings_1_valid",
        "natoms": 4,
        "nsettings_discarded": 1,
        "nstages_valid": 2,
        "stage_info": {
            "bandgap_spin1_au": [
                0.0,
                6.1299999999931e-06
            ],
            "bandgap_spin2_au": [
                0.0,
                6.1299999999931e-06
            ],
            "final_edens_rspace": [
                -3e-09,
                -3e-09
            ],
            "nsteps": [
                1,
                2
            ],
            "opt_converged": [
                true,
                false
            ]
        },
        "step_info": {
            "cell_a_angs": [
                4.05,
                4.05,
                4.05,
                4.05
            ],
            "cell_alp_deg": [
                90.0,
                90.0,
                90.0,
                90.0
            ],
            "cell_b_angs": [
                4.05,
                4.05,
                4.05,
                4.05
            ],
            "cell_bet_deg": [
                90.0,
                90.0,
                90.0,
                90.0
            ],
            "cell_c_angs": [
                4.05,
                4.05,
                4.05,
                4.05
            ],
            "cell_gam_deg": [
                90.0,
                90.0,
                90.0,
                90.0
            ],
            "cell_vol_angs3": [
                66.409,
                66.409,
                66.409,
                66.409
            ],
            "dispersion_energy_au": [
                -0.04894693184602,
                -0.04894693184602,
                -0.04894696543385,
                -0.04894705992872
            ],
            "energy_au": [
                -8.0811276714482,
                -8.0811276714483,
                -8.0811249649336,
                -8.0811173120933
            ],
            "max_grad_au": [
                null,
                0.0,
                null,
                null
            ],
            "max_step_au": [
                null,
                0.0,
                null,
                null
            ],
            "pressure_bar": [
                null,
                null,
                58260.2982324,
                58201.2710544
            ],
            "rms_grad_au": [
                null,
                0.0,
                null,
                null
            ],
            "rms_step_au": [
                null,
                0.0,
                null,
                null
            ],
            "scf_converged": [
                true,
                true,
                true,
                true
            ],
            "step": [
                0,
                1,
                1,
                2
            ]
        }
    }
    
  • last_input_parameters (Dict) reports the inputs that were used for the last CP2K calculation. They are possibly the ones that make the SCF converge, so the user can inspect them and use them for other direct CP2K calculations in AiiDA.

Usage

See examples provided with the plugin. The report provides very useful insight on what happened during the run. Here it is the example of Aluminum:

2019-11-22 16:54:52 [90962 | REPORT]: [266248|Cp2kMultistageWorkChain|setup_multistage]: Unit cell was NOT resized
2019-11-22 16:54:52 [90963 | REPORT]: [266248|Cp2kMultistageWorkChain|run_stage]: submitted Cp2kBaseWorkChain for stage_0/settings_0
2019-11-22 16:54:52 [90964 | REPORT]:   [266252|Cp2kBaseWorkChain|run_calculation]: launching Cp2kCalculation<266253> iteration #1
2019-11-22 16:55:13 [90965 | REPORT]:   [266252|Cp2kBaseWorkChain|inspect_calculation]: Cp2kCalculation<266253> completed successfully
2019-11-22 16:55:13 [90966 | REPORT]:   [266252|Cp2kBaseWorkChain|results]: work chain completed after 1 iterations
2019-11-22 16:55:14 [90967 | REPORT]:   [266252|Cp2kBaseWorkChain|on_terminated]: remote folders will not be cleaned
2019-11-22 16:55:14 [90968 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_settings_stage0]: Bandgaps spin1/spin2: -0.058 and -0.058 ev
2019-11-22 16:55:14 [90969 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_settings_stage0]: BAD SETTINGS: band gap is < 0.100 eV
2019-11-22 16:55:14 [90970 | REPORT]: [266248|Cp2kMultistageWorkChain|run_stage]: submitted Cp2kBaseWorkChain for stage_0/settings_1
2019-11-22 16:55:15 [90971 | REPORT]:   [266259|Cp2kBaseWorkChain|run_calculation]: launching Cp2kCalculation<266260> iteration #1
2019-11-22 16:55:34 [90972 | REPORT]:   [266259|Cp2kBaseWorkChain|inspect_calculation]: Cp2kCalculation<266260> completed successfully
2019-11-22 16:55:34 [90973 | REPORT]:   [266259|Cp2kBaseWorkChain|results]: work chain completed after 1 iterations
2019-11-22 16:55:34 [90974 | REPORT]:   [266259|Cp2kBaseWorkChain|on_terminated]: remote folders will not be cleaned
2019-11-22 16:55:35 [90975 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_settings_stage0]: Bandgaps spin1/spin2: 0.000 and 0.000 ev
2019-11-22 16:55:35 [90976 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_stage]: Structure updated for next stage
2019-11-22 16:55:35 [90977 | REPORT]: [266248|Cp2kMultistageWorkChain|run_stage]: submitted Cp2kBaseWorkChain for stage_1/settings_1
2019-11-22 16:55:35 [90978 | REPORT]:   [266266|Cp2kBaseWorkChain|run_calculation]: launching Cp2kCalculation<266267> iteration #1
2019-11-22 16:55:53 [90979 | REPORT]:   [266266|Cp2kBaseWorkChain|inspect_calculation]: Cp2kCalculation<266267> completed successfully
2019-11-22 16:55:53 [90980 | REPORT]:   [266266|Cp2kBaseWorkChain|results]: work chain completed after 1 iterations
2019-11-22 16:55:54 [90981 | REPORT]:   [266266|Cp2kBaseWorkChain|on_terminated]: remote folders will not be cleaned
2019-11-22 16:55:54 [90982 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_stage]: Structure updated for next stage
2019-11-22 16:55:54 [90983 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_stage]: All stages computed, finishing...
2019-11-22 16:55:55 [90984 | REPORT]: [266248|Cp2kMultistageWorkChain|results]: Outputs: Dict<266273> and StructureData<266271>

Cp2kMultistageDdec work chain

The Cp2kMultistageDdecWorkChain() work chain combines together the CP2K Multistage workchain and the DDEC calculation, with the scope of optimizing the geometry of a structure and compute its partial charge using the DDEC protocol.

workchainaiida_lsmo.workchains.Cp2kMultistageDdecWorkChain

A workchain that combines: Cp2kMultistageWorkChain + Cp2kDdecWorkChain

Inputs:

  • cp2k_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • cp2k, Namespace
      Namespace Ports
      • basissets, Namespace – A dictionary of basissets to be used in the calculations: key is the atomic symbol, value is either a single basisset or a list of basissets. If multiple basissets for a single symbol are passed, it is mandatory to specify a KIND section with a BASIS_SET keyword matching the names (or aliases) of the basissets.
      • code, Code, required – The Code to use for this job.
      • file, Namespace – additional input files
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db
          • parser_name, str, optional, non_db – Parser of the calculation: the default is cp2k_advanced_parser to get the necessary info
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • ddec, Namespace
    Namespace Ports
    • charge_density_folder, RemoteData, optional – Use a remote folder (for restarts and similar)
    • code, Code, required – The Code to use for this job.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
      • description, str, optional, non_db – Description to set on the process node.
      • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
      • label, str, optional, non_db – Label to set on the process node.
      • options, Namespace
        Namespace Ports
        • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
        • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
        • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
        • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
        • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
        • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
        • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
        • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
        • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
        • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
        • parser_name, (str), optional, non_db
        • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
        • priority, str, optional, non_db – Set the priority of the job to be queued
        • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
        • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
        • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
        • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
        • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
        • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
        • withmpi, bool, optional, non_db
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • parameters, Dict, required – Input parameters such as net charge, protocol, atomic densities path, …
  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • min_cell_size, Float, optional – To avoid using k-points, extend the cell so that min(perp_width)>min_cell_size
  • parent_calc_folder, RemoteData, optional – Provide an initial parent folder that contains the wavefunction for restart
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol to be read from {tag}.yaml unless protocol_yaml input is specified
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file with the multistage settings (and ignore protocol_tag)
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, StructureData, optional – Input structure

Outputs:

  • last_input_parameters, Dict, optional – CP2K input parameters used (and possibly working) used in the last stage
  • output_parameters, Dict, optional – Output CP2K parameters of all the stages, merged together
  • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.
  • structure_ddec, CifData, required – structure with DDEC charges

Outline:

run_cp2kmultistage(Run CP2K-Multistage)
run_cp2kddec(Pass the Cp2kMultistageWorkChain outputs as inputs for Cp2kDdecWorkChain: cp2k_base (metadata), cp2k_params, structure and WFN.)
return_results(Return exposed outputs and print the pk of the CifData w/DDEC)

ZeoppMultistageDdec work chain

The ZeoppMultistageDdecWorkChain() work chain, is similar to Cp2kMultistageDdec but it runs a geometry characterization of the structure using Zeo++ (NetworkCalculation) before and after, with the scope of assessing the structural changes due to the cell/geometry optimization.

workchainaiida_lsmo.workchains.ZeoppMultistageDdecWorkChain

A workchain that combines: Zeopp + Cp2kMultistageWorkChain + Cp2kDdecWorkChain + Zeopp

Inputs:

  • cp2k_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • cp2k, Namespace
      Namespace Ports
      • basissets, Namespace – A dictionary of basissets to be used in the calculations: key is the atomic symbol, value is either a single basisset or a list of basissets. If multiple basissets for a single symbol are passed, it is mandatory to specify a KIND section with a BASIS_SET keyword matching the names (or aliases) of the basissets.
      • code, Code, required – The Code to use for this job.
      • file, Namespace – additional input files
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db
          • parser_name, str, optional, non_db – Parser of the calculation: the default is cp2k_advanced_parser to get the necessary info
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • ddec, Namespace
    Namespace Ports
    • charge_density_folder, RemoteData, optional – Use a remote folder (for restarts and similar)
    • code, Code, required – The Code to use for this job.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
      • description, str, optional, non_db – Description to set on the process node.
      • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
      • label, str, optional, non_db – Label to set on the process node.
      • options, Namespace
        Namespace Ports
        • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
        • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
        • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
        • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
        • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
        • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
        • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
        • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
        • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
        • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
        • parser_name, (str), optional, non_db
        • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
        • priority, str, optional, non_db – Set the priority of the job to be queued
        • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
        • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
        • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
        • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
        • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
        • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
        • withmpi, bool, optional, non_db
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • parameters, Dict, required – Input parameters such as net charge, protocol, atomic densities path, …
  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • min_cell_size, Float, optional – To avoid using k-points, extend the cell so that min(perp_width)>min_cell_size
  • parent_calc_folder, RemoteData, optional – Provide an initial parent folder that contains the wavefunction for restart
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol to be read from {tag}.yaml unless protocol_yaml input is specified
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file with the multistage settings (and ignore protocol_tag)
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, CifData, required – input structure
  • zeopp, Namespace
    Namespace Ports
    • atomic_radii, SinglefileData, optional – atomic radii file
    • code, Code, required – The Code to use for this job.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
      • description, str, optional, non_db – Description to set on the process node.
      • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
      • label, str, optional, non_db – Label to set on the process node.
      • options, Namespace
        Namespace Ports
        • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
        • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
        • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
        • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
        • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
        • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
        • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
        • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
        • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
        • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
        • parser_name, str, optional, non_db
        • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
        • priority, str, optional, non_db – Set the priority of the job to be queued
        • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
        • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
        • resources, dict, optional, non_db
        • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
        • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
        • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
        • withmpi, bool, optional, non_db – Set the calculation to use mpi
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • parameters, NetworkParameters, optional – command line parameters for zeo++

Outputs:

  • last_input_parameters, Dict, optional – CP2K input parameters used (and possibly working) used in the last stage
  • output_parameters, Dict, optional – Output CP2K parameters of all the stages, merged together
  • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.
  • structure_ddec, CifData, required – structure with DDEC charges
  • zeopp_after_opt, Namespace
    Namespace Ports
    • output_parameters, Dict, required – key-value pairs parsed from zeo++ output file(s).
  • zeopp_before_opt, Namespace
    Namespace Ports
    • output_parameters, Dict, required – key-value pairs parsed from zeo++ output file(s).

Outline:

run_zeopp_before(Run Zeo++ for the original structure)
run_multistageddec(Run MultistageDdec work chain)
run_zeopp_after(Run Zeo++ for the oprimized structure)
return_results(Return exposed outputs)

SimAnnealing work chain

The SimAnnealingWorkChain() work chain allows to find the minimum configuration a number of gas molecules, in the pore volume of a framework. It runs several NVT simulations in RASPA at decreasing temperature to make the system move to its global minimum (simulated annealing), and it finally performs a minimization for the final fine tuning of the optimum position.

workchainaiida_lsmo.workchains.SimAnnealingWorkChain

A work chain to compute the minimum energy geometry of a molecule inside a framework, using simulated annealing, i.e., decreasing the temperature of a Monte Carlo simulation and finally running and energy minimization step.

Inputs:

  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • molecule, (Str, Dict), required – Adsorbate molecule: settings to be read from the yaml.Advanced: input a Dict for non-standard settings.
  • parameters, Dict, required – Parameters for the SimAnnealing workchain: will be merged with default ones.
  • raspa_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
          • parser_name, str, optional, non_db – Set a string for the output parser. Can be None if no output plugin is available or needed
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db – Set the calculation to use mpi
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • structure, CifData, required – Adsorbent framework CIF.

Outputs:

  • loaded_molecule, CifData, required – CIF containing the final postition of the molecule.
  • loaded_structure, CifData, required – CIF containing the loaded structure.
  • output_parameters, Dict, optional – Information about the final configuration.

Outline:

setup(Initialize the parameters)
while(should_run_nvt)
    run_raspa_nvt(Run a NVT calculation in Raspa.)
run_raspa_min(Run a Energy Minimization in Raspa.)
return_results(Return molecule position and energy info.)

Inputs details

  • parameters (Dict) modifies the default parameters:

    PARAMETERS_DEFAULT = {
        "ff_framework": "UFF",  # (str) Forcefield of the structure.
        "ff_separate_interactions": False,  # (bool) Use "separate_interactions" in the FF builder.
        "ff_mixing_rule": "Lorentz-Berthelot",  # (string) Choose 'Lorentz-Berthelot' or 'Jorgensen'.
        "ff_tail_corrections": True,  # (bool) Apply tail corrections.
        "ff_shifted": False,  # (bool) Shift or truncate the potential at cutoff.
        "ff_cutoff": 12.0,  # (float) CutOff truncation for the VdW interactions (Angstrom).
        "temperature_list": [300, 250, 200, 250, 100, 50],  # (list) List of decreasing temperatures for the annealing.
        "mc_steps": int(1e3),  # (int) Number of MC cycles.
        "number_of_molecules": 1  # (int) Number of molecules loaded in the framework.
    }
    

Outputs details

  • output_parameters (Dict), example:

    {
        "description": [
            "NVT simulation at 300 K",
            "NVT simulation at 250 K",
            "NVT simulation at 200 K",
            "NVT simulation at 250 K",
            "NVT simulation at 100 K",
            "NVT simulation at 50 K",
            "Final energy minimization"
        ],
        "energy_adsorbate/adsorbate_final_coulomb": [
            -0.00095657162276787,
            ...
            3.5423777787399e-06
        ],
        "energy_adsorbate/adsorbate_final_tot": [
            -0.00095657162276787,
            ...
            3.5423777787399e-06
        ],
        "energy_adsorbate/adsorbate_final_vdw": [
            0.0,
            ...
            0.0
        ],
        "energy_host/adsorbate_final_coulomb": [
            -12.696035310164,
            ...
            -15.592788991158
        ],
        "energy_host/adsorbate_final_tot": [
            -30.545798720022,
            ...
            -36.132005060753
        ],
        "energy_host/adsorbate_final_vdw": [
            -17.849763409859,
            ...
            -20.539216069678
        ],
        "energy_unit": "kJ/mol",
        "number_of_molecules": 1
    }
    

Cp2kBindingEnergy work chain

The Cp2kBindingEnergyWorkChain() work chain takes as an input a CIF structure and the initial position of a molecule in its pore, optimizes the molecule’s geometry keeping the framework rigid and computes the BSSE corrected interactions energy. The work chain is similar to CP2K’s MulstistageWorkChain in reading the settings from YAML protocol, and resubmitting the calculation with updated settings in case of failure, but the only step is an hard-coded GEO_OPT simulation with 200 max steps.

NOTE:

  1. It is better to start with the settings of a previous working MulstistageWorkChain, if already available. Otherwise, it may run for 200 steps before realizing that the settings are not good an switch them.

  2. No restart is allowed, since the system is changing the number of atoms for the BSSE calculation: therefore, the wave function is recomputed 5 times from scratch. This needs to be fixed in the future.

  3. If structure and molecule StructureData do not have the same size for the unit cell, the work chain will complain and stop.

workchainaiida_lsmo.workchains.Cp2kBindingEnergyWorkChain

Submits Cp2kBase work chain for structure + molecule system, first optimizing the geometry of the molecule and later computing the BSSE corrected interaction energy. This work chain is inspired to Cp2kMultistage, and shares some logics and data from it.

Inputs:

  • cp2k_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • cp2k, Namespace
      Namespace Ports
      • basissets, Namespace – A dictionary of basissets to be used in the calculations: key is the atomic symbol, value is either a single basisset or a list of basissets. If multiple basissets for a single symbol are passed, it is mandatory to specify a KIND section with a BASIS_SET keyword matching the names (or aliases) of the basissets.
      • code, Code, required – The Code to use for this job.
      • file, Namespace – additional input files
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • molecule, StructureData, required – Input molecule in the unit cell of the structure.
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol tag.yaml. NOTE: only the settings are read, stage is set to GEO_OPT.
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file. NOTE: only the settings are read, stage is set to GEO_OPT.
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, StructureData, required – Input structure that contains the molecule.

Outputs:

  • loaded_molecule, StructureData, required – Molecule geometry in the unit cell.
  • loaded_structure, StructureData, required – Geometry of the system with both fragments.
  • output_parameters, Dict, required – Info regarding the binding energy of the system.
  • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.

Outline:

setup(Setup initial parameters.)
while(should_run_geo_opt)
    run_geo_opt(Prepare inputs, submit and direct output to context.)
    inspect_and_update_settings_geo_opt(Inspect the settings_{idx} calculation and check if it is needed to update the settings and resubmint the calculation.)
run_bsse(Update parameters and run BSSE calculation. BSSE assumes that the molecule has no charge and unit multiplicity: this can be customized from builder.cp2k_base.cp2k.parameters.)
results(Gather final outputs of the workchain.)

Inputs details

Look at the inputs details of the Multistage work chain for more information about the choice of the protocol (i.e., DFT settings).

Outputs details

  • output_parameters (Dict), example:

    {
        "binding_energy_bsse": -1.7922110202537,
        "binding_energy_corr": -23.072114381515,
        "binding_energy_dispersion": -18.318476834858,
        "binding_energy_raw": -24.864325401768,
        "binding_energy_unit": "kJ/mol",
        "motion_opt_converged": false,
        "motion_step_info": {
            "dispersion_energy_au": [
                -0.1611999344803,
                ...
                -0.16105256797101
            ],
            "energy_au": [
                -829.9150365907,
                ...
                -829.91870835924
            ],
            "max_grad_au": [
                null,
                0.0082746554,
                ...
                0.0030823925
            ],
            "max_step_au": [
                null,
                0.0604411557,
                ...
                0.0215865148
            ],
            "rms_grad_au": [
                null,
                0.000915767,
                ...
                0.0003886735
            ],
            "rms_step_au": [
                null,
                0.0071240711,
                ...
                0.0026174255
            ],
            "scf_converged": [
                true,
                ...
                true
            ]
        }
    }
    

BindingSiteWorkChain work chain

The BindingSiteWorkChain() work chain simply combines SimAnnealingWorkChain() and Cp2kBindingEnergyWorkChain(). The outputs from the two workchain are collected under the ff and dft namespaces, respectively.

workchainaiida_lsmo.workchains.BindingSiteWorkChain

A workchain that combines SimAnnealing & Cp2kBindingEnergy

Inputs:

  • cp2k_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • cp2k, Namespace
      Namespace Ports
      • basissets, Namespace – A dictionary of basissets to be used in the calculations: key is the atomic symbol, value is either a single basisset or a list of basissets. If multiple basissets for a single symbol are passed, it is mandatory to specify a KIND section with a BASIS_SET keyword matching the names (or aliases) of the basissets.
      • code, Code, required – The Code to use for this job.
      • file, Namespace – additional input files
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • metadata, Namespace
    Namespace Ports
    • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
    • description, str, optional, non_db – Description to set on the process node.
    • label, str, optional, non_db – Label to set on the process node.
    • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
  • molecule, (Str, Dict), required – Adsorbate molecule: settings to be read from the yaml.Advanced: input a Dict for non-standard settings.
  • parameters, Dict, required – Parameters for the SimAnnealing workchain: will be merged with default ones.
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol tag.yaml. NOTE: only the settings are read, stage is set to GEO_OPT.
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file. NOTE: only the settings are read, stage is set to GEO_OPT.
  • raspa_base, Namespace
    Namespace Ports
    • clean_workdir, Bool, optional – If True, work directories of all called calculation jobs will be cleaned at the end of execution.
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
    • metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • description, str, optional, non_db – Description to set on the process node.
      • label, str, optional, non_db – Label to set on the process node.
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      • metadata, Namespace
        Namespace Ports
        • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
        • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
        • description, str, optional, non_db – Description to set on the process node.
        • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
        • label, str, optional, non_db – Label to set on the process node.
        • options, Namespace
          Namespace Ports
          • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
          • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
          • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
          • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
          • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
          • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
          • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
          • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
          • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
          • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
          • parser_name, str, optional, non_db – Set a string for the output parser. Can be None if no output plugin is available or needed
          • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
          • priority, str, optional, non_db – Set the priority of the job to be queued
          • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
          • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
          • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
          • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
          • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
          • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
          • withmpi, bool, optional, non_db – Set the calculation to use mpi
        • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, CifData, required – Adsorbent framework CIF.

Outputs:

  • dft, Namespace
    Namespace Ports
    • loaded_molecule, StructureData, required – Molecule geometry in the unit cell.
    • loaded_structure, StructureData, required – Geometry of the system with both fragments.
    • output_parameters, Dict, required – Info regarding the binding energy of the system.
    • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.
  • ff, Namespace
    Namespace Ports
    • loaded_molecule, CifData, required – CIF containing the final postition of the molecule.
    • loaded_structure, CifData, required – CIF containing the loaded structure.
    • output_parameters, Dict, optional – Information about the final configuration.

Outline:

run_sim_annealing(Run SimAnnealing)
run_cp2k_binding_energy(Pass the ouptput molecule's geometry to Cp2kBindingEnergy.)
return_results(Return exposed outputs and info.)

Tecnicalities

Unit cell expansion

With periodic boundary conditions the lengths of the simulation box should bigger than twice the cutoff value. Therefore, for an orthogonal cells one should multiply the cell until its length meets this criterion in every direction.

In case of non-orthogona cells however, one should not speak in terms of “lengths” but instead in terms of “perpendicular lengths”, as shown in the figure for the two-dimensional case. While in the orthogonal case one can simplify pwa = b and pwb = a, in a tilted unit cell we have to compute pwa and pwb and then evaluate if the cell needs to be expanded, and the multiplication coefficients.

alternate text

Perpendicular widths in orthogonal and tilted 2D cells.

This explains why we need so much math in the function check_resize_unit_cell(), to compute the Raspa input “UnitCells”.

Note that if you do not multiply correctly the unit cell, Raspa will complain in the output:

WARNING: INAPPROPRIATE NUMBER OF UNIT CELLS USED

which typically results in a lower uptake then the correct one: if the cell is smaller than twice the cutoff, less interactions are computed because each particle sees some artificial vacuum beyond the unit cell. This results in weaker average interactions and therefore lower uptake at a given pressure/temperature.

Isotherm’s pressures selection

In the Isotherm work chain we use the function choose_pressure_points(), which can automatically select the pressure points for an adequate sampling of the isotherm curve. The method, presented in our publication and resumed in the figure, is based on a preliminary estimation of the Henry coefficient and pore volume. From these, a Langmuir isotherm is derived and used as a proxy to determine the pressure points. The input values the user has to specify are pressure_min, pressure_max, pressure_maxstep and pressure_precision. This last is the A coefficient in the figure: 0.1 is the default value, but we recommend to test around 0.05 for a more accurate sampling, i.e., a higher resolution of the isotherm curve in the low pressure region.

_images/isotherm_sampling.png

Note

This method works only for sampling Type I isotherms: it fails to correctly sample inflection curve in case of strong cooperative adsorption, e.g., a typical water isotherm.

aiida_lsmo package

Subpackages

aiida_lsmo.calcfunctions package

Submodules
aiida_lsmo.calcfunctions.ff_builder_module module

ff_builder calcfunction.

aiida_lsmo.calcfunctions.ff_builder_module.ff_builder(params)[source]

AiiDA calcfunction to assemble force filed parameters into SinglefileData for Raspa.

aiida_lsmo.calcfunctions.ff_builder_module.load_yaml()[source]

Load the ff_data.yaml as a dict.

aiida_lsmo.calcfunctions.ff_builder_module.mix_molecule_ff(ff_list, mixing_rule)[source]

Mix molecule-molecule interactions in case of separate_interactions: return mixed ff_list

aiida_lsmo.calcfunctions.ff_builder_module.render_ff_def(ff_data, params, ff_mix_found)[source]

Render the force_field.def file.

aiida_lsmo.calcfunctions.ff_builder_module.render_ff_mixing_def(ff_data, params)[source]

Render the force_field_mixing_rules.def file.

aiida_lsmo.calcfunctions.ff_builder_module.render_molecule_def(ff_data, params, molecule_name)[source]

Render the molecule.def file containing the thermophysical data, geometry and intramolecular force field.

aiida_lsmo.calcfunctions.ff_builder_module.render_pseudo_atoms_def(ff_data, params)[source]

Render the pseudo_atoms.def file.

aiida_lsmo.calcfunctions.ff_builder_module.string_to_singlefiledata(string, filename)[source]

Convert a string to a SinglefileData.

aiida_lsmo.calcfunctions.selectivity module

Calcfunctions to compute gas-selectivity related applications.

aiida_lsmo.calcfunctions.selectivity.calc_selectivity(isot_dict_a, isot_dict_b)[source]

Compute the selectivity of gas A on gas B as S = kH_a/kH_b. Note that if the material is not porous to one of the materials, the result is simply {‘is_porous’: False}. To maintain the comptaibility with v1, intead of checking ‘is_porous’, it checks for the henry_coefficient_average key in the Dict.

aiida_lsmo.calcfunctions.working_cap module

Calcfunctions to compute working capacities for different gasses.

aiida_lsmo.calcfunctions.working_cap.calc_ch4_working_cap(isot_dict)[source]

Compute the CH4 working capacity from the output_parameters Dict of IsothermWorkChain. This must have run calculations at 5.8 and 65.0 bar (at 298K), which are the standard reference for the evaluation.

The results can be compared with Simon2015 (10.1039/C4EE03515A).

aiida_lsmo.calcfunctions.working_cap.calc_h2_working_cap(isotmt_dict)[source]

Compute the H2 working capacity from the output_parameters Dict of MultiTempIsothermWorkChain. This must have run calculations at 1, 5 and 100 bar at 77, 198, 298 K. The US DOE Target for the Onboard Storage of Hydrogen Vehicles set the bar to 4.5 wt% and 30 g/L (Kapelewski2018). Case-A: near-ambient-T adsorption, 100bar/198K to 5bar/298K (cf. Kapelewski2018, 10.1021/acs.chemmater.8b03276) ……. Ni2(m-dobdc), experimental: 23.0 g/L Case-B: low T adsorption, 100-5bar at 77K (cf. Ahmed2019, 10.1038/s41467-019-09365-w) ……. NU-100, best experimental: 35.5 g/L Case-C: low T adsorption at low discharge, 100-1bar at 77K (cf. Thornton2017, 10.1021/acs.chemmater.6b04933) ……. hypMOF-5059389, best simulated: 40.0 g/L

aiida_lsmo.calcfunctions.working_cap.calc_o2_working_cap(isot_dict)[source]

Compute the O2 working capacity from the output_parameters Dict of IsothermWorkChain. This must have run calculations at 5 and 140.0 bar (at 298K), to be consistent with the screening of Moghadam2018 (10.1038/s41467-018-03892-8), for which the MOF ANUGIA (UMCM-152) was found to have a volumetric working capacity of 249 vSTP/v (simulations are nearly identical to experiments). Consider that, at the same conditions, an empty thank can only store 136 vSTP/v, and a comparable working capacity can only br obtained compressing till 300bar.

aiida_lsmo.calcfunctions.working_cap.get_molec_uc_to_mg_g(isot_dict)[source]

Fix the discrepancy coming from old Raspa calculations, having a typo in the conversion label.

aiida_lsmo.calcfunctions.wrappers module

Calculation functions that wrap some advanced script for process evaluation.

aiida_lsmo.calcfunctions.wrappers.calc_co2_parasitic_energy(isot_co2, isot_n2, pe_parameters)[source]

Submit calc_pe calculation using AiiDA, for the CO2 parasitic energy. :isot_co2: (Dict) CO2 IsothermWorkChainNode.outputs[‘output_parameters’] :isot_n2: (Dict) N2 IsothermWorkChainNode.outputs[‘output_parameters’] :pe_parameters: (Dict) See PE_PARAMETERS_DEFAULT

Module contents

AiiDA calcfunctions

aiida_lsmo.parsers package

Submodules
aiida_lsmo.parsers.parser_functions module

Functions used for specific parsing of output files.

aiida_lsmo.parsers.parser_functions.parse_cp2k_output_advanced(fstring)[source]

Parse CP2K output into a dictionary (ADVANCED: more info parsed @ PRINT_LEVEL MEDIUM)

aiida_lsmo.parsers.parser_functions.parse_cp2k_output_bsse(fstring)[source]

Parse CP2K BSSE output into a dictionary (tested with PRINT_LEVEL MEDIUM).

Module contents

Parsers for the specific usage of aiida-lsmo workchains.

class aiida_lsmo.parsers.Cp2kAdvancedParser(node)[source]

Bases: aiida_cp2k.parsers.Cp2kBaseParser

Advanced AiiDA parser class for the output of CP2K.

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.parsers'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
_parse_stdout()[source]

Advanced CP2K output file parser

class aiida_lsmo.parsers.Cp2kBsseParser(node)[source]

Bases: aiida_cp2k.parsers.Cp2kBaseParser

Advanced AiiDA parser class for a BSSE calculation in CP2K.

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.parsers'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
_parse_stdout()[source]

BSSE CP2K output file parser

aiida_lsmo.utils package

Submodules
aiida_lsmo.utils.cp2k_utils module

Utilities related to CP2K.

aiida_lsmo.utils.cp2k_utils.get_bsse_section(natoms_a, natoms_b, mult_a=1, mult_b=1, charge_a=0, charge_b=0)[source]

Get the &FORCE_EVAL/&BSSE section.

aiida_lsmo.utils.cp2k_utils.get_input_multiplicity(structure, protocol_settings)[source]

Compute the total multiplicity of the structure, by summing the atomic magnetizations: multiplicity = 1 + sum_i ( natoms_i * magnetization_i ), for each atom_type i

aiida_lsmo.utils.cp2k_utils.get_kinds_section(structure, protocol_settings)[source]

Write the &KIND sections given the structure and the settings_dict

aiida_lsmo.utils.cp2k_utils.get_kinds_with_ghost_section(structure, protocol_settings)[source]

Write the &KIND sections given the structure and the settings_dict, and add also GHOST atoms

aiida_lsmo.utils.cp2k_utils.ot_has_small_bandgap(cp2k_input, cp2k_output, bandgap_thr_ev)[source]

Returns True if the calculation used OT and had a smaller bandgap then the guess needed for the OT. (NOTE: It has been observed also negative bandgap with OT in CP2K!) cp2k_input: dict cp2k_output: dict bandgap_thr_ev: float [eV]

aiida_lsmo.utils.multiply_unitcell module

Utilities for unit cell multiplication, typically for cut-off issues.

aiida_lsmo.utils.multiply_unitcell.check_resize_unit_cell(cif, threshold)[source]

Returns the multiplication factors for the cell vectors to respect, in every direction: min(perpendicular_width) > threshold.

aiida_lsmo.utils.multiply_unitcell.check_resize_unit_cell_legacy(struct, threshold)[source]

Returns the multiplication factors for the cell vectors to respect, in every direction: min(perpendicular_width) > threshold. TODO: this has been used for CP2K, make it uniform to the other one used for Raspa (from CifFile).

aiida_lsmo.utils.multiply_unitcell.resize_unit_cell(struct, resize)[source]

Resize the StructureData according to the resize Dict

aiida_lsmo.utils.other_utilities module

Other utilities

aiida_lsmo.utils.other_utilities.aiida_cif_merge(aiida_cif_a, aiida_cif_b)[source]

Merge the coordinates of two CifData into a sigle one. Note: the two unit cells must be the same.

aiida_lsmo.utils.other_utilities.aiida_dict_merge(to_dict, from_dict)[source]

Merge two aiida Dict objects.

aiida_lsmo.utils.other_utilities.aiida_structure_merge(aiida_structure_a, aiida_structure_b)[source]

Merge the coordinates of two StructureData into a sigle one. Note: the two unit cells must be the same.

aiida_lsmo.utils.other_utilities.ase_cells_are_similar(ase_a, ase_b, thr=2)[source]

Return True if the cell of two ASE objects are similar up to “thr” decimals. This avoids to give error if two Cells are different at a nth decimal number, tipically because of some truncation.

aiida_lsmo.utils.other_utilities.dict_merge(dct, merge_dct)[source]

Taken from https://gist.github.com/angstwad/bf22d1822c38a92ec0a9 Recursive dict merge. Inspired by :meth:dict.update(), instead of updating only top-level keys, dict_merge recurses down into dicts nested to an arbitrary depth, updating keys. The merge_dct is merged into dct. :param dct: dict onto which the merge is executed :param merge_dct: dct merged into dct :return: None

aiida_lsmo.utils.other_utilities.get_cif_from_structure(structuredata)[source]

Convert CifData to StructureData maintaining the provenance.

aiida_lsmo.utils.other_utilities.get_structure_from_cif(cifdata)[source]

Convert StructureData to CifData maintaining the provenance.

Module contents

aiida-lsmo utils

aiida_lsmo.workchains package

Submodules
aiida_lsmo.workchains.binding_site module

BindingSite workchain.

class aiida_lsmo.workchains.binding_site.BindingSiteWorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

A workchain that combines SimAnnealing & Cp2kBindingEnergy

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.binding_site'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
classmethod define(spec)[source]

Define workflow specification.

return_results()[source]

Return exposed outputs and info.

run_cp2k_binding_energy()[source]

Pass the ouptput molecule’s geometry to Cp2kBindingEnergy.

run_sim_annealing()[source]

Run SimAnnealing

aiida_lsmo.workchains.cp2k_binding_energy module

Binding energy workchain

class aiida_lsmo.workchains.cp2k_binding_energy.Cp2kBindingEnergyWorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

Submits Cp2kBase work chain for structure + molecule system, first optimizing the geometry of the molecule and later computing the BSSE corrected interaction energy. This work chain is inspired to Cp2kMultistage, and shares some logics and data from it.

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.cp2k_binding_energy'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
_spec = <aiida.engine.processes.workchains.workchain.WorkChainSpec object>
classmethod define(spec)[source]
inspect_and_update_settings_geo_opt()[source]

Inspect the settings_{idx} calculation and check if it is needed to update the settings and resubmint the calculation.

results()[source]

Gather final outputs of the workchain.

run_bsse()[source]

Update parameters and run BSSE calculation. BSSE assumes that the molecule has no charge and unit multiplicity: this can be customized from builder.cp2k_base.cp2k.parameters.

run_geo_opt()[source]

Prepare inputs, submit and direct output to context.

setup()[source]

Setup initial parameters.

should_run_geo_opt()[source]

Returns True if it is the first iteration or the settings are not ok.

aiida_lsmo.workchains.cp2k_binding_energy.get_loaded_molecule(loaded_structure, input_molecule)[source]

Return only the molecule’s atoms in the unit cell as a StructureData object.

aiida_lsmo.workchains.cp2k_binding_energy.get_output_parameters(**cp2k_out_dict)[source]

Extracts important results to include in the output_parameters.

aiida_lsmo.workchains.cp2k_multistage module

Multistage work chain.

class aiida_lsmo.workchains.cp2k_multistage.Cp2kMultistageWorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

Submits Cp2kBase workchains for ENERGY, GEO_OPT, CELL_OPT and MD jobs iteratively The protocol_yaml file contains a series of settings_x and stage_x: the workchains starts running the settings_0/stage_0 calculation, and, in case of a failure, changes the settings untill the SCF of stage_0 converges. Then it uses the same settings to run the next stages (i.e., stage_1, etc.).

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.cp2k_multistage'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
classmethod define(spec)[source]
inspect_and_update_settings_stage0()[source]

Inspect the stage0/settings_{idx} calculation and check if it is needed to update the settings and resubmint the calculation.

inspect_and_update_stage()[source]

Update geometry, parent folder and the new &MOTION settings.

results()[source]

Gather final outputs of the workchain.

run_stage()[source]

Check for restart, prepare input, submit and direct output to context.

setup_multistage()[source]

Setup initial parameters.

should_run_stage()[source]

Return True if it exists a new stage to compute.

should_run_stage0()[source]

Returns True if it is the first iteration or the settings are not ok.

aiida_lsmo.workchains.cp2k_multistage.extract_results(resize, **kwargs)[source]

Extracts restults form the output_parameters of the single calculations (i.e., scf-converged stages) into a single Dict output. - resize (Dict) contains the unit cell resizing values - kwargs contains all the output_parameters for the stages and the extra initial change of settings, e.g.: ‘out_0’: cp2k’s output_parameters with Dict.label = ‘settings_0_stage_0_discard’ ‘out_1’: cp2k’s output_parameters with Dict.label = ‘settings_1_stage_0_valid’ ‘out_2’: cp2k’s output_parameters with Dict.label = ‘settings_1_stage_0_valid’ ‘out_3’: cp2k’s output_parameters with Dict.label = ‘settings_1_stage_0_valid’ This will be read as: output_dict = {‘nstages_valid’: 3, ‘nsettings_discarded’: 1}.

aiida_lsmo.workchains.cp2k_multistage_ddec module

Cp2kMultistageDdecWorkChain workchain

class aiida_lsmo.workchains.cp2k_multistage_ddec.Cp2kMultistageDdecWorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

A workchain that combines: Cp2kMultistageWorkChain + Cp2kDdecWorkChain

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.cp2k_multistage_ddec'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
classmethod define(spec)[source]

Define workflow specification.

return_results()[source]

Return exposed outputs and print the pk of the CifData w/DDEC

run_cp2kddec()[source]

Pass the Cp2kMultistageWorkChain outputs as inputs for Cp2kDdecWorkChain: cp2k_base (metadata), cp2k_params, structure and WFN.

run_cp2kmultistage()[source]

Run CP2K-Multistage

aiida_lsmo.workchains.isotherm module

Isotherm workchain

class aiida_lsmo.workchains.isotherm.IsothermWorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

Workchain that computes volpo and blocking spheres: if accessible volpo>0 it also runs a raspa widom calculation for the Henry coefficient.

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.isotherm'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
_get_widom_param()[source]

Write Raspa input parameters from scratch, for a Widom calculation

_update_param_for_gcmc()[source]

Update Raspa input parameter, from Widom to GCMC

classmethod define(spec)[source]
init_raspa_gcmc()[source]

Choose the pressures we want to sample, report some details, and update settings for GCMC

return_output_parameters()[source]

Merge all the parameters into output_parameters, depending on is_porous and is_kh_ehough.

run_raspa_gcmc()[source]

Run a GCMC calculation in Raspa @ T,P.

run_raspa_widom()[source]

Run a Widom calculation in Raspa.

run_zeopp()[source]

Perform Zeo++ block and VOLPO calculations.

setup()[source]

Initialize the parameters

should_run_another_gcmc()[source]

We run another raspa calculation only if the current iteration is smaller than the total number of pressures we want to compute.

should_run_gcmc()[source]

Output the widom results and decide to compute the isotherm if kH > kHmin, as defined by the user

should_run_widom()[source]

Submit widom calculation only if there is some accessible volume, also check the number of blocking spheres and estimate the saturation loading. Also, stop if called by IsothermMultiTemp for geometric results only.

aiida_lsmo.workchains.isotherm.choose_pressure_points(inp_param, geom, raspa_widom_out)[source]

If ‘presure_list’ is not provide, model the isotherm as single-site langmuir and return the most important pressure points to evaluate for an isotherm, in a List.

aiida_lsmo.workchains.isotherm.get_atomic_radii(isotparam)[source]

Get {ff_framework}.rad as SinglefileData form workchain/isotherm_data. If not existing use DEFAULT.rad.

aiida_lsmo.workchains.isotherm.get_ff_parameters(molecule_dict, isotparam)[source]

Get the parameters for ff_builder.

aiida_lsmo.workchains.isotherm.get_geometric_dict(zeopp_out, molecule)[source]

Return the geometric Dict from Zeopp results, including Qsat and is_porous

aiida_lsmo.workchains.isotherm.get_molecule_dict(molecule_name)[source]

Get a Dict from the isotherm_molecules.yaml

aiida_lsmo.workchains.isotherm.get_output_parameters(geom_out, inp_params, widom_out=None, pressures=None, **gcmc_out_dict)[source]

Merge results from all the steps of the work chain.

aiida_lsmo.workchains.isotherm.get_zeopp_parameters(molecule_dict, isotparam)[source]

Get the ZeoppParameters from the inputs of the workchain

aiida_lsmo.workchains.isotherm_calc_pe module

IsothermCalcPE work chain.

class aiida_lsmo.workchains.isotherm_calc_pe.IsothermCalcPEWorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

Compute CO2 parassitic energy (PE) after running IsothermWorkChain for CO2 and N2 at 300K.

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.isotherm_calc_pe'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
classmethod define(spec)[source]
run_calcpe()[source]

Expose isotherm outputs, prepare calc_pe, run it and return the output.

run_isotherms()[source]

Run Isotherm work chain for CO2 and N2.

aiida_lsmo.workchains.isotherm_multi_temp module

IsothermMultiTemp workchain.

class aiida_lsmo.workchains.isotherm_multi_temp.IsothermMultiTempWorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

Run IsothermWorkChain for multiple temperatures: first compute geometric properties and then submit Widom+GCMC at different temperatures in parallel

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.isotherm_multi_temp'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
collect_isotherms()[source]

Collect all the results in one Dict

classmethod define(spec)[source]
run_geometric()[source]

Perform Zeo++ block and VOLPO calculation with IsothermWC.

run_isotherms()[source]

Compute isotherms at different temperatures.

should_continue()[source]

Continue if porous

aiida_lsmo.workchains.isotherm_multi_temp.get_output_parameters(geom_dict, **isotherm_dict)[source]

Gather together all the results, returning lists for the multi temperature values

aiida_lsmo.workchains.isotherm_multi_temp.get_parameters_singletemp(i, parameters)[source]
aiida_lsmo.workchains.nanoporous_screening_1 module

ZeoppMultistageDdecPeWorkChain workchain

class aiida_lsmo.workchains.nanoporous_screening_1.NanoporousScreening1WorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

A workchain that combines: ZeoppMultistageDdecWorkChain wc1 and IsothermCalcPEWorkChain wc2. In future I will use this to include more applications to run in parallel.

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.nanoporous_screening_1'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
classmethod define(spec)[source]

Define workflow specification.

include_results_wc1()[source]

Include results of work chain 1 in group.

include_results_wc2()[source]

Include results of work chain 2 in group.

make_group()[source]

Create curated-xxx_XXXX_vx group and put the orig_cif inside, and exit if it already exists.

run_wc1()[source]

Run work chain 1.

run_wc2()[source]

Run work chain 2.

aiida_lsmo.workchains.nanoporous_screening_1.include_node(tag, node, group)[source]

Given an aiida-node and a (string) tag, add the node in the curated-cof_XXX_vX group, and set the tag as the extra of the node for the query.

aiida_lsmo.workchains.sim_annealing module

Isotherm workchain

class aiida_lsmo.workchains.sim_annealing.SimAnnealingWorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

A work chain to compute the minimum energy geometry of a molecule inside a framework, using simulated annealing, i.e., decreasing the temperature of a Monte Carlo simulation and finally running and energy minimization step.

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.sim_annealing'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
_get_raspa_nvt_param()[source]

Write Raspa input parameters from scratch, for an MC NVT calculation

_spec = <aiida.engine.processes.workchains.workchain.WorkChainSpec object>
classmethod define(spec)[source]
return_results()[source]

Return molecule position and energy info.

run_raspa_min()[source]

Run a Energy Minimization in Raspa.

run_raspa_nvt()[source]

Run a NVT calculation in Raspa.

setup()[source]

Initialize the parameters

should_run_nvt()[source]

Update temperature untill the last of the list.

aiida_lsmo.workchains.sim_annealing.get_ff_parameters(molecule_dict, isotparam)[source]

Get the parameters for ff_builder.

aiida_lsmo.workchains.sim_annealing.get_molecule_dict(molecule_name)[source]

Get a Dict from the isotherm_molecules.yaml

aiida_lsmo.workchains.sim_annealing.get_molecule_from_restart_file(structure_cif, molecule_folderdata, input_dict, molecule_dict)[source]

Get a CifData file having the cell of the initial (unexpanded) structure and the geometry of the loaded molecule. TODO: this is source of error if there are more than one molecule AND the cell has been expanded, as you can not wrap them in the small cell.

aiida_lsmo.workchains.sim_annealing.get_output_parameters(input_dict, min_out_dict, **nvt_out_dict)[source]

Merge energy info from the calculations.

aiida_lsmo.workchains.sim_annealing.load_yaml()[source]

Load the ff_data.yaml as a dict.

aiida_lsmo.workchains.zeopp_multistage_ddec module

ZeoppMultistageDdecWorkChain work chain

class aiida_lsmo.workchains.zeopp_multistage_ddec.ZeoppMultistageDdecWorkChain(*args, **kwargs)[source]

Bases: aiida.engine.processes.workchains.workchain.WorkChain

A workchain that combines: Zeopp + Cp2kMultistageWorkChain + Cp2kDdecWorkChain + Zeopp

__abstractmethods__ = frozenset({})
__module__ = 'aiida_lsmo.workchains.zeopp_multistage_ddec'
_abc_cache = <_weakrefset.WeakSet object>
_abc_negative_cache = <_weakrefset.WeakSet object>
_abc_negative_cache_version = 110
_abc_registry = <_weakrefset.WeakSet object>
_spec = <aiida.engine.processes.workchains.workchain.WorkChainSpec object>
classmethod define(spec)[source]

Define workflow specification.

return_results()[source]

Return exposed outputs

run_multistageddec()[source]

Run MultistageDdec work chain

run_zeopp_after()[source]

Run Zeo++ for the oprimized structure

run_zeopp_before()[source]

Run Zeo++ for the original structure

Module contents

Workchains developed at LSMO laboratory.

Module contents

aiida_lsmo

AiiDA workflows for the LSMO laboratory at EPFL

If you use this plugin for your research, please cite the following work:

Daniele Ongari, Aliksandr V. Yakutovich, Leopold Talirz, and Berend Smit, Building a Consistent and Reproducible Database for Adsorption Evaluation in Covalent–Organic Frameworks, ACS Cent. Sci. 2019, 5, 10, 1663-1675 (2019); https://doi.org/10.1021/acscentsci.9b00619.

If you use AiiDA for your research, please cite the following work:

Giovanni Pizzi, Andrea Cepellotti, Riccardo Sabatini, Nicola Marzari, and Boris Kozinsky, AiiDA: automated interactive infrastructure and database for computational science, Comp. Mat. Sci 111, 218-230 (2016); https://doi.org/10.1016/j.commatsci.2015.09.013; http://www.aiida.net.

aiida-lsmo is released under the MIT license.

Indices and tables