LSMO calc functions and work chains

In the following section all the calc functions and work chains of the aiida-lsmo plugin are listed and documented.

Force Field Builder

The ff_builder() calculation function allows to combine the force field parameters (typically for a Lennard-Jones potential) for a framework and the molecule(s), giving as an output the .def files required by Raspa. To see the list of available parameterization for the frameworks and the available molecules, give a look to the file ff_data.yaml.

What it can do:

  1. Switch settings that are written in the .def files of Raspa, such as tail-corrections, truncation/shifting and mixing rules.

  2. Decide to separate the interactions, so that framework-molecule interactions and molecule-molecule interactions are parametrized differently (e.g., TraPPE for molecule-mololecule and UFF, instead of UFF/TraPPE for framework-molecule).

What it currently can not do:

  1. Deal with flexible molecules.

  2. Take parameters from other files (e.g., YAML).

  3. Generate .def files for a molecule, given just the geometry: it has to be included in the ff_data.yaml file.

Inputs details

  • Parameters Dict:

    PARAMS_EXAMPLE = Dict( dict = {
       'ff_framework': 'UFF',              # See force fields available in ff_data.yaml as framework.keys()
       'ff_molecules': {                   # See molecules available in ff_data.yaml as ff_data.keys(
           'CO2': 'TraPPE',                    # See force fields available in ff_data.yaml as {molecule}.keys()
           'N2': 'TraPPE'
       'shifted': True,                    # If True shift despersion interactions, if False simply truncate them
       'tail_corrections': False,          # If True apply tail corrections based on homogeneous-liquid assumption
       'mixing_rule': 'Lorentz-Berthelot', # Options: 'Lorentz-Berthelot' or 'Jorgensen'
       'separate_interactions': True       # If True use framework's force field for framework-molecule interactions

Outputs details

  • Dictionary containing the .def files as SinglefileData. This output dictionary is ready to be used as a files input of the RaspaCalculation: you can find and example of usage of this CalcFunction in the IsothermWorkChain, or a minimal test usage in the examples.

Selectivity calculators

The calc_selectivity() calculation function computes the selectivity of two gas in a material, as the ratio between their Henry coefficients. In the future this module will host also different metrics to assess selectivity, for specific applications.

Working Capacity calculators

The module calcfunctions/ contains a collections of calculation functions to compute the working capacities for different compound (e.g., CH4, H2) at industrially reference/relevant conditions. The working capacity is the usable amount of a stored adsorbed compound between the loading and discharging temperature and pressure. These are post-processing calculation from the output_parameters of Isotherm or IsothermMultiTemp work chains, that needs to be run at specific conditions: see the header of the calc function to know them. Their inner working is very simple but they are collected in this repository to be used as a reference in our group. If you are investigating some different gas storage application, consider including a similar script here.

An example is calc_ch4_working_cap() for methane storage.

Isotherm work chain

The IsothermWorkChain() work function allows to compute a single-component isotherm in a framework, from a few settings.

What it does, in order:

  1. Run a geometry calculation (Zeo++) to assess the accessible probe-occubiable pore volume and the needed blocking spheres.

  2. Stop if the structure is non-porous, i.e., not permeable to the molecule.

  3. Get the parameters of the force field using the FFBuilder.

  4. Get the number of unit cell replicas needed to have correct periodic boundary conditions at the given cutoff.

  5. Compute the adsorption at zero loading (e.g., the Henry coefficient, kH) from a Widom insertion calculation using Raspa.

  6. Stop if the kH is not more that a certain user-defined threshold: this can be used for screening purpose, or to intentionally compute only the kH using this work chain.

  7. Given a min/max range, propose a list of pressures that sample the isotherm uniformly. However, the user can also specify a defined list of pressure and skip this automatic selection.

  8. Compute the isotherm using Grand Canonical Monte Carlo (GCMC) sampling in series, and restarting each system from the previous one for a short and efficient equilibration.

What it can not do:

  1. Compute isotherms at different temperatures (see IsothermMultiTemp work chain for this).

  2. Compute multi-component isotherms, as it would complicate a lot the input, output and logic, and it is not trivial to assign the mixture composition of the bulk gas at different pressure, for studying a real case.

  3. It is not currently possible to play too much with Monte Carlo probabilities and other advanced settings in Raspa.

  4. Sample the isotherm uniformly in case of “type II” isotherms, i.e., like for water, having significant cooperative insertion.

  5. Run the different pressures in parallel: this would be less efficient because you can not restart from the previous configuration, and not necessarily much faster considering that equilibrating the higher pressure calculation will be anyway the bottleneck.


Workchain that computes volpo and blocking spheres: if accessible volpo>0 it also runs a raspa widom calculation for the Henry coefficient.


  metadata, Namespace
    metadata, Namespace
    raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • structure, CifData, required – Adsorbent framework CIF.
  zeopp, Namespace
    Namespace Ports
    • code, Code, required – The Code to use for this job.
  • block, SinglefileData, optional – Blocked pockets fileoutput file.
  • output_parameters, Dict, required – Results of the single temperature wc: keys can vay depending on is_porous and is_kh_enough booleans.


setup(Initialize the parameters)
run_zeopp(Perform Zeo++ block and VOLPO calculations.)
    run_raspa_widom(Run a Widom calculation in Raspa.)
        init_raspa_gcmc(Choose the pressures we want to sample, report some details, and update settings for GCMC)
            run_raspa_gcmc(Run a GCMC calculation in Raspa @ T,P.)
return_output_parameters(Merge all the parameters into output_parameters, depending on is_porous and is_kh_ehough.)

Inputs details

  • structure (CifData) is the framework with partial charges (provided as _atom_site_charge column in the CIF file)

  • molecule can be provided both as a Str or Dict. It contains information about the molecule force field and approximated spherical-probe radius for the geometry calculation. If provided as a string (e.g., co2, n2) the work chain looks up at the corresponding dictionary in isotherm_data/isotherm_molecules.yaml. The input dictionary reads as, for example:

      name: CO2          # Raspa's MoleculeName
      forcefield: TraPPE # Raspa's MoleculeDefinition
      molsatdens: 21.2   # Density of the liquid phase of the molecule in (mol/l). Typically I run a simulation at 300K/200bar
      proberad: 1.525    # radius used for computing VOLPO and Block (Angs). Typically FF's sigma/2
      singlebead: False  # if true: RotationProbability=0
      charged: True      # if true: ChargeMethod=Ewald
  • parameters (Dict) modifies the default parameters:

    parameters = {
      "ff_framework": "UFF",  # (str) Forcefield of the structure.
      "ff_separate_interactions": False,  # (bool) Use "separate_interactions" in the FF builder.
      "ff_mixing_rule": "Lorentz-Berthelot",  # (string) Choose 'Lorentz-Berthelot' or 'Jorgensen'.
      "ff_tail_corrections": True,  # (bool) Apply tail corrections.
      "ff_shifted": False,  # (bool) Shift or truncate the potential at cutoff.
      "ff_cutoff": 12.0,  # (float) CutOff truncation for the VdW interactions (Angstrom).
      "temperature": 300,  # (float) Temperature of the simulation.
      "temperature_list": None,  # (list) To be used by IsothermMultiTempWorkChain.
      "zeopp_volpo_samples": int(1e5),  # (int) Number of samples for VOLPO calculation (per UC volume).
      "zeopp_block_samples": int(100),  # (int) Number of samples for BLOCK calculation (per A^3).
      "raspa_minKh": 1e-10,  # (float) If Henry coefficient < raspa_minKh do not run the isotherm (mol/kg/Pa).
      "raspa_verbosity": 10,  # (int) Print stats every: number of cycles / raspa_verbosity.
      "raspa_widom_cycles": int(1e5),  # (int) Number of Widom cycles.
      "raspa_gcmc_init_cycles": int(1e3),  # (int) Number of GCMC initialization cycles.
      "raspa_gcmc_prod_cycles": int(1e4),  # (int) Number of GCMC production cycles.
      "pressure_list": None,  # (list) Pressure list for the isotherm (bar): if given it will skip to guess it.
      "pressure_precision": 0.1,  # (float) Precision in the sampling of the isotherm: 0.1 ok, 0.05 for high resolution.
      "pressure_maxstep": 5,  # (float) Max distance between pressure points (bar).
      "pressure_min": 0.001,  # (float) Lower pressure to sample (bar).
      "pressure_max": 10  # (float) Upper pressure to sample (bar).

Note that if the pressure_list value is provided, the other pressure inputs are neglected and the automatic pressure selection of the work chain is skipped.

  • geometric is not meant to be used by the user, but by the IsothermMultiTemp work chains.

Outputs details

  • output_parameters (Dict) whose length depends whether is_porous is True (if not, only geometric outputs are reported in the dictionary), and whether is_kh_enough (if False, it prints only the output of the Widom calculation, otherwise it also reports the isotherm data). This is an example of a full isotherm with is_porous=True and is_kh_enough=True, for 6 pressure points at 298K

        "Density": 0.385817,
        "Density_unit": "g/cm^3",
        "Estimated_saturation_loading": 51.586704,
        "Estimated_saturation_loading_unit": "mol/kg",
        "Input_block": [
        "Input_ha": "DEF",
        "Input_structure_filename": "19366N2.cif",
        "Input_volpo": [
        "Number_of_blocking_spheres": 0,
        "POAV_A^3": 8626.94,
        "POAV_A^3_unit": "A^3",
        "POAV_Volume_fraction": 0.73173,
        "POAV_Volume_fraction_unit": null,
        "POAV_cm^3/g": 1.89657,
        "POAV_cm^3/g_unit": "cm^3/g",
        "PONAV_A^3": 0.0,
        "PONAV_A^3_unit": "A^3",
        "PONAV_Volume_fraction": 0.0,
        "PONAV_Volume_fraction_unit": null,
        "PONAV_cm^3/g": 0.0,
        "PONAV_cm^3/g_unit": "cm^3/g",
        "Unitcell_volume": 11789.8,
        "Unitcell_volume_unit": "A^3",
        "adsorption_energy_widom_average": -9.7886451805,
        "adsorption_energy_widom_dev": 0.0204010566,
        "adsorption_energy_widom_unit": "kJ/mol",
        "conversion_factor_molec_uc_to_cm3stp_cm3": 3.1569089445,
        "conversion_factor_molec_uc_to_gr_gr": 5.8556741651,
        "conversion_factor_molec_uc_to_mol_kg": 0.3650669679,
        "henry_coefficient_average": 6.72787e-06,
        "henry_coefficient_dev": 3.94078e-08,
        "henry_coefficient_unit": "mol/kg/Pa",
        "is_kh_enough": true,
        "is_porous": true,
        "isotherm": {
            "enthalpy_of_adsorption_average": [
            "enthalpy_of_adsorption_dev": [
            "enthalpy_of_adsorption_unit": "kJ/mol",
            "loading_absolute_average": [
            "loading_absolute_dev": [
            "loading_absolute_unit": "mol/kg",
            "pressure": [
            "pressure_unit": "bar"
        "temperature": 298,
        "temperature_unit": "K"
  • block (SinglefileData) file is outputted if blocking spheres are found and used for the isotherm. Therefore, this is ready to be used for a new, consistent, Raspa calculation.

IsothermMultiTemp work chain

The IsothermMultiTempWorkChain() work chain can run in parallel the Isotherm work chain at different temperatures. Since the geometry initial calculation to get the pore volume and blocking spheres is not dependent on the temperature, this is run only once. Inputs and outputs are very similar to the Isotherm work chain.

What it can do:

  1. Compute the kH at every temperature and guess, for each temperature, the pressure points needed for an uniform sampling of the isotherm.

What it can not do:

  1. Select specific pressure points (as pressure_list) that are different at different temperatures.

  2. Run an isobar curve (same pressure, different pressures) restarting each GCMC calculation from the previous system.


Run IsothermWorkChain for multiple temperatures: first compute geometric properties and then submit Widom+GCMC at different temperatures in parallel


  metadata, Namespace
  raspa_base, Namespace
    metadata, Namespace
    raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • structure, CifData, required – Adsorbent framework CIF.
  zeopp, Namespace
    Namespace Ports
    • code, Code, required – The Code to use for this job.
  • block, SinglefileData, optional – Blocked pockets fileoutput file.
  • output_parameters, Dict, required – Results of isotherms run at different temperatures.


run_geometric(Perform Zeo++ block and VOLPO calculation with IsothermWC.)
    run_isotherms(Compute isotherms at different temperatures.)
collect_isotherms(Collect all the results in one Dict)

Inputs details

  • parameters (Dict), compared to the input of the Isotherm work chain, contains the key temperature_list and neglects the key temperature:

    "temperature_list": [278, 298.15, 318.0],

Outputs details

  • output_parameters (Dict) contains the temperature and isotherm as lists. In this example 3 pressure points are computed at 77K, 198K and 298K:

        "Density": 0.731022,
        "Density_unit": "g/cm^3",
        "Estimated_saturation_loading": 22.1095656,
        "Estimated_saturation_loading_unit": "mol/kg",
        "Input_block": [
        "Input_ha": "DEF",
        "Input_structure_filename": "tmpQD_OdI.cif",
        "Input_volpo": [
        "Number_of_blocking_spheres": 0,
        "POAV_A^3": 1579.69,
        "POAV_A^3_unit": "A^3",
        "POAV_Volume_fraction": 0.45657,
        "POAV_Volume_fraction_unit": null,
        "POAV_cm^3/g": 0.624564,
        "POAV_cm^3/g_unit": "cm^3/g",
        "PONAV_A^3": 0.0,
        "PONAV_A^3_unit": "A^3",
        "PONAV_Volume_fraction": 0.0,
        "PONAV_Volume_fraction_unit": null,
        "PONAV_cm^3/g": 0.0,
        "PONAV_cm^3/g_unit": "cm^3/g",
        "Unitcell_volume": 3459.91,
        "Unitcell_volume_unit": "A^3",
        "adsorption_energy_widom_average": [
        "adsorption_energy_widom_dev": [
        "adsorption_energy_widom_unit": "kJ/mol",
        "conversion_factor_molec_uc_to_cm3stp_cm3": 10.757306634,
        "conversion_factor_molec_uc_to_gr_gr": 1.3130795208,
        "conversion_factor_molec_uc_to_mol_kg": 0.6565397604,
        "henry_coefficient_average": [
        "henry_coefficient_dev": [
        "henry_coefficient_unit": "mol/kg/Pa",
        "is_kh_enough": [
        "is_porous": true,
        "isotherm": [
                "enthalpy_of_adsorption_average": [
                "enthalpy_of_adsorption_dev": [
                "enthalpy_of_adsorption_unit": "kJ/mol",
                "loading_absolute_average": [
                "loading_absolute_dev": [
                "loading_absolute_unit": "mol/kg",
                "pressure": [
                "pressure_unit": "bar"
                "enthalpy_of_adsorption_average": [
                "enthalpy_of_adsorption_dev": [
                "enthalpy_of_adsorption_unit": "kJ/mol",
                "loading_absolute_average": [
                "loading_absolute_dev": [
                "loading_absolute_unit": "mol/kg",
                "pressure": [
                "pressure_unit": "bar"
                "enthalpy_of_adsorption_average": [
                "enthalpy_of_adsorption_dev": [
                "enthalpy_of_adsorption_unit": "kJ/mol",
                "loading_absolute_average": [
                "loading_absolute_dev": [
                "loading_absolute_unit": "mol/kg",
                "pressure": [
                "pressure_unit": "bar"
        "temperature": [
        "temperature_unit": "K"

IosthermCalcPE work chain

The IsothermCalcPEWorkChain() work chain takes as an input a structure with partial charges, computes the isotherms for CO2 and N2 at ambient temperature and models the process of carbon capture and compression for geological sequestration. The final outcome informs about the performance of the adsorbent for this application, including the CO2 parasitic energy, i.e., the energy that is required to separate and compress one kilogram of CO2, using that material. Default input mixture is coal post-combustion flue gas, but also natural gas post-combustion and air mixtures are available.


Compute CO2 parassitic energy (PE) after running IsothermWorkChain for CO2 and N2 at 300K.


  metadata, Namespace
  raspa_base, Namespace
    metadata, Namespace
    raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • structure, CifData, required – Adsorbent framework CIF.
  zeopp, Namespace
    Namespace Ports
    • code, Code, required – The Code to use for this job.
  • co2, Namespace
    Namespace Ports
    • block, SinglefileData, optional – Blocked pockets fileoutput file.
    • output_parameters, Dict, required – Results of the single temperature wc: keys can vay depending on is_porous and is_kh_enough booleans.
  • n2, Namespace
    Namespace Ports
    • block, SinglefileData, optional – Blocked pockets fileoutput file.
    • output_parameters, Dict, required – Results of the single temperature wc: keys can vay depending on is_porous and is_kh_enough booleans.
  • output_parameters, Dict, required – Output parmaters of a calc_PE calculations


run_isotherms(Run Isotherm work chain for CO2 and N2.)
run_calcpe(Expose isotherm outputs, prepare calc_pe, run it and return the output.)

Multistage work chain

The Cp2kMultistageWorkChain() work chain in meant to automate DFT optimizations in CP2K and guess some good parameters for the simulation, but it is written in such a versatile fashion that it can be used for many other functions.

What it can do:

  1. Given a protocol YAML with different settings, the work chains iterates until it converges the SCF calculation. The concept is to use general options for settings_0 and more and more robust for the next ones.

  2. The protocol YAML contains also a number of stages, i.e., different MOTION settings, that are executed one after the other, restarting from the previous calculation. During the first stage, stage_0, different settings are tested until the SCF converges at the last step of stage_0. If this dos not happening the work chain stops. Otherwise it continues running stage_1, and all the other stages that are included in the protocol.

  3. These stages can be used for running a robust cell optimization, i.e., combining first some MD steps to escape metastable geometries and later the final optimization, or ab-initio MD, first equilibrating the system with a shorter time constant for the thermostat, and then collecting statistics in the second stage.

  4. Some default protocols are provided in workchains/multistage_protocols and they can be imported with simple tags such as test, default, robust_conv. Otherwise, the user can take inspiration from these to write his own protocol and pass it to the work chain.

  5. Compute the band gap.

  6. You can restart from a previous calculation, e.g., from an already computed wavefunction.

What it can not do:

  1. Run CP2K calculations with k-points.

  2. Run CP2K advanced calculations, e.g., other than ENERGY, GEO_OPT, CELL_OPT and MD.


Submits Cp2kBase workchains for ENERGY, GEO_OPT, CELL_OPT and MD jobs iteratively The protocol_yaml file contains a series of settings_x and stage_x: the workchains starts running the settings_0/stage_0 calculation, and, in case of a failure, changes the settings untill the SCF of stage_0 converges. Then it uses the same settings to run the next stages (i.e., stage_1, etc.).


    cp2k, Namespace
      metadata, Namespace
        options, Namespace
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
  • min_cell_size, Float, optional – To avoid using k-points, extend the cell so that min(perp_width)>min_cell_size
  • parent_calc_folder, RemoteData, optional – Provide an initial parent folder that contains the wavefunction for restart
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol to be read from {tag}.yaml unless protocol_yaml input is specified
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file with the multistage settings (and ignore protocol_tag)
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, StructureData, optional – Input structure


  • last_input_parameters, Dict, optional – CP2K input parameters used (and possibly working) used in the last stage
  • output_parameters, Dict, optional – Output CP2K parameters of all the stages, merged together
  • output_structure, StructureData, optional – Processed structure (missing if only ENERGY calculation is performed)
  • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.


setup_multistage(Setup initial parameters.)
    run_stage(Check for restart, prepare input, submit and direct output to context.)
    inspect_and_update_settings_stage0(Inspect the stage0/settings_{idx} calculation and check if it is needed to update the settings and resubmint the calculation.)
inspect_and_update_stage(Update geometry, parent folder and the new &MOTION settings.)
    run_stage(Check for restart, prepare input, submit and direct output to context.)
    inspect_and_update_stage(Update geometry, parent folder and the new &MOTION settings.)
results(Gather final outputs of the workchain.)

Inputs details

  • structure (StructureData, NOTE this is not a CifData) is the system to investigate. It can be also a molecule in a box and not necessarily a 2D/3D framework.

  • protocol_tag (Str) calls a default protocol. Currently available:


Main choice, uses PBE-D3(BJ) with 600Ry/DZVP basis set and GTH pseudopotential. First settings are with OT, and if not working it switches to diagonalization and smearing. As for the stages it runs a cell optimization, a short NPT MD and again cell optimization.


Quick protocol for testing purpose.


Similar to default but using more robust and more expensive settings for the SCF convergence.


Same settings as default but running only one stage for a single point calculation. Used to exploit the automation of this work chain for a simple energy calculation.

  • protocol_yaml (SinglefileData) is used to specify a custom protocol through a YAML file. See the default YAML file as an example. Note that the dictionary need to contain the following keys:


An user friendly description of the protocol.


Dictionary of KIND/MAGNETIZATION for each element.


Dictionary of KIND/BASIS_SET for each element.


Dictionary of KIND/POTENTIAL for each element.


Any `stage_0 using OT and evaluating a band gap below this threshold will be considered as a failure.

  • settings_0

  • settings_1

Settings updated in stage_0 until the SCF converges.

  • stage_0

  • stage_1

CP2K settings that are updated at every stage.

Other keys may be add in future to introduce new functionalities to the Multistage work chain.

  • starting_settings_idx (Int) is used to start from a custom index of the settings. If for example you know that the material is conductive and needs for smearing, you can use Int(1) to update directly the settings to settings_1 that applies electron smearing: this is the case of default protocol.

  • min_cell_size (Float) is used to extend the unit cell, so that the minimum perpendicular width of the cell is bigger than a certain specified value. This needed when a cell length is too narrow and the plane wave auxiliary basis set is not accurate enough at the Gamma point only. Also this may be needed for hybrid range-separated potentials that require a sufficient non-overlapping cutoff.


Need to explain it further in Technicalities.

  • parent_calc_folder (RemoteData) is used to restart from a previously computed wave function.

  • cp2k_base.cp2k.parameters (Dict) can be used to specify some cp2k parameters that will be always overwritten just before submitting every calculation.

Outputs details

  • output_structure (StructureData) is the final structure at the end of the last stage. It is not outputted in case of a single point calculation, since it does not update the geometry of the system.

  • output_parameters (Dict), here it is an example for Aluminum, where the settings_0 calculation is discarded because of a negative band gap, and therefore switched to settings_1 which make the SCF converge and they are used for 2 stages:

        "cell_resized": "1x1x1",
        "dft_type": "RKS",
        "final_bandgap_spin1_au": 6.1299999999931e-06,
        "final_bandgap_spin2_au": 6.1299999999931e-06,
        "last_tag": "stage_1_settings_1_valid",
        "natoms": 4,
        "nsettings_discarded": 1,
        "nstages_valid": 2,
        "stage_info": {
            "bandgap_spin1_au": [
            "bandgap_spin2_au": [
            "final_edens_rspace": [
            "nsteps": [
            "opt_converged": [
        "step_info": {
            "cell_a_angs": [
            "cell_alp_deg": [
            "cell_b_angs": [
            "cell_bet_deg": [
            "cell_c_angs": [
            "cell_gam_deg": [
            "cell_vol_angs3": [
            "dispersion_energy_au": [
            "energy_au": [
            "max_grad_au": [
            "max_step_au": [
            "pressure_bar": [
            "rms_grad_au": [
            "rms_step_au": [
            "scf_converged": [
            "step": [
  • last_input_parameters (Dict) reports the inputs that were used for the last CP2K calculation. They are possibly the ones that make the SCF converge, so the user can inspect them and use them for other direct CP2K calculations in AiiDA.


See examples provided with the plugin. The report provides very useful insight on what happened during the run. Here it is the example of Aluminum:

2019-11-22 16:54:52 [90962 | REPORT]: [266248|Cp2kMultistageWorkChain|setup_multistage]: Unit cell was NOT resized
2019-11-22 16:54:52 [90963 | REPORT]: [266248|Cp2kMultistageWorkChain|run_stage]: submitted Cp2kBaseWorkChain for stage_0/settings_0
2019-11-22 16:54:52 [90964 | REPORT]:   [266252|Cp2kBaseWorkChain|run_calculation]: launching Cp2kCalculation<266253> iteration #1
2019-11-22 16:55:13 [90965 | REPORT]:   [266252|Cp2kBaseWorkChain|inspect_calculation]: Cp2kCalculation<266253> completed successfully
2019-11-22 16:55:13 [90966 | REPORT]:   [266252|Cp2kBaseWorkChain|results]: work chain completed after 1 iterations
2019-11-22 16:55:14 [90967 | REPORT]:   [266252|Cp2kBaseWorkChain|on_terminated]: remote folders will not be cleaned
2019-11-22 16:55:14 [90968 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_settings_stage0]: Bandgaps spin1/spin2: -0.058 and -0.058 ev
2019-11-22 16:55:14 [90969 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_settings_stage0]: BAD SETTINGS: band gap is < 0.100 eV
2019-11-22 16:55:14 [90970 | REPORT]: [266248|Cp2kMultistageWorkChain|run_stage]: submitted Cp2kBaseWorkChain for stage_0/settings_1
2019-11-22 16:55:15 [90971 | REPORT]:   [266259|Cp2kBaseWorkChain|run_calculation]: launching Cp2kCalculation<266260> iteration #1
2019-11-22 16:55:34 [90972 | REPORT]:   [266259|Cp2kBaseWorkChain|inspect_calculation]: Cp2kCalculation<266260> completed successfully
2019-11-22 16:55:34 [90973 | REPORT]:   [266259|Cp2kBaseWorkChain|results]: work chain completed after 1 iterations
2019-11-22 16:55:34 [90974 | REPORT]:   [266259|Cp2kBaseWorkChain|on_terminated]: remote folders will not be cleaned
2019-11-22 16:55:35 [90975 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_settings_stage0]: Bandgaps spin1/spin2: 0.000 and 0.000 ev
2019-11-22 16:55:35 [90976 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_stage]: Structure updated for next stage
2019-11-22 16:55:35 [90977 | REPORT]: [266248|Cp2kMultistageWorkChain|run_stage]: submitted Cp2kBaseWorkChain for stage_1/settings_1
2019-11-22 16:55:35 [90978 | REPORT]:   [266266|Cp2kBaseWorkChain|run_calculation]: launching Cp2kCalculation<266267> iteration #1
2019-11-22 16:55:53 [90979 | REPORT]:   [266266|Cp2kBaseWorkChain|inspect_calculation]: Cp2kCalculation<266267> completed successfully
2019-11-22 16:55:53 [90980 | REPORT]:   [266266|Cp2kBaseWorkChain|results]: work chain completed after 1 iterations
2019-11-22 16:55:54 [90981 | REPORT]:   [266266|Cp2kBaseWorkChain|on_terminated]: remote folders will not be cleaned
2019-11-22 16:55:54 [90982 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_stage]: Structure updated for next stage
2019-11-22 16:55:54 [90983 | REPORT]: [266248|Cp2kMultistageWorkChain|inspect_and_update_stage]: All stages computed, finishing...
2019-11-22 16:55:55 [90984 | REPORT]: [266248|Cp2kMultistageWorkChain|results]: Outputs: Dict<266273> and StructureData<266271>

Cp2kMultistageDdec work chain

The Cp2kMultistageDdecWorkChain() work chain combines together the CP2K Multistage workchain and the DDEC calculation, with the scope of optimizing the geometry of a structure and compute its partial charge using the DDEC protocol.


A workchain that combines: Cp2kMultistageWorkChain + Cp2kDdecWorkChain


    cp2k, Namespace
      metadata, Namespace
        options, Namespace
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
  ddec, Namespace
    Namespace Ports
    • charge_density_folder, RemoteData, optional – Use a remote folder (for restarts and similar)
    • code, Code, required – The Code to use for this job.
    metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
      • description, str, optional, non_db – Description to set on the process node.
      • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
      • label, str, optional, non_db – Label to set on the process node.
      options, Namespace
        Namespace Ports
        • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
        • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
        • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
        • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
        • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
        • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
        • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
        • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
        • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
        • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
        • parser_name, (str), optional, non_db
        • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
        • priority, str, optional, non_db – Set the priority of the job to be queued
        • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
        • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
        • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
        • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
        • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
        • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
        • withmpi, bool, optional, non_db
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • parameters, Dict, required – Input parameters such as net charge, protocol, atomic densities path, …
  • min_cell_size, Float, optional – To avoid using k-points, extend the cell so that min(perp_width)>min_cell_size
  • parent_calc_folder, RemoteData, optional – Provide an initial parent folder that contains the wavefunction for restart
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol to be read from {tag}.yaml unless protocol_yaml input is specified
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file with the multistage settings (and ignore protocol_tag)
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, StructureData, optional – Input structure


  • last_input_parameters, Dict, optional – CP2K input parameters used (and possibly working) used in the last stage
  • output_parameters, Dict, optional – Output CP2K parameters of all the stages, merged together
  • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.
  • structure_ddec, CifData, required – structure with DDEC charges


run_cp2kmultistage(Run CP2K-Multistage)
run_cp2kddec(Pass the Cp2kMultistageWorkChain outputs as inputs for Cp2kDdecWorkChain: cp2k_base (metadata), cp2k_params, structure and WFN.)
return_results(Return exposed outputs and print the pk of the CifData w/DDEC)

ZeoppMultistageDdec work chain

The ZeoppMultistageDdecWorkChain() work chain, is similar to Cp2kMultistageDdec but it runs a geometry characterization of the structure using Zeo++ (NetworkCalculation) before and after, with the scope of assessing the structural changes due to the cell/geometry optimization.


A workchain that combines: Zeopp + Cp2kMultistageWorkChain + Cp2kDdecWorkChain + Zeopp


    cp2k, Namespace
      metadata, Namespace
        options, Namespace
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
  ddec, Namespace
    Namespace Ports
    • charge_density_folder, RemoteData, optional – Use a remote folder (for restarts and similar)
    • code, Code, required – The Code to use for this job.
    metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
      • description, str, optional, non_db – Description to set on the process node.
      • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
      • label, str, optional, non_db – Label to set on the process node.
      options, Namespace
        Namespace Ports
        • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
        • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
        • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
        • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
        • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
        • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
        • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
        • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
        • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
        • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
        • parser_name, (str), optional, non_db
        • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
        • priority, str, optional, non_db – Set the priority of the job to be queued
        • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
        • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
        • resources, dict, required, non_db – Set the dictionary of resources to be used by the scheduler plugin, like the number of nodes, cpus etc. This dictionary is scheduler-plugin dependent. Look at the documentation of the scheduler for more details.
        • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
        • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
        • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
        • withmpi, bool, optional, non_db
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • parameters, Dict, required – Input parameters such as net charge, protocol, atomic densities path, …
  • min_cell_size, Float, optional – To avoid using k-points, extend the cell so that min(perp_width)>min_cell_size
  • parent_calc_folder, RemoteData, optional – Provide an initial parent folder that contains the wavefunction for restart
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol to be read from {tag}.yaml unless protocol_yaml input is specified
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file with the multistage settings (and ignore protocol_tag)
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, CifData, required – input structure
  zeopp, Namespace
    Namespace Ports
    • atomic_radii, SinglefileData, optional – atomic radii file
    • code, Code, required – The Code to use for this job.
    metadata, Namespace
      Namespace Ports
      • call_link_label, str, optional, non_db – The label to use for the CALL link if the process is called by another process.
      • computer, Computer, optional, non_db – When using a “local” code, set the computer on which the calculation should be run.
      • description, str, optional, non_db – Description to set on the process node.
      • dry_run, bool, optional, non_db – When set to True will prepare the calculation job for submission but not actually launch it.
      • label, str, optional, non_db – Label to set on the process node.
      options, Namespace
        Namespace Ports
        • account, str, optional, non_db – Set the account to use in for the queue on the remote computer
        • append_text, str, optional, non_db – Set the calculation-specific append text, which is going to be appended in the scheduler-job script, just after the code execution
        • custom_scheduler_commands, str, optional, non_db – Set a (possibly multiline) string with the commands that the user wants to manually set for the scheduler. The difference of this option with respect to the prepend_text is the position in the scheduler submission file where such text is inserted: with this option, the string is inserted before any non-scheduler command
        • environment_variables, dict, optional, non_db – Set a dictionary of custom environment variables for this calculation
        • import_sys_environment, bool, optional, non_db – If set to true, the submission script will load the system environment variables
        • input_filename, str, optional, non_db – Filename to which the input for the code that is to be run is written.
        • max_memory_kb, int, optional, non_db – Set the maximum memory (in KiloBytes) to be asked to the scheduler
        • max_wallclock_seconds, int, optional, non_db – Set the wallclock in seconds asked to the scheduler
        • mpirun_extra_params, (list, tuple), optional, non_db – Set the extra params to pass to the mpirun (or equivalent) command after the one provided in computer.mpirun_command. Example: mpirun -np 8 extra_params[0] extra_params[1] … exec.x
        • output_filename, str, optional, non_db – Filename to which the content of stdout of the code that is to be run is written.
        • parser_name, str, optional, non_db
        • prepend_text, str, optional, non_db – Set the calculation-specific prepend text, which is going to be prepended in the scheduler-job script, just before the code execution
        • priority, str, optional, non_db – Set the priority of the job to be queued
        • qos, str, optional, non_db – Set the quality of service to use in for the queue on the remote computer
        • queue_name, str, optional, non_db – Set the name of the queue on the remote computer
        • resources, dict, optional, non_db
        • scheduler_stderr, str, optional, non_db – Filename to which the content of stderr of the scheduler is written.
        • scheduler_stdout, str, optional, non_db – Filename to which the content of stdout of the scheduler is written.
        • submit_script_filename, str, optional, non_db – Filename to which the job submission script is written.
        • withmpi, bool, optional, non_db – Set the calculation to use mpi
      • store_provenance, bool, optional, non_db – If set to False provenance will not be stored in the database.
    • parameters, NetworkParameters, optional – command line parameters for zeo++


  • last_input_parameters, Dict, optional – CP2K input parameters used (and possibly working) used in the last stage
  • output_parameters, Dict, optional – Output CP2K parameters of all the stages, merged together
  • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.
  • structure_ddec, CifData, required – structure with DDEC charges
  • zeopp_after_opt, Namespace
    Namespace Ports
    • output_parameters, Dict, required – key-value pairs parsed from zeo++ output file(s).
  • zeopp_before_opt, Namespace
    Namespace Ports
    • output_parameters, Dict, required – key-value pairs parsed from zeo++ output file(s).


run_zeopp_before(Run Zeo++ for the original structure)
run_multistageddec(Run MultistageDdec work chain)
run_zeopp_after(Run Zeo++ for the oprimized structure)
return_results(Return exposed outputs)

SimAnnealing work chain

The SimAnnealingWorkChain() work chain allows to find the minimum configuration a number of gas molecules, in the pore volume of a framework. It runs several NVT simulations in RASPA at decreasing temperature to make the system move to its global minimum (simulated annealing), and it finally performs a minimization for the final fine tuning of the optimum position.


A work chain to compute the minimum energy geometry of a molecule inside a framework, using simulated annealing, i.e., decreasing the temperature of a Monte Carlo simulation and finally running and energy minimization step.


  raspa_base, Namespace
    metadata, Namespace
    raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      metadata, Namespace
        options, Namespace
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • structure, CifData, required – Adsorbent framework CIF.


  • loaded_molecule, CifData, required – CIF containing the final postition of the molecule.
  • loaded_structure, CifData, required – CIF containing the loaded structure.
  • output_parameters, Dict, optional – Information about the final configuration.


setup(Initialize the parameters)
    run_raspa_nvt(Run a NVT calculation in Raspa.)
run_raspa_min(Run a Energy Minimization in Raspa.)
return_results(Return molecule position and energy info.)

Inputs details

  • parameters (Dict) modifies the default parameters:

        "ff_framework": "UFF",  # (str) Forcefield of the structure.
        "ff_separate_interactions": False,  # (bool) Use "separate_interactions" in the FF builder.
        "ff_mixing_rule": "Lorentz-Berthelot",  # (string) Choose 'Lorentz-Berthelot' or 'Jorgensen'.
        "ff_tail_corrections": True,  # (bool) Apply tail corrections.
        "ff_shifted": False,  # (bool) Shift or truncate the potential at cutoff.
        "ff_cutoff": 12.0,  # (float) CutOff truncation for the VdW interactions (Angstrom).
        "temperature_list": [300, 250, 200, 250, 100, 50],  # (list) List of decreasing temperatures for the annealing.
        "mc_steps": int(1e3),  # (int) Number of MC cycles.
        "number_of_molecules": 1  # (int) Number of molecules loaded in the framework.

Outputs details

  • output_parameters (Dict), example:

        "description": [
            "NVT simulation at 300 K",
            "NVT simulation at 250 K",
            "NVT simulation at 200 K",
            "NVT simulation at 250 K",
            "NVT simulation at 100 K",
            "NVT simulation at 50 K",
            "Final energy minimization"
        "energy_adsorbate/adsorbate_final_coulomb": [
        "energy_adsorbate/adsorbate_final_tot": [
        "energy_adsorbate/adsorbate_final_vdw": [
        "energy_host/adsorbate_final_coulomb": [
        "energy_host/adsorbate_final_tot": [
        "energy_host/adsorbate_final_vdw": [
        "energy_unit": "kJ/mol",
        "number_of_molecules": 1

Cp2kBindingEnergy work chain

The Cp2kBindingEnergyWorkChain() work chain takes as an input a CIF structure and the initial position of a molecule in its pore, optimizes the molecule’s geometry keeping the framework rigid and computes the BSSE corrected interactions energy. The work chain is similar to CP2K’s MulstistageWorkChain in reading the settings from YAML protocol, and resubmitting the calculation with updated settings in case of failure, but the only step is an hard-coded GEO_OPT simulation with 200 max steps.


  1. It is better to start with the settings of a previous working MulstistageWorkChain, if already available. Otherwise, it may run for 200 steps before realizing that the settings are not good an switch them.

  2. No restart is allowed, since the system is changing the number of atoms for the BSSE calculation: therefore, the wave function is recomputed 5 times from scratch. This needs to be fixed in the future.

  3. If structure and molecule StructureData do not have the same size for the unit cell, the work chain will complain and stop.


Submits Cp2kBase work chain for structure + molecule system, first optimizing the geometry of the molecule and later computing the BSSE corrected interaction energy. This work chain is inspired to Cp2kMultistage, and shares some logics and data from it.


    cp2k, Namespace
      metadata, Namespace
        options, Namespace
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
  • molecule, StructureData, required – Input molecule in the unit cell of the structure.
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol tag.yaml. NOTE: only the settings are read, stage is set to GEO_OPT.
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file. NOTE: only the settings are read, stage is set to GEO_OPT.
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, StructureData, required – Input structure that contains the molecule.


  • loaded_molecule, StructureData, required – Molecule geometry in the unit cell.
  • loaded_structure, StructureData, required – Geometry of the system with both fragments.
  • output_parameters, Dict, required – Info regarding the binding energy of the system.
  • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.


setup(Setup initial parameters.)
    run_geo_opt(Prepare inputs, submit and direct output to context.)
    inspect_and_update_settings_geo_opt(Inspect the settings_{idx} calculation and check if it is needed to update the settings and resubmint the calculation.)
run_bsse(Update parameters and run BSSE calculation. BSSE assumes that the molecule has no charge and unit multiplicity: this can be customized from builder.cp2k_base.cp2k.parameters.)
results(Gather final outputs of the workchain.)

Inputs details

Look at the inputs details of the Multistage work chain for more information about the choice of the protocol (i.e., DFT settings).

Outputs details

  • output_parameters (Dict), example:

        "binding_energy_bsse": -1.7922110202537,
        "binding_energy_corr": -23.072114381515,
        "binding_energy_dispersion": -18.318476834858,
        "binding_energy_raw": -24.864325401768,
        "binding_energy_unit": "kJ/mol",
        "motion_opt_converged": false,
        "motion_step_info": {
            "dispersion_energy_au": [
            "energy_au": [
            "max_grad_au": [
            "max_step_au": [
            "rms_grad_au": [
            "rms_step_au": [
            "scf_converged": [

BindingSiteWorkChain work chain

The BindingSiteWorkChain() work chain simply combines SimAnnealingWorkChain() and Cp2kBindingEnergyWorkChain(). The outputs from the two workchain are collected under the ff and dft namespaces, respectively.


A workchain that combines SimAnnealing & Cp2kBindingEnergy


    cp2k, Namespace
      metadata, Namespace
        options, Namespace
      • parameters, Dict, optional – Specify custom CP2K settings to overwrite the input dictionary just before submitting the CalcJob
      • parent_calc_folder, RemoteData, optional – remote folder used for restarts
      • pseudos, Namespace – A dictionary of pseudopotentials to be used in the calculations: key is the atomic symbol, value is either a single pseudopotential or a list of pseudopotentials. If multiple pseudos for a single symbol are passed, it is mandatory to specify a KIND section with a PSEUDOPOTENTIAL keyword matching the names (or aliases) of the pseudopotentials.
      • resources, dict, optional – special settings
      • settings, Dict, optional – additional input parameters
    • handler_overrides, Dict, optional – Mapping where keys are process handler names and the values are a boolean, where True will enable the corresponding handler and False will disable it. This overrides the default value set by the enabled keyword of the process_handler decorator with which the method is decorated.
    • max_iterations, Int, optional – Maximum number of iterations the work chain will restart the process to finish successfully.
  • molecule, (Str, Dict), required – Adsorbate molecule: settings to be read from the yaml.Advanced: input a Dict for non-standard settings.
  • parameters, Dict, required – Parameters for the SimAnnealing workchain: will be merged with default ones.
  • protocol_modify, Dict, optional – Specify custom settings that overvrite the yaml settings
  • protocol_tag, Str, optional – The tag of the protocol tag.yaml. NOTE: only the settings are read, stage is set to GEO_OPT.
  • protocol_yaml, SinglefileData, optional – Specify a custom yaml file. NOTE: only the settings are read, stage is set to GEO_OPT.
    raspa, Namespace
      Namespace Ports
      • block_pocket, Namespace – Zeo++ block pocket file
      • code, Code, required – The Code to use for this job.
      • file, Namespace – Additional input file(s)
      • framework, Namespace – Input framework(s)
      metadata, Namespace
        options, Namespace
      • parent_folder, RemoteData, optional – Remote folder used to continue the same simulation stating from the binary restarts.
      • retrieved_parent_folder, FolderData, optional – To use an old calculation as a starting poing for a new one.
      • settings, Dict, optional – Additional input parameters
  • starting_settings_idx, Int, optional – If idx>0 is chosen, jumps directly to overwrite settings_0 with settings_{idx}
  • structure, CifData, required – Adsorbent framework CIF.


  • dft, Namespace
    Namespace Ports
    • loaded_molecule, StructureData, required – Molecule geometry in the unit cell.
    • loaded_structure, StructureData, required – Geometry of the system with both fragments.
    • output_parameters, Dict, required – Info regarding the binding energy of the system.
    • remote_folder, RemoteData, required – Input files necessary to run the process will be stored in this folder node.
  • ff, Namespace
    Namespace Ports
    • loaded_molecule, CifData, required – CIF containing the final postition of the molecule.
    • loaded_structure, CifData, required – CIF containing the loaded structure.
    • output_parameters, Dict, optional – Information about the final configuration.


run_sim_annealing(Run SimAnnealing)
run_cp2k_binding_energy(Pass the ouptput molecule's geometry to Cp2kBindingEnergy.)
return_results(Return exposed outputs and info.)