QM/MM overview

The following overview will outline the main steps in setting up and running a QM/MM simulation in CP2K, starting from preparing your system from a raw pdb file, and leading towards running a production QM/MM run, for example a molecular dynamics simulation. Each section links to the relevant specfic parts of the guide.

This overview is designed to be read from the point of view of a beginner, a more experienced user may benefit from using the contents page to directly navigate to areas of interest.

Prepare system

The first step is to prepare your system - i.e. to generate the desired starting system and forcefield files. The starting point for this is usually a pdb file which may have been downloaded from the PDB data bank. Usually these structures are not ready to use and will require fixing to obtain the desired protonation state of the protein. Details of how to do this are described in the system preparation guide: System preparation

You will also have to generate a forcefield for your system. There are different tools available to do this depending on the type of forcefield. For CP2K the Amber, Charmm and Gromos forcefields can be used. The relevant tools and their tutorials can be found below:

Ambertools - https://ambermd.org/tutorials/ForceField.php
Charmm - https://www.charmm.org/charmm/documentation/tutorials/
Gromos - http://www.gromos.net

Minimisation, Equilibration and Thermalisation

After you have built your topology and coordinate files you must first minimise your system at the MM level before running any QM/MM simulations. This will find the energy minima for your system and fix any possible bad contacts in your initial structure. Once the system is minimised, it has to be subsequently heated (from 0 K to your target conditions, e.g. 300 K) and equilibrated. This gradual and slow heating process prevents instabilities arising due to the sudden increase in the kinetic energy.

Select QM atoms

After the system has been prepared you can consider which atoms will be included in the QM region. This will usually be the atoms in your ligand, however it is important to consider whether enough atoms are included to accurately represent the chemistry of the system, while at the same time not having too many QM atoms (which will require more compute resources).

Details on how to select QM atoms from a pdb file and get them in the correct format for using in a CP2K input are given here: Selecting QM atoms

Run MM only in CP2K

Before running a QM/MM simulation in CP2K it is recommended to try and run a simple single energy MM calculation. This will verify that the input forcefield and coordinates are set up correctly and compatible with CP2K before attempting to add more complicated QM/MM parameters. The MM calculation should be fast to run and if the calculation finishes without error and the energy is sensible then it is a good indicator that the system has been prepared correctly for CP2K.

Details of how to do this are given here: CP2K MM setup

If this is your first time using CP2K then it is recommended to read the Running CP2K section of guide (Running QM/MM simulations), as well as the CP2K Output Guide (Understanding the CP2K output). If you encouter any errors while runnning you can debug these following advice in the Troubleshooting guide (CP2K Troubleshooting).

Run a simple QM/MM calculation in CP2K

Once an MM calculation has been run successfully the input can be used as a basis for a QM/MM calculation.

The METHOD should be set to QMMM. You will need to add the QMMM section for the parameterisation of the QM region, and the DFT section which will define all the necessary settings for the QM treatment. Additionally, information about the atomic kind parameterisation will need to be added for each atomic kind in the SUBSYS section.

The structure of your input will look something like this:

&GLOBAL                                   ! global section (required)
   PROJECT project_name                      ! name of project for output files
   RUN_TYPE ENERGY                           ! run type (important)
   PRINT_LEVEL LOW                           ! controls amount of output produced
&END GLOBAL

&FORCE_EVAL                               ! parameters for force evaluation
   METHOD QMMM                               ! method employed e.g. QMMM, QS, FIST

   &DFT                                   ! DFT section - all QM
     .... contents of DFT section
   &END DFT

   &QMMM                                  ! QMMM section - set up for QM region
     .... contents of QMMM section
   &END QMMM

   &MM                                    ! MM section - MM forcefields,  etc.
     .... contents of MM section
   &END MM

   &SUBSYS                                ! subsystem - coordinates, atom kinds etc.
     .... contents of SUBSYS section
   &SUBSYS

&END FORCE_EVAL

&MOTION                                   ! control of atom movement e.g. geometry optimisations, MD
  .... contents of MOTION section
&END MOTION

Information on setting up the parameters for the QMMM section can be found here: CP2K QM/MM parameterisation

Settings for this will depend highly on your choice of the QM region.

Information on setting the QM treatment can be found here: QM treatment

It is good practice to start with a simple method for the XC functional and then check that the QM set up has been done correctly before increasing the complexity and deciding on a more accurate or appropirate method for your system.

You should first calculate just the ENERGY of the system and check that the SCF converges and that the energy is sensible. This will ensure that there are no errors in your DFT setup or QM atom selection.

Before running a production QM/MM calculation the value of the CUTOFF should be converged for the final choice of BASIS_SET, XC_FUNCTIONAL and any other parameters. How to do this is documented here: https://www.cp2k.org/howto:converging_cutoff

Run MD with CP2K

Once you have setup a simple single energy QM/MM CP2K calculation it is fairly straightfoward to adjust the input file to run a production molecular dynamics simulation.

The first change is to set the RUN_TYPE to MD. You will also need to add an MD section in the MOTION section, which will list the parameters to do with the dynamics of the simulation. For a simple NVE ensemble MD this would look as follows:

&MOTION
   &MD
      ENSEMBLE NVE                            ! Ensemble type
      STEPS 5                                 ! Number of MD steps
      TEMPERATURE 300                         ! Target temperature in Kelvin
      TIMESTEP 1                              ! Timestep in femtoseconds
   &END MD
&END MOTION

Usually a timestep of 1 femtoseconds or less is recommended in order to ensure energy conservation in the system.

More information about MD simulations in CP2K is given here: https://www.cp2k.org/howto:md

Ensembles

CP2K offers a range of MD integrators for sampling different ensembles which are listed here: https://manual.cp2k.org/trunk/CP2K_INPUT/MOTION/MD.html#ENSEMBLE

Common ones are the NVT and NPT_I (isobaric) ensemble. For an NVT ensemble you will need to add information about the thermostat in the &THERMOSTAT section within the MD section, while the NPT_I ensemble will need both and THERMOSTAT and BAROSTAT section as shown below.

&MOTION
..
 &MD
    ENSEMBLE NPT_I
    TIMESTEP  0.5
    STEPS  1000
    TEMPERATURE 298
    &BAROSTAT
      TIMECON [fs] 100        ! time constant for barostat
      PRESSURE [bar] 1.0      ! target pressure
    &END BAROSTAT
    &THERMOSTAT
      TYPE CSVR               ! type of thermostat -  options include NOSE, CSVR (rescaling), GLE, AD_LANGEVIN
      &CSVR
        TIMECON [fs] 10.      ! time constant for thermostat
      &END CSVR
    &END THERMOSTAT
 &END MD
 ..
&END MOTION