# Blog Archives

## XIX RMFQT – National Meeting on TheoPhysChem

The Mexican Meeting on Theoretical Physical Chemistry is a national staple of our local scientific discipline. The nineteenth edition had to be a virtual conference due to sanitary restrictions still enforced in Mexico. Nevertheless, this was a successful meeting in which we tried new things, such as a live broadcast via our new official YouTube channel and a Twitter poster session covered under the hashtag #RMFQTXIX.

Please browse the previous links (talks in Spanish, most Tweets are also in Spanish but some are available in English.) Twitter conferences are here to stay and the creativity from the participants will be key in moving them forward; unfortunately, most of us are still grounded in the traditional idea of a physical poster and that notion taken literally translates poorly to a Tweet. I wanted to embed some of the presented posters but I don’t want to leave people out and they were fortunately too many for them to fit in a blog post. So head on to Twitter and check the hashtags #RMFQTXIX and #CompChemMX and follow the official Twitter account for the RMFQT.

A big shout-out to the staff, PhD students Jessica Arcudia and Gustavo Mondragón for keeping up the live sessions and online broadcast. The future of Mexican CompChem is in safe hands!

## Fixing the error: Bad data into FinFrg

I found this error in the calculation of two interacting fragments, both with unpaired electrons. So, two radicals interact at a certain distance and the full system is deemed as a singlet, therefore the unpaired electron on each fragment have opposite spins. The problem came when trying to calculate the Basis Set Superposition Error (BSSE) because in the Counterpoise method you need to assign a charge and multiplicity to each fragment, however it’s not obvious how to assign opposite spins.

The core of the problem is related to the *guess *construction; normally a Counterpoise calculation would look like the following example:

#p B3LYP/6-31G(d,p) counterpoise=2 -2,1 -1,2 -1,2 C(Fragment=1) 0.00 0.00 0.00 O(Fragment=2) 1.00 1.00 1.00 ...

In which the first pair of charge-multiplicity numbers correspond to the whole molecule and the following to those of each fragment in increasing order of *N* (in this case, *N* = 2). So for this hypothetical example we have two anions (but could easily be two cations) each with an unpaired electron, yielding a complex of charge = -2 and a singlet multiplicity which implies those two unpaired electrons have opposite spin. But if the *guess *(the initial trial wavefunction from which the SCF will begin) has a problem understanding this then the title error shows up:

Bad data into FinFrg Error termination via Lnk1e ...

The solution to this problem is as simple as it may be obscure: Create a convenient guess wavefunction by placing a negative sign to the multiplicity of one of the fragments in the following example. You may then use the guess as the starting point of other calculations since it will be stored in the checkpoint file. By using this negative sign we’re not requesting a *negative multiplicity*, but a given multiplicity of *opposite spin *to the other fragment.

#p B3LYP/6-31G(d,p) guess=(only,fragment=2) -2,1 -1,2 -1,-2 C(Fragment=1) 0.00 0.00 0.00 O(Fragment=2) 1.00 1.00 1.00 ...

This way, the second fragment will have the opposite spin (but the same *multiplicity*) as the first fragment. The only keyword tells gaussian to only calculate the guess wave function and then exit the program. You may then use that guess as the starting point for other calculations such as my failed Counterpoise one.

## Basis Set Superposition Error (BSSE). A short intro

Molecular Orbitals (MOs) are linear combinations of Atomic Orbitals (AOs), which in turn are linear combinations of other functions called ‘basis functions’. A basis, or more accurately a basis set, is a collection of functions which obey a set of rules (such as being orthogonal to each other and possibly being normalized) with which all AOs are constructed, and although these are centered on each atomic nucleus, the canonical way in which they are combined yield delocalized MOs; in other words, an MO can occupy a large space spanning several atoms at once. We don’t mind this expansion across a molecule, but what about between two molecules? Calculating the interaction energy between two or more molecular fragments leads to an artificial extra–stabilization term that stems from the fact that electrons in molecule 1 can occupy AO’s (or the basis functions which form them) centered on atoms from molecule 2.

Fundamentally, the interaction energy of any A—B dimer, *E _{int}*, is calculated as the energy difference between the dimer and the separately calculated energies for each component (Equation 1).

*E _{int} = E_{AB} – E_{A} – E_{B}* (

**1**)

However the calculation of *E _{int} *by this method is highly sensitive to the choice of basis set due to the Basis Set Superposition Error (BSSE) described in the first paragraph. The BSSE is particularly troublesome when small basis sets are used, due to the poor description of dispersion interactions but treating this error by just choosing a larger basis set is seldom useful for systems of considerable sizes. The Counterpoise method is a nifty correction to equation 1, in which EA and EB are calculated with the basis set of A and B respectively, i.e., only in EAB a larger basis set (that of A and B simultaneously) is used. The Counterpoise method calculates each component with the AB basis set (Equation 2)

*E _{int}^{CP} = E_{AB}^{AB} – E_{A}^{AB}– E_{B}^{AB}* (

**2**)

where the superscript AB means the whole basis set is used. This is accomplished by using ‘*ghost*‘ atoms with no nuclei and no electrons but empty basis set functions centered on them.

In Gaussian, BSSE is calculated with the Counterpoise method developed by Boys and Simon. It requires the keyword Counterpoise=N where N is the number of fragments to be considered (for an A—B system, N=2). Each atom in the coordinates list must be specified to which fragment pertains; additionally, the charge and multiplicity for each fragment and the whole supermolecular ensemble must be specified. Follow the example of this hydrogen fluoride dimer.

%chk=HF2.chk #P opt wB97XD/6-31G(d,p) Counterpoise=2 HF dimer 0,1 0,1 0,1 H(Fragment=1) 0.00 0.00 0.00 F(Fragment=1) 0.00 0.00 0.70 H(Fragment=2) 0.00 0.00 1.00 F(Fragment=2) 0.00 0.00 1.70

For closed shell fragments the first line is straightforward but one must pay attention that the first pair of numbers in the charge multiplicity line correspond to the whole ensemble, whereas the folowing pairs correspond to each fragment in consecutive order. Fragments do not need to be specified contiguously, i.e., you don’t need to define all atoms for fragment 1 and after those the atoms for fragment 2, etc. They could be mixed and the program still assigns them correctly. Just as an example I typed wB97XD but any other method, DFT or ab initio, may be used; only semiempirical methods do not admit a BSSE calculation because they don’t make use of a basis set in the first place!

The output provides the corrected energy (in atomic units) for the whole system, as well as the BSSE correction (which added to the previous term yields the un-corrected energy of the system). Gaussian16 also provides these values in kcal/mol as ‘Complexation energies’ first raw (uncorrected) and then the corrected energy.

BSSE is always present and cannot be entirely eliminated because of the use of finite basis sets but it can be correctly dealt with if the Counterpoise method is included.

## Orbital Contributions to Excited States

This is a guest post by our very own Gustavo “*Gus*” Mondragón whose work centers around the study of excited states chemistry of photosynthetic pigments.

When you’re calculating excited states (no matter the method you’re using, TD-DFT, CI-S(D), EOM-CCS(D)) the analysis of the orbital contributions to electronic transitions poses a challenge. In this post, I’m gonna guide you through the CI-singles excited states calculation and the analysis of the electronic transitions.

I’ll use adenine molecule for this post. After doing the corresponding geometry optimization by the method of your choice, you can do the excited states calculation. For this, I’ll use two methods: CI-Singles and TD-DFT.

The route section for the CI-Singles calculation looks as follows:

%chk=adenine.chk

%nprocshared=8

%mem=1Gb

#p CIS(NStates=10,singlets)/6-31G(d,p) geom=check guess=read scrf=(cpcm,solvent=water)

adenine excited states with CI-Singles method

0 1

--blank line--

I use the same geometry from the optimization step, and I request only for 10 singlet excited states. The CPCP implicit solvation model (solvent=water) is requested. If you want to do TD-DFT, the route section should look as follows:

%chk=adenine.chk

%nprocshared=8

%mem=1Gb

#p FUNCTIONAL/6-31G(d,p) TD(NStates=10,singlets) geom=check guess=read scrf=(cpcm,solvent=water)

adenine excited states with CI-Singles method

0 1

--blank line--

Where FUNCTIONAL is the DFT exchange-correlation functional of your choice. Here I strictly not recommend using B3LYP, but CAM-B3LYP is a noble choice to start.

Both calculations give to us the excited states information: excitation energy, oscillator strength (as *f* value), excitation wavelength and multiplicity:

Excitation energies and oscillator strengths:

Excited State 1: Singlet-A 6.3258 eV 196.00 nm f=0.4830 <S**2>=0.000

11 -> 39 -0.00130

11 -> 42 -0.00129

11 -> 43 0.00104

11 -> 44 -0.00256

11 -> 48 0.00129

11 -> 49 0.00307

11 -> 52 -0.00181

11 -> 53 0.00100

11 -> 57 -0.00167

11 -> 59 0.00152

11 -> 65 0.00177

The data below corresponds to all the electron transitions involved in this excited state. I have to cut all the electron transitions because there are a lot of them for all excited states. If you have done excited states calculations before, you realize that the HOMO-LUMO transition is always an important one, but not the only one to be considered. Here is when we calculate the *Natural Transition Orbitals *(NTO), by these orbitals we can analyze the electron transitions.

For the example, I’ll show you first the HOMO-LUMO transition in the first excited state of adenine. It appears in the long list as follows:

35 -> 36 0.65024

The 0.65024 value corresponds to the transition amplitude, but it doesn’t mean anything for excited state analysis. We must calculate the NTOs of an excited state from a new Gaussian input file, requesting from the checkpoint file we used to calculate excited states. The file looks as follows:

%Oldchk=adenine.chk

%chk=adNTO1.chk

%nproc=8

%mem=1Gb

#p SP geom=allcheck guess=(read,only) density=(Check,Transition=1) pop=(minimal,NTO,SaveNTO)

I want to say some important things right here for this last file. See that no level of theory is needed, all the calculation data is requested from the checkpoint file “adenine.chk”, and saved into the new checkpoint file “adNTO1.chk”, we must use the previous calculated density and specify the transition of interest, it means the excited state we want to analyze. As we don’t need to specify charge, multiplicity or even the comment line, this file finishes really fast.

After doing this last calculation, we use the new checkpoint file “adNTO1.chk” and we format it:

formchk -3 adNTO1.chk adNTO1.fchk

If we open this formatted checkpoint file with GaussView, chemcraft or the visualizer you want, we will see something interesting by watching he MOs diagram, as follows:

We can realize that frontier orbitals shows the same value of 0.88135, which means the real transition contribution to the first excited state. As these orbitals are contributing the most, we can plot them by using the cubegen routine:

cubegen 0 mo=homo adNTO1.fchk adHOMO.cub 0 h

This last command line is for plotting the equivalent as the HOMO orbital. If we want to plot he LUMO, just change the “homo” keyword for “lumo”, it doesn’t matter if it is written with capital letters or not.

You must realize that the Natural Transition Orbitals are quite different from Molecular Orbitals. For visual comparisson, I’ve printed also the molecular orbitals, given from the optimization and from excited states calculations, without calculating NTOs:

These are the molecular frontier orbitals, plotted with Chimera with 0.02 as the isovalue for both phase spaces:

The frontier NTOs look qualitatively the same, but that’s not necessarily always the case:

If we analyze these NTOs on a hole-electron model, the HOMO refers to the *hole* space and the LUMO refers to the *electron* space.

Maybe both orbitals look the same, but both frontier orbitals are quite different between them, and these last orbitals are the ones implied on first excited state of adenine. The electron transition will be reported as follows:

If I can do a graphic summary for this topic, it will be the next one:

NTOs analysis is useful no matter if you calculate excited states by using CIS(D), EOM-CCS(D), TD-DFT, CASSCF, or any of the excited states method of your election. These NTOs are useful for population analysis in excited states, but these calculations require another software, MultiWFN is an open-source code that allows you to do this analysis, and another one is called TheoDORE, which we’ll cover in a later post.

## XVIII RMFQT

It was my distinct pleasure for me to participate in the organization of the latest edition of the Mexican Meeting on Theoretical Physical Chemistry, RMFQT which took place last week here in Toluca. With the help of the School of Chemistry from the Universidad Autónoma del Estado de México.

This year the national committee created a Lifetime Achievement Award for Dr. Annik Vivier, Dr. Carlos Bunge, and Dr. José Luis Gázquez. This recognition from our community is awarded to these fine scientists for their contributions to theoretical chemistry but also for their pioneering work in the field in Mexico. The three of them were invited to talk about any topic of their choosing, particularly, Dr. Vivier stirred the imagination of younger students by showing her pictures of the times when she used to hangout with Slater, Roothan, Löwdin, etc., it is always nice to put faces onto equations.

Continuing with a recent tradition we also had the pleasure to host three invited plenary lectures by great scientists and good friends of our community: Prof. William Tiznado (Chile), Prof. Samuel B. Trickey (USA), and Prof. Julia Contreras (France) who shared their progress on their recent work.

As I’ve abundantly pointed out in the past, the RMFQT is a joyous occasion for the Mexican theoretical community to get together with old friends and discuss very exciting research being done in our country and by our colleagues abroad. I’d like to add a big shoutout to Dr. Jacinto Sandoval-Lira for his valuable help with the organization of our event.

## Useful Thermochemistry from Gaussian Calculations

Statistical Mechanics is the bridge between microscopic calculations and thermodynamics of a particle ensemble. By means of calculating a partition function divided in electronic, rotational, translational and vibrational functions, one can calculate all thermodynamic functions required to fully characterize a chemical reaction. From these functions, the vibrational contribution, together with the electronic contribution, is the key element to getting thermodynamic functions.

Calculating the Free Energy change of any given reaction is a useful approach to asses their thermodynamic feasibility. A large negative change in Free Energy when going from reagents to products makes up for a quantitative spontaneous (and exothermic) reaction, nevertheless the rate of the reaction is a different story, one that can be calculated as well.

Using the **freq** option in your route section for a Gaussian calculation is mandatory to ascertain the current wave function corresponds to a minimum on a potential energy hypersurface, but also yields the thermochemistry and thermodynamic values for the current structure. However, thermochemistry calculations are not restricted to minima but it can also be applied to transition states, therefore yielding a full thermodynamic characterization of a reaction mechanism.

A regular **freq** calculation yields the following output (all values in atomic units):

Zero-point correction= 0.176113 (Hartree/Particle) Thermal correction to Energy= 0.193290 Thermal correction to Enthalpy= 0.194235 Thermal correction to Gibbs Free Energy= 0.125894 Sum of electronic and zero-point Energies= -750.901777 Sum of electronic and thermal Energies= -750.884600 Sum of electronic and thermal Enthalpies= -750.883656Sum of electronic and thermal Free Energies= -750.951996

For any given reaction say A+B -> C one could take the values from the last row (lets call it G) for all three components of the reaction and perform the arithmetic: DG = GC – [GA + GB], so products minus reagents.

By default, Gaussian calculates these values (from the previously mentioned partition function) using normal conditions, T = 298.15 K and P = 1 atm. For an assessment of the thermochemistry at other conditions you can include in your route section the corresponding keywords **Temperature=**x.x and **Pressure=**x.x, in Kelvin and atmospheres, respectively.

(Huge) **Disclaimer**: Although calculating the thermochemistry of any reaction by means of DFT calculations is a good (and potentially very useful) guide to chemical reactivity, getting quantitative results require of high accuracy methods like G3 or G4 methods, collectively known as Gn mehtods, which are composed of pre-defined stepwise calculations. The sequence of these calculations is carried out automatically; no basis set should be specified. Other high accuracy methods like CBS-QB3 or W1U can also be considered whenever Gn methods are too costly.

## Using PDB files for Electronic Structure Calculations

#### Quick Post on preparing Gaussian input files from PDB files.

If you’re modeling biological systems chances are that, more often than not, you start by retrieving a PDB file. The Protein Data Bank is a repository for all things biochemistry – from oligo-peptides to full DNA sequences with over 140,000 available files encoding the corresponding structure obtained by various experimental means ranging from X-Ray diffraction, NMR and more recently, Cryo Electron Microscopy (CEM).

The PDB file encodes the Cartesian coordinates for each atom present in the structure as well as their in the same way molecular dynamics codes -like AMBER or GROMACS- code the parameters for a force field; this makes the PDB a natural input file for MD.

There are however some considerations to have in mind for when you need to use these coordinates in electronic structure calculations. Personally I give it a pass with OpenBabel to add (or possibly just re-add) all Hydrogen atoms with the following instruction:

$>obabel -ipdb filename.pdb -ogjf -Ofilename.gjf -hAlternatively, you can select a pH value, say 7.5 with:

$>obabel -ipdb filename.pdb -ogjf -Ofilename.gjf -h -p7.5

You may also use the GUI if by any chance you’re working in Windows:

This sends all H atoms to the end of the atoms list. Usually for us the next step is to optimize their positions with a partial optimization at a low level of theory for which you need to use the ReadOptimize ReadOpt or RdOpt in the route section and then add the atom list at the end of the input file:

Atomic coordinates

--blank line--

noatoms atoms=H

--blank line--

Finally, visual inspection of your input structure is always helpful to find any meaningful errors, remember that PDB files come from experimental measurements which are not free of problems.

As usual thanks for reading, commenting, and sharing.

## Estimation of pKa Values through Local Electrostatic Potential Calculations

Calculating the p*K*a value for a Brønsted acid is very hard, like really hard. A full thermodynamic cycle (fig. 1) needs to be calculated along with the high-accuracy solvation free energy for each of the species under consideration, not to mention the use of expensive methods which will be reviewed here in another post in two weeks time.

Finding descriptors that help us circumvent the need for such sophisticated calculations can help great deal in estimating the p*K*a value of any given acid. We’ve been interested in the reactivity of σ-hole bearing groups in the past and just like Halogen, Tetrel, Pnicogen and Chalcogen bonds, Hydrogen bonds are highly directional and their strength depends on the polarization of the O-H bond. Therefore, we suggested the use of the maximum surface electrostatic potential (*V*_{S,max}) on the acid hydrogen atom of carboxylic acids as a descriptor for the strength of their interaction with water, the first step in the deprotonation process.

We selected six basis sets; five density functionals; the MP2 method for a total of thirty-six levels of theory to optimize and calculate *V*_{S,max} on thirty carboxylic acids for a grand total of 1,080 wavefunctions, which were later passed onto MultiWFN (all calculations were taken with PCM = *water*). Correlation with the experimental pKa values showed a great correlation across the levels of theory (R2 > 0.9), except for B3LYP. Still, the best correlations were obtained with LC-wPBE/cc-pVDZ and wB97XD/cc-pVDZ. From this latter level of theory the linear correlation yielded the following equation:

p*K*a = -0.2185(*V*_{S,max}) + 16.1879

Differences in pKa turned out to be less than 0.5 units, which is remarkable for such a straightforward method; bear in mind that calculation of full thermodynamic cycles above chemical accuracy (1.0 kcal/mol) yields pKa differences above 1.0 units.

We then took this equation for a test with 10 different carboxylic acids and the prediction had a correlation of 98% (fig. 2)

I think this method can really catch on for a quick way to predict the pKa values of any carboxylic acid imaginable. We’re now working on the model extension to other groups (i.e. Bronsted bases) and putting together a black-box workflow so as to make it even more accessible and straightforward to use.

We’ve recently published this work in the journal Molecules, an open access publication. Thanks to Prof. Steve Scheiner for inviting us to participate in the special issue devoted to *tetrel bonding*. Thanks to Guillermo Caballero for the inception of this project and to Dr. Jacinto Sandoval for taking the time from his research in photosynthesis to work on this pet project of ours and of course the rest of the students (Gustavo Mondragón, Marco Diaz, Raúl Torres) whose hard work produced this work.

## Dr. Gabriel Merino wins The Walter Kohn Prize 2018

Just as I was thinking about the state of Mexican scientific environment in the global scale, Prof. Dr. Gabriel Merino from CINVESTAV comes and gets this prize awarded by the International Center for Theoretical Physics (ICTP) and the Quantum ESPRESSO Foundation, showing us all that great science is possible even under pressing circumstances.

This prize is awarded biennially to a young scientist for outstanding contributions in the field of quantum-mechanical materials and molecular modeling, performed in a developing country or emerging economy,and in the case of Dr. Merino it is awarded not only for his contributions to theory and applications but also by his contributions to the prediction of novel systems that violate standard chemical paradigms, broadening the scope of concepts like aromaticity, coordination and chemical bond. The list of his contributions is very long despite his young age and there are barely any topic in chemistry or materials science that escapes his interest.

Gabriel is also one of the leading organizers of the Mexican Theoretical Physical Chemistry Meeting, an unstoppable mentor with many of his former students now leading research teams of their own. He is pretty much a force of nature.

Congratulations to Dr. Gabriel Merino, his team, CINVESTAV and thanks for being such an inspiration and a good friend at the same time.

¡Felicidades, Gabriel!