- Browse by Author
Browsing by Author "Shao, Yihan"
Now showing 1 - 10 of 10
Results Per Page
Sort Options
Item Accelerated computation of free energy profile at ab initio quantum mechanical/molecular mechanical accuracy via a semi-empirical reference potential. II. Recalibrating semi-empirical parameters with force matching(The Royal Society of Chemistry, 2019-08-15) Pan, Xiaoliang; Li, Pengfei; Ho, Junming; Pu, Jingzhi; Mei, Ye; Shao, Yihan; Chemistry and Chemical Biology, School of ScienceAn efficient and accurate reference potential simulation protocol is proposed for producing ab initio quantum mechanical molecular mechanical (AI-QM/MM) quality free energy profiles for chemical reactions in a solvent or macromolecular environment. This protocol involves three stages: (a) using force matching to recalibrate a semi-empirical quantum mechanical (SE-QM) Hamiltonian for the specific reaction under study; (b) employing the recalibrated SE-QM Hamiltonian (in combination with molecular mechanical force fields) as the reference potential to drive umbrella samplings along the reaction pathway; and (c) computing AI-QM/MM energy values for collected configurations from the sampling and performing weighted thermodynamic perturbation to acquire AI-QM/MM corrected reaction free energy profile. For three model reactions (identity SN2 reaction, Menshutkin reaction, and glycine proton transfer reaction) in aqueous solution and one enzyme reaction (Claisen arrangement in chorismate mutase), our simulations using recalibrated PM3 SE-QM Hamiltonians well reproduced QM/MM free energy profiles at the B3LYP/6–31G* level of theory all within 1 kcal/mol with a 20 to 45 fold reduction in the computer time.Item Accelerating ab initio QM/MM Molecular Dynamics Simulations with Multiple Time Step Integration and a Recalibrated Semi-empirical QM/MM Hamiltonian(American Chemical Society, 2022-06-02) Pan, Xiaoliang; Van, Richard; Epifanovsky, Evgeny; Liu, Jian; Pu, Jingzhi; Nam, Kwangho; Shao, Yihan; Chemistry and Chemical Biology, School of ScienceMolecular dynamics (MD) simulations employing ab initio quantum mechanical and molecular mechanical (ai-QM/MM) potentials are considered to be the state of the art, but the high computational cost associated with the ai-QM calculations remains a theoretical challenge for their routine application. Here, we present a modified protocol of the multiple time step (MTS) method for accelerating ai-QM/MM MD simulations of condensed-phase reactions. Within a previous MTS protocol [Nam J. Chem. Theory Comput. 2014, 10, 4175], reference forces are evaluated using a low-level (semiempirical QM/MM) Hamiltonian and employed at inner time steps to propagate the nuclear motions. Correction forces, which arise from the force differences between high-level (ai-QM/MM) and low-level Hamiltonians, are applied at outer time steps, where the MTS algorithm allows the time-reversible integration of the correction forces. To increase the outer step size, which is bound by the highest-frequency component in the correction forces, the semiempirical QM Hamiltonian is recalibrated in this work to minimize the magnitude of the correction forces. The remaining high-frequency modes, which are mainly bond stretches involving hydrogen atoms, are then removed from the correction forces. When combined with a Langevin or SIN(R) thermostat, the modified MTS-QM/MM scheme remains robust with an up to 8 (with Langevin) or 10 fs (with SIN(R)) outer time step (with 1 fs inner time steps) for the chorismate mutase system. This leads to an over 5-fold speedup over standard ai-QM/MM simulations, without sacrificing the accuracy in the predicted free energy profile of the reaction.Item Bridging semiempirical and ab initio QM/MM potentials by Gaussian process regression and its sparse variants for free energy simulation(AIP, 2023) Snyder, Ryan; Kim, Bryant; Pan, Xiaoliang; Shao, Yihan; Pu, Jingzhi; Chemistry and Chemical Biology, School of ScienceFree energy simulations that employ combined quantum mechanical and molecular mechanical (QM/MM) potentials at ab initio QM (AI) levels are computationally highly demanding. Here, we present a machine-learning-facilitated approach for obtaining AI/MM-quality free energy profiles at the cost of efficient semiempirical QM/MM (SE/MM) methods. Specifically, we use Gaussian process regression (GPR) to learn the potential energy corrections needed for an SE/MM level to match an AI/MM target along the minimum free energy path (MFEP). Force modification using gradients of the GPR potential allows us to improve configurational sampling and update the MFEP. To adaptively train our model, we further employ the sparse variational GP (SVGP) and streaming sparse GPR (SSGPR) methods, which efficiently incorporate previous sample information without significantly increasing the training data size. We applied the QM-(SS)GPR/MM method to the solution-phase SN2 Menshutkin reaction, NH3+CH3Cl→CH3NH3++Cl-, using AM1/MM and B3LYP/6-31+G(d,p)/MM as the base and target levels, respectively. For 4000 configurations sampled along the MFEP, the iteratively optimized AM1-SSGPR-4/MM model reduces the energy error in AM1/MM from 18.2 to 4.4 kcal/mol. Although not explicitly fitting forces, our method also reduces the key internal force errors from 25.5 to 11.1 kcal/mol/Å and from 30.2 to 10.3 kcal/mol/Å for the N-C and C-Cl bonds, respectively. Compared to the uncorrected simulations, the AM1-SSGPR-4/MM method lowers the predicted free energy barrier from 28.7 to 11.7 kcal/mol and decreases the reaction free energy from -12.4 to -41.9 kcal/mol, bringing these results into closer agreement with their AI/MM and experimental benchmarks.Item Doubly Polarized QM/MM with Machine Learning Chaperone Polarizability(American Chemical Society, 2021) Kim, Bryant; Shao, Yihan; Pu, Jingzhi; Chemistry and Chemical Biology, School of ScienceA major shortcoming of semiempirical (SE) molecular orbital methods is their severe underestimation of molecular polarizability compared with experimental and ab initio (AI) benchmark data. In a combined quantum mechanical and molecular mechanical (QM/MM) treatment of solution-phase reactions, solute described by SE methods therefore tends to generate inadequate electronic polarization response to solvent electric fields, which often leads to large errors in free energy profiles. To address this problem, here we present a hybrid framework that improves the response property of SE/MM methods through high-level molecular-polarizability fitting. Specifically, we place on QM atoms a set of corrective polarizabilities (referred to as chaperone polarizabilities), whose magnitudes are determined from machine learning (ML) to reproduce the condensed-phase AI molecular polarizability along the minimum free energy path. These chaperone polarizabilities are then used in a machinery similar to a polarizable force field calculation to compensate for the missing polarization energy in the conventional SE/MM simulations. Because QM atoms in this treatment host SE wave functions as well as classical polarizabilities, both polarized by MM electric fields, we name this method doubly polarized QM/MM (dp-QM/MM). We demonstrate the new method on the free energy simulations of the Menshutkin reaction in water. Using AM1/MM as a base method, we show that ML chaperones greatly reduce the error in the solute molecular polarizability from 6.78 to 0.03 Å3 with respect to the density functional theory benchmark. The chaperone correction leads to ~10 kcal/mol of additional polarization energy in the product region, bringing the simulated free energy profiles to closer agreement with the experimental results. Furthermore, the solute-solvent radial distribution functions show that the chaperone polarizabilities modify the free energy profiles through enhanced solvation corrections when the system evolves from the charge-neutral reactant state to the charge-separated transition and product states. These results suggest that the dp-QM/MM method, enabled by ML chaperone polarizabilities, provides a very physical remedy for the underpolarization problem in SE/MM-based free energy simulations.Item Facilitating Ab Initio QM/MM Free Energy Simulations by Gaussian Process Regression with Derivative Observations(Royal Society of Chemistry, 2022-10-27) Snyder, Ryan; Kim, Bryant; Pan, Xiaoliang; Shao, Yihan; Pu, Jingzhi; Chemistry and Chemical Biology, School of ScienceIn combined quantum mechanical and molecular mechanical (QM/MM) free energy simulations, how to synthesize the accuracy of ab initio (AI) methods with the speed of semiempirical (SE) methods for a cost-effective QM treatment remains a long-standing challenge. In this work, we present a machine-learning-facilitated method for obtaining AI/MM-quality free energy profiles through efficient SE/MM simulations. In particular, we use Gaussian process regression (GPR) to learn the energy and force corrections needed for SE/MM to match with AI/MM results during molecular dynamics simulations. Force matching is enabled in our model by including energy derivatives into the observational targets through the extended-kernel formalism. We demonstrate the effectiveness of this method on the solution-phase SN2 Menshutkin reaction using AM1/MM and B3LYP/6-31+G(d,p)/MM as the base and target levels, respectively. Trained on only 80 configurations sampled along the minimum free energy path (MFEP), the resulting GPR model reduces the average energy error in AM1/MM from 18.2 to 5.8 kcal mol-1 for the 4000-sample testing set with the average force error on the QM atoms decreased from 14.6 to 3.7 kcal mol-1 Å-1. Free energy sampling with the GPR corrections applied (AM1-GPR/MM) produces a free energy barrier of 14.4 kcal mol-1 and a reaction free energy of -34.1 kcal mol-1, in closer agreement with the AI/MM benchmarks and experimental results.Item Machine learning based implicit solvent model for aqueous-solution alanine dipeptide molecular dynamics simulations(Royal Society of Chemistry, 2023-02-03) Yao, Songyuan; Van, Richard; Pan, Xiaoliang; Park, Ji Hwan; Mao, Yuezhi; Pu, Jingzhi; Mei, Ye; Shao, Yihan; Chemistry and Chemical Biology, School of ScienceInspired by the recent work from Noé and coworkers on the development of machine learning based implicit solvent model for the simulation of solvated peptides [Chen et al., J. Chem. Phys., 2021, 155, 084101], here we report another investigation of the possibility of using machine learning (ML) techniques to "derive" an implicit solvent model directly from explicit solvent molecular dynamics (MD) simulations. For alanine dipeptide, a machine learning potential (MLP) based on the DeepPot-SE representation of the molecule was trained to capture its interactions with its average solvent environment configuration (ASEC). The predicted forces on the solute deviated only by an RMSD of 0.4 kcal mol-1 Å-1 from the reference values, and the MLP-based free energy surface differed from that obtained from explicit solvent MD simulations by an RMSD of less than 0.9 kcal mol-1. Our MLP training protocol could also accurately reproduce combined quantum mechanical molecular mechanical (QM/MM) forces on the quantum mechanical (QM) solute in ASEC environment, thus enabling the development of accurate ML-based implicit solvent models for ab initio-QM MD simulations. Such ML-based implicit solvent models for QM calculations are cost-effective in both the training stage, where the use of ASEC reduces the number of data points to be labelled, and the inference stage, where the MLP can be evaluated at a relatively small additional cost on top of the QM calculation of the solute.Item Machine learning based implicit solvent model for aqueous-solution alanine dipeptide molecular dynamics simulations(RSC, 2023) Yao, Songyuan; Van, Richard; Pan, Xiaoliang; Park, Ji Hwan; Mao, Yuezhi; Pu, Jingzhi; Mei, Ye; Shao, Yihan; Chemistry and Chemical Biology, School of ScienceInspired by the recent work from Noé and coworkers on the development of machine learning based implicit solvent model for the simulation of solvated peptides [Chen et al., J. Chem. Phys., 2021, 155, 084101], here we report another investigation of the possibility of using machine learning (ML) techniques to “derive” an implicit solvent model directly from explicit solvent molecular dynamics (MD) simulations. For alanine dipeptide, a machine learning potential (MLP) based on the DeepPot-SE representation of the molecule was trained to capture its interactions with its average solvent environment configuration (ASEC). The predicted forces on the solute deviated only by an RMSD of 0.4 kcal mol−1 Å−1 from the reference values, and the MLP-based free energy surface differed from that obtained from explicit solvent MD simulations by an RMSD of less than 0.9 kcal mol−1. Our MLP training protocol could also accurately reproduce combined quantum mechanical molecular mechanical (QM/MM) forces on the quantum mechanical (QM) solute in ASEC environment, thus enabling the development of accurate ML-based implicit solvent models for ab initio-QM MD simulations. Such ML-based implicit solvent models for QM calculations are cost-effective in both the training stage, where the use of ASEC reduces the number of data points to be labelled, and the inference stage, where the MLP can be evaluated at a relatively small additional cost on top of the QM calculation of the solute.Item Machine-Learning-Assisted Free Energy Simulation of Solution-Phase and Enzyme Reactions(ACS, 2021-09) Pan, Xiaoliang; Yang, Junjie; Van, Richard; Epifanovsky, Evgeny; Ho, Junming; Huang, Jing; Pu, Jingzhi; Mei, Ye; Nam, Kwangho; Shao, Yihan; Chemistry and Chemical Biology, School of ScienceDespite recent advances in the development of machine learning potentials (MLPs) for biomolecular simulations, there has been limited effort on developing stable and accurate MLPs for enzymatic reactions. Here we report a protocol for performing machine-learning-assisted free energy simulation of solution-phase and enzyme reactions at the ab initio quantum-mechanical/molecular-mechanical (ai-QM/MM) level of accuracy. Within our protocol, the MLP is built to reproduce the ai-QM/MM energy and forces on both QM (reactive) and MM (solvent/enzyme) atoms. As an alternative strategy, a delta machine learning potential (ΔMLP) is trained to reproduce the differences between the ai-QM/MM and semiempirical (se) QM/MM energies and forces. To account for the effect of the condensed-phase environment in both MLP and ΔMLP, the DeePMD representation of a molecular system is extended to incorporate the external electrostatic potential and field on each QM atom. Using the Menshutkin and chorismate mutase reactions as examples, we show that the developed MLP and ΔMLP reproduce the ai-QM/MM energy and forces with errors that on average are less than 1.0 kcal/mol and 1.0 kcal mol–1 Å–1, respectively, for representative configurations along the reaction pathway. For both reactions, MLP/ΔMLP-based simulations yielded free energy profiles that differed by less than 1.0 kcal/mol from the reference ai-QM/MM results at only a fraction of the computational cost.Item Reaction Path-Force Matching in Collective Variables: Determining Ab Initio QM/MM Free Energy Profiles by Fitting Mean Force(American Chemical Society, 2021) Kim, Bryant; Snyder, Ryan; Nagaraju, Mulpuri; Zhou, Yan; Ojeda-May, Pedro; Keeton, Seth; Hege, Mellisa; Shao, Yihan; Pu, Jingzhi; Chemistry and Chemical Biology, School of ScienceFirst-principles determination of free energy profiles for condensed-phase chemical reactions is hampered by the daunting costs associated with configurational sampling on ab initio quantum mechanical/molecular mechanical (AI/MM) potential energy surfaces. Here, we report a new method that enables efficient AI/MM free energy simulations through mean force fitting. In this method, a free energy path in collective variables (CVs) is first determined on an efficient reactive aiding potential. Based on the configurations sampled along the free energy path, correcting forces to reproduce the AI/MM forces on the CVs are determined through force matching. The AI/MM free energy profile is then predicted from simulations on the aiding potential in conjunction with the correcting forces. Such cycles of correction-prediction are repeated until convergence is established. As the instantaneous forces on the CVs sampled in equilibrium ensembles along the free energy path are fitted, this procedure faithfully restores the target free energy profile by reproducing the free energy mean forces. Due to its close connection with the reaction path-force matching (RP-FM) framework recently introduced by us, we designate the new method as RP-FM in collective variables (RP-FM-CV). We demonstrate the effectiveness of this method on a type-II solution-phase SN2 reaction, NH3 + CH3Cl (the Menshutkin reaction), simulated with an explicit water solvent. To obtain the AI/MM free energy profiles, we employed the semiempirical AM1/MM Hamiltonian as the base level for determining the string minimum free energy pathway, along which the free energy mean forces are fitted to various target AI/MM levels using the Hartree-Fock (HF) theory, density functional theory (DFT), and the second-order Møller-Plesset perturbation (MP2) theory as the AI method. The forces on the bond-breaking and bond-forming CVs at both the base and target levels are obtained by force transformation from Cartesian to redundant internal coordinates under the Wilson B-matrix formalism, where the linearized FM is facilitated by the use of spline functions. For the Menshutkin reaction tested, our FM treatment greatly reduces the deviations on the CV forces, originally in the range of 12-33 to ∼2 kcal/mol/Å. Comparisons with the experimental and benchmark AI/MM results, tests of the new method under a variety of simulation protocols, and analyses of the solute-solvent radial distribution functions suggest that RP-FM-CV can be used as an efficient, accurate, and robust method for simulating solution-phase chemical reactions.Item Training Machine Learning Potentials for Reactive Systems: A Colab Tutorial on Basic Models(Wiley, 2024) Pan, Xiaoliang; Snyder, Ryan; Wang, Jia-Ning; Lander, Chance; Wickizer, Carly; Van, Richard; Chesney, Andrew; Xue, Yuanfei; Mao, Yuezhi; Mei, Ye; Pu, Jingzhi; Shao, Yihan; Chemistry and Chemical Biology, School of ScienceIn the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field - the training of system-specific MLPs for reactive systems - with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.