From Computational Physics to Structural Bioinformatics

Part 0

Fernando Luís BARROSO da Silva Department of Biomolecular Sciences, FCFRP/USP, Brazil [email protected] March 09, 2020

1

or Computational physics applied to biological science or

where Physics, Biology and computer science meet [pictures from internet, unknown internet,from[picturesunknown author]

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

2

1 where Physics, Biology and

computer science meet

Pressure [pictures from internet, unknown internet,from[picturesunknown author] Temperature

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

3

Multi-scale & multi-approach modeling

4

2 Molecular base to understand diseases, pharmaceuticals, bioseparation processes, new functionalized materials, ....

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

5

General approach: to rationalize key applied systems From the REAL system to CG models

+ - + - + - - + - microencapsulation pH and salt

Example of measurements Electrostatic interactions Electrostatic

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

6

3 From Computational Physics to Structural Bioinformatics

1. M.P.Allen e D.J. Tildesley, Computer Simulation of Liquids, (Oxford University Press, 1989). 2. D. Frenkel e B. Smid, Understanding Molecular Simulation, (Academic Press, 2001). 3. T. Schlick, Molecular Modeling and Simulation, (Springer, 2010). 4. J.M. Haille, Molecular Dynamics Simulation: Elementary Methods, (Wiley-Interscience, 1997). 5. D.C. Rappaport, The Art of Molecular Dynamics Simulation, (Cambridge University Press, 1997). + specific scientific papers

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

7

….Structural Bioinformatics

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

8

4 Syllabus

✓ About us ✓ Bibliography

✓ From complex problems to key answers ✓ Early computer experiments ✓ What is Structural Bioinformatics? ✓ Molecular modeling ✓ Solving the model

✓ Electrostatic interactions & constant-pH simulation methods

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

9

Life is complex! [unknown author] [unknown

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

10

5 Physicist΄s toolbox

predict events

퐴 = −푘푇푙푛 푄 = 푈 − 푇푆

[unknown author] [unknown Free energy Free

11

푤(푟) = −푘퐵푇 ln[ 푔(푟)]

The radial distribution function hist(r) g(r) =

nobsV [Barroso da Silva, daSilva, 2000] [Barroso Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

12

6 Intermolecular interactions and the radial distribution function

[Skaf & Barroso da Silva, 1994]

Water

intermolecular interactions

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

13 [FB, SP LGD, & 2016] PD,

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

14

7 The radial distribution function How do I generate the configurations? hist(r) g(r) = nobsV …

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

15

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

16

8 How does computer simulation enter in this picture?

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

17

Easier to be done in a computer experiment! [unknown author] [unknown

18

9 Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

19

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

20

10 rdf

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

21

Structural Bioinformatics

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

22

11 We need to create a common language!

Computer simulation B.P.C. – DCBM/FCFRP - USP Fernando Barroso ([email protected]) SAIFR, March 9-15, 2020

23

“…is conceptualising biology in terms of molecules (in the sense of Physical chemistry) and applying “informatics techniques” (derived from disciplinessuch as applied maths, computer science and statistics) to understand and organise the information associated with these molecules, on a large scale.”

24

12 Homology modeling

biomolecular “Omics” Molecular proteomics simulation data mining modeling Binding affinities alignments molecular dynamics structure prediction protein interactions genomics docking

Statistics Physics

25

PSP Biological approach

1) Find a known structure with a similar sequence 2) Align the sequences 3) Model the unknown structure on the known structure using the alignment

26

13 Physical approach

1) Choose or make a model 2) Simulate

f (x) Solution of the Modelling model

What Force field? Molecular dynamics? Langevin dynamics? Genetic algorithm? simplification Monte Carlo? reduction in # sites accuracy practical relevance Poisson-Boltzmann?

27

Real world Physical approach

Computer Experiments — Our main tools Solve • Modeling the model • Implementation • Setup • Runs (simulation) • Analyses

Bioinformatics B.P.C. – DCBM/FCFRP - USP Fernando Barroso ([email protected]) SAIFR - March 9-15, 2020

28

14 Molecular modeling

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

29

Gwild ✓ Fundamental interactions General questions ✓ Each aa contribution G ✓ Molecular mechanisms mutant

More specific questions CaM-smMLK

• Complex? In what condition? Is it stable? • Contact residues? • Mean smMLK conformation? • What sites have changes on pKas? Where are they localized? • Effect of Ca2+ on the CaM-smMLK? • Effect of smMLK on the Ca2+-CaM? • Critical mutation at contact • Other critical mutations

30

15 Models (general idea)

Mesoscopic models

31

1 1 푈 = 푘 푏 − 푏 2 + 푘 휃 − 휃 2 2 푏 0 2 휃 0 bonds angles 1 + 푘 1 + cos 푛휑 − 훿 2 휑 dihedrals 퐴 퐵 푞 푞 + − + 1 2 Direct approach 푟12 푟6 휀휀 푟 non−bonded 0 pairs

B2

Compare with experiments

32

16 33

Examples of force fields

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

34

17 1 1 푈 = 푘 푏 − 푏 2 + 푘 휃 − 휃 2 2 푏 0 2 휃 0 bonds angles 1 + 푘 1 + cos 푛휑 − 훿 2 휑 dihedrals 퐴 퐵 푞 푞 + − + 1 2 푟12 푟6 휀휀 푟 non−bonded 0 pairs

35

36

18 *

Bioinformatics B.P.C. – DCBM/FCFRP - USP Fernando Barroso ([email protected]) SAIFR - March 9-15, 2020

37

38

19 RMC aim

39

40

20 Solving the model

f (x) Solution of the Modelling model

Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

41

Molecular Dynamics

Protein Protein Thermodynamics Protein Dynamics Structure Protein Function

compute Forces

푑푟 퐹 Update 푣(푡) = = 푡 + 푣0 Analysis 푑푡 푚 F = ma Trajectory 퐹 푟(푡) = 푡2 + 푣 푡 + 푟 2푚 0 0 Time averages Update positions and velocities

42

21 ✓ Gromacs ✓ Charmm ✓ Amber ✓ Discover, Insight ✓ Sigma, Tripos, … ✓ NAMD ✓ Tinker ✓ Lammps... http://www.mdtutorials.com/gmx/lysozyme/index.html

43

Monte Carlo

Trial move

Ei Ej

E j − Ei 0 → Accept move −E E j − Ei 0 → Accept move if e random number generated uniformly on (0,1)

44

22 Monte Carlo

Protein Protein Thermodynamics Structure Protein Function

Random displacement 106 steps Evaluate Update ensemble Analysis Energies −E j Pj e

Accept or Ensemble averages reject

45

Monte Carlo

46

23 Simulation Methods for Protein Structure Fluctuations Scott H. Northrup and J. Andrew McCammon Biopolymers, Vol. 19, Issue 5, pp. 1001-1016 (1980)

Three numerical techniques for generating thermally accessible configurations of globular proteins are considered; these techniques are the molecular dynamics method, the Metropolis Monte Carlo method, and a modified Monte Carlo method which takes account of the forces acting on the protein atoms. The molecular dynamics method is shown to be more efficient than either of the Monte Carlo methods. Because it may be necessary to use Monte Carlo methods in certain important types of sampling problems, the behavior of these methods is examined in some detail. It is found that an acceptance ratio close to 1/6 yields optimum efficiency for the Metropolis method, in contrast to what is often assumed. This result, together with the overall inefficiency of the Monte Carlo methods, appears to arise from the anisotropic forces acting on the protein atoms due to their covalent bonding. Possible ways of improving the Monte Carlo methods are suggested.

47

x Computational physics

✓ Rationalize processes ✓ Molecular design patients ✓ To deeply understand protein function ✓ To identify new targets ✓ To develop methods and algorithms to determine Next Generation Sequencers protein characteristics ✓ To analyze drug binding

in silico drug discovery

biophysical characterization whole genome sequencing

48

24 Fernando Barroso, FCFRP/USP SAIFR - March 9-15, 2020

49

From Molecular to Macroscopic Scale (Bio2020)

Experiments, theory and computer simulation

Macroscopic scale

Biomolecular interactions

Single molecule Electrolyte solutions

50

25 From Molecular to Macroscopic Scale (Bio2020)

Experiments, theory and computer simulation

Macroscopic scale

Biomolecular interactions

Single molecule Electrolyte solutions

51

26