How is ligand binding modeled?

How is ligand binding modeled?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I have the following exercise to solve:

To be honest, for both parts, my only idea so far would be to divide the rate expression by the sphere area and multiply by the new available areas (that of a circle for part a and that of many circles for part b). But if I do this, I will have an "a" parameter always, which makes no sense because it is representative only of the sphere. Any ideas on how to approach this? All help is greatly appreciated.

don't use the perfect square model. You will keep getting Pi but it's not a real life formulae only used for theory. Try to use a Cartesian graph but three dimensions to take into account curve flexibility. Integral calculus is similar too or better.

Study of protein-ligand binding by fluorescence

Physical Biochemistry Laboratory, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay. Tel.: 5982-525-86-18 Fax: 5982-525-07-49Search for more papers by this author

Physical Biochemistry Laboratory, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay

Physical Biochemistry Laboratory, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay

Physical Biochemistry Laboratory, Facultad de Ciencias, Universidad de la República, Iguá 4225, 11400 Montevideo, Uruguay. Tel.: 5982-525-86-18 Fax: 5982-525-07-49Search for more papers by this author

For two binding sites and two ligands, L and D, the binding of L is:

Note that the unbound concentration of each ligand is needed to predict the binding. Usually experiments are designed using accurately known total ligand concentrations and measurements are made of the bound ligand. The binding of all the ligands in the system must be known to predict the unbound concentration. This can be done using a general binding model described by Feldman (1972) and implemented by Munson & Rodbard (1980). This model is implemented in the BIND library in MKMODEL.

  • Feldman HA. Mathematical theory of complex ligand-binding systems at equilibrium. Analyt Biochem 1972 48:317-338
  • Munson PJ, Rodbard D. LIGAND: A versatile computerised approach for characterisation of ligand binding systems. Analyt Biochem 1980 107:220-239

11: Binding proteins- Antibodies, Myoglobin/Hemoglobin

IgG Antibody Structure: light chains are in green and dark blue, heavy chains in light blue and orange, disulfide bonds in yellow spacefill, carbohydrate in red wireframe.

3. For Immunoaffinity chromatography, antibodies specific to your protein are bound to beads and used to purify your protein. Can sometimes be eluted with salt, but may require stronger means (concentrated urea, SDS) to elute protein.

Procedure for antibody production:

  1. Purify a small amount of your protein.
  2. Inject a sample into a mammal (usually a rabbit, mouse, rat or goat) or a chicken.
  3. Wait and do a second injection to increase immune response.
  4. The animal's immune system produces antibodies that will bind tightly to the injected protein.
  5. Harvest blood (mammal) or egg (chicken).
  6. Use blood serum directly or do additional purification from this "antiserum".
  7. Lipids and other contaminants in egg yolk must be separated from antibodies prior to use.

Antibodies are bound to beads for use in immunoaffinity chromatography (antibody affinity chromatography).
Antibodies labeled with radioactive, fluorescent or histochemical (an attached enzyme that will make color from a colorless compound) can be used to detect proteins that have been run on a gel.

Three Dimensional Protein Structure and Ligand Binding

Students will synthesize a protein as individual amino acids and function as a catalyst.

Enzymes: 3-D Structure and Function

Understanding how enzymes function is extremely important for developing medical treatments and technologies. This Enzyme 3-D Structure and Function Activity is designed for high school biology students and familiarizes them with features of protein structure that determine its function. The overall activity consists of three portions: 1 st is an Introductory Presentation, 2 nd is a Kinesthetic activity, and 3 rd is a 3-D Structure Visualization. The Introductory Presentation sets the tone for the activity and gives the students background on proteins and amino acids. It also introduces enzymes and their various physiological functions. The Kinesthetic activity presents the students with the challenge of folding a protein correctly. They each will represent an amino acid and will form a polypeptide chain by holding hands with each other and form the lowest energy conformation. The 3-D Structure Visualization allows students to see a molecular view of an enzyme and its active site from an atomic perspective. Students will be assessed with a survey that questions what they knew prior, what they learned, and what they want to know more about. 

Duration: 60-95 minutes

Learning Objectives:

  • What proteins are made of and how they are chemically structured in general
  • An enzyme is a special class of proteins that can perform catalytic reactions converting substrates into products
  • Catalysts are not consumed in the reaction but are reused
  • The active site is necessary to performing this function but can be disrupted by inhibitors or mutations
  • The SHAPE of a protein affects its FUNCTION
  • Changing the enzyme’s environment will alter its function


Sea Urchin Fertilization and Development Activity

In this activity the students will learn about model organisms and cell development by fertilizing sea urchin eggs and monitoring their development. Sea urchin embryos are used as a model organism in Amro Hamdoun’s lab at the Scripps Research Institute at UC San Diego to study a class of proteins called ABC-transporters. These proteins are important for effluxing toxic chemical substrates from cells and also cause certain cancer cells to become resistant to chemotherapy. Sea urchin growth and development can be easily seen with a simple light microscope and the main stages of embryogenesis can be monitored over a few days. The extraction of sperm and eggs from sea urchins is relatively easy and if done appropriately, the urchins can be recycled to collect gametes multiple times. Briefly, the students will fertilize Lytechinus pictus urchin eggs and observe the changes in cellular morphology over time. They will describe and illustrate what they observe throughout the two day activity. Incorporating inquiry into the activity, the students will then make serial dilutions of the sperm and observe any differences in fertilization when these dilutions are added to the eggs. Students will be assessed by answering questions at the end of each section that requires them to make and record their observations. This activity is a modification of multiple sea urchin fertilization protocols.

Discussion and conclusion

In this work, we consider binding is a local event and emphasize the local information in target-ligand interaction prediction. We apply site-ligand interactions instead of target-ligand interactions and propose a chemical interpretable model to cover the site-ligand interactions. We first extract the ligand-binding sites from target-ligand complexes. Then we break the binding sites and ligands into fragments so that they can be encoded as fragment vectors based on target and ligand dictionary respectively. Finally, we assume that the fragments interactions determine the site-ligand interaction and propose a model, fragment interaction model (FIM), to generalize the assumption. The proposed model demonstrates higher AUC score (92%) with respect to two prevalence algorithms CS-PD (80%), BLM-NII (85%) and RF (85%). In addition, the fragment interaction network origined from FIM is chemical interpretable. Comparing to BLM-NII, RF and CS-PD model, it require crystal structure to extract local information (binding site) in FIM, which hinder the applying of FIM sometimes. However, with the increasing determination of protein crystal structures and the developing molecular modeling technique, we can model a 3D structure by computer, and extract the binding site.

Compared with traditional target-based or ligand-based approaches, the proposed FIM method has the advantages of finding target candidates and ligand candidates simultaneously. Moreover, FIM can predict the interaction between previously unseen targets and ligand candidates. Different with other target-ligand based methods, our method emphasizes the basic chemical interactions between amine acids and ligand fragments, which is more general and could be applied beyond drugtarget interactions. Furthermore, we no longer represent the target as a whole but extract the ligand-binding sites from target-ligand complexes and apply the binding sites to describe the genomic space. For one hand, representing the genomic space by binding sites allows us provide site-ligand interaction prediction, which is important for multi-site targets. For another hand, the binding sites are local, which facilitate to achieve chemical interpretable model. Along this way, we break the binding sites and ligands into fragments, and regard the fragment interactions as genomic and chemical space interactions. We know clearly about how the genomic space interacts with chemical space under FIM.

In all, we highlight the local information during the binding process and attempt to figure out a clear relationship between the genomic and chemical spaces. The proposed model (FIM) applies the ligand binding sites as local information and views the binding site and ligand fragment interactions as genomic and chemical space interactions. The fragment interactions are straightforward and chemical interpretable, and the fragment interaction network reflect the chemical interactions. The comparison result shows that FIM outperforms other three approaches. The investigation on the role of global information shows that the local information dominate the predictive accuracy and integrating of the global information might promote the predictive ability to a very limited extent.

Aims of this article

A characteristic feature of most integrin receptors is their ability to bind a wide variety of ligands. Moreover, many extracellular matrix and cell surface adhesion proteins bind to multiple integrin receptors (Humphries, 1990 Plow et al., 2000 van der Flier and Sonnenberg, 2001). In recent years, structure-function analyses of both integrins and their ligands have revealed a similar mode of molecular interaction that explains this promiscuity. Nonetheless, the integrin literature is replete with studies describing different integrin-ligand pairs, and the major aim of this article is to provide a clarification of this picture.

Materials and Methods

Data sets and decoys generation

The PDBbind v2017 forms the source of our present study and it contains aver 16,151 Protein–ligand complexes (Wang et al., 2005). Around 270 complexes that contain rarely occurring atom types (such as) that could not be processed by RDKit was removed (Wildman & Crippen, 1999) and after cleaning the data, we have 14,491 protein–ligand complexes for further processing. The fpocket tool was used to generate the pockets for the given protein with the default parameters (Le Guilloux, Schmidtke & Tuffery, 2009). The pockets and their corresponding ligands were used to generate the input for our model. The data preparation process is similar to our previous work (Zhang et al., 2019a). The ligands were converted into SMILES format by open babel (O’Boyle et al., 2011) and then converted into a 300-dimensional vector by mol2vec tool (Jaeger, Fulle & Turk, 2018). The basic idea of mol2vec is to consider the SMILES string as molecular sentence which are composed of words (substructure), and like the natural language processing method word2vec, an unsupervised machine learning method was used to construct the mol2vec by learning vector of each word based on a large amount of available chemical compounds dataset (corpus) (Krallinger et al., 2015). The residues in the pocket are converted into a 300-dimensional vector by mol2vec tool and summed up into a 300-dimensional vector to represent the pocket. The protein–ligand binding is then represented as a 600-dimension vector which concatenates the ligand vector and pocket vector by an in-house python script. The three-dimensional structures shown in the article are plotted by using Visual Molecular Dynamics (VMD) and Chimera (Humphrey, Dalke & Schulten, 1996 Pettersen et al., 2004).

Positive and negative datasets

We constructed two positive datasets with different strategies. Strategy 1 is by defining a collection of atoms that fall within 1 nm around the known ligand as the known pocket. The potential pockets that are close to the known pockets (have C alpha center distance with known pocket smaller than 0.3 nm) were taken as the positive dataset. Strategy 2 is that known pockets are chosen directly as the positive dataset. The negative dataset is constructed from fpocket predictions on the potential pockets of the proteins in the PDBbind database. The parameters are set to default to perform fpocket prediction. We randomly choose three pockets that have C alpha center distance with a native pocket larger than 1 nm. If the number of pockets that have larger than 1 nm distance to the native pocket is less than three, we consider all the pockets. The selected predicted pockets together with the ligands are taken as the negative dataset. If the decoy pocket is far away from the known ligand binding pocket (center of distance between a known pocket and decoy pocket are larger than 3 nm) and the pocket’s vector is not similar to that known ligand binding pocket, we assume the pocket is not the near-native pocket of ligand. We understand that this is a big approximation, we can’t guarantee the defined non-ligand binding pocket is not a druggable site, but it is highly possible that defined non-ligand binding pocket was not the given ligand’s binding site. The deep learning can tolerate noise (very small portions of unreliable data) and such approximation still can be used in the construction of our model.

To make sure each vector of decoy pocket is far away from their corresponding known pocket we have used the following formula to calculate their vector similarity. Using the cutoff value of 0.995, we remove those pockets that are highly similar to the native one.

(1) S i j = ( V i ∗ V j ) / ( | V i | ∗ | V j | ) where the Sij is the measured similarity between pocket i and pocket j, the Vi is the vector of pocket i, and the Vj is the vector of pocket j. The dataset was divided into two groups as near natives as positive “A” and native pockets as positive “B”. The training, validation, testing for two groups of datasets is shown in Fig. 1. The model generated with training A is validated and tested with dataset B and vise versa.

Figure 1: Classification of datasets into different groups.

Preparation of extra test sets

We have collected protein–ligand complexes that are deposited in the PDB database after the year 2018 (Berman et al., 2000). We remove redundancy by only keeping one PDB structure if the structures are from the same gene. These protein–ligand complexes are not in the PDBbind 2018 dataset (Wang et al., 2005) and used as an extra testing set. The extra testing set is further divided into three parts: 11 cases that fpocket have generated near-native pockets (extra test set A) 6 cases that fpocket have not generated near-native pockets (extra test set B) 2 cases have two pockets corresponding to different ligands (extra test set C). The classification and grouping of data are presented in Fig. 1. The 6 cases that fpocket can’t generate near-native decoys were used the native pocket as positive. The proteins with PDB identifier 6QTN (A, F chain) and 5ZG2 contain two pockets with different ligands bound. We attempted to check whether our method can successfully identify correct pockets for each of the ligands. The known pocket and near-native pocket are defined by the same method as above. The proteins in the extra test set A and B were subjected to the prediction by the P2Rank with its default parameters for comparison (Krivák & Hoksza, 2018 Jendele et al., 2019).

Preparation of G-protein coupled receptor independent test dataset

We retrieved 98 GPCR-ligand complexes from the GPCRDB database ( (Pándy-Szekeres et al., 2018). The near-native pocket was defined as the same as the previously mentioned procedure. We are interested to test whether our methods can perform well on those challenging GPCR proteins, characteristic of a true structure-based drug discovery application scenario.

Construction of deep learning model

The details of the model construction procedure are illustrated in Fig. 2. It contains data processing, model training, validation and testing. We use the DFCNN inspired by DenseNet as our model (Huang et al., 2017). The DFCNN model architecture is similar to our previous work (Zhang et al., 2019a). The fully connected neural network is suitable for vectors as inputs. Moreover, DenseNet can overcome the gradient vanishing problem and allows many deep layers for learning more abstract features. This model has shown good performance in identifying protein–ligand binding affinity in our previous work (Zhang et al., 2019a). It has advantages in protein–ligand binding estimation over most other machine learning methods including Support Vector Machine (Suykens & Vandewalle, 1999), RandomForest (Breiman, 2001), XGBoost (Chen & Guestrin, 2016), Convolutional Neural Network (CNN) (Krizhevsky, Sutskever & Hinton, 2012). Densely fully-connected neural network (DFCNN) and CNN were built using Keras (Chollet, 2015) with Tensorflow back end (Abadi et al., 1983). The DFCNN has 16 densely connected layers outputting 100 units simultaneously plus a normal fully-connected layer outputting one unit as the final output. Specifically, densely connected layer refers to a layer taking all outputs of its preceding layers as its input which can remarkably solve the problem of gradient vanishing. Rates for dropout layers are all set to 0.25. The dense layers employed ReLU activation function except for the output layers which employed sigmoid activation function. Input for the network has been normalized to make its mean and standard deviation to be 0 and 1 separately. The Adam optimizer was used to minimize the binary cross-entropy of DFCNN.

Figure 2: The workflow of DeepBindPoc model.

Data normalization and performance evaluation

All the data have been normalized before the final input for the model. The normalization is as follows: (2) t = d a t a _ s e t (3) t n o r m a l i z e = ( t − m e a n ) / s t d where t is the data set value, the mean is the data mean value and the std is the data standard deviation. We have tested two normalization strategies, one is based on fixed mean and standard deviation values directly from the training dataset (mean = −0.5696 and std = 30.8744 for using near-native as positive and mean = −0.9610 and std = 63.6607 for using native as positive). The other strategy involves normalizing by dataset itself, which is used for comparison in the present study. We find normalize by training dataset is more reliable, so we used normalized by training dataset unless specifically stated. Several metrics were used to evaluate the proposed models, including accuracy, Area Under the receiver operating characteristic Curve (AUC) (Hanley & McNeil, 1982), Matthews Correlation Coefficient (MCC), True Positive Rate (TPR), specificity and sensitivity.

Web server

The protein structures and its known ligands in PDB format are required as input to the webserver We first use the fpocket to generate the pocket decoys and we use mol2vec to convert the pocket and ligand into the vectors. After the protein and ligand vectors are concatenated, DeepBindPoc will score decoys with native like possibility, and select the top three decoys as the potential pockets. The predicted pocket name was shown on the page along with the fpocket score and DeepBindPoc score. The results can also be downloaded as a file for the convenience of the user. We also provide the batch mode to the user and provide a zip file of proteins with their corresponding ligands. For both the single model and the batch model, we have provided an example input for the convenience of the user.


Accounting for the effect of solvent on the strength of molecular interactions has been a long-standing problem for molecular calculations in general and for structure-based drug design in particular. Here, we explore the generalized-Born (GB/SA) model of solvation (Still, W. C. Tempczyk, A. Hawley, R. C. Hendrickson, T. J.Am.Chem.Soc.1990, 112, 6127−9) to calculate ligand−receptor binding energies. The GB/SA approach allows for the estimation of electrostatic, van der Waals, and hydrophobic contributions to the free energy of binding. The GB/SA formulation provides a good balance between computational speed and accuracy in these calculations. We have derived a formula to estimate the binding free energy. We have also developed a procedure to penalize any unoccupied embedded space that might form between the ligand and the receptor during the docking process. To improve the computational speed, the protein contribution to the electrostatic screening is precalculated and stored on a grid. Refinement of the ligand position is required to optimize the nonbonded interactions between ligand and receptor. Our version of the GB/SA algorithm takes approximately 10 s per orientation (with minimization) on a Silicon Graphics R10000 workstation. In two test systems, dihydrofolate reductase (dhfr) and trypsin, we obtain much better results than the current DOCK (Ewing, T. J. A. Kuntz, I. D. J. Comput. Chem. 1997, 18, 1175−89) force field scoring method (Meng, E. C. Shoichet, B. K. Kuntz, I. D. J.Comput.Chem.1992, 13, 505−24). We also suggest a methodology to identify an appropriate parameter regime to balance the specificity and the generality of the equations.

Current address: Dalton Cardiovascular Research Center and Department of Biochemistry, University of Missouri, Columbia, MO 65211.

Current address: Computer-Assisted Drug Design, Bristol-Myers Squibb Company, 5 Research Parkway, Wallingford, CT 06492.

In papers with more than one author, the asterisk indicates the name of the author to whom inquiries about the paper should be addressed.

How is ligand binding modeled? - Biology

Many biologically important molecules have multiple binding sites. For example hemoglobin, the oxygen carrying molecule in red blood cells, binds four molecules of molecular oxygen, each binding at its own distinct binding site. Hemoglobin is an example of a cooperative molecule, that is one where the binding of a ligand at one site alter that the affinity of other binding sites for their ligands. In the case of hemoglobin, the binding of a molecule of O2 at one site increases the affinity of the other sites for O2.

This property is critical for the function of hemoglobin, which picks up four molecules of O2 in the oxygen-rich environment of the lungs, then delivers it to the tissues that have a much lower concentration of oxygen. As each molecule of O2bind to hemoglobin, the affinity of the remaining binding site increases, making it easier for more O2 to bind . Conversely, as each molecule of O2 is released to a tissue, the affinity of the remaining sites for O2 decreases, making it easier for subsequent molecules of O2 to dissociate.

Scanning electron micrograph of blood. The doughnut-shaped cells are red blood cells. Photo credit: Bruce Wetzel. Courtesy of the National Cancer Institute.

The problem of how hemoglobin delivers oxygen throughout the body has been studied for the past 100 years. In 1910, biochemist Archibald Hill modeled this property of hemoglobin using the rational function,

where &theta is the percentage of binding sites occupied, [L] is the concentration of ligand, n is the Hill coefficient, which represents the degree of cooperativity, and Kd is the dissociation constant. Recall that Kd is equal to the ligand concentration when half of the binding sites are filled.

A common application of the Hill equation is modeling cooperative enzymes. These enzymes are under allosteric control, that is the binding of a molecule at one site alters the affinity of the enzyme for its substrate and hence regulates the enzyme activity. In this case, the Hill equation is rewritten as the rational function,

where V is the reaction velocity, Vmax is the maximum reaction velocity, and [S] is the substrate concentration. The constant K is analogous to the Michaelis constant (Km) and n is the Hill coefficient indicating the degree of cooperativity.

Hill coefficient Cooperativity
n = 1 none
n > 1 positive
n < 1 negative

Positive cooperativity occurs when an enzyme has several sites to which a substrate can bind, and the binding of one substrates molecules increases the rate of binding of other substrates. Cooperativity can be recognized by plotting velocity against substrate concentration. An enzyme that displays positive cooperativity sill be sigmoidal (or S-shaped), while noncooperative enzymes display Michaelis-Menten kinetics and the plots are hyperbolic.

Use the Hill equation to answer the following questions:

Watch the video: 066-Ligand Binding (August 2022).