a quantitative matrix based MHC Class I & II Binding Peptide Prediction Server


Contents	Stepwise Help Documentation. Result Display Formats. Source of Matrices used for Prediction Prediction Algorithm
Introduction	The server allow identification and prediction of peptides/regions from the antigenic sequence binding with HLA class 1 and/or class II allels.The server identifies the experimentally proven binders (available in MHCBN database) in query antigen sequence.The prediction of HLA binders (36 class I and 51 class II) in antigen sequence ia based on quantitative matrices. The quantitative matrices for these HLA alleles were obtained from literature.
Stepwise Help	Enter the protein sequence for running prediction.The sequence can be loaded in the provided text box by cutting or pasting. The user can also upload the sequence file through browsing.Then select the format of sequence. The server accept both the formatted and unformatted sequence.The formatted sequence are accepted in almost every standard format like EMBL,PIR,etc.The server uses the ReadSeq program for reading the input sequences. Note: All the gaps, non standard characters except the standard protein sequence are ignored from the sequence. The sequence is accepted only in the single amino acid code. Single or multiple alleles can be selected from the list of 36 class I and 51 class II alleles. The user can also select all class I and Class II alleles through "Both" option in selection box.An example of selection of multiple alleles from both class I & II alleles is shown below.The user can select multiple MHC alleles by using the ALT or Meta Key. The server provides option of identification and predicting binder in antigen sequence. The user can perform prediction and identification separately (selecting either prediction or identification) or together by (selecting the "Both" option). Identification of experimental binders- The data of HLA binders for mapping was obtained from MHCBN version 3.1. The databases have more than 20,000 MHC binding peptides. Prediction of MHC binders The prediction of MHC binders is achieved by implementing the quantitative matrices.he matrices for class I alleles are obtained from Propred1 server. These matrices are originally derived from BIMAS server and literature. The matrices for class II alleles are obtained from ProPred1 servers. These matrices are originally derived from the virtual matrices of Sturniolo et al., 1999. Parameters for Prediction Threshold is an important parameter for prediction During prediction, user can choose a cutoff threshold, which is defined as the 'percentage of best scoring natural peptides'. For example, a threshold of 1% would predict peptides in any given protein sequence that belongs to the 1% best scoring natural peptides. The threshold parameter allows user to modulate the stringency level in prediction results. A lower threshold (1%) corresponds to a high stringency prediction, i.e. to a lower rate of false positives and to a higher rate of false negatives. In contrast, a higher threshold value (low stringency) corresponds to a higher rate of false positives and a lower rate of false negatives. Authors suggest that for first round of screening, threshold values higher than 3% are not desirable, since the rate of false positives can increase the size of the predicted repertoire to an amount unacceptable for later experimental testing.The value of threshold can vary from 1% to 10%. All peptide achieving score more then cutoff score at selected threshold will be assigned binders. DisplayThe server displays the results in two formats i) HTML Mapping ii)Tabular. These reults display formats will be described in result display section of help. Users can limit the number of top peptides to be displayed in tabular display. TOP of Page
Result Display Formats	The result will be dsiplayed in following two formats. HTML Mapping: This option will dispaly all the predicted binder in the input sequence by just coloring the residues.The N terminal of predicted binder is shown in red and all other residues in blue color.This option is very useful in detection of the promiscuous binding regions in the input sequence.The promiscuous binding regions are those which bind with many HLA alleles.This optionis may helpful in locating the regions which bind with both HLA class I and HLA class II alleles. The server also display the results in the tabular format which is most common output fromat of prediction servers.Each row of table is consisting of five columns. The first column demonstrates the name of allele. In case of prediction first column also depict cutoff score at selected threshold. The second and third columns specify rank and starting position of peptide in the protein sequence respectively. The forth column will show the peptide sequence itself. In case of prediction the fifth columns will illustrate the score of peptide (on the basis of score the rank shown in first column is decided). The last column will display the predicted state (binder or non-binder) on the basis peptide score and selected threshold. An example of tabular display is shown below. TOP of Page
HLAPred Data & Matrices	Data for Identification of Experimentally proven Binders The server allows searching of antigen sequence against MHCBN Database version 3.1 (13). MHCBN is a comprehensive database of Major Histocompatibility Complex (MHC) binding and non-binding peptides compiled from published literature and existing databases. The database consists more than 23000 entries. The HLApred server searches all the peptides obtained from MHCBN for selected HLA alleles in query antigen sequence. Prediction of HLA Binders: All the quantitative matrices for the prediction are obtained from ProPred and ProPred1 servers developed by our group. The HLA class I matrices were orginally derived from BIMAS and literatute. The Matrices for HLA class II were orginally obtained from the work of Sturniolo et al., 1999.These matrices are either multiplication or addition type. The listing of allele and their type is provided below. Where allele mean the name of HLA allele, type specify the type of matrices used in prediction,"++" denotes addition matrices and "××" denotes the multiplication matrices. TOP of Page*
HLAPred Algorithm	Prediction The brief description of algorithm used in prediction is provided below.The server uses two type of the matrices, Addition and multiplication matrices for prediction as shown in table above. The overlapping peptide frames are obtained from the input antigenic sequence. Suppose the length of input sequence is x. Then n numbers of peptide frames each consisting of nine amino acids are obtained by using this formula. n= (x- length of each peptide frame) +1 ..I The following overlapping peptide frames obtained from "ILKEPVHGVIR" sequence. ILKEPVHGVIR Sequence ILKEPVHGV Frame -I LKEPVHGVI Frame -II KEPVHGVIR Frame -III Calculation of Score by using Addition matrices: The initial score for each peptide frame is set at 0.000. The coefficient value is assigned to each amino acid of peptide frame depending on the type of amino acid and its position in nonamer peptide from the selected matrix.The coefficient values are precalculated and stored in the tables of 20 X 9 known as quantitative matrices.The peptide score is obtained by the summation of each coefficient value.For example the score of peptide "ILKEPVHGV" is calculated as follows. Peptide Score=I₁+L₂+K₃+E₄+P₅+V₆+H₇+G₈+V₉ Where I₁ is the coefficent value of I at position 1. Any peptide having the score more than the selected threshold is known as predicted binder. Calculation of Score by using multiplication matrices: The initial score for each peptide is set at 1.000. The coefficient value is assigned to each amino acid depending on amino acid and its position in nonamer peptide from the selected matrix. For example the score of peptide "ILKEPVHGV" is calculated as follows. Score=I₁×L₂×K₃×E₄×P₅×V₆×H₇×G₈×V₉ Final score = Score * final constant The final constant is a also stored at the bottom of each matrix. Peptide score = log (Final score). The peptide having score more than logrithmic cut score at selected threshold are considered as predicted binders. TOP of Page