User Tools

Site Tools


MHC ligand binding prediction

Epitope prediction provides several methods for predicting potential T-Cell epitopes. Different sequence input possibilities provide access to the protein databases NCBI RefSeq [1] and UniProt[2]. Additionally, own sequences can be entered for prediction. Methods and alleles are available for MHC class I and MHC class II epitope prediction.

In the following the different configuration steps of Epitope prediction are explained.

Step 1: Data Input

The first step is to specify the sequence(s) for which predictions should be performed. There are six input types that can be selected via a drop-down menu:

  1. RefSeq This input field provides access to sequences of the NCBIs RefSeq database [1]. As search keys use RefSeq accession IDs. Separate search keys by blank.
  2. UniProt. This input field provides access to sequences of the UniProtKB database [2]. As search keys use Primary accession numbers. Separate search keys by blank.
  3. Peptide sequence(s). This input field provides the option to paste own peptide sequences. Start each peptide sequence in a new line.
  4. Protein Fasta from History. Use this option to select a protein fasta file from the History panel. Files can be uploaded with the (1.) Upload File tool or the (2.) jQuery Upload tool.
  5. Peptide List from History. Use this option to select a peptide list file from the History panel. The file should contain short peptide sequences of in the range of 8-16 AA, one sequence per line. Files can be uploaded with the (1.) Upload File tool or the (2.) jQuery Upload tool.

Figure 1. Upload possibilities. 1. Upload tool. 2. jQuery Upload tool.

Additionally, you can specify an HLA Allele file from History. The Allele file contains HLA alleles in new nomenclature up to a detail level of 4-digits. Alleles specified in this way are used for predictions, if the selected prediction model supports them.

Depending on the input format you also can specify the required length of the epitopes [8-16 AA]. Depending on the selected length the available prediction methods are filtered.

Peptide or protein sequences containing non-standard amino acids are not considered in epitope prediction.

Step 2: Prediction Methods

In the second step the prediction methods to be used are selected. Multiple methods can be used at the same time, but at least one prediction method has to be selected. The following prediction methods are available:

  1. SYFPEITHI [3] are position-specific scoring matrices (PSSMs) that were designed based on expert knowledge and amino acid occurrences in naturally processed HLA ligands.
  2. BIMAS [4] uses PSSMs derived from experimentally determined binding affinities measured as dissociation rates of the peptide:HLA:β2-microglobulin complex relative to a reference peptide.
  3. SVMHC [5] is a SVM based classification method that was trained on experimentally validated epitopes from the SYFPEITHI database and randomly generated peptides.
  4. NetMHC family [6-9] comprises NetMHC, NetMHCpan, NetMHCII, and NetMHCIIpan, which are all artificial neural network based regression methods. Furthermore, NetMHCpan and NetMHCIIpan incorporate structural information of the HLA-binding pockets to allow prediction for HLA alleles with insufficient data.
  5. UniTope [10] is a SVM based prediction method that also combines structural information of the HLA binding groove with epitope sequences. In comparison to NetMHC(II)pan, the peptides are encoded using physicochemical properties .
  6. TEPITOPEpan [11] uses PSSMs for epitope prediction and is based on Sturniolo et al’s virtual binding pocket approach. To allow predictions for alleles that originally were not covered by Sturniolo et al. TEPITOPEpan uses a phylogenetic-based weighting approach to reconstruct the allele-specific PSSM from the original matrices.

Step 3: HLA selection

In this step the alleles for which predictions should be performed have to be selected. A tree is generated based on the supported alleles of the previous selected prediction methods (Figure 2). Only the shared HLA alleles are displayed if multiple prediction methods were selected. If an HLA Allele file was specified the supported alleles are filter based on the contained alleles in the Allele file.

 HLA allele tree for SYFPEITHI Figure 2: HLA allele tree for SYFPEITHI

By checking higher levels of the tree all HLA alleles of the lower levels are selected as well. If no HLA-Tree is generated or your favorite HLA allele is nowhere to be found, please click back and select a different prediction method.

Step 4: Results

Two outputs are generated. The first output is an internal representation of the predictions that can be directly used as input for Epitope Selection. The second output is a detailed and interactive html output of the prediction results.

Figure 3. Example result page.

The results are presented in a sortable and searchable table. Each row represents one prediction result of an epitope and a prediction method. The results can be exported in either CSV of Excel format by clicking Save and selecting the desired format. By clicking Print, the table is completely extended to be able to use the Browser print function. To return to the normal view hit ESC.


  1. Pruitt K.D. , Tatusova T. , Maglott D.R. (2007) NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35:D61–D65
  2. The UniProt Consortium (2007) The Universal Protein Resource (UniProt). Nucleic Acids Res 35:D193-D197
  3. Rammensee H. , Bachmann J. , Emmerich N.P. , Bachor O.A. , Stevanovic S. (1999) SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50:213–219.
  4. Parker K.C. , Bednarek M.A. , Coligan J.E. (1994) Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. J Immunol 152:163–175.
  5. Dönnes P., Kohlbacher O. (2006) SVMHC: a server for prediction of MHC-binding peptides. Nucleic Acids Res 34:W194–W197.
  6. Lundegaard C, Lamberth K, Harndahl M, Buus S, Lund O, Nielsen M. (2008) NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8-11. Nucleic Acids Res. 1;36(Web Server issue):W509-12.
  7. Nielsen M, Lundegaard C, Blicher T, Lamberth K, Harndahl M, et al. (2007) NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence. PLoS ONE 2(8): e796. doi: 10.1371/journal.pone.0000796
  8. Nielsen, M. and Lund, O. (2009) NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction. BMC bioinformatics, 10, 296.
  9. Karosiene E, Rasmussen M, Blicher T, Lund O, Buus S, and Nielsen M. (2013) NetMHCIIpan-3.0, a common pan-specific MHC class II prediction method including all three human MHC class II isotypes, HLA-DR, HLA-DP and HLA-DQ. Immunogenetics.
  10. Toussaint N. C, Feldhahn M, Ziehm M, Stevanovic M, and Kohlbacher O. (2011) T-cell epitope prediction based on self-tolerance. Proc. ICIW.
  11. Zhang L, Chen Y, Wong H-S, Zhou S, Mamitsuka H, et al. (2012) TEPITOPEpan: Extending TEPITOPE for Peptide Binding Prediction Covering over 700 HLA-DR Molecules. PLoS ONE 7(2): e30483. doi: 10.1371/journal.pone.0030483
epitope_prediction.txt · Last modified: 2015/07/08 11:10 by schubert