User Tools

Site Tools


Epitope Selection

Epitope selection is the most important step for vaccine design. It is concerned with selecting a small set of candidate epitopes to maximize the probability of inducing a long lasting and strong immune response. Epitope Selection is an interface to OptiTope a highly flexible mathematical framework for epitope selection [1,2]. OptiTope determines the provably optimal epitope set that maximizes the overall immunogenicity for a target population or a single person and user specified requirements.

Overall immunogenicity of an epitope set is defined as the sum over the immunogenicity of its components weighted by the HLA allele frequencies of the target population. This is a commonly made assumption due to a lack of understanding the interplay of different epitopes within a vaccine.

Step 1: Data Input

In the first step you can specify the target antigens from which the epitopes should be selected. There are three input types:

  1. Peptide List. You can specify a list of candidate epitopes by either direct input (one per line) or by selecting a Peptide List file form the History panel.
  2. Multiple Sequence Alignment. You can also enter a list of multiple sequence alignment either directly or via the History panel. For more details on the specific format follow the link.
  3. Prediction Table. You can also directly input - via History panel or input field - complete prediction results arranged in a table. The format is generated by Epitope Prediction and Polymorphic Epitope Prediction. For more details on the specific format follow the link.

You can upload files via the Upload Tool (Figure 1 (1)) or the jQuery Upload tool (Figure 1 (2)).

Figure 1. Upload possibilities. 1. Upload tool. 2. jQuery Upload tool.

Additionally, you can specify an HLA Allele file from History. The Allele file has to contain HLA alleles in new nomenclature up to a detail level of 4-digits. The so specified alleles are used for predictions, if the selected prediction model supports them. HLA Genotyping generates these files automatically based on sequencing data.

If you are using Multiple Sequence Alignments as input, consensus sequences for the specified antigens are generated as well as conservation scores for each epitope based on the alignments.

Dependent on the input type you can specify a prediction method and an epitope length. Epitope Selection supports the following prediction tools:

  1. SYFPEITHI [3] are position-specific scoring matrices (PSSMs) that were designed based on expert knowledge and amino acid occurrences in naturally processed HLA ligands.
  2. BIMAS [4] uses PSSMs derived from experimentally determined binding affinities measured as dissociation rates of the peptide:HLA:β2-microglobulin complex relative to a reference peptide.
  3. SVMHC [5] is a SVM based classification method that was trained experimentally validated epitopes from the SYFPEITHI database and random generated peptides.
  4. NetMHC family [6-9] comprises of NetMHC, NetMHCpan, NetMHCII, and NetMHCIIpan, which all are artificial neural network based regression methods. Furthermore, NetMHCpan and NetMHCIIpan incorporate structural information of the HLA-binding pockets to allow prediction for HLA alleles with insufficient data.
  5. TEPITOPEpan [10] uses PSSMs for epitope prediction and is based on Sturniolo et al’s virtual binding pocket approach. To allow predictions for alleles that originally were not covered by Sturniolo et al. TEPITOPEpan uses a phylogenetic-based weighting approach to reconstruct the allele-specific PSSM from the original matrices.

Step 2: Target Population (HLA Alleles)

In the second step you have to specify the target population. This can be selected based on geographic region, pre-defined population, or manually.

  1. Geographic Region / Pre-defined Population. The allele probabilities for the geographic regions as well as the pre-defined populations have to be extracted from dbMHC and can be selected via a drop-down menu.
  2. Manual Input. You can also manually specify the HLA allele frequencies by recursively adding new input fields via Add new Series. Please enter HLA frequencies in the range of 0 to 1 and make sure that the locus-wise sum of the frequencies does not exceed 1.

The HLA alleles are filtered based on the selected prediction method, the (optionally) entered HLA Allele File, and the HLA allele specified in the Prediction Table.

Step 3: HLA Allele Selection

In this step you have to select the annotated HLA alleles. A HLA-Tree is generated based on the previous configurations (Figure 2).

 HLA allele tree for SYFPEITHI Figure 2: Example HLA allele tree

By checking higher levels of the tree all HLA alleles of the lower levels are selected as well. If you select HLA-A for example, prediction will be made for all HLA-A alleles that are supported by the selected prediction methods. If no HLA-Tree is generated or your favorite HLA allele is nowhere to be found, please select a different prediction method.

Step 4: Constraints

In the fourth step you can customize the vaccine selection strategy to your needs by choosing constraints from a pre-defined list. Four different constraints are configurable:

  • Maximum number of epitopes to select. Enter the maximum number of epitopes you want to select. (This is the only obligatory constraint.)
  • Minimum epitope conservation. Only epitopes which fulfill this conservation requirement will be considered. (Only usable if multiple sequence alignments were chosen as input. Otherwise conservation for all epitopes is set to 100%)
  • Minimum proportion of alleles to cover. Epitope Selection will select a set of epitopes which is immunogenic with respect to the specified proportion of alleles or more.
  • Minimum proportion of antigens to cover. The optimal epitope set will include epitopes to cover the specified proportion of antigens or more.

Additionally, you can specify an immunogenicity threshold, meaning a minimum immunogenicity (score) required for a peptide to be considered immunogenic with respect to a specific allele.

Step 5: Results

Two outputs are generated. The first output is an internal representation of the predictions that can be directly used as input for Epitope Assembly. The second output is a detailed and interactive html output of the prediction results.

It summarizes the input as well as the vaccine design configuration.

If the optimization problem is feasible, a table containing the optimal set of epitopes will be displayed. For every epitope in the set its fraction of the overall immunogenicity, the conservation, the MHC alleles it covers, the allele-wise immunogenicity contribution, and, if antigen information is given, the corresponding antigens are listed.

Figure 3: Result page for a prediction table input.

Information on the number of selected epitopes, the number of covered alleles, and antigens are given above the result table. Additionally, the locus as well as the population coverage is displayed.

  • Covered allele: An MHC allele is considered to be covered by an epitope set, if one of the epitopes is sufficiently immunogenic w.r.t. the allele.
  • Covered antigen: An antigen is considered to be covered by an epitope set, if one of the epitopes is derived from this antigen.
  • Locus coverage: If locus A has a coverage of 75%, the probability of an individual from the target population carrying a covered allele at locus A is 75%.
  • Population coverage: A population coverage of 80% corresponds to a probability of 80% for an individual from the target population to carry at least one of the covered alleles.

In order to download the complete summary click the link below the result table.

If Epitope Selection failed for your specified design, constraints have led to an infeasible problem. That means that no epitope set can fulfill all specified requirements. Please retry Epitope Selection with relaxed requirements by either increasing the maximal number of epitopes to select, by reducing the coverage constraints, or by completely dropping optional constraints.


  1. Toussaint N.C., Doennes P., Kohlbacher O. (2008) A Mathematical Framework for the Selection of an Optimal Set of Peptides for Epitope-Based Vaccines. PLoS Comp Biol 4:e1000246
  2. Toussaint N. C. and Kohlbacher O. (2009) OptiTope - a web server for the selection of an optimal set of peptides for epitope-based vaccines. Nucleic Acids Research, 37:W617-22. doi:10.1093/nar/gkp293
  3. Rammensee H. , Bachmann J. , Emmerich N.P. , Bachor O.A. , Stevanovic S. (1999) SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50:213–219.
  4. Parker K.C. , Bednarek M.A. , Coligan J.E. (1994) Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. J Immunol 152:163–175.
  5. Dönnes P., Kohlbacher O. (2006) SVMHC: a server for prediction of MHC-binding peptides. Nucleic Acids Res 34:W194–W197.
  6. Lundegaard C, Lamberth K, Harndahl M, Buus S, Lund O, Nielsen M. (2008) NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8-11. Nucleic Acids Res. 1;36(Web Server issue):W509-12.
  7. Nielsen M, Lundegaard C, Blicher T, Lamberth K, Harndahl M, et al. (2007) NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence. PLoS ONE 2(8): e796. doi: 10.1371/journal.pone.0000796
  8. Nielsen, M. and Lund, O. (2009) NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction. BMC bioinformatics, 10, 296.
  9. Karosiene E, Rasmussen M, Blicher T, Lund O, Buus S, and Nielsen M. (2013) NetMHCIIpan-3.0, a common pan-specific MHC class II prediction method including all three human MHC class II isotypes, HLA-DR, HLA-DP and HLA-DQ. Immunogenetics.
  10. Zhang L, Chen Y, Wong H-S, Zhou S, Mamitsuka H, et al. (2012) TEPITOPEpan: Extending TEPITOPE for Peptide Binding Prediction Covering over 700 HLA-DR Molecules. PLoS ONE 7(2): e30483. doi: 10.1371/journal.pone.0030483
epitope_selection.txt · Last modified: 2014/12/18 16:03 by schubert