User Tools

Site Tools


Spacer Design for Epitope Assembly

Spacer Design for Epitope Assembly is concerned with determining the optimal amino acid composition and length of a small sequence connecting two epitopes within a string-of-beads poly-peptide and the epitope order to maximizes the probability that the epitopes will be fully recovered after proteasomal cleavage. This is an important step for vaccine design and has potentially high impact on the efficacy of the designed vaccine. Epitope Assembly formulates the epitope ordering problem as a traveling salesperson problem, where epitopes represent the cities to visit and the distances between the cities represent the recovery probabilities.

The configuration steps of Spacer Design are explained in the following:

Step 1: Data Input

Epitope Assembly supports two types of input:

  1. Peptide list From History. Use this possibility to select a protein fasta file from the History panel.
  2. Peptide sequence(s). This input field provides the possibility to paste own peptide sequences. Start each peptide sequence in a new line.

Files can be uploaded with the (Figure 1 (1.)) Upload File tool or the (Figure 1 (2.)) jQuery Upload tool. Figure 1. Upload possibilities. 1. Upload tool. 2. jQuery Upload tool.

After specifying the epitope input, you have to choose which proteasomal cleavage and epitope prediction method should be used during optimization. Currently, Spacer Design supports two proteasomal cleavage site prediction methods:

  1. PCM [1] is a position specific scoring matrix derived from degradation experiments of β-casein [2], enolase [3] and prion proteins [4].
  2. ProteaSMM [5] is a linear ridge regression model trained on β-casein [2], enolase [3]. It comes in two flavors representing the constitutive (C) and the immune proteasome.

For epitope prediction, we currently support:

  1. SYFPEITHI [6] are position-specific scoring matrices (PSSMs) that were designed based on expert knowledge and amino acid occurrences in naturally processed HLA ligands.
  2. BIMAS [7] uses PSSMs derived from experimentally determined binding affinities measured as dissociation rates of the peptide:HLA:β2-microglobulin complex relative to a reference peptide.
  3. SMM [8] A linear ridge regression model.
  4. SMMPMBEC [9] Bayesian regression model using a experimentally derived peptide:MHC Binding Energy covariance matrix as prior.

In addition to the prediction methods, you have to specify a binding threshold for the chosen epitope prediction method to distinguish immunogenic peptides from non-immunogenic peptides. This is dependent on the selected prediction method.

Peptide or protein sequences containing non-standard amino acids are not considered in epitope prediction.

Advanced Options

In the Advanced Options you can specify the max. length of the spacer sequences. But be aware that the optimal length is automatically determined within the specified boundaries. Furthermore, you can influence how much of the max obtainable cleavage score should be retained when minimizing neo-immunogenicity by changing the Alpha parameter [0,1].

If you want to minimize non-junction cleavage sites as well you can specify so und change the Beta [0,1] parameter accordingly. Beta, similar to Alpha, specifies how strongly the minimal obtainable neo-immunogenicity has to be fulfilled from the next optimization in order to achieve a smaller non-junction cleavage score. For example a Beta value of 0.99 would mean that the next solution is only allowed to have a 1% higher neo-immunogenicity objective as the score obtained during neo-immunogenicity optimization.

Step 2: Target Population (HLA Alleles)

In the second step you have to specify the target population. This can be selected based on geographic region, pre-defined population, or manually.

  1. Geographic Region / Pre-defined Population. The allele probabilities for the geographic regions as well as the pre-defined populations have to be extracted from dbMHC and can be selected via a drop-down menu.
  2. Manual Input. You can also manually specify the HLA allele frequencies by recursively adding new input fields via Add new Series. Please enter HLA frequencies in the range of 0 to 1 and make sure that the locus-wise sum of the frequencies does not exceed 1.

The HLA alleles are filtered based on the selected prediction method, the (optionally) entered HLA Allele File, and the HLA allele specified in the Prediction Table.

Step 3: HLA Allele Selection

In this step you have to select the annotated HLA alleles. A HLA-Tree is generated based on the previous configurations (Figure 2).

 HLA allele tree for SYFPEITHI Figure 2: Example HLA allele tree

By checking higher levels of the tree all HLA alleles of the lower levels are selected as well. If you select HLA-A for example, prediction will be made for all HLA-A alleles that are supported by the selected prediction methods. If no HLA-Tree is generated or your favorite HLA allele is nowhere to be found, please select a different prediction method.

Step 4: Result

Two outputs are generated. The first output is an internal representation of the assembly. The second output is an interactive html output of the assembly results.

Figure 3. Cleavage and epitope predictions for the optimized string-of-beads construct.

It summarizes the configuration and shows the cleavage scores for string-of-beads construct and epitope predictions for the specified HLA alleles (Figure 3). You can download the cleavage prediction table by clicking Save and selecting either CSV or XLS as output format. By clicking Print, the table is completely extended to be able to use the Browser print function. To return to the normal view hit ESC.

Additionally, the output provides the finished string-of-beads construct with their optimized spacer sequences oriented from N- to C-terminus (Figure 3).


  1. Donnes, P, and Kohlbacher, O. (2005). Integrated modeling of the major events in the MHC class I antigen processing pathway. Protein Sci, 14(8), 2132-2140. doi: 10.1110/ps.051352405
  2. Emmerich, N.P., Nussbaum, A.K., Stevanovicá, S., Priemer, M., Toes, R.E., Rammensee, H.-G., and Schild, H. (2000). The human 26 S and 20 S proteasomes generate overlapping but different sets of peptide fragments from a model protein substrate. J. Biol. Chem. 275 21140– 21148
  3. Nussbaum, A.K. (2001). “From the test tube to the World Wide Web.” Ph.D. thesis, Eberhard-Karls-Universitát, Tübingen, Germany.
  4. Tenzer, S., Stoltze, L., Schönfisch, B.,Dengjel, J., Müller, M., Stevanovicá, S., Rammensee, H.-G., and Schild, H. (2004). Quantitative analysis of prion-protein degradation by constitutive and immuno-20S proteasomes indicates differences correlated with disesase susceptibility. J. Immunol. 172 1083–1091.
  5. Tenzer, S., Peters, B., Bulik, S., Schoor, O., Lemmel, C., Schatz, M. M., … & Holzhütter, H. G. (2005). Modeling the MHC class I pathway by combining predictions of proteasomal cleavage, TAP transport and MHC class I binding. Cellular and Molecular Life Sciences CMLS, 62(9), 1025-1037.
  6. Rammensee H. , Bachmann J. , Emmerich N.P. , Bachor O.A. , Stevanovic S. (1999) SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 50:213–219.
  7. Parker K.C. , Bednarek M.A. , Coligan J.E. (1994). Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. J Immunol 152:163–175.
  8. Peters, B., & Sette, A. (2005). Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method. BMC bioinformatics, 6(1), 132.
  9. Kim, Y., Sidney, J., Pinilla, C., Sette, A., & Peters, B. (2009). Derivation of an amino acid similarity matrix for peptide: MHC binding and its application as a Bayesian prior. BMC bioinformatics, 10(1), 394.
epitope_spacer.txt · Last modified: 2015/06/21 20:31 by schubert