Skip to main content

Development of a novel multi‑epitope vaccine against the pathogenic human polyomavirus V6/7 using reverse vaccinology

Abstract

Background

Human polyomaviruses contribute to human oncogenesis through persistent infections, but currently there is no effective preventive measure against the malignancies caused by this virus. Therefore, the development of a safe and effective vaccine against HPyV is of high priority.

Methods

First, the proteomes of 2 polyomavirus species (HPyV6 and HPyV7) were downloaded from the NCBI database for the selection of the target proteins. The epitope identification process focused on selecting proteins that were crucial, associated with virulence, present on the surface, antigenic, non-toxic, and non-homologous with the human proteome. Then, the immunoinformatic methods were used to identify cytotoxic T-lymphocyte (CTL), helper T-lymphocyte (HTL), and B-cell epitopes from the target antigens, which could be used to create epitope-based vaccine. The physicochemical features of the designed vaccine were predicted through various online servers. The binding pattern and stability between the vaccine candidate and Toll-like receptors were analyzed through molecular docking and molecular dynamics (MD) simulation, while the immunogenicity of the designed vaccines was assessed using immune simulation.

Results

Online tools were utilized to forecast the most optimal epitope from the immunogenic targets, including LTAg, VP1, and VP1 antigens of HPyV6 and HPyV7. A multi-epitope vaccine was developed by combining 10 CTL, 7 HTL, and 6 LBL epitopes with suitable linkers and adjuvant. The vaccine displayed 98.35% of the world's population coverage. The 3D model of the vaccine structure revealed that the majority of residues (87.7%) were located in favored regions of the Ramachandran plot. The evaluation of molecular docking and MD simulation revealed that the constructed vaccine exhibits a strong binding (-1414.0 kcal/mol) towards the host's TLR4. Moreover, the vaccine-TLR complexes remained stable throughout the dynamic conditions present in the natural environment. The immune simulation results demonstrated that the vaccine design had the capacity to elicit robust immune responses in the host.

Conclusion

The multi-parametric analysis revealed that the designed vaccine is capable of inducing sustained immunity against the selected polyomaviruses, although further in-vivo investigations are needed to verify its effectiveness.

Peer Review reports

Background

Polyomaviruses are small, nonenveloped DNA viruses, which are widespread in nature. Non-enveloped polyomaviruses are capable of infecting mammals and birds with their small circular double-stranded DNA genomes of approximately 5.0 kbp [1, 2]. Two major regulatory proteins are encoded by PyVs, the large tumor antigen (LT-ag) and the small tumor antigen (sT-ag), as well as several structural proteins (VP1 and VP2) [3]. Heterologous animal models indicate that PyVs may carry strong oncogenes, that contribute to cancer in humans. Regulatory proteins are important in viral replication and transcription early in the infection cycle, while structural proteins participate in capsid formation later [4].

Among 12 identified human polyomaviruses (HPyVs), six strains are involved in human diseases, especially in different human cancers. HPyV6, HPyV7, Merkel cell polyomavirus (MCPyV), trichodysplasiaspinulosa virus (TSPyV), HPyV9, MWPyV and BK virus (BKV) and JC virus (JCV), as well as newly identified viruses such as KI (KIPyV), WU (WUPyV), are the most commonly identified strains of human polyomavirus (HPyV) [5,6,7].

The World Health Organization reported increased incidence of skin cancer in the past few decades, with about 8,500 new cases of skin cancer reported daily in the United States. The relationship between HPyV6 and HPyV7 in human skin cancer has recently been elucidated, in recent decade [7, 8]. Several studies have been conducted to confirm the presence of HPyV6- and HPyV7-DNA in cutaneous (Table 1) and non-cutaneous (Table 2) malignancies, including malignant melanoma (MM) [9], non-melanoma skin cancer tissues, basal cell carcinoma (BCC) and squamous cell carcinoma (SCC). Studies on keratoacanthoma and trichoblastoma revealed the presence of HPyV6 in tumors [10]. HPyV7-DNA was found to be detected more frequently in non-cutaneous cancers compared to HPyV6-DNA, whereas HPyV6-DNA was more commonly observed in skin malignancies. A study concluded that all age groups and genders are infected with HPyV6 and 7, and 52–93% of humans are seropositive for HPyV6 while 33–84% are seropositive for HPyV7 [7]. PCR investigations demonstrated HPyV6 and HPyV7 DNA from the skin of both healthy individuals and those experiencing different types of skin tumors. Small and large T antigens are encoded by polyomaviruses, making them potentially oncogenic [11]. Severe cases of HPyV6 and 7 infections are associated with skin disorders, characterized by a significant increase in viral load, expression of dyskeratotic keratinocytes, and the presence of encapsidated virions observed through electron microscopy and sequencing [12]. Moreover, HPyV6 was identified in various forms of epithelial neoplasms, while HPyV7 was observed in thymic epithelial tumors. These discoveries imply that HPyV6 and HPyV7 could be crucial contributors to the development of inflammatory skin conditions and may also possess oncogenic properties [13]. These findings indicate that HPyV6 and HPyV7 may have a crucial involvement in the development of inflammatory skin conditions and potentially possess oncogenic properties [13].

Table 1 HPyV-6 and HPyV-7 seroprevalence in primary cutaneous malignancies
Table 2 HPyV6 and HPyV7 prevalence in non-cutaneous human malignancies

It has shown that HPyV6 and 7 bind, and inactivate p53 resulting in tumor progression. P53 suppresses tumor growth by regulating gene expression in response to stressors, such as DNA damage, leading to apoptosis and cell cycle arrest [20]. The transactivation domain of P53 is repressed by interaction with LT-antigen preventing its binding to DNA, eventually leading to cancer in humans [21, 22].

The lack of vaccine against HPyVs might be related to the complexity of the virus and the ability of HPyVs to evade the host immune system by various strategies, such as downregulating the expression of major histocompatibility complex (MHC) molecules, interfering with interferon signaling, and modulating the activity of immune cells [23]. Multi-epitope vaccines contain multiple antigenic fragments (epitopes) from different proteins of the target pathogen. This type of vaccine has several advantages over traditional vaccines, such as inducing a broader and more robust immune response, reducing the risk of antigenic escape and cross-reactivity, and facilitating the production and delivery of the vaccine. Hence, a multi-epitope vaccine against HPyVs could be beneficial for preventing or treating HPyV-associated diseases, especially in immunocompromised patients who are more susceptible to viral reactivation and complications [24].

In recent years, cancer vaccines have shown promising results against different cancers but vaccine development using traditional approaches is complex and requires a significant amount of effort [25]. Compared to traditional laboratory approaches, immunoinformatics enables rapid development of a multi-epitope vaccine, increasing efficiency and reducing costs [26, 27]. Epitope-based peptide vaccines demonstrated effectiveness in providing protective immunity against various viruses including Zika, dengue, SARS-CoV-2, and Coxsackie B viruses [28]. Hence, it is supposed that a peptide-based vaccine against oncogenic polyomavirus could provide an efficient protective vaccine against HPyV6/V7 oncogenic virus strains.

To develop a multi-epitope vaccine, CTL, HTL, and LBL epitopes of the HPyV6 and HPyV7 oncoproteins including large T antigen (LTAg), VP1, and VP2 were identified and the vaccine's stability and effectiveness were analyzed by immunoinformatics methods. The study yielded compelling evidence supporting the likelihood that the multi-epitope vaccine can effectively initiate a strong anti-HyPV immune response.

Methods

Retrieval and analysis of protein sequences

Human polyomaviruses 6 and 7 were obtained from the National Center for Biotechnology Information (NCBI) database (https://www.ncbi.nlm.nih.gov/) [29]. Complete amino acid sequence retrieval was performed from the UniProt database (https://www.uniprot.org/) [30] in FASTA format. In addition, the VaxiJen v2.0 server (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) was used to evaluate the antigenicity using the default threshold for viruses. VaxiJen 2.0 server is based on auto and cross-covariance (ACC) transformation methods with 70 − 89% accurate prediction [31, 32]. Then, the AllergenFP v1.0 server (https://ddg-pharmfac.net/AllergenFP/) was employed to find the allergenicity of the proteins. This server uses a novel alignment-free descriptor-based fingerprint approach that produces 88.9% accuracy in the prediction result [33]. TMHMM v2.0 server (https://services.healthtech.dtu.dk/services/TMHMM-2.0/), based on the hidden Markov model (HMM), was utilized for transmembrane (TM) helix prediction [34]. The next step of research included structural proteins that are non-allergenic, antigenic, and display less TM helices.

Identification and evaluation of Cytotoxic T-Lymphocyte (CTL) epitopes

The important role of cytotoxic T lymphocytes (CTLs) in the host defense mechanism is known [35]. CTLs have a receptor called CD8, which attaches to a molecule called MHC class I on the surface of infected cells. This enables them to deliver molecules that destroy the infection [36]. The NetCTL v1.2 server(https://services.healthtech.dtu.dk/services/NetCTL-1.2/) using weight matrix and artificial neural networks is highly efficient for epitopes prediction of 9-mer CTLs against 12 supertypes including A1, A2, A3, A24, A26, B7, B8, B27, B39, B44, B58, and B62. Using a threshold value of 0.90 to maintain a specificity and sensitivity of 0.98 and 0.74 respectively, this server was used to predict CT epitopes with high combination scores among the obtained protein sequences [37]. The MHC-I binding tool of the IEDB resource was utilized to determine MHC-I binding alleles for each CTL epitope dependent on the CONSENSUS method (http://tools.iedb.org/mhci/) [38]. To characterize antigenicity for individual CTL epitopes of the VaxiJen v2.0 server (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) [31], the allergenic profile of the AllerTOP v2.0 server (https://www.ddg-pharmfac.net/AllerTOP/) [39], toxicity prediction of ToxinPred server (http://crdd.osdd.net/raghava/toxinpred/) [40], and immunogenicity of IEDB Class I Immunogenicity tool (http://tools.iedb.org/immunogenicity/) [41] were used respectively. To separate allergens from non-allergens with a prediction accuracy of 85.3% in five-fold cross-validation, the AllerTop v2.0 server was used, which applies the amino acid descriptors, ACC transformation methods, and k-nearest neighbor (kNN) methods [39]. The ToxinPred server is used for evaluation properties of different peptides using support-vector machines (SVM), that is a machine learning approach combined with a quantitative matrix for toxicity prediction [40]. To confirm whether a specific epitope elicits an immune response or not, immunogenicity prediction was performed. CTL epitopes were utilized for the vaccine construction that were antigenic, non-allergenic, non-toxic, and immunogenic and showed high C-scores.

Identification and analysis of HTL (Helper T-Lymphocyte) epitopes

Antigen recognition by Helper T cells activates CTLs and B cells to eliminate infectious pathogens [35]. We used the MHC class II binding website of the IEDB resource (http://tools.iedb.org/mhcii/) to predict 15-mer HTL epitopes from the target protein sequences [42]. We also used the CONSENSUS method for predicting protein binding alleles with percentile rank threshold ≤ 2 to maintain consistency [43]. To predict characteristics of individual HTL epitopes we used the VaxiJen v2.0 server (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) [31], ToxinPred server (http://crdd.osdd.net/raghava/toxinpred/) [40], and AllergenFP v1.0 server (https://ddg-pharmfac.net/AllergenFP/), respectively [33]. HTL epitopes were assessed based on their non-toxicity, antigenicity, and non-allergenicity, and considering their cytokine induction ability. IFN-γ plays a major role in inhibiting virus replication by stimulating immune responses of natural killer cells and macrophages, as well as by increasing T cell responses [44]. Interferon-gamma (IFN-γ) prediction was performed by the IFNepitope tool (http://crdd.osdd.net/raghava/ifnepitope/predict.php) with 81.39% precision. We applied the IL4pred tool (https://webs.iiitd.edu.in/raghava/il4pred/) with a threshold score of 0.2 to evaluate the induction of interleukin 4 (IL-4). This operation was executed using SVM-based methods with 75.76% accuracy [45]. Then, the ability to induce both cytokines was prioritized in the selection of HTL epitopes for vaccine construction. For the proteins that were without cytokine-inducing functions, we prioritized the IFN-γ and IL4-inducing abilities of the HTL epitopes [46].

Identification and analysis of LBL (Linear B-Lymphocyte) epitopes

LBL epitopes are important for the induction of B lymphocytes to generate antibodies and has a important role in vaccine design. The ABCpred tool (http://crdd.osdd.net/raghava/abcpred/), based on recurrent neural network with a 0.51 threshold value, was utilized to estimate LBL from the selected protein sequences [47, 48]. LBLepitopes with scores > 0.8 were chosen as vaccine candidates. The AllerTOP v2.0 tool (https://www.ddg-pharmfac.net/AllerTOP/) [49], ToxinPred tool (http://crdd.osdd.net/raghava/toxinpred/) [40], and VaxiJen v2.0 tool (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) [31] were used to assess the anticipated linear B-lymphocyte epitopes’ allergenic, toxic, and antigenic profiles respectively.

Evaluation of the human homology and epitope conservancy

We used the “epitope conservancy analysis” server (http://tools.iedb.org/conservancy/) of the IEDB resource to analyze the conservation of selected MHC class I/II epitopes. This feature demonstrates the availability of the epitope in a range of various strains. In the conservancy analysis, epitopes with 100% maximum identity were selected for vaccine construct [50]. Epitope homology with the human proteome was investigated to avoid cross reaction with human proteins or weak response due to tolerance, and non-homologous epitopes were selected. Human homology was determined using the protein BLAST module of the BLAST server (https://blast.ncbi.nlm.nih.gov/Blast.cgi) with Homo sapiens (taxid: 9606), and a threshold of 0.05.Non-homologous peptides where no hits were found below the threshold e-value were designated as epitopes [51, 52].

Molecular docking and peptide modeling analysis

To assess the binding ability, selective MHC-I epitopes were docked with related HLA alleles. For modeling the determined CTL epitopes, we used the PEP-FOLD v3.5 tool (https://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD3/). This tool utilizes the Taboo/Backtract sampling algorithm for the prediction of peptide 3D conformations with 5 − 50 residues [53]. After predicting five possible structures through this tool for any peptide sequence, the energy of each structure was determined using SWISS-PDB VIEWER, and the model with the least energy was selected for subsequent assessments [54]. The human alleles HLA-A*03:01 (PDB ID: 6O9B), HLA-B*18:01 (PDB ID: 6MT3), HLA-A*02:01 (PDB ID: 7RTD), HLA-A*24:02 (PDB ID: 7MJA), HLA-B*08:01 (PDB ID: 7NUI), HLA-B*07:02 (PDB ID: 7RZD), HLA-B*58:01 (PDB ID: 5VWJ), HLA-A*01:01 (PDB ID: 6AT9), HLA-B*44:03 (PDB ID: 4JQX), HLA-B*40:02 (PDB ID: 5IEK) were considered for MHC-I epitopes. Using the RCSB Protein Data Bank (https://www.rcsb.org/), the crystal structure of shortlisted HLA alleles was downloaded in pdb format [55]. Then, the protein preparation wizard of UCSF Chimera (version 1.11.2) was utilized to prepare proteins by removing structurally bound ligands [56]. The HADDOCK tool (https://wenmr.science.uu.nl/haddock2.4/) was used to estimate the interaction among the Alleles and CTL Epitopes [57, 58]. Molecular visualization of docking analysis was performed through Ligplot software, and images were obtained by UCSF Chimera and Microsoft PowerPoint 2019 [56, 59].

Population coverage analysis

Variations in the HLA allele distribution and expression in regions and races around the world may affect the response to vaccines based on epitope [60]. The population coverage of the candidate vaccine was estimated by implementing the IEDB population coverage server (http://tools.iedb.org/population/). To do this, the investigation of selected HTL and CTL epitopes coupled with their relevant HLA binding alleles in both MHC (I and II) classes were performed individually and in combination [61]. In this study, our emphasis was on the global coverage of alleles and parts of different continents.

MHC cluster analysis

The MHC gene family, as one of the most polymorphic genes in the various species' genomes, contains several thousand alleles in humans [62]. Cluster analysis of MHC alleles is used to identify two classes of MHC molecules with similar binding specificities. The MHCcluster 2.0 online tool (https://services.healthtech.dtu.dk/services/MHCcluster-2.0/) was utilized to provide highly instinctive heat maps and phylogenetic tree-based visualizations of the functional cluster between MHC variants based on the default parameters. During the MHC class I cluster analysis, the NetMHCpan-2.8 approach was utilized with an HLA-prevalent and -characterized module, while for the MHC class II cluster analysis, the relevant DRB allele modules were chosen [62, 63].

Designing and formulation of the multi-epitope vaccine

The vaccine was constructed from the selected HTL, CTL, and LBL epitopes of HPyV6 and HPyV7 proteins. Also, an adjuvant was attached to the vaccine structure using a suitable linker [64, 65]. We used TLR4 agonist as an adjuvant because viral glycoproteins were found to recognize TLR4 agonist [66, 67]. Therefore, 50S ribosomal protein L7/L12 was included as adjuvant (NCBI ID: P9WHE3) and attached to the N-terminal of the vaccine peptides through a bifunctional linker (EAAAK). Contrastingly, the HTL, CTL, and LBL epitopes were connected using Gly-Pro-Gly-Pro-Gly (GPGPG), Ala-Ala-Tyr (AAY), and Lys-Lys (KK) linkers, respectively [64, 65]. The GPGPG linker inhibits the formation of the "junctional epitope" and aids in immune processing. The AAY linker improves epitope immunogenicity by affecting peptide stability. The KK linker improves the maintenance of independent immunogenic functions of the constructed vaccine [68, 69].

Antigenicity, allergenicity, solubility, and physicochemical property assessment

The ProtParam tool (https://web.expasy.org/protparam/)was applied to analyze the physicochemical profiles of the constructed vaccine [70]. We also used the Vaxijen v2.0 tool (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) [31] and ANTIGENPro server (https://scratch.proteomics.ics.uci.edu/) of the Scratch protein forecast tool to predict antigenicity. ANTIGENPro demonstrated an accuracy of 76% with cross-validation tests on the combined dataset [32]. Three servers including AllergenFP v1.0 (https://www.ddg-pharmfac.net/AllergenFP/) [33], AllerTop v2.0 (https://www.ddg-pharmfac.net/AllerTOP/) [39], and AlgPred (http://crdd.osdd.net/raghava/algpred/) [71] were used to predict the allergenic profile of the vaccine construct. The SOLpro online tool (https://scratch.proteomics.ics.uci.edu/) was applied to estimate the solubility of the proposed vaccine and a prediction score ≥ 0.5, indicates the vaccine will be soluble [72]. In addition, we applied the Protein-Sol online tool (https://protein-sol.manchester.ac.uk/) to better understand the solubility [73]. By comparing the scaled solubility value (QuerySol) with the E. coli proteins’ mean solubility from the experimental dataset (PopAvrSol), solubility is predicted, and a predicted score greater than 0.45 is considered to be soluble [74]. In order to predict the number of transmembrane helices, we used the TMHMM v2.0 tool (https://services.healthtech.dtu.dk/services/TMHMM-2.0/) [34]. Also, possible signal peptides were investigated using the application of the SignalP 4.1 tool (https://services.healthtech.dtu.dk/services/SignalP-4.1/) in the ultimate designed vaccine [75].

BLAST homology assessment

The PSI-BLAST algorithm of the NCBI Protein BLAST (BLASTp) module was used to determine the homology between the vaccine construct and the human proteome [76, 77]. Cross-checking study was performed to avoid autoimmune reactions through molecular imitation. The BLASTp search limited the results to records from H. sapiens (taxid: 9606) only. In order to be valid, the query coverage must not show more than 40% homology to the human proteome [78].

Prediction of the secondary structure

The secondary structural configurations were identified using two servers PSIPRED v4.0 (http://bioinf.cs.ucl.ac.uk/psipred/) and Prabi (https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_gor4.html) with default parameters [79, 80]. The percentage of 2D features, i.e. α-helix, β-turns, and random coils was calculated on the vaccine construct by both servers. The precision of prediction results with the Prabi tool was reported to be a mean of 64.4% [80]. The PSIPRED server, as the most accurate predictive generator of secondary structure, displays an accuracy of 78.1% [81].

Homology modeling and 3D structure refinement and validation

I-TASSER tool (https://seq2fun.dcmb.med.umich.edu//I-TASSER/) was used to generate the three-dimensional model of the multi-epitope vaccine. Using threading templates as templates, this server produces a 3D structure based on the amino acids sequence, and it estimates the C-score to assess the validity of the predicted models. The C-score of a model typically falls within the range of -5 to 2, with a higher C-score indicating greater confidence [82,83,84]. Tertiary structure vaccine model refinement was performed via the GalaxyRefine tool (https://galaxy.seoklab.org/cgi-bin/submit.cgi?type=REFINE). Various parameters including GDT-HA, rmsd, poor rotations, Molprobity, clash score, and Rama-favored are produced in the output result of five refined models [85]. After validation of the models through the ProSA-web online tool (https://prosa.services.came.sbg.ac.at/prosa.php), the estimation of the Z-score and analysis of the stereochemical quality of each protein structure were performed by assessing the residue by residue geometry and overall structure geometry [86]. In order to further analyze the Ramachandran plot, the Procheck web server (https://saves.mbi.ucla.edu/) was used to determine the overall quality of the refined 3D structure of the vaccine. The Ramachandran plot is a plot of the dihedral angles phi (ϕ) and psi (ψ) of amino acids to visualize the percentage of amino acids in the generously allowed, disallowed, most favorite, and additional allowed areas [87].

Identification of discontinuous B-Cell epitopes

The ElliPro server (http://tools.iedb.org/ellipro/) was used to estimate conformational B-cell epitopes in the designed vaccine utilizing default parameters (minimum score: 0.5; maximum distance: 6 Å). The improved Thornton’s technique using residue clustering algorithms is the basis of the results. Prediction is done based on the neighbor residue clustering, protein form, and residual protein index (PI) [88].

Molecular docking of the immune receptor (TLR4) and designed vaccine

The protein data bank (RCSB) at 2.4 Å resolution was used to retrieve TLR4 complexes (PDB ID: 4G8A) [55]. Heteroatoms and three chains B, C, and D were deleted in the UCSF Chimera software (version 1.11.2) [56]. Energy minimization of protein was carried out using the Swiss-PDB Viewer with the GROMOS 43B1 force field [54]. Molecular docking of vaccine-TLR4 complexes was performed in Cluspro (https://cluspro.bu.edu/login.php) online tools [89]. It's an automated, web-based program for the docking of peptide − protein or protein − protein. The server executes a series of three computational procedures in the following manner: firstly, the process of rigid body docking is carried out employing PIPER; secondly, the 1000 docked structures with the lowest energy are subjected to clustering using pairwise IRMSD as the distance metric; and finally, the forecasted complex structures positioned at the cluster centers are refined by minimizing their energy. Also, We used the balanced coefficient to obtain the best protein–protein binding results [90]. The output of this server is a short list of putative complexes ranked according to their clustering properties.

In silico immune simulation

The immune simulation study was conducted using the C-ImmSim server (https://kraken.iac.rm.cnr.it/C-IMMSIM/index.php?page=1) to understand and investigate the immunogenicity and immune response profile. Using the position-specific scoring matrix (PSSM), this server employs real lifelike immune responses and interactions, and machine learning [91]. The time steps in the CImmSim web tool (with default parameters) were set to 1, 42, and 84, each time step is equal to 8 h and time step 1 is injection at time = 0. The time interval between two injections (a total of 3 injections) was considered 4 weeks [92].

Molecular dynamics simulation

Molecular dynamics (MD) simulation was applied to refine the TLR-vaccine complex structures using GROMACS 2018 [93, 94]. The structures were centered in a dodecahedron box and filled with water using tip3 water model. To neutralize systems some molecules of water were randomly replaced by Cl- or Na + . After neutralization, the energy minimization was done using steepest descent algorithm. Equilibrating the systems was performed under 100 ps NVT at temperature of 298 K followed by 100 ps NPT ensembles at pressure of 1 bar. Electrostatic interactions were calculated by PME (Flores-Canales and Kurnikova, 2015) and the LINCS procedure was applied to constrain all bonds connecting hydrogen atoms. The Final MD simulation was run for 100 ns with no restraint.

In silico cloning and codon optimization of the final vaccine protein.

The online Java Codon Compatibility Tool (JCAT) web server (http://www.jcat.de/) was utilized for codon optimization and reverse translation of the ultimate vaccine protein [95]. To express the final construct in E. coli, the K12 strain was used as the host. Using this server, important parameters such as GC content and codon adaptive index (CAI) were calculated for assessment of protein expression levels. After introducing sites for BamHI and XhoI restriction enzymes within 3′ and 5′ends of the designed vaccine sequence, respectively, this sequence was transformed into the pET30 ( +) vector through the SnapGene software.

Results

Vaccine design

Retrieval and evaluation of protein sequences

The reference sequence for HPyV6 and HPyV7 large T antigen (LTAg), and viral proteins 1/2 (VP1, VP2) were obtained from UniProt Proteome database. VaxiJen v2.0 tool was utilized to determine the subjected protein sequences. TMHMM v2.0 was employed to anticipate the number of TM helices. The antigenicity of candidate proteins varies from 0.4281 to 0.4996, hence the proteins of interest have sufficient predicted antigenic properties. In addition, the AllergenFP server suggested the proteins are non-allergenic. To construct a multi-epitope-based vaccine, the large T, VP1 and VP2 were included. The quantity of transmembrane helices is zero. Table 3 displays the sequences of these proteins, their UniProt entries, and allergenicity, antigenicity, and TM helices.

Table 3 Details of the selected proteins and Their Selection Criteria

Identification and validation of CTL epitopes

Based on the specified selection range, 52 potential CTL epitopes of the candidate proteins were identified as non-allergenic, non-toxic, antigenic, as well as immunogenic (Table S1). The epitopes were calculated with NetCTL 1.2 using a combinatorial approach. Just ten predicted CTL epitopes were selected for peptide-based vaccine design. The list of the chosen CTL epitopes with their characteristics in the final vaccine is shown in Table 4.

Table 4 A brief list of CTL Epitopes to form the final structure of the vaccine

Identification and validation of HTL epitopes

Overall, 55 potential HTL epitopes under specified selection range as non-allergenic, non-toxic, and antigenic were identified (Table S2). Evaluation of expected cytokine induction capability was done on selected HTL epitopes and based on those results 7 peptides were selected to include in the final vaccine. The list of the chosen HTL peptides with their characteristics in the final vaccine is shown in Table 5.

Table 5 A brief list of HTL Epitopes to form the final structure of the vaccine

Identification and validation of linear B‑cell epitope (LBL)

Overall, 46 LBL epitopes from the target proteins were identified by evaluating potential toxicity, immunogenicity and antigenic characteristics (Table S3). One LBL epitope was chosen from each of the 6 protein components for usage in the ultimate vaccine. The list of the six LBL epitopes with their characteristics in the final vaccine is shown in the Table 6.

Table 6 A brief list of LBL Epitopes to form the final structure of the vaccine

Vaccine evaluation

Evaluation of human homology and epitope conservancy

A lack of homology to normal human proteins was assessed for each of the shortlisted epitopes in both MHC classes and no homologies were identified in the human proteome, suggesting that responses against these peptides are not likely to cause response against a normal protein. Tables 4 and 5 have incorporated conservancy and human homology analyses of selected epitopes.

Molecular docking analyses of the CTL epitopes and HLA alleles

Utilizing molecular docking simulations, we assessed CTL epitope binding to alleles of HLA. We have elected the HLA-A*03:01 allele for LSHATLGNK epitope, HLA-B*18:01 allele for FERWVSFGM epitope, HLA-A*02:01 allele for WLLFVLEEL epitope, HLA-A*24:02 allele for IYKVEAILL epitope, HLA-B*08:01 allele for TPKRRNLLF epitope, HLA-B*07:02 allele for GPRIGSTTM epitope, HLA-B*58:01 allele for LWLPQAWPW epitope, HLA-A*01:01 allele for DTMIVWEAY epitope, HLA-B*44:03 allele for MELTDVLLI epitope, and HLA-B*40:02 allele for TELLFAPQM epitope. The more negative the z-score indicates that the cluster has a high level of reliability. Based on the docking parameters, the selected CTL epitopes exhibited excellent binding interactions with the active site of HLA alleles (Fig. 1). The docking statistics are demonstrated in Table 7.

Fig. 1
figure 1

Molecular docking of the selected CTL epitopes with their respective HLA alleles as indicated in Table 5

Table 7 Data of the molecular docking between CTL Epitopes and HLA Alleles

Population coverage analysis

The analysis of MHCI and MHCII epitopes demonstrated that 97.74% of the global population is covered by MHCI epitopes, while MHCII epitopes cover 26.99% of the global population. Since a multi-epitope vaccine protein includes both MHC epitope classes, a combined estimate of their population coverage was used. Overall, 98.35% of the world's population was covered. Combining MHC class-I and class-II epitope coverage in Europe was 99.55%. followed by West Indies (98.25%), North America (98.19%), East Asia (96.82%), Oceania (95.27%), Southeast Asia (94.27%), North Africa (92.84%), West Africa (92.69%), Northeast Asia (91.63%), South Africa (91.08%), Southwest Asia (89.37%), East Africa (88.85%), South Asia (88.69%), Central Africa (84.65%), South America (80.56%) and Central America (9.07%). Comparison of population coverage between the epitopes of MHC class I/II and mixed MHC epitopes are shown in Fig. 2, Table 8, and Figures S1-S3.

Fig. 2
figure 2

Analysis of population coverage of alleles worldwide

Table 8 Analysis of MHC restriction data for worldwide population coverage

MHC cluster analysis

MHC cluster v2.0 server was exploited in order to cluster MHC classes I and II alleles that interact by selected structural protein epitopes. In this study, 25 alleles from the MHC class I, and 22 alleles from the MHC class II were analyzed. MHCI and MHC II molecules Cluster analysis is shown in Fig. 3A, C, respectively. A tree map showing the cluster analysis of MHCI and MHCII is also shown in Fig. 3B, D. The red zones on the heat map were associated with stronger interactions, whereas the yellow zones were associated with feeble interactions among clusters of both MHC molecules. 

Fig. 3
figure 3

Results of the Cluster analysis for MHC I and II molecules. A Heat map showing the MHC-I cluster, B Heat map showing the MHC-II cluster, C detailed tree map of the MHC-I clustering analysis, D detailed tree map of the MHC-II clustering analysis

Formulation of the vaccine construct

In order to formulate the vaccine construct, we assembled the most favorable CTL, HTL, and LBL epitopes using AAY, GPGPG, and KK linkers, respectively. Furthermore, 50S ribosomal protein L7/L12 adjuvant (NCBI ID: P9WHE3) was attached using EAAAK linker to the N-terminal region of the vaccine. The structure of the vaccine consists of seven epitopes of HTL, ten epitopes of CTL, and six epitopes of LBL from the target protein sequences of polyomaviruses 6 and 7. After suitable evaluation and comparison of different structures, we determined the final structure of the vaccine with 501 amino acids. The final recombinant vaccine was analyzed for subsequent evaluations (Fig. 4).

Fig. 4
figure 4

Schematic presentation of the formulated multi-epitope vaccine construct. The multi-epitope vaccine constructs included (left to right) an adjuvant and CTL, HTL, and LBL epitopes, indicated in the Brown, navy, light blue, and violet rectangular boxes, respectively

Vaccine assessment

Assessment of the antigenicity, allergenicity, and physicochemical properties of the final vaccine protein

The physicochemical characteristics of the formulated construct were evaluated. The vaccine's chemical formula is C2455H3814N662O694S11. The molecular weight of the vaccine construct was estimated to be 54059.97 Da. It was calculated that the protein has a theoretical pI value of 9.41. This value represents that the protein is highly basic. The grand average of hydropathicity (GRAVY) property demonstrated the hydrophilic nature of protein, as it was-0.317. Furthermore, the instability index score was 34.33, and the score of the aliphatic index was calculated as 72.95. The subjected protein’s half-life was evaluated 30 h in mammalian reticulocytes in vitro and exceeds 20 h in yeast also approximated 10 h in E. coli in vivo. Vaccine allergenicity and antigenicity were also assessed using multiple servers. According to ANTIGENPro and Vaxijen 2.0 tools, the score of antigenicity was determined to be 0.897722 and0.6224, respectively. We also employed several tools to evaluate the solubility of the vaccine sequence. In Solpro and Protein-Sol tools, the score of solubility was estimated to be 0.777639 and 0.504, respectively. Also, the final proposed vaccine did not indicate any signal peptides and Transmembrane helices based on the prediction data (Table 9 and Fig. 5A).

Table 9 Allergenicity, antigenicity, and physicochemical properties of the final structure of the vaccine
Fig. 5
figure 5

Protein-Sol server prediction of vaccine protein Solubility (A), Prabi server Prediction of vaccine secondary structure (B), PSIPRED server Prediction of vaccine secondary structure (C)

BLAST homology assessment

Homo sapiens proteins were found to be 24% homologous with the protein vaccine sequence based on sequence homology between the designed vaccine and the proteome sequence. These results confirmed that chimeric vaccine construct cannot prompt autoimmune responses in the host based on BLAST homology assessment. This study focused on the Homo sapiens species (taxid:9606).

Secondary structure extrapolation

The percentage of the secondary structural features of the multi-epitope vaccine was carried out using the PSIPRED and Prabi servers.

The prabi tool estimated 40.92% α-helix, 14.17% β—strand, and 44.91% random coils (Fig. 5B), whilst the PSIPRED server estimated 37.72% α—helix, 10.97% β—strand, and 51.31% random coils in the multi-epitope vaccine construct (Fig. 5C).

Tertiary structure modeling, refinement, and validation of the multi‑epitope vaccine

The I-TASSER online tool was utilized to make the tertiary structure of the ultimate vaccine protein. The server created 5 models for the appointed vaccine. The estimated C-score values for models 1–5 were -0.97, -1.63, -2.97, -4.44, and -3.40, respectively. The best structure with a C-score value of − 0.97 from modeling was selected for additional analysis. before refinement, Procheck and ProSA and tools were utilized to assessment this model. In the current survey, model 1 showed a z-score of -2.12 and 73.4% of the residues in the most favored regions. The GalaxyRefine tool was used to refine the 3D structure of the submitted model. This generated five refined structures for the raw model (Table 10). After refinement, all structures show the regions favored by Rama more than the submitted originally raw model. Model 3 was determined to be the best refined structure among the generated models. It displayed goodRama-favored (90.6), poor rotamers (0.8), MolProbity (2.195), clash score (14.4), rmsd (0.517), and GDT-HA (0.9037) scores. ProSA and SAVES v6.0 online tools were employed to validate the refined structure. Based on Ramachandran plot of the selected model indicated that 87.7% of amino acids in favored regions, 9.6% additional allowed, 1.0% generously allowed, and 1.7% disallowed regions were found. The Z-score value for the refined model was estimated -2.61 (Fig. 6). For further analysis, we have selected model 3 in this study.

Table 10 Models of vaccines refined by Galaxy Refine
Fig. 6
figure 6

The evaluation of the ultimate 3D structure. 3D structure refined by the GalaxyRefine server (A), a Z-score calculated using the PROSA server for the vaccine construct (B), and an analysis of the vaccine construct using Ramachandran plot (C)

Screening for conformational B-Cell epitopes

The ElliPro server identified thirteen Conformational B-cell epitopes in the vaccine construct sequence (Fig. 7). A total of 243 residues were found in these epitopes ranging in size from 3 to 98, shown in Table 11. Moreover, B—cell epitope scores ranged from 0.975 to 0.53 for the prediction of conformational B-cell epitopes.

Fig. 7
figure 7

3D structure Conformational B-cell epitopes existing in the protein vaccine (A-N). Green rods and yellow domain show the protein construct and Conformational B- cell epitopes, in order

Table 11 Shortlist of Conformational Epitopes of the ultimate designed vaccine

Molecular docking of the vaccine protein and TLR complex

Immune cells and vaccine constructs must interact in order to produce a stable and efficient immune response. Molecular Docking of the designed vaccine with TLR4 was carried out by the ClusPro 2.0 server. In the current study, the program produced 30various clusters and ranked them by energy level. There were -1414.0, -1406.2, -1372.0, -1350.9, -1341.0, -1339.5, -1327.2, and -1321.0 kcal/mol of energy in the eight top clusters. The best group with the minimum energy of -1414.0 kcal/mol was selected. The Chimera 1.15rc program was applied to visualize the docked complex (Fig. 8). Using the LigPlot v1.4.5 software, we have generated a map with the hydrophobic interactions and hydrogen bonds between the protein vaccine and TLR4 (Fig. 9). The vaccine and chain A of TLR4 formed 20 hydrogen bonds. These hydrogen bonds are formed by amino acids along with their lengths, as shown in Table 12.

Fig. 8
figure 8

Three-dimensional representation of molecular docking of the vaccine construct and TLR complex

Fig. 9
figure 9

Schematic of the interaction between TLR4 and the vaccine construct. A Amino acids involved in hydrogen bonding from Chain A of TLR4 (B) Amino acids of vaccine construct

Table 12 Hydrogen bonding interactions between TLR4 and vaccine amino acids

Vaccine immune simulation

In silico immune simulation

Immune simulator C-ImmSim was employed to provide simulations of the immune responses associated with the final chimeric vaccine construct. The secondary and tertiary responses were clearly indicated by the anticipated elevated levels of IgM + IgG, IgM, IgG1 + IgG2, and IgG1 antibodies, subsequently followed by a reduction in antigen concentration (Fig. 10A). Results indicated a variety of long-lasting B-cell isotopes. B-cell isotype switching and memory formation may be involved in this process (Fig. 10B). In addition, T helper (helper) and TC (cytotoxic) cells are showing a clear increase with memory growth (Fig. 10C and D). There is also a clear increase in IFN-γ production and the growth of dendritic cells after immunization (Fig. 10E and F). These data represent that After successive exposures to the target antigen, robust and significant secondary immune response, antigen clearance enhancement, and production of vigorous immune memory occur.

Fig. 10
figure 10

Results of the in silico immune simulation using the C-ImmSim server for the designed vaccine. A the generation of immune complex and immunoglobulin as a result of response to designed vaccine injections, B B lymphocyte total count after the three injections, C growth of CD4 T-helper lymphocytes after the three injections including active, duplicating, resting, anergic, D Increasing the number of cytotoxic CD8 lymphocytes after injection of the designed vaccine, E Proliferation of dendritic cells after immunization, F Stimulation of cytokines and interleukins after vaccine administration

Evaluation of MD simulations

The global structural stability of proteins was evaluated using Root Mean Square deviation (RMSD) of the backbone atoms. This plot shows how much the protein conformation has changed during MD simulation from initial structure. The TLR showed RMSD value in the range of 0.25 to 0.35 nm. The RMSD after 20 ns reached stability. Furthermore, the RMSD of vaccine was plotted and was in the range of 0.45 to 1. The root-mean-square fluctuation (RMSF) indicates the fluctuation of protein residues over time from a reference position during simulation. In current simulations, no unusual fluctuation was observed in protein structure. The compactness of TLR and vaccine was evaluated using Radius of gyration (Rg) plot. The Rg of TLR was in the range of 3.25 to 3.35. The Rg of vaccine was in the range of 0.3 to 0.35. A stability in compactness of each protein was observed. The hydrogen bond between receptor and peptide and was calculated. About 15 H bonds were formed between TLR and vaccine (Fig. 11).

Fig. 11
figure 11

Molecular dynamics simulation of the TLR4 complex and vaccine construct. A RMSD plot of the vaccine construct, B RMSD plot of the TLR4, C RMSF plot of the vaccine, D RMSF plot of the TLR4, E and F radius of gyration of the vaccine-TLR4 complex, and G hydrogen bond analysis from the simulation system

Codon optimization and in silico cloning of the designed vaccine

Codon optimization is used to ameliorate gene expression and translation precision of the recombinant protein by adapting to the target host's codon bias. Reverse translation of the predicted vaccine was performed to achieve maximum expression in Escherichia coli strain K12 by Jcat server. After codon optimization, CAI score and GC content in improved protein sequence were estimated at 1.0 and 52.96, respectively. The data demonstrate that the improved protein sequence could be expressed sustainably in the E. coli system. Finally, the improved sequence was successfully integrated into the pET30a ( +) vector by SnapGene program (Fig. 12).

Fig. 12
figure 12

The map of the in silico cloning of the vaccine construct into the pET30a ( +) vector using SnapGene software. The black segment indicates the backbone of the vector and the red segment shows the vaccine construct. This vaccine construct contains restriction sites for XhoI and BamHI restriction enzymes at the 5′ and 3′ ends, respectively

Discussion

Conventional techniques for vaccine development involve the use of whole organisms, which can lead to undesired exposure to antigens and may trigger allergic responses. To prevent allergic responses, peptide-based vaccines that included short peptide fragments derived from immunogenic proteins have been used to produce strong and targeted immune reactions. Rabies, rubella, yellow fever, smallpox, hepatitis A/B, chickenpox, polio, influenza, Human Papillomavirus, and Japanese encephalitis are some of the infectious diseases which vaccines are highly effective against [96, 97]. The development of vaccines involves complex, time-consuming, and expensive in vitro and in vivo assays to ensure vaccine effectiveness [98]. Current advances in immunoinformatics and computational biology allow the design of effective vaccines in silico and reduce the number of in vitro experiments [99]. Using an in vitro study, an experimentally validated multiepitope vaccine was designed against Clostridium perfringens [100]. With this method, a wide range of vaccine candidates can be identified without the requirement of cultivating pathogenic organisms [98].

Human polyomavirus 6 (HPyV6) and HPyV7 are polyomaviruses species initially discovered in the skin of healthy people [101]. The role of HPyV6 and 7 proteins in binding and inactivating p53 has been documented, suggesting its oncogenic role [102]. The incidence of malignant skin tumors has increased over recent decades, chiefly as a result of alterable exposures.The World Health Organization reported that about 8500 new cases of skin tumors are diagnosed every day in the U.S. [7, 8]. Several studies have shown the prevalence of HPyV6- and 7 in primary cutaneous malignancies, including actinic keratosis, basal cell carcinoma, bone marrow transplantation, neuroendocrine, and lymphoid skin cancers [7].

Therefore, in the present study, the POLY capsid protein VP1, POLY minor capsid protein VP2, and POLY large T antigen from HPyV6 and HPyV7 were examined as candidate antigens for epitope identification.The allergenicity, toxicity, and antigenicity of the identified epitopes were assessed. There are a number of factors to consider when making peptide-based vaccines, including the intrinsic properties of the selected epitopes, adjuvant, and linker, and their arrangement and location within the protein. Based on the findings from the studies conducted by Olugbenga et al. [103], Mahnoor Majid et al. [104], and Sami et al. [99], we used KK, GPGPG, and AAY linkers to fuse LBL, HTL, and CTL epitopes, respectively. Epitope presentation is promoted with AAY and GPGPG linkers, while junctional epitopes are reduced with these linkers [105, 106]. The KK linker, a bi-lysine basic linker, preserved the immunogenic properties of B cell epitopes while keeping the pH near physiological levels [107, 108].

Compared with live attenuated vaccines, computational vaccines have relatively low immunogenicity. In order to address this problem, adjuvants are routinely employed. Hence, adjuvants have been widely used to increase vaccine effectiveness. Adjuvants generally function by activating innate immune cells through pathogen associated molecular pattern receptors. Adjuvants can also improve vaccines by stabilizing the epitope structure of the vaccine antigen, creating a suitable source for the gradual release of the antigen, better presenting the antigen to the antigen-presenting cells (APC), increasing the absorbing molecules of these cells at the site of the vaccine, and the proper binding of the antigen to these cells improves the vaccine performance [109]. The 50S ribosomal protein L7/L12 (Locus RL7_MYCTU) from Mycobacterium tuberculosis is a TLR4 agonist [110]. Thus, in order to enhance the immunogenicity of vaccine, we used it as an adjuvant. EAAAK, an empirical α-helical linker, reduces the connection with other protein regions while providing rigidity and improving chimeric protein durability [111]. Multiple servers determined that the construct vaccine was non-allergenic and highly antigenic, demonstrating that triggers robust immune responses without inducing unwanted allergies.

The final vaccine had a theoretical pI of 8.3, indicating its alkaline nature. Furthermore, the vaccine construct exhibited an average molecular weight of 54.05 KDa, indicating its favorable antigenic characteristics [112]. According to standards, proteins with a molecular weight below 110 kDa are deemed appropriate vaccine candidates [113]. The instability index of the vaccine was measured as 34.63, Values less than 40 are considered as a stable protein in biological environments [114]. The constructed vaccine has an indicated average half-life of above 20, 10, and 30 h in yeast cells (in vivo), E. coli (in vivo), and mammalian reticulocytes (in vitro), respectively. On the basis of previous findings, these half-life results are acceptable [99, 115]. The aliphatic index was 72.95, indicating the constructed vaccine would be thermostable at natural human body temperature [116]. The GRAVY value of the protein was -0.317, indicating the hydrophilic nature of the vaccine. Vaccine formulation and purification are made easier by the strong affinity for water molecules [117, 118].

After making the 3D model of the vaccine, the refinement system is employed to enhance its quality, both in terms of global and local structures. Validation of the model is necessary to accurately compare the unrefined structure with the refined structure. The Ramachandran plot indicated that 73.4% of the amino acid residues in the unrefined structure were detected in the desired region, while 87.7% of the amino acid residues in the refined structure were placed in the desired region, demonstrating improvement in the refined structure. Assessment of the immune response induced by an antigen is one of the primary characteristics in the validation of an introduced vaccine [119]. Molecular analyzes were employed to investigate the molecular connection between the formulated vaccine and TLR4, and suitable interactions were detected with a strong affinity score of -1414.0 kcal/mol. This relationship of the engineered vaccine with TLR-4 demonstrated that the recombinant protein vaccine has the capacity to stimulate an innate and adaptive immune response. To investigate the stability and dynamic efficiency of the vaccine/TLR4 complex, MD simulation was performed and the RMSD diagram confirmed the stable binding of this compound.

An appropriate host is required for the expression of recombinant protein. E. coli expression systems is the most common host for expressing recombinant proteins [120, 121]. To enable the recombinant vaccine to be expressed at high levels in E. coli (K12 strain) codon optimization was performed. An analysis of the designed vaccine indicated a CAI score of 1.0 and GC content of 52.96. CAI values ​​of more than 0.8 and GC content of 30–70% have been reported to favor high expression in the E. coli host [122, 123].

Conclusion

Human polyomaviruses (HPyVs) infect a wide range of tissues such as skin, kidney and respiratory tract and often lead to persistent and asymptomatic infection, while these infections can lead to cancer. Currently, no significant therapeutic vaccine is available for HPyV. In this study, immunoinformatics techniques were applied to identify and refine candidate vaccines against HPyV. The highly immunogenic T and B cell epitopes were identified and used for vaccine design. The proposed vaccine is projected to produce robust immune reactions, including cytokines, and interferons. The binding analysis confirmed the vaccine binding to the immune receptor TLR4 that was dynamically stable. Although experimental trials in appropriate animal models is necessary to test the potency of the engineered vaccine, analysis using different bioinformatics tools indicated the high immunogenicity and preventive potential of the developed vaccine.

Availability of data and materials

Datasets used in the experiments are listed as follows: (1) NCBI: https://www.ncbi.nlm.nih.gov/ (2) UniProt database: https://www.uniprot.org/ (3) VaxiJen server: http://www.ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html (4) AllergenFP v1.0 server: https://ddg-pharmfac.net/AllergenFP/ (5) TMHMM v2.0 tool: https://services.healthtech.dtu.dk/services/TMHMM-2.0/ (6) NetCTL 1.2 server: https://services.healthtech.dtu.dk/services/NetCTL-1.2/ (7) IEBD server: http://tools.iedb.org/mhci/ (8) AllerTOP v2.0 server: https://www.ddg-pharmfac.net/AllerTOP/ (9) ToxinPred server: http://crdd.osdd.net/raghava/toxinpred/ (10) Class I Immunogenicity: http://tools.iedb.org/immunogenicity/ (11) IEDB resource http://tools.iedb.org/mhcii/ (12) IFNepitope tool: http://crdd.osdd.net/raghava/ifnepitope/design.php (13) IL4pred tool: https://webs.iiitd.edu.in/raghava/il4pred/ (14) ABCpred tool: http://crdd.osdd.net/raghava/abcpred/ (15) epitope conservancy analysis: http://tools.iedb.org/conservancy/ (16) BLAST server: https://blast.ncbi.nlm.nih.gov/Blast.cgi (17) PEP-FOLD v3.5 tool: https://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD3/ (18) HADDOCK tool: https://wenmr.science.uu.nl/haddock2.4/ (19) RCSB Protein Data Bank: https://www.rcsb.org/ (20) IEDB population coverage server: http://tools.iedb.org/population/ (21) MHCcluster 2.0 tool: https://services.healthtech.dtu.dk/services/MHCcluster-2.0/ (22) ExPASy ProtParam tool: https://web.expasy.org/protparam/ (23) ANTIGENPro server: https://scratch.proteomics.ics.uci.edu/ (24) AlgPred: http://crdd.osdd.net/raghava/algpred/ (25) SOLpro tool: https://scratch.proteomics.ics.uci.edu/ (26) SignalP 4.1 tool: https://services.healthtech.dtu.dk/services/SignalP-4.1/ (27) PSIPRED v4.0: http://bioinf.cs.ucl.ac.uk/psipred/ (28) Prabi: https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_gor4.html (29) I-TASSER tool: https://seq2fun.dcmb.med.umich.edu//I-TASSER/ (30) GalaxyRefine tool: https://galaxy.seoklab.org/cgi-bin/submit.cgi?type=REFINE (31) ProSA-web: https://prosa.services.came.sbg.ac.at/prosa.php (32) Procheck server: https://saves.mbi.ucla.edu/ (33) ElliPro server: http://tools.iedb.org/ellipro/ (34) ClusPro: https://cluspro.bu.edu/login.php (35) JCat: http://www.jcat.de/ (36) C-ImmSim server: https://kraken.iac.rm.cnr.it/C-IMMSIM/index.php?page=1

All data generated or analyzed during this study are included in this published article [and its supplementary information files].

Abbreviations

HPyV6:

Human polyomaviruses 6

HPyV7:

Human polyomaviruses 7

CTLs:

Cytotoxic T-lymphocytes

HTLs:

Helper T cells

LBL:

Linear B-Lymphocyte

LTAg:

Large T antigen

VP1:

POLY capsid protein

VP2:

POLY minor capsid protein

MHC I:

Major histocompatibility complex (MHC) class I

MHC II:

Major histocompatibility complex (MHC) class II

RMSD:

Root Mean Square Deviation

RMSF:

Root Mean Square Fluctuation

Rg:

Radius of gyration

TLR4:

Toll like receptor 4

CAI:

Codon Adaptation Index

GRAVY:

Grand average of hydropathicity

pI:

Isoelectric point

MD:

Molecular dynamics

References

  1. Johne R, Buck CB, Allander T, Atwood WJ, Garcea RL, Imperiale MJ, et al. Taxonomical developments in the family Polyomaviridae. Arch Virol. 2011;156(9):1627–34.

    Article  CAS  PubMed  Google Scholar 

  2. Condez AC, Nunes M, Filipa-Silva A, Leonardo I, Parreira R. Human Polyomaviruses (HPyV) in Wastewater and Environmental Samples from the Lisbon Metropolitan Area: Detection and Genetic Characterization of Viral Structural Protein-Coding Sequences. Pathogens. 2021;10(10).

  3. van der Meijden E, Feltkamp M. The human polyomavirus middle and alternative T-Antigens; thoughts on roles and relevance to cancer. Front Microbiol. 2018;9:398.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Prado JCM, Monezi TA, Amorim AT, Lino V, Paladino A, Boccardo E. Human polyomaviruses and cancer: an overview. Clinics (Sao Paulo). 2018;73(suppl 1):e558s.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Torres C. Evolution and molecular epidemiology of polyomaviruses. Infect Genet Evol. 2020;79:104150.

    Article  CAS  PubMed  Google Scholar 

  6. Buck CB, Van Doorslaer K, Peretti A, Geoghegan EM, Tisza MJ, An P, et al. The ancient evolutionary history of polyomaviruses. PLoS Pathog. 2016;12(4):e1005574.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Klufah F, Mobaraki G, Liu D, Alharbi RA, Kurz AK, Speel EJM, et al. Emerging role of human polyomaviruses 6 and 7 in human cancers. Infect Agent Cancer. 2021;16(1):35.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Higgins S, Nazemi A, Chow M, Wysong A. Review of nonmelanoma skin cancer in African Americans, Hispanics, and Asians. Dermatol Surg. 2018;44(7):903–10.

    Article  CAS  PubMed  Google Scholar 

  9. Schrama D, Groesser L, Ugurel S, Hafner C, Pastrana DV, Buck CB, et al. Presence of human polyomavirus 6 in mutation-specific BRAF inhibitor-induced epithelial proliferations. JAMA Dermatol. 2014;150(11):1180–6.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Beckervordersandforth J, Pujari S, Rennspiess D, Speel EJ, Winnepenninckx V, Diaz C, et al. Frequent detection of human polyomavirus 6 in keratoacanthomas. Diagn Pathol. 2016;11(1):58.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Ehlers B, Wieland U. The novel human polyomaviruses HPyV6, 7, 9 and beyond. APMIS. 2013;121(8):783–95.

    Article  CAS  PubMed  Google Scholar 

  12. Nguyen KD, Lee EE, Yue Y, Stork J, Pock L, North JP, et al. Human polyomavirus 6 and 7 are associated with pruritic and dyskeratotic dermatoses. J Am Acad Dermatol. 2017;76(5):932-40.e3.

    Article  CAS  PubMed  Google Scholar 

  13. Hashida Y, Higuchi T, Matsuzaki S, Nakajima K, Sano S, Daibata M. Prevalence and genetic variability of human polyomaviruses 6 and 7 in healthy skin among asymptomatic individuals. J Infect Dis. 2018;217(3):483–93.

    Article  PubMed  Google Scholar 

  14. Purdie KJ, Proby CM, Rizvi H, Griffin H, Doorbar J, Sommerlad M, et al. The role of human papillomaviruses and polyomaviruses in BRAF-inhibitor induced cutaneous squamous cell carcinoma and benign squamoproliferative lesions. Front Microbiol. 2018;9:1806.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Hampras SS, Locke FL, Chavez JC, Patel NS, Giuliano AR, Miller K, et al. Prevalence of cutaneous viral infections in incident cutaneous squamous cell carcinoma detected among chronic lymphocytic leukemia and hematopoietic stem cell transplant patients. Leuk Lymphoma. 2018;59(4):911–7.

    Article  PubMed  Google Scholar 

  16. Du-Thanh A, Foulongne V, Guillot B, Dereure O. Recently discovered human polyomaviruses in lesional and non-lesional skin of patients with primary cutaneous T-cell lymphomas. J Dermatol Sci. 2013;71(2):140–2.

    Article  CAS  PubMed  Google Scholar 

  17. Poluschkin L, Rautava J, Turunen A, Wang Y, Hedman K, Syrjänen K, et al. Polyomaviruses detectable in head and neck carcinomas. Oncotarget. 2018;9(32):22642–52.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Rennspiess D, Pujari S, Keijzers M, Abdul-Hamid MA, Hochstenbag M, Dingemans AM, et al. Detection of human polyomavirus 7 in human thymic epithelial tumors. J Thorac Oncol. 2015;10(2):360–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Klufah F, Mobaraki G, Chteinberg E, Alharbi RA, Winnepenninckx V, Speel EJM, et al. High Prevalence of Human Polyomavirus 7 in Cholangiocarcinomas and Adjacent Peritumoral Hepatocytes: Preliminary Findings. Microorganisms. 2020;8(8).

  20. Muller PA, Vousden KH. p53 mutations in cancer. Nat Cell Biol. 2013;15(1):2–8.

    Article  CAS  PubMed  Google Scholar 

  21. Moens U, Prezioso C, Pietropaolo V. Functional domains of the early proteins and experimental and epidemiological studies suggest a role for the novel human polyomaviruses in cancer. Front Microbiol. 2022;13:834368.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Borchert S, Czech-Sioli M, Neumann F, Schmidt C, Wimmer P, Dobner T, et al. High-affinity Rb binding, p53 inhibition, subcellular localization, and transformation by wild-type or tumor-derived shortened Merkel cell polyomavirus large T antigens. J Virol. 2014;88(6):3144–60.

    Article  PubMed  PubMed Central  Google Scholar 

  23. White MK, Gordon J, Khalili K. The rapidly expanding family of human polyomaviruses: recent developments in understanding their life cycle and role in human pathology. PLoS Pathog. 2013;9(3):e1003206.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Hirsch HH, Babel N, Comoli P, Friman V, Ginevri F, Jardine A, et al. European perspective on human polyomavirus infection, replication and disease in solid organ transplantation. Clin Microbiol Infect. 2014;20(Suppl 7):74–88.

    Article  PubMed  Google Scholar 

  25. Nabel GJ. HIV vaccine strategies. Vaccine. 2002;20(15):1945–7.

    Article  CAS  PubMed  Google Scholar 

  26. Oli AN, Obialor WO, Ifeanyichukwu MO, Odimegwu DC, Okoyeh JN, Emechebe GO, et al. Immunoinformatics and vaccine development: an overview. Immunotargets Ther. 2020;9:13–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Ahmad B, Ashfaq UA, Rahman MU, Masoud MS, Yousaf MZ. Conserved B and T cell epitopes prediction of ebola virus glycoprotein for vaccine development: an immuno-informatics approach. Microb Pathog. 2019;132:243–53.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Huang S, Zhang C, Li J, Dai Z, Huang J, Deng F, et al. Designing a multi-epitope vaccine against coxsackievirus B based on immunoinformatics approaches. Front Immunol. 2022;13:933594.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL. NCBI BLAST: a better web interface. Nucleic Acids Res. 2008;36(Web Server issue):W5-9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2017;45(D1):D158-d69.

  31. Doytchinova IA, Flower DR. VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines. BMC Bioinformatics. 2007;8:4.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Magnan CN, Zeller M, Kayala MA, Vigil A, Randall A, Felgner PL, et al. High-throughput prediction of protein antigenicity using protein microarray data. Bioinformatics. 2010;26(23):2936–43.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Dimitrov I, Naneva L, Doytchinova I, Bangov I. AllergenFP: allergenicity prediction by descriptor fingerprints. Bioinformatics. 2014;30(6):846–51.

    Article  CAS  PubMed  Google Scholar 

  34. Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001;305(3):567–80.

    Article  CAS  PubMed  Google Scholar 

  35. Xu X, Chen P, Wang J, Feng J, Zhou H, Li X, et al. Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission. Sci China Life Sci. 2020;63(3):457–60.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Raskov H, Orhan A, Christensen JP, Gögenur I. Cytotoxic CD8(+) T cells in cancer and cancer immunotherapy. Br J Cancer. 2021;124(2):359–67.

    Article  CAS  PubMed  Google Scholar 

  37. Larsen MV, Lundegaard C, Lamberth K, Buus S, Lund O, Nielsen M. Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction. BMC Bioinformatics. 2007;8:424.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Moutaftsi M, Peters B, Pasquetto V, Tscharke DC, Sidney J, Bui HH, et al. A consensus epitope prediction approach identifies the breadth of murine T(CD8+)-cell responses to vaccinia virus. Nat Biotechnol. 2006;24(7):817–9.

    Article  CAS  PubMed  Google Scholar 

  39. Dimitrov I, Flower DR, Doytchinova I. AllerTOP–a server for in silico prediction of allergens. BMC Bioinformatics. 2013;14 Suppl 6(Suppl 6):S4.

    Article  PubMed  Google Scholar 

  40. Gupta S, Kapoor P, Chaudhary K, Gautam A, Kumar R, Raghava GP. In silico approach for predicting toxicity of peptides and proteins. PLoS ONE. 2013;8(9):e73957.

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  41. Calis JJ, Maybeno M, Greenbaum JA, Weiskopf D, De Silva AD, Sette A, et al. Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput Biol. 2013;9(10):e1003266.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Wang P, Sidney J, Kim Y, Sette A, Lund O, Nielsen M, et al. Peptide binding predictions for HLA DR, DP and DQ molecules. BMC Bioinformatics. 2010;11:568.

    Article  PubMed  PubMed Central  Google Scholar 

  43. Wang P, Sidney J, Dow C, Mothé B, Sette A, Peters B. A systematic assessment of MHC class II peptide binding predictions and evaluation of a consensus approach. PLoS Comput Biol. 2008;4(4):e1000048.

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  44. Tahir Ul Qamar M, Rehman A, Tusleem K, Ashfaq UA, Qasim M, Zhu X, et al. Designing of a next generation multiepitope based vaccine (MEV) against SARS-COV-2: Immunoinformatics and in silico approaches. PLoS One. 2020;15(12):e0244176.

  45. Dhanda SK, Gupta S, Vir P, Raghava GP. Prediction of IL4 inducing peptides. Clin Dev Immunol. 2013;2013:263952.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Bhuiyan MA, Quayum ST, Ahammad F, Alam R, Samad A, Nain Z. Discovery of potential immune epitopes and peptide vaccine design-a prophylactic strategy against Rift Valley fever virus. F1000Research. 2020;9:999.

    Article  CAS  Google Scholar 

  47. Saha S, Raghava GP. Prediction methods for B-cell epitopes. Methods Mol Biol. 2007;409:387–94.

    Article  CAS  PubMed  Google Scholar 

  48. Saha S, Raghava GP. Prediction of continuous B-cell epitopes in an antigen using recurrent neural network. Proteins. 2006;65(1):40–8.

    Article  CAS  PubMed  Google Scholar 

  49. Dimitrov I, Bangov I, Flower DR, Doytchinova I. AllerTOP vol 2–a server for in silico prediction of allergens. J Mol Model. 2014;20(6):2278.

    Article  PubMed  Google Scholar 

  50. Bui HH, Sidney J, Li W, Fusseder N, Sette A. Development of an epitope conservancy analysis tool to facilitate the design of epitope-based diagnostics and vaccines. BMC Bioinformatics. 2007;8:361.

    Article  PubMed  PubMed Central  Google Scholar 

  51. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.

    Article  CAS  PubMed  Google Scholar 

  52. Mehla K, Ramana J. Identification of epitope-based peptide vaccine candidates against enterotoxigenic Escherichia coli: a comparative genomics and immunoinformatics approach. Mol Biosyst. 2016;12(3):890–901.

    Article  CAS  PubMed  Google Scholar 

  53. Lamiable A, Thévenet P, Rey J, Vavrusa M, Derreumaux P, Tufféry P. PEP-FOLD3: faster de novo structure prediction for linear peptides in solution and in complex. Nucleic Acids Res. 2016;44(W1):W449–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Guex N, Peitsch MC. SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis. 1997;18(15):2714–23.

    Article  CAS  PubMed  Google Scholar 

  55. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The protein data bank. Nucleic Acids Res. 2000;28(1):235–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF Chimera–a visualization system for exploratory research and analysis. J Comput Chem. 2004;25(13):1605–12.

    Article  CAS  PubMed  Google Scholar 

  57. Honorato RV, Koukos PI, Jiménez-García B, Tsaregorodtsev A, Verlato M, Giachetti A, et al. Structural biology in the clouds: the WeNMR-EOSC ecosystem. Front Mol Biosci. 2021;8:729513.

    Article  PubMed  PubMed Central  Google Scholar 

  58. van Zundert GCP, Rodrigues J, Trellet M, Schmitz C, Kastritis PL, Karaca E, et al. The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. J Mol Biol. 2016;428(4):720–5.

    Article  PubMed  Google Scholar 

  59. Wallace AC, Laskowski RA, Thornton JM. LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein Eng. 1995;8(2):127–34.

    Article  CAS  PubMed  Google Scholar 

  60. Adhikari UK, Tayebi M, Rahman MM. Immunoinformatics approach for epitope-based peptide vaccine design and active site prediction against polyprotein of emerging oropouche virus. J Immunol Res. 2018;2018:6718083.

    Article  PubMed  PubMed Central  Google Scholar 

  61. Bui HH, Sidney J, Dinh K, Southwood S, Newman MJ, Sette A. Predicting population coverage of T-cell epitope-based diagnostics and vaccines. BMC Bioinformatics. 2006;7:153.

    Article  PubMed  PubMed Central  Google Scholar 

  62. Thomsen M, Lundegaard C, Buus S, Lund O, Nielsen M. MHCcluster, a method for functional clustering of MHC molecules. Immunogenetics. 2013;65(9):655–65.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Hasan M, Ghosh PP, Azim KF, Mukta S, Abir RA, Nahar J, et al. Reverse vaccinology approach to design a novel multi-epitope subunit vaccine against avian influenza A (H7N9) virus. Microb Pathog. 2019;130:19–37.

    Article  CAS  PubMed  Google Scholar 

  64. Dorosti H, Eslami M, Negahdaripour M, Ghoshoon MB, Gholami A, Heidari R, et al. Vaccinomics approach for developing multi-epitope peptide pneumococcal vaccine. J Biomol Struct Dyn. 2019;37(13):3524–35.

    Article  CAS  PubMed  Google Scholar 

  65. Nain Z, Abdulla F, Rahman MM, Karim MM, Khan MSA, Sayed SB, et al. Proteome-wide screening for designing a multi-epitope vaccine against emerging pathogen Elizabethkingia anophelis using immunoinformatic approaches. J Biomol Struct Dyn. 2020;38(16):4850–67.

    Article  CAS  PubMed  Google Scholar 

  66. Pandey RK, Bhatt TK, Prajapati VK. Novel immunoinformatics approaches to design multi-epitope subunit vaccine for malaria by investigating anopheles salivary protein. Sci Rep. 2018;8(1):1125.

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  67. Olejnik J, Hume AJ, Mühlberger E. Toll-like receptor 4 in acute viral infection: too much of a good thing. PLoS Pathog. 2018;14(12):e1007390.

    Article  PubMed  PubMed Central  Google Scholar 

  68. Abdellrazeq GS, Fry LM, Elnaggar MM, Bannantine JP, Schneider DA, Chamberlin WM, et al. Simultaneous cognate epitope recognition by bovine CD4 and CD8 T cells is essential for primary expansion of antigen-specific cytotoxic T-cells following ex vivo stimulation with a candidate Mycobacterium avium subsp. paratuberculosis peptide vaccine. Vaccine. 2020;38(8):2016–25.

    Article  CAS  PubMed  Google Scholar 

  69. Borthwick N, Silva-Arrieta S, Llano A, Takiguchi M, Brander C, Hanke T. Novel Nested Peptide Epitopes Recognized by CD4(+) T Cells Induced by HIV-1 Conserved-Region Vaccines. Vaccines (Basel). 2020;8(1).

  70. Gasteiger E, Hoogland C, Gattiker A, Wilkins MR, Appel RD, Bairoch A. Protein identification and analysis tools on the ExPASy server. The proteomics protocols handbook. 2005:571–607.

  71. Saha S, Raghava GP. AlgPred: prediction of allergenic proteins and mapping of IgE epitopes. Nucleic Acids Res. 2006;34(Web Server issue):W202–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Magnan CN, Randall A, Baldi P. SOLpro: accurate sequence-based prediction of protein solubility. Bioinformatics. 2009;25(17):2200–7.

    Article  CAS  PubMed  Google Scholar 

  73. Hebditch M, Carballo-Amador MA, Charonis S, Curtis R, Warwicker J. Protein-Sol: a web tool for predicting protein solubility from sequence. Bioinformatics. 2017;33(19):3098–100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Niwa T, Ying BW, Saito K, Jin W, Takada S, Ueda T, et al. Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins. Proc Natl Acad Sci U S A. 2009;106(11):4201–6.

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  75. Nielsen H. Predicting Secretory Proteins with SignalP. Methods Mol Biol. 2017;1611:59–73.

    Article  CAS  PubMed  Google Scholar 

  76. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–402.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Altschul SF, Wootton JC, Gertz EM, Agarwala R, Morgulis A, Schäffer AA, et al. Protein database searches using compositionally adjusted substitution matrices. Febs j. 2005;272(20):5101–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Rojas M, Restrepo-Jiménez P, Monsalve DM, Pacheco Y, Acosta-Ampudia Y, Ramírez-Santana C, et al. Molecular mimicry and autoimmunity. J Autoimmun. 2018;95:100–23.

    Article  CAS  PubMed  Google Scholar 

  79. Buchan DW, Minneci F, Nugent TC, Bryson K, Jones DT. Scalable web services for the PSIPRED Protein Analysis Workbench. Nucleic Acids Res. 2013;41(Web Server issue):W349–57.

    Article  PubMed  PubMed Central  Google Scholar 

  80. Garnier J, Gibrat JF, Robson B. GOR method for predicting protein secondary structure from amino acid sequence. Methods Enzymol. 1996;266:540–53.

    Article  CAS  PubMed  Google Scholar 

  81. Montgomerie S, Sundararaj S, Gallin WJ, Wishart DS. Improving the accuracy of protein secondary structure prediction using structural alignment. BMC Bioinformatics. 2006;7:301.

    Article  PubMed  PubMed Central  Google Scholar 

  82. Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010;5(4):725–38.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. The I-TASSER Suite: protein structure and function prediction. Nat Methods. 2015;12(1):7–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Zhang Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008;9:40.

    Article  PubMed  PubMed Central  Google Scholar 

  85. Ko J, Park H, Heo L, Seok C. GalaxyWEB server for protein structure prediction and refinement. Nucleic Acids Res. 2012;40(Web Server issue):W294–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Wiederstein M, Sippl MJ. ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res. 2007;35(Web Server issue):W407–10.

    Article  PubMed  PubMed Central  Google Scholar 

  87. Lovell SC, Davis IW, Arendall WB 3rd, de Bakker PI, Word JM, Prisant MG, et al. Structure validation by Calpha geometry: phi, psi and Cbeta deviation. Proteins. 2003;50(3):437–50.

    Article  CAS  PubMed  Google Scholar 

  88. Ponomarenko JV, Bourne PE. Antibody-protein interactions: benchmark datasets and prediction tools evaluation. BMC Structural Biology. 7(1):1–19.

  89. Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, et al. The ClusPro web server for protein-protein docking. Nat Protoc. 2017;12(2):255–78.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Desta IT, Porter KA, Xia B, Kozakov D, Vajda S. Performance and its limits in rigid body protein-protein docking. Structure. 2020;28(9):1071-81.e3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Del Tordello E, Rappuoli R, Delany I. Reverse vaccinology: exploiting genomes for vaccine design. 2017:65–86.

  92. Gill SC, von Hippel PH. Calculation of protein extinction coefficients from amino acid sequence data. Anal Biochem. 1989;182(2):319–26.

    Article  CAS  PubMed  Google Scholar 

  93. Abraham MJ, Murtola T, Schulz R, Páll S, Smith JC, Hess B, et al. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19–25.

    Article  ADS  Google Scholar 

  94. Pronk S, Páll S, Schulz R, Larsson P, Bjelkmar P, Apostolov R, et al. GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics. 2013;29(7):845–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Grote A, Hiller K, Scheer M, Münch R, Nörtemann B, Hempel DC, et al. JCat: a novel tool to adapt codon usage of a target gene to its potential expression host. Nucleic Acids Res. 2005;33(Web Server issue):W526–31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Plotkin SA. Vaccines: past, present and future. Nat Med. 2005;11(4 Suppl):S5-11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Kash N, Lee MA, Kollipara R, Downing C, Guidry J, Tyring SK. Safety and efficacy data on vaccines and immunization to human papillomavirus. J Clin Med. 2015;4(4):614–33.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Mugunthan SP, Harish MC. Multi-epitope-based vaccine designed by targeting cytoadherence proteins of mycoplasma gallisepticum. ACS Omega. 2021;6(21):13742–55.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Sami SA, Marma KKS, Mahmud S, Khan MAN, Albogami S, El-Shehawi AM, et al. Designing of a multi-epitope vaccine against the structural proteins of Marburg virus exploiting the immunoinformatics approach. ACS Omega. 2021;6(47):32043–71.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Katalani C, Nematzadeh G, Ahmadian G, Amani J, Kiani G, Ehsani P. In silico design and in vitro analysis of a recombinant trivalent fusion protein candidate vaccine targeting virulence factor of Clostridium perfringens. Int J Biol Macromol. 2020;146:1015–23.

    Article  CAS  PubMed  Google Scholar 

  101. Schowalter RM, Pastrana DV, Pumphrey KA, Moyer AL, Buck CB. Merkel cell polyomavirus and two previously unknown polyomaviruses are chronically shed from human skin. Cell Host Microbe. 2010;7(6):509–15.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  102. Prezioso C, Van Ghelue M, Moens U, Pietropaolo V. HPyV6 and HPyV7 in urine from immunocompromised patients. Virol J. 2021;18(1):24.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Onile OS, Musaigwa F, Ayawei N, Omoboyede V, Onile TA, Oghenevovwero E, et al. Immunoinformatics Studies and Design of a Potential Multi-Epitope Peptide Vaccine to Combat the Fatal Visceral Leishmaniasis. Vaccines (Basel). 2022;10(10).

  104. Majid M, Andleeb S. Designing a multi-epitopic vaccine against the enterotoxigenic Bacteroides fragilis based on immunoinformatics approach. Sci Rep. 2019;9(1):19780.

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  105. Livingston B, Crimi C, Newman M, Higashimoto Y, Appella E, Sidney J, et al. A rational strategy to design multiepitope immunogens based on multiple Th lymphocyte epitopes. J Immunol. 2002;168(11):5499–506.

    Article  CAS  PubMed  Google Scholar 

  106. Nezafat N, Karimi Z, Eslami M, Mohkam M, Zandian S, Ghasemi Y. Designing an efficient multi-epitope peptide vaccine against Vibrio cholerae via combined immunoinformatics and protein interaction based approaches. Comput Biol Chem. 2016;62:82–95.

    Article  CAS  PubMed  Google Scholar 

  107. Negahdaripour M, Nezafat N, Eslami M, Ghoshoon MB, Shoolian E, Najafipour S, et al. Structural vaccinology considerations for in silico designing of a multi-epitope vaccine. Infect Genet Evol. 2018;58:96–109.

    Article  PubMed  Google Scholar 

  108. Gu Y, Sun X, Li B, Huang J, Zhan B, Zhu X. Vaccination with a paramyosin-based multi-epitope vaccine elicits significant protective immunity against trichinella spiralis infection in mice. Front Microbiol. 2017;8:1475.

    Article  PubMed  PubMed Central  Google Scholar 

  109. Awate S, Babiuk LA, Mutwiri G. Mechanisms of action of adjuvants. Front Immunol. 2013;4:114.

    Article  PubMed  PubMed Central  Google Scholar 

  110. Lee SJ, Shin SJ, Lee MH, Lee MG, Kang TH, Park WS, et al. A potential protein adjuvant derived from Mycobacterium tuberculosis Rv0652 enhances dendritic cells-based tumor immunotherapy. PLoS ONE. 2014;9(8):e104351.

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  111. Chen X, Zaro JL, Shen WC. Fusion protein linkers: property, design and functionality. Adv Drug Deliv Rev. 2013;65(10):1357–69.

    Article  CAS  PubMed  Google Scholar 

  112. Khatoon N, Pandey RK, Prajapati VK. Exploring Leishmania secretory proteins to design B and T cell multi-epitope subunit vaccine using immunoinformatics approach. Sci Rep. 2017;7(1):8285.

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  113. Naz A, Awan FM, Obaid A, Muhammad SA, Paracha RZ, Ahmad J, et al. Identification of putative vaccine candidates against Helicobacter pylori exploiting exoproteome and secretome: a reverse vaccinology based approach. Infect Genet Evol. 2015;32:280–91.

    Article  CAS  PubMed  Google Scholar 

  114. Guruprasad K, Reddy BV, Pandit MW. Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence. Protein Eng. 1990;4(2):155–61.

    Article  CAS  PubMed  Google Scholar 

  115. Shah SZ, Jabbar B, Mirza MU, Waqas M, Aziz S, Halim SA, et al. An Immunoinformatics Approach to Design a Potent Multi-Epitope Vaccine against Asia-1 Genotype of Crimean-Congo Haemorrhagic Fever Virus Using the Structural Glycoproteins as a Target. Vaccines (Basel). 2022;11(1).

  116. Chaudhri G, Quah BJ, Wang Y, Tan AH, Zhou J, Karupiah G, et al. T cell receptor sharing by cytotoxic T lymphocytes facilitates efficient virus control. Proc Natl Acad Sci U S A. 2009;106(35):14984–9.

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  117. Chang KY, Yang JR. Analysis and prediction of highly effective antiviral peptides based on random forests. PLoS ONE. 2013;8(8):e70166.

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  118. Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982;157(1):105–32.

    Article  CAS  PubMed  Google Scholar 

  119. Gori A, Longhi R, Peri C, Colombo G. Peptides for immunological purposes: design, strategies and applications. Amino Acids. 2013;45(2):257–68.

    Article  CAS  PubMed  Google Scholar 

  120. Chen R. Bacterial expression systems for recombinant protein production: E. coli and beyond. Biotechnol Adv. 2012;30(5):1102–7.

    Article  CAS  PubMed  Google Scholar 

  121. Rosano GL, Ceccarelli EA. Recombinant protein expression in Escherichia coli: advances and challenges. Front Microbiol. 2014;5:172.

    Article  PubMed  PubMed Central  Google Scholar 

  122. Morla S, Makhija A, Kumar S. Synonymous codon usage pattern in glycoprotein gene of rabies virus. Gene. 2016;584(1):1–6.

    Article  CAS  PubMed  Google Scholar 

  123. Ali M, Pandey RK, Khatoon N, Narula A, Mishra A, Prajapati VK. Exploring dengue genome to construct a multi-epitope based subunit vaccine by utilizing immunoinformatics approach to battle against dengue infection. Sci Rep. 2017;7(1):9232.

    Article  PubMed  PubMed Central  ADS  Google Scholar 

Download references

Acknowledgements

The authors acknowledge support provided by the Tabriz University of Medical Sciences, Tabriz, Iran.

Funding

This work was supported by a grant from Biotechnology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran (Grant no. 68980).

Author information

Authors and Affiliations

Authors

Contributions

R.S: Data modeling, interpretation of data, data curation, and manuscript writing; S. F: have designed and supervised the work; N. B: Manuscript reviewing and editing; N. B: wrote the manuscript; F. E: wrote the manuscript; M. S: Analyzed the data. S. F. and S.V: edited the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Safar Farajnia.

Ethics declarations

Ethics approval and consent to participate

This work was approved by Ethical committee of Tabriz University of Medical Sciences, Tabriz, Iran (IR.TBZMED.AEC.1400.016).

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Salahlou, R., Farajnia, S., Bargahi, N. et al. Development of a novel multi‑epitope vaccine against the pathogenic human polyomavirus V6/7 using reverse vaccinology. BMC Infect Dis 24, 177 (2024). https://doi.org/10.1186/s12879-024-09046-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12879-024-09046-0

Keywords