top of page

Materials & Methods

Data compilation

Amino acid sequences for the Envelope glycoprotein from 12 strains of Ebola virus, the Hemagglutinin protein of 32 independently isolated samples of Influenza virus A (strain A, subtype H1N1) and the Glycoprotein NB of 11 independently isolated samples Influenza virus B (strain B) were compiled into separate text files. Strains of Ebola virus (Cote d’Ivoire, Zaire, Sudan and Reston) used in this study were isolated from outbreaks during the 20th and 21st century. For the Influenza virus A, only data for strain A and subtype, H1N1, were used. Lastly, for Influenza virus B, only strain B samples were used (see table in Reference for the accession numbers of the protein sequences involved in analysis).

 

Hierarchical clustering

The compiled amino acid sequence text file for each viral protein (Envelope glycoprotein, Hemagglutinin, and Glycoprotein NB) was ran through a custom Python program to calculate amino acid frequencies in each sequence. Hierarchical clustering was performed with Gene Cluster 3.0 software (Eisen) using an uncentered correlation metric and an average linkage clustering method. Java Treeview software was used to view the resulting clusters.

 

 

Table 1. Ebola virus Envelope glycoproteins.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 2. Influenza virus A (strain A, H1N1 subtype) Hemagglutinin proteins.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 3. Influenza virus B (strain B) Glycoproteins NB.

 

Protein Sequences

bottom of page