Kenyon College



Chris Gillen
Animal Biology
Spring 1999


Today we will use molecular sequence data to construct phylogenetic trees of animal species. We will download sequences from public genetic databases and perform sequence alignments and comparisons using DNAstar software.

Step 1: Downloading sequences.

Open netscape and go to the Biology 36 syllabus. Click on the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov/. Then go to the Entrez search system. There are many ways to find relevant sequences. Suppose we want to find cytochrome oxidase subunit 1 proteins from a particular species. From the entrez page, we can select taxonomy, then use the search engine to find the taxon we are interested in (TYPE in Homo sapiens and hit enter). You will see a taxonomic tree for homo sapiens. You can move upwards in the taxonomy by clicking PARENT. Try it. Go back to Homo sapiens by clicking on it. In the next screen, click on it again to get info on this taxon. To view human protein sequences, click protein, and refine query in entrez, then hit GET SEQUENCES. Refine the query by typing cytochrome oxidase in the add field box (make sure the field menu says all fields). Then hit RETRIEVE SEQUENCES. Look through the list until you find Cytochrome oxidase subunit 1 (this is what we want.). Click on GENPEPT report under it. This will show you the report for the sequence. You can save it to a text file by using the menu at the bottom. Select PC text and hit save. Name the file human.pro and save it to an appropriate folder. Feel free to download other sequences that interest you.

Step 2: Convert sequences to DNAstar format.

Open EDITSEQ under the Start / programs / DNAstar menu. Chose File/import. Chose human and open the file. Click save. The file is now in DNAstar format.

Step 3: Perform a sequence alignment.

Open Megalign under the Start / programs / DNAstar menu. Chose new. Chose enter sequences. Enter the sequences you wish to compare. (i.e the human cytochrome oxidase). I will have many cytochrome oxidase sequences available. Once a few sequences are entered, aligned the sequences by choosing align - clustal method. You can check the alignment using view - alignment report and you can check the phylogenetic tree using view - phylogenetic tree. You may wish to construct trees with different combinations of animal cytochrome oxidases. Do you always get the same results? Notice that you can choose a select a tree that reflects distances or one that does not reflect sequence distances. Also note that the trees may not necessarily be "rooted" correctly - that is, the sequences of the most "primitive" animals might not be placed at the base because the program is evaluated sequence differences only and is not making judgments about primitive vs. advanced.

Because protein sequences tend to be highly conserved, DNA or mRNA sequences can make for better sequence comparisons. I have a set of cDNAs for actin and a set of 18s ribosomal sequences available. Perform a sequence alignment with these DNA sequences. Notice that they may be different lengths - this is because some may be partial or contain non-coding sequences. It would make a better comparison to crop all of the sequences so that you are comparing the same sequence for each species. You can do this by going to view-alignment report and looking to see which bases align with each of these sequences. Then you can "crop" the sequence that MEGALIGN considers by going to the main window and clicking on the name of each sequence. This allows you to select the sequence start and end for each sequence. Do this and then redo the phylogenetic tree. Is the answer the same?


Step 4: Try to use this program to ask your own question about phylogenetic relationships among animals.