Read through the assignment carefully before starting!
Add your name and e-mail address on the front page!!!
This week we'll work with a gene from a family of interferon responsive genes, IFI27 and its relatives. These genes were found in a lab that is interested in response to infection. As the lab bioinformatics expert, you are given the mRNA sequence and its your job to find out what you can about the genomic structure and sequence.
The point of this assignment is to see some of the various functions of genome browsers.
The gene is mouse Ifi27l1 (IFI27 like 1). In mouse its not clear which gene is the direct homolog of the human IFI27, as there are different numbers of family members in different species. We'll do this search at the UC Santa Cruz Genome Browser.
We have to go to NCBI, and get the mRNA in fasta format. The accession number of the is: BC128276. Open and copy the sequence. Keep this window open in the background, you'll need it at the end.
Now we'll open a page at UC Santa Cruz.
Because you may be working on computers where either you or others have worked on the genome browser before, we'll reset the browser to make sure everything comes out the way we want it to. Mouse over the "Genome Browser" link in the blue tool bar on top of the page. Then click on the "Reset all user settings" link at the bottom of the pull down menu. It should reset to the latest version (hg38) of the human genome.
Click on Blat (from "Tools" in the top menu bar) and run a search against the mouse genome. Make sure you have the most recent version! Paste in the sequence you just copied (from NCBI), and click on the "submit" button.
Now let's figure out our gene structure:
The chart should look something like this:
exon/intron | location in mRNA | Location in genomic DNA |
exon 1 | 1-100 | 15,354,876-15,354,976 |
intron 1 | 15,354,976-15,400,987 |
Go back to the previous window and click on the browser link
Now we have to make sure we have the right tracks open. First we'll concentrate on the genomic questions. Scroll down to the bottom of the page
Go to the section labeled "Mapping and Sequencing tracks." Open the "Contigs" "Assembly" and "Gap" to "full". Make sure that "Blat sequence" is also on full - that is our input sequence. Click on refresh (either from the top of the control panels, or the link on the right hand side of any of the blue bars dividing the track types.) (If you want more information on any of the maps, click on the title of the map in the control panels, and it will take you to an explanation page)
Click "refresh"
Now we are going to compare what's available in the database to our gene. To answer these questions, compare the various tracks to "your sequence from blat". Look at the mouse mRNA track. Zoom out once (1.5x) to make sure you have the full length of the mRNA's (the browser by default opens to the full length of the sequence we input into blat).
Let's see how the gene prediction programs fared in this region. Go to the Genes and Gene Prediction Tracks section, and open the following tracks to full: SGP, Geneid, and Genscan.
This gene family is different in different organisms, so lets see what the conservation of this member looks like. Go down to the track controls again. Go to "mRNA and EST tracks" and hide the mouse EST tracks if they're open. Go to the Comparative Genomics section and put Conservation on full. Click on the track title (the blue link) and enter the track controls. In the boxed section, in the top row, click the + next to Species selection, and make sure it chooses all the species. Click the submit button above the box.
Now we want to see what is going on in human, so take the mouse protein (from the first page in NCBI that you had to open), and run a Blat against the human genome. (when you change the species to human on the blat page, wait for it to reload, and make sure that you see the Human Blat header on top.)