Instructions and Requirements for the Final Project |
DO NOT include the outputs in your discussion (refer to the results, do NOT copy/paste them). The text should include all the information necessary to understand your project, and the outputs should be in separately uploaded files.
Please upload a Final project file, and one file per program output.
The project is due NO LATER THAN Friday, August 5, 2022, by 12 noon. The project should be sumitted to the Feinberg moodle.
If you have a problem with the deadline (for example: miluim, out of the country - not "I have another exam") speak to Shifra BY July 31. (Email shifra.ben-dor@weizmann.ac.il, Telephone x2741).
The computer classroom will be reserved for us after the end of the semester, through the due date at the regularly scheduled hours.
For the Project
Please report on which chromosome your sequence is located, on which strand the gene is located, whether the sequence is draft or finished (if your genome has draft), and the exon/intron structure of your gene (as far as you can tell from your results), any splice variants. Don't forget to explain ALL the hits (even those that are unexpected).
Run a similarity search of your Protein sequence against a Protein database using BLAST.
Please describe in your report:
What program you used, what database, scoring matrix and if you used the filter.
Look at the hits list: the distribution of the hits with the various E scores.
Look at the Alignments and report:
For the top 10 hits: relate to their length and % of identity (similarity). (in addition to e-score, organism, related proteins...)
Summary of the rest of the results (which organisms they come from, are they from the same family.....).
You need to test the validity of the last hit on the hits list from the database similarity search.
Please describe in your report:
Which sequences you compared, which program you used, % similarity and identity
what the alignment looks like and how it compares to the database search that found it
Don't forget that the algorithms have to match! (database search and pairwise - global or local)
When choosing the sequences for multiple alignment, choose sequences
that are 80% similar or less (if possible - if you have a particular reason why you want to use more similar sequences, discuss it with us first). (If you don't get less similiar from your database search, redo do it with different parameters!)
Use at least 5 sequences in addition to yours.
According to the similarities to your query sequence, you'll choose the
method for the multiple alignment. You only have to use ONE method (you can use either clustalw, clustalo or muscle).
Please describe in your report:
What are the sequences used for the alignment, how similar are they to your sequence, what program was used.
Describe the results, namely pointing out regions which are conserved and regions which are variable.
Describe your results. (Don't forget to list which databases, what the hits are, where they are in your sequence....and the sequence signatures, if they are available.)
Can you say that the "motifs" found in your sequence are represented in the multiple alignment (if so, how well are they conserved? if not, why not?)
Summarize your findings.
1) your sequence
2) a printout of the genomic viewer with your blat hits (the blat output itself, in other words, the full list of hits, and the alignment and a printout of the genome browser with your hits visible on it for the best match to your gene)
3) the output of your translation program
4) The full hits list and at least the top ten alignments of your database search (you should also include the alignments of any other sequences you use later on - particularly the sequences you use for pairwise and multiple analysis). Please save as PDF - save (at least) twice, once on the main window with the list, once when the alignment tab is on top (and make sure to scroll down enough to see all the top 10, then it will show up in the pdf). You may cut and paste for the sequences used in the pairwise and multiple alignments.
5) your pairwise alignment output
6) your multiple alignment
7) a printout of your interpro results (the graphical view is enough, not all of the internal pages!)
Shifra shifra.ben-dor@weizmann.ac.il