Genome 540 Homework Assignment 7

Due Sun March 2

  • Your output should provide
  • You must turn in your results and your computer program, using this template file . Please put everything into ONE plain text file - do not send an archive of files or a tar file, or a word processing document file. Compress it (using either Unix compress, or gzip -- if you don't have access to either of these programs let us know), and send it as an attachment to both Phil and Benjamin.
    1. Transition Probabilities: Give all possible transitions, even if the probability is fixed at zero. Output transitions ordered by first state, then second (1,1; 1,2; 1,3; 1,4; etc).
    2. Emission Probabilities: For each symbol emitted by the state indicated in the attributes for this field, give the probability of emitting that symbol:
      1,TTT=.15000
      1,TCT=.15000
      1,TAT=.15000
      1,TGT=.15000
    3. Gene Annotation: List the first three genes found (starting from the beginning of the chromosome), giving the strand, start and end coordinates, and genbank annotation.
    4. Underflow Issues: Different answers can be arrived at using slightly different calculations. For example (assuming log space probabilities):
      maxPath = prevNodeMax + transProb + emitProb
      maxPath = prevNodeMax + (transProb + emitProb)
      prevNodeMax is generally much "larger" - if transProb is close to zero, then prevNodeMax + transProb ~= prevNodeMax, leading to loss of precision. You should use the second formulation for this assignment.