Genome 540 Homework Assignment 6
Due Sunday Feb 23
- Using the same HMM and dataset as in homework 5, write a program that implements EM (Baum-Welch) training instead.
Use the same starting parameter values, but in contrast to homework 5, you should not hold any parameter values fixed -- allow all of them to change with each iteration. Compute the log-likelihood (to the base 2) of the sequence at each iteration, and run the program until the increase in log-likelihood between successive iterations becomes less than .1. You should check that the loglikelihood increases with each iteration -- if it doesn't, something is wrong with your program.
- Your output should provide
- the name and first line of the .fna file
- the number of iterations until convergence
- the final log-likelihood
- the final emission and transition probabilities
-- please output these in scientific notation, to four significant digits (i.e., 9.000e-1)
- You must turn in your results and your computer
program, using this template file .
Please put everything into ONE plain text file - do not send an archive
of files or a tar file, or a word processing document file. Compress it (using either Unix compress, or gzip -- if
you don't have access to either of these programs let us know), and
send it as an attachment to both Phil and Benjamin.