Rechercher une page de manuel
estim_mtd
Langue: en
Version: 111135 (mandriva - 01/05/08)
Section: 1 (Commandes utilisateur)
NAME
estim_mtd - Mixture Transition Distribution Markov model estimation tool.SYNOPSIS
estim_mtd arguments [options]DESCRIPTION
estim_mtd performs Mixture Transition Distribution Markov model estimation and statistics calculus. The model is estimated on input sequence(s). The stationary law is also computed. The resulting model can then be used to simulate sequences with the simul_m program.ARGUMENTS
- sequence_file
- Either the name of a file containing a set of sequences in FASTA format, or the name of a file containing a list of filenames, each of which containing a set of sequences in FASTA format.
- -d --mtd_order=INTEGER
- Order of the Markov model.
- -k --mkv_order=INTEGER
- Order of the Markov model of the matrices in the MTD.
OPTIONS
- -p --phase=INTEGER
- Number of phases (default = 1).
- -a --alphabet=FILENAME
- A file describing the alphabet to use (DNA alphabet, default setting).
- -A --Alphabet=EXPRESSION
- An expression describing the alphabet to use: [number<10 of characters for each pattern]+[:]+[alphabet patterns list] (DNA alphabet, default setting).
- --dna
- Use DNA alphabet (1:AGCT, default setting).
- --protein
- Use amino acid alphabet (1:IVLFCMAGTWSYPHEQDNKR).
- -o --output=FILENAME
- Result file containing the parameters of the estimated MTD Markov model.
- --identical
- Imposes that the matrices are identical.
- --seed=INTEGER
- Number of seeds for the EM algorithm (NBSEED, default setting).
- --iter=INTEGER
- Maximum iterations number of the EM algorithm (NBITERMAX, default setting).
- --eps=FLOAT
- Value of the epsilon of the EM algorithm (EPS, default setting).
- --log
- Log the successive likelihood values and save them in the file "em.log".
- -l --likelihood=FILENAME
- Compute the likelihood under selected model on the sequences contained in FILENAME or on the sequences whose filenames are listed in FILENAME.
- -L --Likelihood
- Compute the likelihood under selected model on the sequences specified by the sequence_file argument.
- -b --bic=FILENAME
- Compute the BIC under selected model on the sequences contained in FILENAME or on the sequences whose filenames are listed in FILENAME.
- -B --Bic=FILENAME
- Compute the BIC under selected model on the sequences specified by the sequence_file argument.
- --all
- Compute the total BIC/likelihood for all the given sequences.
- -v --version
- Display the version number and exit.
- -h --help
- Print this help and exit.
Examples
Estimate a MTD Markov model of order 5 with matrices of order 2 on the list of sequence files contained in file seq.list. The sequences contain tokens of an alphabet described in file sample.alpha. Generate the estimated model in file model.desc. Log the successive likelihood values of the EM algorithm.
- estim_mtd seql.list -d 5 -k 2 -a sample.alpha -o model.desc --log
Estimate a MTD Markov model of order 3 with matrices of order 1 on the list of sequences contained in seq.faa. The sequences contain tokens of the amino-acids alphabet. rot.part is the partition file (see next section). The number of seeds, iterations and the epsilon are given.
- estim_mtd seq.faa -d 3 -k 1 --seed 20 --iter 100 --eps 0.001 --protein
AUTHORS
estim_mtd is part of the seq++ package, developed by Vincent Miele <miele@genopole.cnrs.fr>, David Robelin <robelin@genopole.cnrs.fr>, Pierre-Yves Bourguignon <bourguignon@genopole.cnrs.fr>, Gregory Nuel <nuel@genopole.cnrs.fr> and Hugues Richard <richard@genopole.cnrs.fr>. Sophie Lebre <lebre@genopole.cnrs.fr> has inspired this work on MTD models.SEE ALSO
estim_m(1), estim_pm(1), estim_vlm(1), simul_m(1), dist_m(1)More information on seq++ is available at <http://stat.genopole.cnrs.fr/seqpp>.
Contenus ©2006-2024 Benjamin Poulain
Design ©2006-2024 Maxime Vantorre