estim_m

Langue: en

Version: 111134 (mandriva - 01/05/08)

Section: 1 (Commandes utilisateur)

NAME

estim_m - Markov model estimation tool.

SYNOPSIS

estim_m arguments [options]

DESCRIPTION

estim_m performs Markov model estimation and/or statistics calculus. The model is estimated on input sequence(s) OR loaded from a description file (previously generated by estim_m). In both cases the stationnary law is computed. The resulting model can then be used to simulate sequences with the simul_m program.

ARGUMENTS

sequence_file
Either the name of a file containing a set of sequences in FASTA format, or the name of a file containing a list of filenames, each of which containing a set of sequences in FASTA format.
-d --order=INTEGER
Order of the Markov model.

OPTIONS

-p --phase=INTEGER
Number of phases (default = 1).
-a --alphabet=FILENAME
A file describing the alphabet to use (DNA alphabet, default setting).
-A --Alphabet=EXPRESSION
An expression describing the alphabet to use: [number<10 of characters for each pattern]+[:]+[alphabet patterns list] (DNA alphabet, default setting).
--dna
Use DNA alphabet (1:AGCT, default setting).
--protein
Use amino acid alphabet (1:IVLFCMAGTWSYPHEQDNKR).
-m --model=FILENAME
File containing the Markov model parameters (generated by estim_* programs or built regarding the supported format). Use only to load an existing model to compute statistics.
-o --output=FILENAME
Result file containing the parameters of the estimated Markov model.
-l --likelihood=FILENAME
Compute the likelihood under selected model on the sequences contained in FILENAME or on the sequences whose filenames are listed in FILENAME.
-L --Likelihood
Compute the likelihood under selected model on the sequences specified by the sequence_file argument.
-b --bic=FILENAME
Compute the BIC under selected model on the sequences contained in FILENAME or on the sequences whose filenames are listed in FILENAME.
-B --Bic=FILENAME
Compute the BIC under selected model on the sequences specified by the sequence_file argument.
--all
Compute the total BIC/likelihood for all the given sequences.
-v --version
Display the version number and exit.
-h --help
Print this help and exit.

Examples

Estimate a Markov model of order 2 on the list of sequence files contained in file seq.list. The sequences contain tokens of an alphabet described in file custom.alpha. Generate the estimated model in file model.desc.

estim_m seql.list -d 2 -a custom.alpha -o model.desc

Idem with the total likelihood computation on the sequences from seq.list. The alphabet is given as an expression.

estim_m seql.list -d 2 -A 1:ABCDEF -L --all

AUTHORS

estim_m is part of the seq++ package, developed by Vincent Miele <miele@genopole.cnrs.fr>, David Robelin <robelin@genopole.cnrs.fr>, Pierre-Yves Bourguignon <bourguignon@genopole.cnrs.fr>, Gregory Nuel <nuel@genopole.cnrs.fr> and Hugues Richard <richard@genopole.cnrs.fr>.

SEE ALSO

estim_pm(1), estim_mtd(1), estim_vlm(1), simul_m(1), dist_m(1)

More information on seq++ is available at <http://stat.genopole.cnrs.fr/seqpp>.