R'MES: Finding Exceptional Motifs in Sequences


Main  |  Getting started  |  Method  |  Running R'MES  |  Utilities  |  Downloads  |  Contact


To get a complete description of all the possibilities offered by R'MES, please refer to the user guide. In particular, it starts by giving the most basic use case of R'MES (calculating the scores of exceptionality of all the words of a given length in a given sequence and under a given Markov model) and then describes other possible cases with the associated options (using degenerated words, analyzing coding DNA sequences, using customized alphabets, finding exceptionally skewed motifs and studying clumps of motifs).

R'MES has to be run via a command line which looks like :

rmes [options] -s <filename> -o <string>

where

     -s <filename>, --seq <filename>
     sets the sequence file in FASTA or GenBank format,
     -o <string>, --out <filename>
     specifies the prefix for output files.

 

All the options can be obtained by typing:

rmes --help

and are described below.

The option which specifies the approximation of the word count distribution used to evaluate the p-value is nevertheless required and must take one of the following values:

     --gauss
     Use the Gaussian approximation,
     --poisson
     Use the Poisson approximation for the number of clumps,
     --compoundpoisson
     Use the compound Poisson approximation,
     --skew
     Use the Gaussian method and compute the additional scores for the skew.

The following options are optional:

     -l <int> or --length <int>
     (value required) length of the analyzed words,
     -i <int> or --lmin <int>
     (value required) length of the smallest analyzed words,
     -a <int> or --lmax <int>
     (value required) length of the largest analyzed words,
     -m <int> or --markov_order <int>
     (value required) order of the Markov model,
     --max
     Use the maximal Markov order with respect to the word length,
     -f <filename> or --fam <filename>
     (value required) set the family file in this format.
     --phases <integer>
     (value required) number of phases.
     --dna
     Use nucleotide alphabet
     --aa
     Use amino acid alphabet
     --alphabet <character string>
     (value required) Specify a string to be used as alphabet for the sequences
     -z or --compress
     Compress output files.
     -v or --version
     Displays version information and exits.