R'MES: Finding Exceptional Motifs in Sequences

Three utilities are provided in the R'MES package:

rmes.format displays the results contained in an output file generated by the rmes command. It produces a table with the motifs sorted according to their exceptionality scores (see below).
rmes.gfam allows to generate family files when the corresponding families are degenerated DNA motifs which can be written thanks to the bases a, c, g, t and n (see below).
rmes.composition allows to know the length of a sequence and its composition (see below).

rmes.format

The rmes.format program displays the results contained in an output file generated by the rmes command. It produces a table with the motifs sorted according to their exceptionality scores (quantities such like observed counts, expected counts, scores etc. are also presented).

It has to be run via a command line which looks like :

rmes.format [options] < <rmesfilename> > <tablefilename>

where

<rmesfilename>: indicates the filename of the output file generated by the rmes command,
<tablefilename>: specifies the name of the output file.

All the options can be obtained by typing :

rmes.format --help

and are described below:

-l <int> or --length <int>: (value required) sets the length of the analyzed words,
-i <int> or --lmin <int>: (value required) sets the length of the smallest analyzed words,
-a <int> or --lmax <int>: (value required) sets the length of the largest analyzed words,
--tmax <float value>: (value required) motifs with a score greater or equal to this threshold will be displayed (3 by default),
--tmin <float value>: (value required) motifs with a score less or equal to this threshold will be displayed (-3 by default),
-v or --version: displays version information and exits.

For more details, please refer to the user guide

rmes.gfam

The rmes.gfam program allows to generate family files which can be used with the -f option of the rmes command. However, only families corresponding to degenerated oligonucleotides on the {a,c,g,t,n} alphabet can be generated.

The basic command is :

rmes.gfam [-t <label>] -p <string>

where

-t <label>: specifies the title of the family file (it will be printed as the first line),
-p <string>: specifies the template pattern of the degenerated oligonucleotides. This pattern is composed of
characters from {a,c,g,t,n,#}. When used, the letters a,c,g,t,n indicate fixed bases, whereas the
symbol # indicates that it will be successively replaced by a, c, g then t to produce different
families.

For instance, the -p #n## option will produce the 64 degenerated tetranucleotides having an n in the second position, whereas the -p ####aa template will produce the 256 hexanucleotides ending with aa.

For more details, please refer to the user guide

rmes.composition

The rmes.composition program allows to give the length and the word composition of a given sequence. The length is defined by the sum of the number of valid characters and of the number of separators.

The basic command is :

rmes.composition [options] -s <filename>

where

-s <filename>: sets the sequence file in FASTA or GenBank format,

and the options are :

-l <int> or --length <int>: (value required) specifies the word length,
-i <int> or --lmin <int>: (value required) specifies the minimal word length,
-a <int> or --lmax <int>: (value required) specifies the maximal word length,
-aa: uses the amino acid alphabet.

If no word length is mentioned then the letter composition will be provided.

For more details, please refer to the user guide