R'MES: Finding Exceptional Motifs in Sequences
Main | Getting started | Method | Running R'MES | Utilities | Downloads | Contact
Three utilities are provided in the R'MES package:
- rmes.format displays the results contained in an output file generated by the rmes command. It produces a table with the motifs sorted according to their exceptionality scores (see below).
- rmes.gfam allows to generate family files when the corresponding families are degenerated DNA motifs which can be written thanks to the bases a, c, g, t and n (see below).
- rmes.composition allows to know the length of a sequence and its composition (see below).
rmes.format
The rmes.format program displays the results contained in an output file generated by the rmes command. It produces a table with the motifs sorted according to their exceptionality scores (quantities such like observed counts, expected counts, scores etc. are also presented).
It has to be run via a command line which looks like :rmes.format [options] < <rmesfilename> > <tablefilename>where
- <rmesfilename>
- indicates the filename of the output file generated by the rmes command,
- <tablefilename>
- specifies the name of the output file.
All the options can be obtained by typing :
rmes.format --helpand are described below:
- -l <int> or --length <int>
- (value required) sets the length of the analyzed words,
- -i <int> or --lmin <int>
- (value required) sets the length of the smallest analyzed words,
- -a <int> or --lmax <int>
- (value required) sets the length of the largest analyzed words,
- --tmax <float value>
- (value required) motifs with a score greater or equal to this threshold will be displayed (3 by default),
- --tmin <float value>
- (value required) motifs with a score less or equal to this threshold will be displayed (-3 by default),
- -v or --version
- displays version information and exits.
For more details, please refer to the user guide
rmes.gfam
The rmes.gfam program allows to generate family files which can be used with the -f option of the rmes command. However, only families corresponding to degenerated oligonucleotides on the {a,c,g,t,n} alphabet can be generated.
The basic command is :
rmes.gfam [-t <label>] -p <string>where
- -t <label>
- specifies the title of the family file (it will be printed as the first line),
- -p <string>
- specifies the template pattern of the degenerated oligonucleotides. This pattern is composed of
characters from {a,c,g,t,n,#}. When used, the letters a,c,g,t,n indicate fixed bases, whereas the
symbol # indicates that it will be successively replaced by a, c, g then t to produce different
families.
For instance, the -p #n## option will produce the 64 degenerated tetranucleotides having an n in the second position, whereas the -p ####aa template will produce the 256 hexanucleotides ending with aa.
For more details, please refer to the user guide
rmes.composition
The rmes.composition program allows to give the length and the word composition of a given sequence. The length is defined by the sum of the number of valid characters and of the number of separators.
The basic command is :
rmes.composition [options] -s <filename>where
- -s <filename>
- sets the sequence file in FASTA or GenBank format,
and the options are :
- -l <int> or --length <int>
- (value required) specifies the word length,
- -i <int> or --lmin <int>
- (value required) specifies the minimal word length,
- -a <int> or --lmax <int>
- (value required) specifies the maximal word length,
- -aa
- uses the amino acid alphabet.
If no word length is mentioned then the letter composition will be provided.
For more details, please refer to the user guide