Commit e99789e7 authored by Carsten Kemena's avatar Carsten Kemena

improving options

parent 1e45b525
No preview for this file type
......@@ -10,7 +10,7 @@
\begin{document}
\title{RADS manual}
\subtitle{2.1.2 (beta)}
\subtitle{2.2.0}
\author{Carsten Kemena}
\maketitle
......@@ -33,9 +33,8 @@ The general option influence the general behaviour of RADS:
\begin{tabular}{llp{9.5cm}}
\hline
parameter & default & description\\\hline
-h, --help & - & Produces this help message \\
-h, --help & - & Produces this help message \\
-d, --db & - & Prefix to the database. Can be either one of the precomputed ones downloaded from the website or self-computed. \\
-a, --all & - & All domain types need to occur\\
-o, --out & - & The output file.\\
-n, --threads & 1 & The number of threads to use\\
\hline
......@@ -48,7 +47,7 @@ The query options define the different ways a query can be provided.
\begin{tabular}{llp{9cm}}
\hline
parameter & default & description\\\hline
-q, --query-dom &- & The domain annotation file to be used as query. This is a simple domain annotion file in one of the supported formats.\\
-q, --query-dom &- & The domain annotation file to be used as query. This is a simple domain annotation file in one of the supported formats.\\
-Q, --query-seq &- & File containing sequences to be used as queries. The file has to be in FASTA format.\\
--domaindb & - & The domain database to use for automated annotation. \\
-D, --domains & - & Provide a domain arrangement manually in form of space separated domain accession numbers (e.g. PF00001 PF00002)\\
......@@ -67,12 +66,24 @@ parameter & default & description\\\hline
-M, --matrix &- & The domain similarity matrix. This one needs to fit the data in the database (e.g. If you work with a database that contain Pfam domains, use the corresponding Pfam similarity matrix.\\
--gop & -50 & Gap opening costs\\
--gep & -10 & Gap extension costs\\
-c, --collapse & false & Collapse consecutive identical domains\\
\hline
\end{tabular}
\end{table}
Gap opening costs are only taken into account when the gap occurs in the middle of a domain arrangement. Gaps at either end of a DA are assumed only penalized using the 'gap extension' costs.
\subsubsection{Result filtering options}
\begin{table}[H]
\begin{tabular}{llp{9cm}}
\hline
parameter & default & description\\\hline
-a, --all & false & All domain types need to occur\\
-M, --min-score & =0 & The minimum alignment score to list\\
\end{tabular}
\end{table}
\subsection{Data bases}
We provide a range of precomputed databases on our website. We currently provide databases based on the InterPro annotations. If you want to compute a database based on you own data you can do that very easily using the makeRadsDB program included (see Section \ref{section:makeRadsDB}).
......@@ -81,27 +92,34 @@ We provide a range of precomputed databases on our website. We currently provide
\subsection{Output format}
The output is in a very simple textfile format. The hits are listed in a table of five \emph{tab} separated columns. The first column contains the alignment score and the second the normalized version. The third column contains the the target id followed by the sequence length in the fourth column.
The output is in a simple text file format and contains two parts. The first part is a summary of the process containing the date of execution, The version of RADS and the parameters used. The second part of the file contains the result. The hits are listed in a table of five \emph{tab} separated columns. The first column contains the alignment score and the second the normalized version. The third column contains the the target id followed by the sequence length in the fourth column.
The table is sorted according to the first column.
\begin{verbatim}
# RADS Output v1
# RADS version 2.0.0
# ********************************
# RADS version 2.2.0
# RADS Output v1
# run at Fri Apr 20 14:19:09 2018
#
# query file: -
# database: interPro-test
# gap open penalty -50
# gap extension penalty -10
# matrix: pfam-31.dsm
# all: false
# collapse: true
# ******************************************************************
# -------------------------------------------------------------------
Results for: manual entered query
Domain arrangement: PF00001
Domain arrangement: PF00001 PF00002 PF00003
# score | normalized | SeqID | sequence length | domain arrangement
# -------------------------------------------------------------------
100 1.00 10020:000030 611 PF00001 44 293
100 1.00 10020:000054 276 PF00001 2 215
100 1.00 10020:0001c3 337 PF00001 42 293
100 1.00 10020:000327 402 PF00001 75 353
100 1.00 10020:000359 410 PF00001 52 305
100 1.00 10020:000393 372 PF00001 67 321
300 1.00 test-seq1 530 PF00001 10 63 PF00002 104 312 PF00003 362 524
300 1.00 test-seq2 530 PF00001 10 63 PF00002 104 312 PF00003 362 524
190 0.69 test-seq3 530 PF00002 104 312 PF00003 362 524
190 0.69 test-seq5 530 PF00001 10 63 PF00002 104 312 PF00002 362 524
\end{verbatim}
......@@ -122,7 +140,7 @@ rads --db InterPro60-pfam -M pfam30.dsm -q seq.dom
\section{makeRadsDB}\label{section:makeRadsDB}
A program to compute a data base that can be used by RADS. A database consists of two files an index file
(SQLite database) and an arrangement file (simple textfile) (e.g. if the name of the data base is MyDB the
(SQLite database) and an arrangement file (simple text file) (e.g. if the name of the data base is MyDB the
files needed are MyDB.db and MyDB.da).
\subsection{Program options}
......@@ -157,9 +175,7 @@ Some options to influence the data base construction.
\begin{tabular}{llp{9cm}}
\hline
parameter & default & description\\\hline
-d, --databases & - & The database to use\\
-f, --filter & - & Remove overlapping domains\\
-t, --threshold & 10 & Maximal number of allowed overlap\\
-d, --database & - & The database to use\\
\hline
\end{tabular}
......
......@@ -80,20 +80,20 @@ main(int argc, char *argv[])
int gop, gep;
fs::path matrixName;
bool collapse;
po::options_description scoreOpts("Scoring options");
scoreOpts.add_options()
("matrix,m", po::value<fs::path>(&matrixName)->value_name("FILE"), "The domain similarity matrix")
("gop", po::value<int>(&gop)->default_value(-50)->value_name("INT"), "Gap opening costs")
("gep", po::value<int>(&gep)->default_value(-10)->value_name("INT"), "Gap extension costs")
("collapse,c", po::value<bool>(&collapse)->default_value(false)->zero_tokens(), "Collapse consecutive identical domains")
;
bool all;
bool collapse;
int minScore;
po::options_description filterOpts("Result filtering options");
filterOpts.add_options()
("all,a", po::value<bool>(&all)->default_value(false)->zero_tokens(), "All domain types need to occur")
("collapse,c", po::value<bool>(&collapse)->default_value(false)->zero_tokens(), "Collapse consecutive identical domains")
("min-score,M", po::value<int>(&minScore)->default_value(0), "The minimum alignment score to list")
;
......@@ -177,9 +177,7 @@ main(int argc, char *argv[])
querySet.emplace("manual entered query", da);
}
AP::Output outS(outFile);
// calculate results
fs::path daFile = prefix;
daFile.replace_extension(".da");
vector<BSDL::AlignmentMatrix<int, BSDL::DSM> > matrices;
......@@ -190,11 +188,12 @@ main(int argc, char *argv[])
std::time_t tt;
tt = system_clock::to_time_t(today);
// Write results
string qPath = querySeqFile.empty() ? "-" : fs::canonical(querySeqFile).string();
string print_matrix_name = (print_filename_only) ? matrixName.stem().string() : fs::canonical(matrixName).string();
string print_db_name = (print_filename_only) ? prefix.stem().string() : fs::canonical(daFile).replace_extension("").string();
// print output header
AP::Output outS(outFile);
outS << "# RADS version " + version + "\n"
<< "# RADS Output v1\n"
<< "# run at " << ctime (&tt) << "#\n"
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment