Commit 2e3169e5 authored by T-B-F's avatar T-B-F

add info

parent d993275c
# Scripts to generate the domain similarity matrix
## require
python 3.x
## usage
### Download Pfam models for hhsearch
change the pfam version to the current one in the ftp
mkdir -p hhsuite_dbs/pfamA
cd hhsuite_dbs/pfamA
wget http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/pfamA_31.0.tgz .
tar xzf pfamA_31.0.tgz
### Execute the make_domat.sh script
chmod u+x make_domat.sh
./make_domat.sh hhsuite_dbs/pfamA/pfam tmpdir/
It may take some time.
If you are an experienced user you can execute many of these steps in parallel using a computer grid.
......@@ -34,7 +34,7 @@ def main():
else:
# storing lines before knowing name of the models, ie name of output
lines.append(line)
if line.startswith("ACC "):
if line.startswith("NAME "):
# get name of the model and create the corresponding output file
tmp = line.split()
name = tmp[1].split(".")[0]
......
......@@ -4,7 +4,12 @@
# Pfam-A.hhm models can be downloaded at
# http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/
hmmdb=$1 # path to the file with all the hmmdata, example Pfam-A.hmm
# WARNING do not use Pfam models from Pfam db but pfam models from HHsuite db
# models will be extracted by "python cut_ssed.py"
# individual models will be compared to the full database of models (from HHsuite db)
hmmdb=$1 # path to the file with all the hmmdata, example hhsuite_dbs/pfam
workout=$2 # path to the working directory
matrix=domsim_matrix.dat # output name
simcutoff=1 # simlarity cutoff
......@@ -28,7 +33,7 @@ fi
## important stuff starts here ##
# cut the inital pfam file into smaller pieces
python cut_seed.py -i $hmmdb -o $tmphmm
python cut_seed.py -i ${hmmdb}_hhm_db -o $tmphmm
# run for each pfam pieces hhsearch against the hmmdatabase
for hmm in `ls $tmphmm`
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment