MycoAlign Installation Procedure

From GSAF

Revision as of 00:51, 25 January 2008 by Jhannah (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

How to install MycoAlign.

The goal is to be able to hand a vanilla Linux machine and this wiki page to an administrator, and they could install MycoAlign from scratch without knowing anything about what MycoAlign is or what it does.

[edit] Prerequisites

  • Linux
  • MySQL
  • Perl, C compiler, Python TODO: details?
  • Apache, FastCGI? TODO: details?

[edit] Installation

adduser i3bio       # Production
adduser i3biodev    # Developers / QA
cd /home
svn checkout https://hive.gds.unomaha.edu/svn/i3bio/tags/current i3bio
svn checkout https://hive.gds.unomaha.edu/svn/i3bio/trunk        i3biodev

# Compile "compare", a C program:
cd i3bio/public_html/db/compare
make

# Build the database
cd /home/i3bio/
... create the MySQL database "i3bio"       TODO: Procedure?
... create the MySQL database "i3biodev"    TODO: Procedure?
... Security? Usernames and passwords?      TODO: Procedure?
mysql i3bio    < public_html/db/scripts/schema.sql
mysql i3biodev < public_html/db/scripts/schema.sql

# Configure Apache
... Are the Apache configs in SVN? Documentation? FastCGI?  TODO: How?

vi public_html/db/config.pl
   # Change settings to production database (username, passwords, etc. Details here.
   # (It's the i3biodev config that is committed to SVN, not the production config.)

[edit] Create /tmp/Actinobacteria database for subsequent searching

The CLAB server has most of the current GenBank archive on it. So let's pop over there and grab a current dump of all Actinobacteria sequences. We'll convert those sequences to FASTA, compress the file, move it to the i3bio server, and prepare for BLAST queries.

ssh clab.ist.unomaha.edu
cd /sequences/genbank/
seqget.pl --organism Actinobacteria 1>/tmp/A.genbank
genbank2fasta.pl /tmp/A.genbank /tmp/A.fasta
bzip2 /tmp/A.fasta
scp /tmp/A.fasta.bz2 i3bio.gds.unomaha.edu:/tmp/
ssh i3bio.gds.unomaha.edu
mkdir /tmp/Actinobacteria
cd /tmp/Actinobacteria
mv ../A.fasta.bz2 ./
bzip2 -d A.fasta.bz2
formatdb -i A.fasta -o T -p F

Now your directory should look like this:

jhannah@i3bio:/tmp/Actinobacteria2$ ls -al
total 332144
drwxr-xr-x  2 jhannah jhannah      4096 2008-01-24 18:45 .
drwxrwxrwt 47 root    root         8192 2008-01-24 18:46 ..
-rw-r--r--  1 jhannah jhannah 268569337 2008-01-24 18:41 A.fasta
-rw-r--r--  1 jhannah jhannah   4122667 2008-01-24 18:45 A.fasta.nhr
-rw-r--r--  1 jhannah jhannah    408580 2008-01-24 18:45 A.fasta.nin
-rw-r--r--  1 jhannah jhannah   1123780 2008-01-24 18:45 A.fasta.nsd
-rw-r--r--  1 jhannah jhannah     26104 2008-01-24 18:45 A.fasta.nsi
-rw-r--r--  1 jhannah jhannah  65485509 2008-01-24 18:45 A.fasta.nsq
-rw-r--r--  1 jhannah jhannah       600 2008-01-24 18:45 formatdb.log

So as long as these files look big and fat: nhr nin nsd nsi nsq, you're done w/ A.fasta. So let's remove it since it's huge and we don't need it any more.

rm A.fasta

You can also read formatdb.log, if you're in the mood.

That's it. You should be all set.