Jay's random SeqLab notes
From GSAF
My random SeqLab meanderings. --Jhannah 19:23, 31 January 2008 (CST)
Contents |
[edit] Typedata / Fetch
typedata genembl:drogp* # writes to STDOUT typedata genembl:drogpdh* fetch genembl:drogpdh* # writes to files
Different databases
Nucleic
genbank GenBank
embl EMBL
genembl GenBank + EMBL
nucleic PIR-Nucleic
Protein
pir PIR-Protein
sw Swiss-Prot
14 different subclassifications of GenEMBL ba Bacteria in Invertebrate ... search ONLY invertebrate for drogpdh*: typedata in:drogpdh*
fetch genembl:drogpdh tofasta -check tofasta -INfile=drogpdh.gb_in -Default tofasta -INfile=drogpdh.gb_in -BEGin=20 -END=60
Reference Searching lookup stringsearch names SequenceSearching blast netblast SequenceRetrieval fetch netfetch
lookup -check lookup Kit -ORG="Rattus norvegicus"
stringsearch genEMBL:* catalase stringsearch genEMBL:* catalase -MEN=A -OUT=myfile.list lookup mouse catalase -IN=@myfile.list reformat -RSF @myfile.list
[edit] X Windows
From Mac OS X:
- Launch X11
- ssh -X gsaf.unmc.edu
- xwindows
- dotplot nm_022264.seq.pnt
From Windows:
- Launch Cygwin
- In PuTTY, click the "Enable X11 forwarding" box
- ssh to gsaf.unmc.edu
- xwindows
- dotplot nm_022264.seq.pnt
[edit] netblastn
netfetch -check
netfetch drogpdh
reformat drogpdh.rsf{*}
netblast drogpdh.seq -DBNucleotideonly -LIStsize=20 # creates drogpdh.netblastn
netfetch drogpdh.netblastn -TOP=10 -TYPe=n -OUT=drogpdh.hits.rsf
[edit] netblastx, compare, dotplot
netfetch -check
netfetch drogpdh
reformat drogpdh.rsf{*}
netblast drogpdh.seq -LIStsize=20 # creates drogpdh.netblastx
netfetch drogpdh.netblastx -TOP=10 -TYPe=n -OUT=drogpdh.hits.rsf
reformat drogpdh.hits.rsf{*}
compare abb29518.seq abc96874.seq
dotplot abb29518.pnt
[edit] Homework assigned Feb 21
netfetch nm_022264
reformat nm_022264.rsf{*}
translate nm_022264.seq -BEG=45 -END=2981 -OUT=nm_022264.seq.aa
netblast nm_022264.seq.aa -LIStsize=20
netfetch nm_022264.seq.netblastp -TOP=10 -TYPe=p -OUT=nm_022264.hits.rsf
reformat nm_022264.hits.rsf{*}
compare nm_022264.seq.aa caa44354.seq
dotplot nm_022264.seq.pnt
That dotplot is pretty boring, so let's also do a dotplot from the worst hit:
netfetch ABP97102.1
reformat abp97102.rsf{*}
compare nm_022264.seq.aa abp97102.seq -WIN=5 -OUT=worst_hit.pnt
dotplot worst_hit.pnt
Wow. That's just noise. Try these:
compare nm_022264.seq.aa abp97102.seq -WIN=10 -OUT=worst_hit.pnt compare nm_022264.seq.aa abp97102.seq -WIN=15 -OUT=worst_hit.pnt compare nm_022264.seq.aa abp97102.seq -WIN=20 -OUT=worst_hit.pnt
And watch the line re-emerge... :)
[edit] pileup, figure, pretty
ls *seq > seq.list pileup @seq.list # Creates .msf file, pileup.figure figure -POR pileup.figure # See the denogram pretty -IN=@seq.list -CON -OUT=pretty.pretty # vi pretty.pretty to see consensus
[edit] map, mapplot, mapsort, plasmidmap
map is text, mapplot is pretty version of whole seq, mapsort shows us all the fragments we'd end up with.
map -INfile=x02513.seq -Default map -INfile=x02513.seq -BEGin=1 -END=7249 -ENZ=* -TAB mapplot -INfile=gb_sy:synpbr322 -OUTfile=synpbr322.mapplot mapplot -INfile=gb_sy:synpbr322 -OUTfile=synpbr322.mapplot -EXCL=500,2000 -DAT=pUC18_cutters.dat
These are very important to us. Find only those that cut once when we ignore some ranges:
-EXC=500,2000 -ONCe
mapsort -INfile=genbank:synpbr322 -CIR -ENZ=* -ONC -EXCL=10,200 # creates text file synpbr322.mapsort mapsort -INfile=genbank:synpbr322 -CIR -ENZ=* -PLA -ONC -EXCL=10,200 plasmidmap synpbr322.tick # pretty version!

