The query sequence for this search has been filtered. Filtering
eliminates low complexity regions that commonly give spuriously high
scores that reflect compositional bias rather than significant
position-by-position alignment. Filtering can eliminate these potentially
confounding matches (e.g., hits against proline-rich regions or poly-A
tails) from the blast reports, leaving regions whose blast statistics
reflect the specificity of their pairwise alignment.
BLASTX 2.1.1 [Aug-8-2000]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= Contig432.seq Contig432
(1253 letters)
Database: nr
565,281 sequences; 177,575,912 total letters
Score E
Sequences producing significant alignments: (bits) Value
dbj|BAA03574.1| (D14846) endo alpha-1,4 polygalactosaminida... 111 2e-43
pir||C70471 hypothetical protein aq_1993 - Aquifex aeolicus... 98 3e-36
pir||T35294 probable endo alpha-1,4 polygalactosaminidase -... 102 1e-20
pir||D75551 probable endo alpha-1,4 polygalactosaminidase -... 58 4e-15
dbj|BAA14013.1| (D89734) unnamed protein product [Streptomy... 55 1e-06
emb|CAB59594.1| (AL132662) hypothetical secreted protein [S... 42 0.012
>dbj|BAA03574.1| (D14846) endo alpha-1,4 polygalactosaminidase precusor [Pseudomonas
sp.]
Length = 294
Score = 111 bits (275), Expect(2) = 2e-43
Identities = 56/130 (43%), Positives = 79/130 (60%), Gaps = 1/130 (0%)
Frame = +2
Query: 614 NLRSKNVRNIMKKRIKYAAKKGCDAIDPDNVDGYQNDNGLGLTQKDSIDYVKFLATEAAK 793
++RS NVR+IM R+ A KGCD ++PDNVDGY ND G L D + F+A EA K
Sbjct: 140 DIRSSNVRDIMTARLDRAVAKGCDGVEPDNVDGYANDTGFPLQDTDQYAFNVFIANEAHK 199
Query: 794 YNMSTGLKNAGDIIKSVLPYVQFSVNEQCVEYSECETFAAFIKAKKPVFNIEYPKGAPKV 973
N++ GLKN D + ++ P F+VNE+C E EC+ + F KPV N EY A K
Sbjct: 200 RNLAVGLKNDVDQLVALEPSFDFAVNEECNEQKECDGYTVFTSKNKPVLNAEY---AGKY 256
Query: 974 K-EADRKTICS 1003
+ + ++T+C+
Sbjct: 257 RTPSGQRTLCN 267
Score = 87.4 bits (213), Expect(2) = 2e-43
Identities = 43/94 (45%), Positives = 54/94 (56%)
Frame = +3
Query: 333 WHPKVRATCQIILLKPLKLSNDGTAKNLKPNVCVYDLDLYDNDAETFAALHNAGKNVICY 512
W P V T Q L L N NV +YD+DL+D D T AAL AG+ V+CY
Sbjct: 55 WTPTVADTWQWQLKGKL---------NTSYNVAIYDIDLFDTDPATIAALKQAGRKVVCY 105
Query: 513 FSAGSWENWRDDKNQFKKADLGKTMDGWPDEKWI 614
FSAGS ENWR D ++FK +D G +D W E+W+
Sbjct: 106 FSAGSSENWRPDFSKFKASDQGNKLDDWEGERWL 139
>pir||C70471 hypothetical protein aq_1993 - Aquifex aeolicus
gb|AAC07769.1| (AE000767) putative protein [Aquifex aeolicus]
Length = 215
Score = 97.9 bits (240), Expect(2) = 3e-36
Identities = 44/95 (46%), Positives = 64/95 (67%)
Frame = +2
Query: 614 NLRSKNVRNIMKKRIKYAAKKGCDAIDPDNVDGYQNDNGLGLTQKDSIDYVKFLATEAAK 793
++R++ VR +M KR+K A +KGCD +DPDN+D Y D G LT++D DY FL+ EA K
Sbjct: 107 DVRNEKVRELMVKRLKLAKQKGCDGVDPDNLDIYLYDTGFNLTKEDLKDYAVFLSREAKK 166
Query: 794 YNMSTGLKNAGDIIKSVLPYVQFSVNEQCVEYSEC 898
+ GLKN G +++ +L Y FSV E+C ++ EC
Sbjct: 167 IGLKIGLKNNGVLVEELLNYFDFSVVEECHKFKEC 201
Score = 77.3 bits (187), Expect(2) = 3e-36
Identities = 32/64 (50%), Positives = 42/64 (65%)
Frame = +3
Query: 423 NVCVYDLDLYDNDAETFAALHNAGKNVICYFSAGSWENWRDDKNQFKKADLGKTMDGWPD 602
NV +YD+DL+DN + L GK VICYFSAG+WE WR D N+F K +GK +GW
Sbjct: 43 NVELYDIDLFDNSVQVINELKAKGKTVICYFSAGTWEEWRPDANEFPKEAIGKPYEGWEG 102
Query: 603 EKWI 614
E ++
Sbjct: 103 EYFL 106
>pir||T35294 probable endo alpha-1,4 polygalactosaminidase - Streptomyces
coelicolor
emb|CAB51262.1| (AL096872) putative endo alpha-1,4 polygalactosaminidase
[Streptomyces coelicolor A3(2)]
Length = 282
Score = 102 bits (251), Expect = 1e-20
Identities = 57/170 (33%), Positives = 88/170 (51%)
Frame = +2
Query: 575 GQDDGRVA*REMDNLRSKNVRNIMKKRIKYAAKKGCDAIDPDNVDGYQNDNGLGLTQKDS 754
G+ +G R +D + + +M +R+ KG DA++PDN+DGY+ND G LT D
Sbjct: 125 GKGNGWEGERWLDIRATDVLEPLMAERLDMCRDKGFDAVEPDNMDGYKNDTGFPLTGDDQ 184
Query: 755 IDYVKFLATEAAKYNMSTGLKNAGDIIKSVLPYVQFSVNEQCVEYSECETFAAFIKAKKP 934
+ Y + +A A M+ GLKN D I ++ F+VNEQC +Y EC F+ A K
Sbjct: 185 LRYNRLIAKLAHDRGMAVGLKNDLDQIPDLVDDFDFAVNEQCAQYGECADNRPFVDADKA 244
Query: 935 VFNIEYPKGAPKVKEADRKTICSKKGKAKGTDGFSTVIKKMNLDGWVQYC 1084
VF++EY E + C+ + + S+++KK LD W + C
Sbjct: 245 VFHVEY--------ELPTERFCADSRELR----LSSMLKKYELDAWREAC 282
Score = 69.9 bits (168), Expect = 6e-11
Identities = 31/64 (48%), Positives = 44/64 (68%)
Frame = +3
Query: 423 NVCVYDLDLYDNDAETFAALHNAGKNVICYFSAGSWENWRDDKNQFKKADLGKTMDGWPD 602
+V VYD+D +D+D T A LH+ G+ VICY S G+WE++R D + F K LGK +GW
Sbjct: 74 DVPVYDIDGFDHDEATVAGLHDDGRKVICYVSTGAWEDFRPDADAFPKKVLGKG-NGWEG 132
Query: 603 EKWI 614
E+W+
Sbjct: 133 ERWL 136
>pir||D75551 probable endo alpha-1,4 polygalactosaminidase - Deinococcus
radiodurans (strain R1)
gb|AAF09753.1|AE001879_1 (AE001879) endo alpha-1,4 polygalactosaminidase, putative
[Deinococcus radiodurans]
Length = 305
Score = 58.2 bits (138), Expect(2) = 4e-15
Identities = 30/104 (28%), Positives = 55/104 (52%), Gaps = 7/104 (6%)
Frame = +2
Query: 641 IMKKRIKYAAKKGCDAIDPDNVDGYQNDNGLGLTQKDSIDYVKFLATEAAKYNMSTGLKN 820
I+ +R+ A KG DA++PDN+ QN ++++D +D+ +LA A + ++ KN
Sbjct: 156 ILDRRLALCAAKGFDAVEPDNLQNDQNVTSGVISRQDQLDFNGWLADRAHAHGLAILQKN 215
Query: 821 AGDII-------KSVLPYVQFSVNEQCVEYSECETFAAFIKAKKPVFNIEY 952
D + + ++ +NE C Y EC +++ K N+EY
Sbjct: 216 GPDYVLQADRQGRLMVDLFDGVLNESCQRYKECGPLTEYVRRGKLALNVEY 266
Score = 45.7 bits (106), Expect(2) = 4e-15
Identities = 22/67 (32%), Positives = 35/67 (51%)
Frame = +3
Query: 414 LKPNVCVYDLDLYDNDAETFAALHNAGKNVICYFSAGSWENWRDDKNQFKKADLGKTMDG 593
L V + DLD ++ A A L G +CY + GS+E++R D Q+ + +T
Sbjct: 75 LPAGVSLLDLDGFETSAAKVADLKAQGVYTVCYLNVGSYESYRPDAAQYPDSLKIQTDPN 134
Query: 594 WPDEKWI 614
WPDE ++
Sbjct: 135 WPDESFV 141
>dbj|BAA14013.1| (D89734) unnamed protein product [Streptomyces griseus]
Length = 258
Score = 55.4 bits (131), Expect = 1e-06
Identities = 37/114 (32%), Positives = 58/114 (50%), Gaps = 2/114 (1%)
Frame = +2
Query: 620 RSKNVRNIMKKRIKYAAKKGCDAIDPDNVDGYQNDNGLGLTQKDSIDYVKFLATEAAKYN 799
R + +I+ I AK G A++PDN+D Y+ GL LT+ + K LA A
Sbjct: 112 RRSRLADIVGGWIDGCAKAGFQAVEPDNLDSYERSKGL-LTRAHNAASAKLLADRAHAAG 170
Query: 800 MSTGLKNAGDII--KSVLPYVQFSVNEQCVEYSECETFAAFIKAKKPVFNIEYPKG 961
++ G KN D++ + + + F+V E+C Y EC +A + VF +EY G
Sbjct: 171 LAIGQKNTTDLLGQRDTIGF-DFAVAEECGRYDECADYADAYGDR--VFVVEYTDG 223
>emb|CAB59594.1| (AL132662) hypothetical secreted protein [Streptomyces coelicolor
A3(2)]
Length = 275
Score = 42.2 bits (97), Expect = 0.012
Identities = 26/77 (33%), Positives = 39/77 (49%), Gaps = 1/77 (1%)
Frame = +2
Query: 668 AKKGCDAIDPDNVDGYQNDNGLGLTQKDSIDYVKFLATEAAKYNMSTGLKNAGDIIKS-V 844
A KG A++PDN D Y L L D+ +K LA A ++ G KN ++ +
Sbjct: 145 ADKGFQAVEPDNYDSYTRAGDL-LDAADAQGLIKLLAERAHADGLAIGQKNTVELAPNRK 203
Query: 845 LPYVQFSVNEQCVEYSEC 898
+ F+V E+C E+ EC
Sbjct: 204 ANGLDFAVAEECGEWDEC 221
Database: nr
Posted date: Sep 29, 2000 9:53 PM
Number of letters in database: 177,575,912
Number of sequences in database: 565,281
Lambda K H
0.318 0.135 0.00
Gapped
Lambda K H
0.270 0.0470 4.94e-324
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 378694507
Number of Sequences: 565281
Number of extensions: 7793700
Number of successful extensions: 21318
Number of sequences better than 10.0: 12
Number of HSP's better than 10.0 without gapping: 4
Number of HSP's successfully gapped in prelim test: 2
Number of HSP's that attempted gapping in prelim test: 21306
Number of HSP's gapped (non-prelim): 10
length of query: 417
length of database: 177,575,912
effective HSP length: 54
effective length of query: 363
effective length of database: 147,050,738
effective search space: 53379417894
effective search space used: 53379417894
frameshift window, decay const: 50, 0.1
T: 12
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.8 bits)
X3: 64 (24.9 bits)
S1: 41 (21.7 bits)
S2: 72 (32.5 bits)