Genbank accession
YP_009857207.1 [GenBank]
Protein name
tail fiber protein
RBP type
TSP
Evidence DepoScope
Probability 1,00
TF
Evidence GenBank
Probability 1,00
TF
Evidence Phold
Probability 1,00
TF
Evidence RBPdetect
Probability 0,66
TF
Evidence RBPdetect2
Probability 0,95
TF
Evidence UniProt/TrEMBL
Probability 1,00
Protein sequence
MAITKIILQQMVTMDQNSITASKYPKYTIVLGNTISSITAGELTSAIESSKASAAAAKQSEINAKQSELNAKDSENEAEISAASSQQSATQSASSATASANSAKAAKTSETNAKASETAAKTSETKAKASETAAKTSETNANSSKTAAAASASAAKTSETNASASAAAAKTSETNANNSKTAAANSANAAKASEINAKTSETNAAASATKAENVASGMKASIGLGDSPRDCPDISGNPSNFLGFLRIYETATGFPSIAVGETVLTGFISGTDGVPSFAGLFVGSSTRTIYSYRWRPESGPLWTKNARIDDVNRLVQLSSETQLLNPGNNAKIIITSGKLWGAYDIENRAYIPLAVGQGGTGGRSASEARTNLGINRLEQIANDQTRLYAGNNTTYLEIGNNRAWGVYNRSSNSWQPLGIAQGGTGAMTAEDARTNLGIGRSSSPTFGHLNLLVENDSAQASSGILNQYLRDTSGVQRARSRIYSEIRGDNKAWLTLHIQSDANTNKYAGLSIDGNFQINGNFIGNAISLSDVATSKVNLQVNRFLQNTGETAVVNHAGTAQIFITDNKNWGAYDNELKRRIALPISQGGTGSLSISEAKNNLQIPSIGGGEWLTLNAPAGVEDGKYYPVIIDLAYSSLYASGAFIDIKTRSSTGSDPMNCCSFNGFIRCGGWSDRKDGGYGYFNNYARNEIAMKCILSSSKDAERYVAIYIEARGFPVQLRVPAFCEVIVPTSNFTYKNTTYAWGTANPATDSTDVLTMFDFSLNRIGFYQATTEGNYYIGNGERIVLSNGMNVGDELALYAPKITFSGTIAAGNGVIADGISVSNATFYSRYRVGDKIYGAEFRASENAGQVVVRDPAGTNHQFFNFNKEGTFSAPSGILSSTGIDWNTQHNTVNKFYGIAGQVNTPENNVVYGGIHVGFSGNYATQFAGRGSKFWARSIEGGTIGAWNRIITDQYANFGVPVTITKTGEALSIRLTGTDTSQVGYLACRTDSARLWYVGKGGSAKDVIIHNDMTNSNITVSSDITLRTPNYGGGVFADGSAMVIRRSNNRLFRIENTSYSAKDAIIQLWGNTTGRPTVIECKLPDGFLWYAQENTDGSRYFEVNGPIKARAFNQVSDRDLKDNIAEIPKATESLRKMKGYTYTLKENGMPYAGVIAQEVMEALPEAVSGFTKYTDLEGPTLTGEQLVGEERFYSVDYAAITGLLVQAGRESDSRITALESEVSDLKKQIADLTLVVNSLLANKAQ
Physico‐chemical
properties
protein length:1247 AA
molecular weight: 132952,81710 Da
isoelectric point:6,51303
aromaticity:0,08340
hydropathy:-0,32358

Domains

Domains [InterPro]
DC_0608
ATT
2–181
IPR030392
CHP
1118–1224
YP_009857207.1
1 1247
Architecture
ATT
STR
RBD
STR
RBD
ATT 2-181 | STR 182-603 | RBD 661-969 | STR 970-1028 | RBD 1029-1247
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
YP_009857207.1
1 1247
Domain Start End Length (AA) Confidence
N-terminal 1 228 228 0,6675
Central domain 229 428 201 0,5918
C-terminal 429 1247 818 0,6074
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-228
Central
229-428
C-terminal
429-1247

Taxonomy

  Name Taxonomy ID Lineage
Phage Phage NBSal005
[NCBI]
2991865 Uroviricota > Caudoviricetes > Demerecviridae > Tequintavirus > Tequintavirus NBSal005
Host No host information

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
YP_009857207.1 [NCBI]
Genbank nucleotide accession
NC_048857.1 [NCBI]
CDS location
range 35794 -> 39537
strand +
CDS
ATGGCTATAACTAAAATAATTCTACAGCAAATGGTCACTATGGACCAGAATAGCATAACTGCAAGTAAATATCCTAAGTATACCATAGTTCTGGGAAATACAATTAGCTCTATTACTGCTGGAGAGCTAACTTCTGCTATAGAGTCCTCTAAAGCATCCGCTGCAGCAGCTAAACAATCAGAGATTAATGCTAAACAGTCGGAGCTAAATGCTAAAGATTCTGAGAATGAAGCAGAAATTTCTGCGGCATCTTCTCAGCAATCTGCAACTCAGTCTGCTTCTTCTGCTACTGCTTCTGCTAATAGCGCTAAAGCTGCAAAAACTTCAGAAACTAATGCAAAAGCTAGTGAAACAGCAGCTAAAACTTCAGAAACTAAGGCAAAAGCTAGTGAAACAGCAGCTAAAACCTCAGAAACCAATGCTAATAGTAGCAAAACTGCTGCGGCTGCATCTGCTTCCGCAGCTAAAACTTCAGAAACTAATGCCAGTGCTTCAGCTGCTGCGGCTAAAACTTCTGAAACCAATGCTAATAACAGCAAAACTGCTGCTGCTAATAGTGCTAATGCTGCTAAAGCATCAGAAATCAATGCTAAAACATCAGAAACTAATGCGGCGGCATCTGCTACTAAAGCTGAAAATGTAGCTTCTGGTATGAAAGCTTCTATAGGTCTTGGGGATTCTCCAAGAGATTGTCCGGATATTTCTGGTAATCCTTCCAATTTCCTGGGATTCCTAAGAATATATGAAACGGCTACAGGTTTTCCTAGTATAGCAGTAGGTGAAACAGTTCTTACCGGATTTATCTCAGGTACTGATGGTGTACCTTCCTTCGCTGGTCTCTTTGTAGGTTCATCTACTAGGACAATATACTCATATCGCTGGAGACCAGAGAGTGGACCCTTATGGACTAAAAATGCTAGAATAGATGATGTAAACAGACTTGTACAGCTCAGCTCCGAAACCCAGCTGTTAAACCCAGGTAATAATGCTAAAATCATTATTACTAGTGGTAAACTTTGGGGAGCTTATGATATAGAGAATAGAGCATATATACCTCTTGCAGTAGGACAAGGAGGTACAGGTGGTAGATCAGCTTCTGAAGCTAGAACTAATCTGGGAATAAATAGACTTGAACAGATAGCTAATGATCAAACAAGATTATACGCTGGTAATAATACAACCTATCTAGAGATTGGTAATAACAGAGCATGGGGGGTTTATAACAGGTCTTCAAATAGCTGGCAACCCTTAGGAATAGCACAGGGTGGTACCGGAGCTATGACAGCTGAAGATGCTCGTACAAACCTAGGAATAGGACGCAGCAGTTCTCCAACTTTTGGTCACTTAAACCTTCTTGTAGAAAATGACTCTGCACAAGCTTCTTCCGGAATACTTAATCAGTACTTACGAGATACTTCTGGAGTACAGAGAGCTCGTAGTAGGATTTACTCGGAAATACGTGGGGATAATAAAGCTTGGTTGACACTACACATACAGTCCGATGCTAATACTAATAAATATGCTGGTTTGAGTATTGATGGCAATTTCCAGATAAATGGTAACTTTATTGGTAATGCTATAAGCTTGTCGGATGTAGCCACTTCTAAAGTAAATCTACAGGTAAATAGGTTCTTACAGAATACAGGTGAAACAGCTGTAGTAAACCACGCTGGTACAGCCCAGATTTTTATTACAGATAATAAAAATTGGGGGGCTTATGATAACGAATTAAAAAGACGCATAGCCCTACCCATATCTCAAGGAGGTACAGGAAGCTTATCTATATCAGAAGCAAAAAATAACCTCCAGATACCTTCTATAGGTGGAGGTGAGTGGTTAACACTTAACGCACCTGCTGGTGTTGAAGACGGAAAATATTATCCTGTCATTATTGATCTTGCTTACAGTTCCCTGTATGCCTCTGGTGCTTTTATTGATATAAAAACACGATCCTCTACAGGCAGTGATCCTATGAACTGTTGTAGCTTTAATGGTTTTATTAGGTGTGGAGGTTGGAGTGATCGTAAAGATGGTGGGTACGGATACTTCAATAACTATGCGAGAAATGAGATTGCAATGAAATGTATTCTTTCTTCTTCCAAAGATGCTGAAAGATACGTTGCAATTTATATTGAGGCCCGCGGTTTTCCTGTCCAGTTACGTGTTCCTGCATTCTGTGAGGTAATTGTACCAACGTCAAACTTTACGTATAAAAATACAACTTACGCATGGGGGACCGCTAATCCTGCAACAGATTCAACCGACGTTCTTACTATGTTTGATTTTTCTCTAAACCGTATAGGTTTCTATCAGGCAACAACGGAAGGAAACTATTACATAGGGAACGGTGAGCGTATAGTTCTTTCTAATGGTATGAACGTTGGTGATGAACTAGCTTTATATGCCCCTAAAATTACTTTTAGTGGTACTATAGCAGCTGGTAATGGTGTTATTGCTGACGGTATTTCTGTATCTAATGCCACCTTCTATTCTAGATATAGAGTAGGGGATAAGATATATGGTGCGGAGTTTAGAGCTAGTGAAAATGCTGGTCAGGTTGTTGTTAGGGATCCAGCAGGAACTAACCATCAATTCTTTAACTTCAATAAAGAAGGAACTTTTTCAGCTCCTTCTGGTATTTTATCTTCTACTGGTATAGACTGGAATACACAACATAACACTGTCAATAAGTTTTATGGTATTGCTGGTCAAGTTAATACTCCGGAAAACAATGTTGTATATGGTGGTATCCATGTAGGGTTTAGCGGTAATTATGCTACCCAGTTTGCAGGTCGTGGATCTAAATTCTGGGCTAGGAGTATTGAGGGTGGTACTATTGGGGCATGGAATCGTATAATTACAGATCAGTATGCTAATTTTGGTGTGCCTGTCACAATAACCAAAACTGGTGAGGCCTTATCTATAAGACTTACTGGAACTGATACGTCTCAAGTTGGTTACTTAGCATGTAGAACAGATTCTGCTAGGCTGTGGTATGTTGGGAAAGGTGGATCTGCTAAGGATGTTATCATACATAATGATATGACTAACAGTAATATAACTGTAAGTAGTGACATTACATTAAGAACTCCTAACTATGGGGGAGGAGTTTTTGCTGATGGATCTGCGATGGTTATACGGCGTTCAAATAATCGACTGTTTAGGATTGAAAATACATCATATTCAGCTAAAGATGCTATAATTCAATTATGGGGTAATACTACTGGAAGGCCTACTGTTATTGAATGTAAGTTACCCGATGGTTTCCTATGGTATGCACAAGAGAATACTGATGGTTCCCGTTATTTTGAGGTAAATGGGCCTATTAAAGCGCGTGCATTTAACCAAGTTTCTGATAGGGATCTAAAAGATAACATAGCAGAAATACCTAAGGCTACAGAATCTCTCCGTAAAATGAAAGGATATACATATACCCTTAAAGAAAACGGTATGCCGTACGCGGGGGTTATAGCCCAGGAAGTAATGGAGGCTCTACCTGAAGCTGTAAGTGGATTTACAAAATATACAGATCTTGAAGGGCCTACACTTACTGGAGAGCAACTAGTTGGCGAGGAACGTTTCTATTCTGTTGACTATGCGGCTATAACAGGTCTGTTAGTACAAGCAGGAAGAGAATCTGATAGCAGAATCACAGCCCTAGAATCAGAAGTATCTGATCTTAAAAAGCAAATTGCAGACCTAACGTTAGTAGTTAATTCTCTACTAGCAAATAAGGCACAGTAA

Genome Context

Genome Context

Tertiary structure

PDB ID
460fe98f5aeee9ea3d72491eaeedb790800686e671b98543efe1fa09efc270a8
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,5489
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50