Genbank accession
AGR48337.1 [GenBank]
Protein name
tail fiber protein
RBP type
TF
Evidence GenBank
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,80
TF
Evidence RBPdetect2
Probability 0,96
Protein sequence
MGKGGGKGHTPREAKDNLKSTQMMSVIDAIGEGPVEGPVKGLQSILVNKTPLTDTDGNPVIHGVTAVWRAGEQEQTPPEGFESSGAETALGVEVTKAKPVTRTITSANIDRLRVTFGVQSLVETTSKGDRNPSSVRLLIQLERNGNWVTEKDVTINGKTTSQYLTSVILNNLPERPFNIRMVRVTADSTTDQLQNRTLWSSYTEIIDVKQCYPNTAIVGLQVDAEQFGGQQMVVNYHIRGRIIQVPSNYDPEKRTYSGIWDGSLKPAYSNNPAWCLWDMLTHPRYGMGKRLGASDVDKWALYAIGQYCDQTVPDGFGGTEPRMTFNAYLSQQRKVWDVLGDFCSAMRCMPVWNGQTLTFVQDRPSDVVWPYTNSDVVVDDNGVGFRYSFSALKDRHTAVEVNYTDPQNGWQTSTELVEDPDAILRYGRNLLKVDAFGCTSRGQAHRAGLWVIKTELLETQTVDFTLGSQGLRHTPGDIIEICDNDYAGTLTGGRILSIDAASRTLTLDREVTLPEAGTSTVNLINGSGKPVRVDITAHPAPDRIQVSVLPDGMETYGVWGLSLPSLRRRLFRCVSIRENTDGTFAITAVQHVPEKEAIVDNGARFEPMSGSLNSVIPPAVQHLTVEVSASDGQYLALAKWDTPRVVKGVRFSLRLTSGSGENSRLVTSAITADTEHRFSGLPLGEYTLTVRAINSYGQQGEPATTTFRINAPAAPARIELTPGYFQITAVPVLAVYDPTVQYEFWFSEKRITDTAQVETSARYLGTGSQWSVSGPHIRPGKDFWFYVRSVNLVGKSAFVEASGRASNDAEGYLDFFRGEIGKTHLAQGMWELIDNSQLDDEMAEMKTTITETRNEITQTVSKTLEDQSATIQQIQRVQTDTNNDLAALYMLKVQKTKNGIPYVAGIGAGIEDADGQPLSNILLQADRIAMINPENGNTTPLFVAQGNQLFMNDVFLKRLFAVSITSSGNPPTFSLTPEGRLTARNADISGHISANSGTLNNVVIAENCTINGTLKAENIIGDLVKCAGVAFPVDGSYLANGTRTLTVYDDHSFDRQIIIPPIIYVGSKQESRTSNDIWTECFLHVDQNGRRIYSGRSVTEPGIFSGIIDMPAGGGHITLSFTVSSRRQNNSWGSSRISNLQAIVVKKNSAGISIR
Physico‐chemical
properties
protein length:1155 AA
molecular weight: 126490,66280 Da
isoelectric point:5,67733
aromaticity:0,07619
hydropathy:-0,30173

Domains

Domains [InterPro]
IPR053171
Unmapped
1–858
DC_0023
STR
1–1153
IPR055383
STR
609–712
IPR036116
STR
616–718
IPR003961
STR
617–706
IPR003961
STR
619–714
AGR48337.1
1 1155
Architecture
STR
ATT
STR
ATT
STR
ATT
STR
STR 1-85 | ATT 86-207 | STR 208-329 | ATT 330-496 | STR 497-713 | ATT 714-816 | STR 817-1154 |
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
AGR48337.1
1 1155
Domain Start End Length (AA) Confidence
N-terminal 1 1013 1013 0,9083
Central domain 1014 1144 132 0,2028
C-terminal 1145 1155 10 0,9862
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-1013
Central
1014-1144
C-terminal
1145-1155

Taxonomy

  Name Taxonomy ID Lineage
Phage Escherichia phage 1720a-02
[NCBI]
1115653 Viruses > Duplodnaviria > Heunggongvirae > Uroviricota > Caudoviricetes
Host Escherichia coli
[NCBI]
562 cellular organisms > Bacteria > Pseudomonadati > Pseudomonadota > Gammaproteobacteria > Enterobacterales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
AGR48337.1 [NCBI]
Genbank nucleotide accession
KF030445.1 [NCBI]
CDS location
range 4793 -> 8260
strand -
CDS
GTGGGTAAAGGGGGCGGCAAGGGGCACACACCGCGTGAGGCGAAGGACAATCTCAAGTCAACGCAGATGATGAGCGTGATTGATGCCATTGGTGAGGGACCGGTGGAAGGCCCGGTGAAGGGACTGCAGAGTATTCTGGTGAACAAAACCCCGCTGACGGACACGGACGGTAATCCCGTGATACACGGTGTGACCGCCGTCTGGCGTGCCGGGGAGCAGGAGCAGACACCGCCGGAAGGCTTTGAGTCCTCCGGGGCGGAAACCGCACTGGGCGTGGAGGTGACGAAGGCAAAGCCGGTGACGCGCACCATCACGTCAGCGAACATTGACCGTCTGCGGGTCACCTTCGGGGTGCAGTCACTGGTGGAGACCACCTCAAAGGGTGACCGTAATCCCTCTTCTGTCCGCCTGCTGATTCAGCTTGAGCGTAACGGTAACTGGGTGACGGAGAAGGATGTCACCATTAACGGCAAGACCACCTCGCAGTACCTGACGTCGGTGATTCTGAATAATCTCCCTGAGCGCCCCTTTAACATCCGGATGGTCAGGGTGACGGCGGACAGTACCACGGACCAGCTGCAGAACAGAACGCTGTGGTCGTCATACACCGAAATCATCGATGTGAAACAGTGCTACCCGAACACGGCCATTGTGGGGCTGCAGGTGGATGCGGAGCAGTTCGGTGGCCAGCAGATGGTGGTGAACTACCATATCCGCGGCCGCATCATTCAGGTGCCGTCAAACTATGACCCGGAAAAACGCACCTACAGCGGTATCTGGGACGGGAGTCTGAAACCGGCATACAGCAATAACCCGGCCTGGTGCCTGTGGGACATGCTGACCCACCCGCGCTACGGGATGGGAAAACGCCTGGGGGCCTCGGACGTGGACAAGTGGGCGCTGTATGCCATCGGGCAGTACTGTGACCAGACGGTCCCGGATGGTTTCGGGGGCACAGAGCCGCGGATGACCTTTAATGCGTACCTGTCACAGCAGCGTAAGGTGTGGGATGTCCTGGGGGATTTCTGCTCGGCGATGCGCTGTATGCCGGTATGGAACGGCCAGACGCTGACGTTCGTTCAGGACCGCCCGTCGGATGTGGTGTGGCCGTACACCAACAGCGATGTGGTGGTGGATGATAACGGCGTGGGGTTCCGCTACAGCTTCAGTGCCCTGAAGGACCGGCACACGGCGGTGGAGGTGAATTACACCGACCCGCAGAACGGCTGGCAGACTTCCACGGAACTGGTGGAAGACCCGGACGCCATCCTGCGCTACGGGCGCAACCTGCTGAAGGTGGATGCGTTCGGCTGTACCAGCCGCGGTCAGGCCCACCGTGCCGGACTGTGGGTGATAAAGACCGAACTGCTGGAAACGCAGACGGTGGATTTCACGCTCGGGTCTCAGGGGCTGCGTCACACACCCGGTGACATCATTGAAATCTGTGATAACGACTATGCCGGGACCCTGACCGGCGGACGTATCCTGTCCATTGATGCCGCCAGCCGTACCCTGACGCTGGACCGTGAGGTGACACTGCCGGAAGCAGGGACATCGACGGTGAACCTGATTAACGGCAGCGGTAAGCCGGTGCGCGTGGACATCACTGCACACCCCGCCCCTGACCGGATACAGGTCAGCGTCCTGCCTGATGGCATGGAGACATACGGTGTGTGGGGACTCTCCCTGCCGTCACTGCGTCGTCGCCTGTTCCGCTGTGTTTCCATCCGGGAAAACACGGACGGCACCTTTGCCATCACGGCGGTGCAGCATGTGCCGGAAAAAGAAGCCATTGTGGATAACGGGGCCCGCTTTGAGCCGATGTCCGGCTCACTGAACAGCGTCATCCCGCCGGCAGTGCAGCACCTCACGGTGGAGGTGAGTGCCTCAGACGGCCAGTATCTGGCGCTGGCGAAATGGGACACGCCGCGGGTGGTGAAGGGCGTGCGCTTCAGTCTGCGCCTGACCAGTGGCAGTGGTGAAAACAGCCGCCTGGTGACCAGCGCCATCACTGCCGACACGGAGCACCGTTTCAGTGGCCTGCCGCTCGGGGAATACACCCTGACGGTCAGGGCGATAAACAGCTACGGCCAGCAGGGCGAACCTGCCACCACCACATTCCGGATTAATGCACCGGCGGCACCGGCCAGAATTGAACTGACGCCGGGGTATTTTCAGATAACAGCAGTACCGGTGCTGGCGGTGTATGACCCGACGGTACAGTATGAATTCTGGTTCTCAGAAAAACGCATCACGGACACGGCACAGGTGGAAACCTCAGCCCGTTATCTGGGTACCGGCAGCCAGTGGAGCGTCTCCGGCCCGCACATCAGGCCGGGGAAGGATTTCTGGTTTTATGTGCGCAGCGTCAACCTGGTGGGGAAATCTGCGTTTGTGGAAGCCAGTGGCCGGGCGAGCAATGATGCGGAAGGGTATCTGGACTTTTTCAGAGGAGAAATCGGGAAGACACATCTGGCACAGGGGATGTGGGAGCTGATTGATAACAGCCAGCTTGATGATGAGATGGCAGAGATGAAGACCACCATCACAGAAACCCGCAATGAAATCACACAGACGGTCAGTAAAACCCTGGAAGACCAGAGTGCCACCATACAGCAGATACAGCGGGTGCAGACAGACACAAATAACGACCTGGCTGCGCTGTACATGCTGAAGGTGCAGAAAACGAAAAACGGCATTCCGTATGTTGCCGGTATAGGTGCGGGGATTGAGGATGCTGATGGCCAGCCCCTGAGCAATATTCTGCTGCAGGCGGACCGTATCGCGATGATTAACCCGGAGAACGGCAACACCACGCCGCTGTTTGTGGCGCAGGGGAATCAGCTGTTTATGAACGACGTGTTCCTGAAGCGACTGTTTGCAGTGAGCATCACCTCGTCCGGCAATCCCCCGACGTTCTCCCTGACGCCGGAGGGCAGGCTGACGGCCCGCAATGCGGACATCAGCGGACATATCAGTGCGAACTCGGGCACGCTCAATAATGTCGTGATAGCGGAGAACTGCACGATAAATGGCACGCTGAAAGCGGAGAACATTATTGGTGATCTTGTGAAATGTGCAGGGGTGGCTTTTCCGGTGGATGGTAGTTACCTTGCGAACGGTACACGAACGCTGACGGTGTATGACGATCACAGCTTTGACCGGCAGATTATAATCCCGCCGATAATCTATGTCGGGTCAAAACAGGAATCCCGCACCAGTAATGACATCTGGACAGAGTGCTTCCTGCATGTTGATCAGAATGGACGCCGGATTTATTCAGGCAGGTCAGTGACAGAGCCGGGAATTTTCAGCGGGATCATCGATATGCCAGCTGGCGGTGGTCATATCACCCTGAGTTTTACCGTGAGCTCACGGCGTCAGAATAACAGTTGGGGCAGTTCACGAATCAGTAACCTTCAGGCGATAGTGGTGAAGAAAAACAGCGCGGGGATCAGCATCCGCTGA

Genome Context

Genome Context

Tertiary structure

PDB ID
f3393bb5a28538091de00940037043c5682d54850d08bc7abeb7bf0e458535d5
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,8113
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50

Literature

Title Authors Date PMID Source
Prophage sequence of genetically marked E. coli phage Phi1720a-02 containing a deletion in the stx2dact Shiga toxin gene Osburne,M.S., Tai,A. and Leong,J. 2021 GenBank