UniProt accession
S4UT56 [UniProt]
Protein name
Putative tail fiber protein
RBP type
TF
Evidence GenBank
Probability 1,00
TF
Evidence Phold
Probability 1,00
TSP
Evidence DepoScope
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,91
Protein sequence
MAQDMTSFEQAVDQVIVDSERLHLIVNGNAVDEVVVEDGTTIPTVRKAMLDTLYFKTPPIPWAYGASTTVFNQLYEFKGDTGTQWWYAPAASKSNPVRMPADPSESPNWRLYTDSAVMAKYYAKLNSPRFEGDPRVPTPPMDDKSESIANTEFVVDYVDSIFKAMEGMKVTVGSLVVKGLTELANTIVGGTLTLHGPVNGADSTARFRNLILTANTSTLTFAWSDPKHADWRSTELQPHEVSTHRVIADTITSGKPVANNNDVHFDGLGNNFFDYVYIRGNAMKAATEPTLQVDGTTRVKNLEVTGTVTGITYSVDGTMIYPSYIESTGDALINGGLEVGGSVVIRGTASIQNIALNTLRVNERATFEGEGLTANKGVITELTTTTLTATTANSENCNVTRNLQVNGDVSLNAAGTGTTSVHNLEISGTVTGWLPDFSNVNFVCNGINSSGKITTSQEIEAGKTITAPTFHAGKVDFDLEEVDASSGTWTPNGQASMYVVHAKGDFTIGQWPGTSAEDKPYPFTAVIYVIQDAVGHNVTLHDKYAILSATPVINNKANSVTLLQLTYCGVGDIVDVVIAQR
Physico‐chemical
properties
protein length:581 AA
molecular weight: 62510,10350 Da
isoelectric point:4,81150
aromaticity:0,07917
hydropathy:-0,15714

Domains

Domains [InterPro]
DC_0194
STR
1–579
IPR058969
ATT
6–112
S4UT56
1 581
Architecture
ATT
STR
ATT 1-112 | STR 113-579 |
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
S4UT56
1 581
Domain Start End Length (AA) Confidence
N-terminal 1 203 203 0,9924
Central domain 204 472 270 0,8232
C-terminal 473 581 108 0,9064
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-203
Central
204-472
C-terminal
473-581

Taxonomy

  Name Taxonomy ID Lineage
Phage Escherichia phage JES2013
[NCBI]
1327956 Uroviricota > Caudoviricetes > Vequintavirinae > Vequintavirus JES2013 >
Host Escherichia coli O157:H7
[NCBI]
83334 Bacteria > Proteobacteria > Gammaproteobacteria > Enterobacteriales > Enterobacteriaceae > Escherichia
Host Escherichia coli str. K-12 substr. MG1655
[NCBI]
511145 Bacteria > Proteobacteria > Gammaproteobacteria > Enterobacteriales > Enterobacteriaceae > Escherichia

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
AGM12416.1 [NCBI]
Genbank nucleotide accession
KC690136 [NCBI]
CDS location
range 19172 -> 20917
strand -
CDS
ATGGCGCAAGACATGACAAGCTTCGAGCAGGCGGTAGATCAAGTAATTGTTGATTCTGAACGTTTGCACTTGATTGTCAACGGTAACGCTGTGGATGAAGTTGTCGTAGAGGATGGGACCACTATCCCTACGGTACGAAAAGCCATGCTTGACACCCTTTATTTTAAAACGCCGCCGATCCCTTGGGCGTATGGTGCATCCACAACAGTTTTTAACCAGTTGTATGAGTTTAAAGGAGATACAGGCACTCAGTGGTGGTACGCCCCAGCAGCATCAAAATCTAACCCAGTTAGGATGCCAGCAGATCCTTCAGAATCTCCGAACTGGAGATTATATACCGATTCTGCAGTAATGGCAAAATACTACGCAAAACTTAACAGTCCAAGGTTCGAAGGTGACCCGAGAGTACCTACACCTCCAATGGACGATAAGTCCGAGTCTATAGCAAACACAGAGTTTGTTGTTGACTATGTGGACAGTATATTCAAAGCCATGGAGGGGATGAAAGTAACTGTAGGGTCTTTGGTGGTAAAAGGTCTTACAGAACTCGCTAACACTATAGTTGGTGGCACCCTTACCTTACATGGACCTGTTAATGGGGCAGATTCTACTGCACGCTTTAGGAATCTAATCCTCACGGCAAATACTTCTACACTTACTTTTGCGTGGAGCGATCCTAAGCATGCAGACTGGAGAAGTACAGAGCTGCAACCTCATGAAGTGTCCACCCACAGGGTTATAGCTGACACTATAACTTCTGGGAAACCAGTGGCTAATAATAACGATGTGCATTTTGATGGTCTGGGTAATAACTTTTTTGACTACGTGTACATTCGTGGTAATGCCATGAAGGCAGCAACCGAACCGACATTGCAGGTCGATGGGACCACCAGGGTTAAGAACCTTGAGGTGACAGGTACCGTTACAGGGATAACATACTCTGTCGATGGCACCATGATCTACCCTAGCTATATTGAAAGCACTGGTGATGCATTGATAAATGGCGGTTTGGAGGTTGGCGGGTCTGTAGTTATTCGAGGTACGGCCTCTATCCAAAACATAGCTCTGAATACCTTAAGAGTCAATGAGCGTGCAACTTTTGAAGGAGAAGGGCTTACAGCCAACAAAGGTGTAATTACCGAGCTGACAACAACCACTTTAACTGCAACAACTGCAAACTCTGAAAACTGCAACGTTACCCGCAACTTGCAGGTAAATGGAGATGTCAGTTTAAACGCAGCAGGGACTGGCACCACCTCTGTTCACAACCTTGAAATATCCGGCACGGTGACTGGGTGGTTGCCTGACTTCTCTAATGTCAACTTTGTCTGTAATGGCATTAACTCCAGTGGCAAGATAACCACCTCTCAAGAAATTGAGGCGGGTAAAACCATTACTGCCCCTACTTTCCACGCAGGAAAGGTAGATTTTGATTTAGAGGAGGTTGACGCGTCCAGTGGGACATGGACACCTAACGGGCAGGCCAGTATGTATGTTGTTCACGCGAAAGGGGATTTTACAATAGGACAGTGGCCTGGGACATCAGCGGAAGATAAACCTTATCCATTCACTGCGGTTATCTATGTAATTCAGGATGCTGTAGGTCACAACGTGACTTTGCACGATAAGTATGCTATCCTGTCGGCAACACCTGTTATTAACAACAAGGCTAACAGTGTTACCTTGCTGCAATTAACGTATTGTGGTGTTGGTGATATCGTAGACGTAGTAATTGCACAACGTTAA

Genome Context

Genome Context

Tertiary structure

PDB ID
17610cf72202e24c71a71fe6f97003b2596447a52437a19229cf24ef36b2db9b
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,6441
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50