UniProt accession
S4USV0 [UniProt]
Protein name
Putative phage host specificity protein
RBP type
TF
Evidence GenBank
Probability 1,00
TF
Evidence Phold
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,80
TF
Evidence RBPdetect2
Probability 0,94
Protein sequence
MPIAMIDNIVASARSSSYDWNASKAALADSLNTVTTRTLDKIFEGNKSYRSQQDRQKLLRSSAAPCSVVYGKTRTSGLLAFLEQDRDRTLHCAIVLANHPLEGVEDILIDGNPISSYGDLVSWELHNDRKTSDPFMGTHCPSWSPDMIGRGISWLRASFRFDPNKFPFGLPNVTLVKVGKKCYDPRISKEVYTNNAALVILDYLRTYLKCPDETINWESFKEAANICDEAVKNADGTSERRYTINGEFDMDEAPASIMAEMLKACGADLSYVAGKYGLLVGAYYGPATMTLSEDCVCGEVKIYPEASFDKRSNTITGRFTSPTKGYSETDFPSVFVPEWIEKDGERKIIDIDYRFVTSPYQAQRVSAIFLRRARAGRIIEVTCNMRGFKFKPGRYVTMDLPSIGIVGQEMRVLEWEFTKKGGVKVKLRQDAKEWNDATGQLPDSGDVDIPISPSGVAQPQNFRYSVLQAGEVTHGVLAWDNVGTYAQNIVQVRKNGEIVWTAQTVEQFVRVEGLTKGSYTATVVATSYKGGVSPEAYCEFNIQAPEAPVSVEVKQGYFAITLIPHSRDLASVSTQYDFWTSGMTRLPDTSDATVTSKATRMGVGSTWTSEGLQNDKIYYWYIRTTKAFGSSAFVECAAHCYTSIEDLMPQIDAEFKKTETYKELTANLDNAVKEMDKNVGAVNDRVTSIFEELGDTIEGVVRETTEKFTGVNGEISALNSKLVAAQQSFDDKLAAESGRLGSLIETTNRSTVDLVNRKTQALSEQITATQGTLREELKNTESKLNSSIQETNKATTDLLNKTTEAIKQDLVAATGKITKLEDDVQKEVANLQASIQETNSTTVDLVNNTATAIRQELTSAKQEIVDEMGNIDELRATVSNTSKAVTTLEGKIDAQWGAKIQVDSHGNKYVAGIQLGMEGSGGQVQSYFMVSANNFAVYNPGNGTATLAFAIKNNQAFLKDAFIENGTISSAKIAQEISSNNYDSNGYHNYGWYINKNGHAQFMDVWVKGNINASSGNFTGAVNATSGTFRGDVYANNGSFRGTIDATGGTFRGRVEASVIRANQFEGAIVAHRTYGDCAPVYNSQQRVCRWRWRYVDNVSGQGKNVTFFFKLNGTLASSQLNVWIAGHQILAGKKYFNDNNGMCAVGITGLGEQTIDIVIEIYTPWSTSSVTGVTISCPTVIVSRSNSSFQGPWNESHD
Physico‐chemical
properties
protein length:1199 AA
molecular weight: 131805,03450 Da
isoelectric point:5,49863
aromaticity:0,09091
hydropathy:-0,35004

Domains

Domains [InterPro]
DC_0023
STR
1–1151
IPR053171
Unmapped
470–1039
G3DSA:1.20.120.20
STR
641–791
Coil
Unmapped
774–794
S4USV0
1 1199
Architecture
STR
ATT
STR
STR 1-546 | ATT 547-651 | STR 652-1151 |
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
S4USV0
1 1199
Domain Start End Length (AA) Confidence
N-terminal 1 1020 1020 0,8093
Central domain 1021 1188 169 0,3633
C-terminal 1189 1199 10 0,9812
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-1020
Central
1021-1188
C-terminal
1189-1199

Taxonomy

  Name Taxonomy ID Lineage
Phage Escherichia phage JES2013
[NCBI]
1327956 Uroviricota > Caudoviricetes > Vequintavirinae > Vequintavirus JES2013 >
Host Escherichia coli O157:H7
[NCBI]
83334 Bacteria > Proteobacteria > Gammaproteobacteria > Enterobacteriales > Enterobacteriaceae > Escherichia
Host Escherichia coli str. K-12 substr. MG1655
[NCBI]
511145 Bacteria > Proteobacteria > Gammaproteobacteria > Enterobacteriales > Enterobacteriaceae > Escherichia

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
AGM12425.1 [NCBI]
Genbank nucleotide accession
KC690136 [NCBI]
CDS location
range 28680 -> 32279
strand -
CDS
ATGCCAATTGCAATGATTGATAACATCGTTGCTTCTGCAAGATCTTCAAGTTATGACTGGAACGCATCCAAGGCAGCTCTGGCTGACAGCCTGAACACTGTTACCACCAGAACTCTGGATAAGATTTTTGAGGGCAACAAATCTTACAGAAGTCAGCAAGATAGACAAAAGCTTCTCAGATCTTCTGCTGCACCTTGCTCTGTAGTGTATGGTAAGACACGTACGTCTGGATTGCTTGCGTTTTTGGAGCAGGACAGAGACAGAACCCTCCATTGTGCTATTGTTCTTGCCAATCACCCTCTGGAAGGTGTAGAAGATATACTTATCGACGGTAATCCTATTTCCTCGTATGGAGATCTGGTATCGTGGGAGCTACATAACGACAGAAAAACATCTGATCCTTTCATGGGTACACACTGCCCATCATGGTCACCCGACATGATAGGTAGAGGGATCAGTTGGCTACGTGCCAGCTTCAGGTTTGACCCTAACAAGTTTCCTTTTGGGTTGCCAAACGTTACACTTGTCAAGGTTGGTAAAAAATGCTATGATCCTCGTATCAGTAAAGAGGTGTATACCAATAACGCGGCCTTGGTGATTCTAGACTATTTAAGAACGTACCTTAAATGTCCTGACGAAACCATCAACTGGGAGTCCTTCAAGGAGGCTGCTAACATATGCGACGAGGCGGTAAAAAACGCAGACGGAACCAGTGAGCGCCGATACACCATTAATGGCGAGTTTGACATGGACGAGGCTCCGGCAAGCATTATGGCAGAAATGCTGAAAGCTTGTGGCGCAGATCTCAGCTATGTAGCTGGTAAGTATGGTTTGCTGGTTGGTGCATACTATGGCCCGGCAACAATGACGCTGAGTGAGGATTGTGTCTGTGGTGAGGTCAAGATCTATCCGGAAGCCTCATTTGACAAAAGATCCAACACAATAACTGGCAGATTCACTAGTCCGACCAAAGGGTATTCTGAAACAGATTTTCCATCAGTGTTTGTCCCAGAGTGGATAGAGAAGGATGGAGAAAGAAAGATAATCGATATAGACTATCGCTTTGTTACCAGTCCTTATCAAGCTCAACGTGTTTCTGCAATCTTCCTAAGACGTGCCAGAGCTGGCAGGATTATTGAAGTCACCTGCAATATGCGAGGTTTCAAATTTAAACCTGGTCGTTACGTTACAATGGACCTCCCAAGTATTGGTATAGTGGGCCAAGAAATGAGAGTTCTTGAATGGGAGTTTACCAAAAAGGGTGGTGTCAAGGTAAAACTTCGTCAAGATGCTAAAGAGTGGAATGATGCCACAGGGCAACTTCCGGATTCTGGCGATGTAGATATTCCGATATCTCCATCTGGAGTAGCTCAACCGCAAAACTTCAGATATTCCGTTCTCCAAGCTGGAGAAGTAACTCACGGTGTTTTGGCTTGGGACAACGTTGGGACTTATGCTCAAAACATTGTGCAGGTCAGGAAAAACGGAGAGATTGTCTGGACAGCGCAGACAGTTGAGCAGTTTGTCCGTGTTGAAGGTCTGACTAAAGGTTCCTACACAGCCACTGTTGTTGCAACATCTTATAAAGGTGGTGTATCTCCAGAAGCATACTGTGAGTTTAACATTCAGGCACCTGAGGCTCCGGTTTCTGTAGAAGTTAAACAGGGATACTTTGCTATTACCCTGATCCCGCACAGCAGAGACTTGGCCAGTGTAAGCACCCAATATGACTTTTGGACATCTGGGATGACAAGATTGCCAGATACCAGTGATGCAACCGTTACTTCAAAAGCCACTCGTATGGGTGTTGGTTCAACTTGGACATCTGAAGGTCTGCAGAACGATAAGATTTATTATTGGTATATCAGGACAACAAAGGCTTTTGGTTCATCTGCTTTTGTGGAGTGTGCAGCCCACTGTTACACATCCATAGAAGATCTTATGCCGCAGATTGATGCTGAGTTCAAAAAGACAGAGACTTATAAAGAGCTTACCGCCAATCTCGATAATGCAGTCAAGGAAATGGATAAAAATGTTGGTGCAGTCAATGACCGTGTTACTAGCATTTTTGAAGAGCTTGGAGATACAATAGAGGGTGTTGTCAGAGAGACGACAGAAAAATTTACAGGGGTTAATGGAGAAATTTCTGCCCTTAACAGTAAGCTGGTAGCAGCTCAGCAATCCTTTGACGATAAACTTGCAGCAGAGTCTGGTAGGCTTGGGTCACTAATCGAGACAACAAATAGATCAACCGTTGATCTTGTTAATCGTAAAACTCAAGCATTGAGTGAGCAAATAACTGCAACTCAAGGTACTCTGCGGGAAGAGTTGAAGAACACTGAGAGCAAACTTAACTCCTCTATTCAAGAAACCAACAAGGCAACAACCGACCTGCTAAATAAGACAACGGAAGCTATTAAGCAGGATCTGGTGGCAGCTACAGGGAAGATCACCAAGCTGGAAGATGATGTGCAGAAAGAGGTTGCCAATCTGCAAGCATCTATTCAAGAAACTAATAGTACAACAGTTGATCTTGTTAATAATACAGCCACAGCGATCCGTCAAGAACTTACCTCTGCAAAACAAGAAATTGTTGACGAAATGGGTAATATTGATGAGCTTAGGGCTACAGTATCTAATACCTCTAAAGCTGTAACAACACTTGAGGGTAAAATAGATGCTCAGTGGGGGGCTAAGATTCAGGTTGACAGCCATGGTAATAAATATGTGGCAGGTATCCAGCTGGGCATGGAGGGGTCTGGAGGACAAGTTCAATCATACTTTATGGTTAGTGCAAACAACTTTGCGGTGTACAATCCTGGTAACGGCACAGCAACCCTTGCTTTCGCAATCAAGAATAACCAAGCGTTCTTGAAAGACGCTTTTATAGAGAATGGCACGATCTCCTCTGCCAAGATCGCACAAGAAATTTCGTCAAACAACTACGATAGCAACGGGTACCATAACTACGGTTGGTATATTAACAAGAACGGGCACGCCCAATTTATGGATGTGTGGGTGAAAGGTAACATCAACGCCAGTTCTGGTAACTTCACAGGGGCAGTTAACGCCACCAGTGGTACCTTCCGTGGTGATGTTTATGCCAATAATGGTAGCTTTAGAGGCACCATAGATGCAACCGGAGGTACCTTCCGTGGTCGTGTAGAAGCTTCTGTTATCCGTGCTAACCAGTTCGAAGGTGCGATTGTTGCACACAGGACTTACGGAGATTGTGCTCCAGTATATAACTCCCAGCAAAGGGTTTGCCGTTGGAGGTGGAGATACGTAGATAACGTTTCAGGTCAGGGTAAGAACGTAACATTCTTCTTTAAACTGAATGGTACTCTTGCCAGCTCCCAACTGAATGTATGGATAGCTGGTCATCAGATCCTTGCTGGCAAGAAGTATTTTAACGACAACAACGGCATGTGTGCGGTAGGGATAACAGGTCTGGGTGAACAAACTATAGACATTGTTATCGAGATTTACACACCGTGGTCAACGTCGAGCGTGACAGGTGTCACAATCTCCTGTCCTACTGTGATCGTTAGTCGTTCTAACTCAAGCTTCCAGGGACCTTGGAACGAGTCTCACGACTAA

Genome Context

Genome Context

Tertiary structure

PDB ID
59fb11d3d10b2aca52df49452ba65394a99b7d0a4671aa4faefc80f4bf585e32
ESMFold
Source ESMFold
Method ESMFold
Resolution 0,6807
Oligomeric State monomer
Model Confidence
Very high
pLDDT > 90
High
90 > pLDDT > 70
Low
70 > pLDDT > 50
Very low
pLDDT < 50