Genbank accession
YP_010657804.1 [GenBank]
Protein name
tail spike protein
RBP type
TF
Evidence UniProt/TrEMBL
Probability 1,00
TF
Evidence GenBank
Probability 1,00
TSP
Evidence RBPdetect
Probability 0,88
Protein sequence
MGMNSHIPFDADNDWTLDPYHCNRSNDPLVDKVIGNAYAVVRAVYCNLGNLKLLYDFLNTYGMVLGVKSEAELKKLNKLAKYARVYGFADTGDRQVTDYLYVPDDTSGIRPDDQTATGSWIKVSTSGSGSGGTGGGSGSYIPYVYANGSALGGETSFKVPAEALGVPFIIINGSVQYIGYGFSFNPANSTVTLSNPLVQGDEVIALTSAAPASPDNPNVSNWVQVNWLYNNGAAVGGEQVITVPYNFKDVPAVYKNGERYYKNLQTKSYVYDPSTRTVTMTELLAQGDRVIITLGGESASLEITDRTTQEVARANNVKDTDVVLSSSTNVVITDKKVLYDVNAQKYWDLPNLPPNVYIVKVEGNKLIYNPGAVVIDLLEPANPLVIVEPVLSRLGAETGNPMAGTFEKGATVDSAAKSVGSTMEGKLYRWEGALPKTVRAGDTPSSSGGIGSGKWVEITNATLRSQLASTGGAAMVKASDGRTVEQWLVQSDSASFRAKNMAKLAWCDYQVHNRGSLKCCFLGDSMTAGFDRTSSDTIPAQDGDWATRASMNYPYRFASYLPEQSGCSVYITMRAISGYTAKQAYEEALWQSNPNCDIVFIMYAINDSGGVAGATLDLYMEYMEKLIRRYIDWGCAVVVQRPSGGGQGAGNPAWLHWAKRMQMVARVYGCPVFDAHEVMLNRHYAAVQSDGTHYNSMGYAIHGEKLASMLMAGGLLDTYKPVVNETTVWTGMMSDHIGWCDARGNIGTGRSDGAYTRDKVTGVLQAGKATICTFSFYLDAEAAHIYGKLDGLINTIYTNGYWWNNGNKPYYQYAVDIDNSFGASLQRVNKSANNYEGMPGSRKFVGRLIGRGWHTITLFTNLQGEALKDAFVNSITVQPIPIGLSTEQMWGQDEERRYRVVHTRRMPSPSGQGGTLPVAVALTGFQMRAPQSFLGTGPGTNAVPAPYFYNTVPGKLKVYNEKGDYIEWLVYKDGSSGLKWKGKVLTHSFADVASVPTLTAYMGTAKQNVIVAAGSSGANQPLENIYDYNAGLQEQTGNPSTDLSWKGGIYLVFTLAWPSTAPTGYWTIELEGSDWFGNSESAVGCF
Physico‐chemical
properties
protein length:1086 AA
molecular weight: 117901,16480 Da
isoelectric point:5,76668
aromaticity:0,10681
hydropathy:-0,23923

Domains

Domains [InterPro]
DC_0041
STR
27–223
IPR040775
RBD
405–458
IPR013830
ENZ
521–701
cd00229
ENZ
521–708
YP_010657804.1
1 1086
Architecture
STR
STR
ATT
STR
ATT
STR
RBD
STR 27-223 | STR 271-294 | ATT 295-380 | STR 381-399 | ATT 400-470 | STR 471-711 | RBD 712-1083 |
Legend: ATT STR RBD CBM LEC ENZ CHP LNK TAS TTP UNK Unmapped

Tail Spike Domain Segmentation

Tail Spike Domain Segmentation

This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.

Domain Layout
N-terminal
Central
C-terminal
YP_010657804.1
1 1086
Domain Start End Length (AA) Confidence
N-terminal 1 493 493 0,9561
Central domain 494 1011 519 0,9297
C-terminal 1012 1086 74 0,3378
Legend: N-terminal Central domain C-terminal
3D Structure with Domain Coloring

The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).

Domain Coloring
N-terminal
1-493
Central
494-1011
C-terminal
1012-1086

Taxonomy

  Name Taxonomy ID Lineage
Phage Shigella virus Moo19
[NCBI]
2886042 Uroviricota > Caudoviricetes > Schitoviridae > Moovirus > Moovirus moo
Host Shigella flexneri
[NCBI]
623 cellular organisms > Bacteria > Pseudomonadati > Pseudomonadota > Gammaproteobacteria > Enterobacterales

Coding sequence (CDS)

Coding sequence (CDS)
Genbank protein accession
YP_010657804.1 [NCBI]
Genbank nucleotide accession
NC_070850.1 [NCBI]
CDS location
range 65719 -> 68979
strand -
CDS
ATGGGTATGAACTCTCACATTCCATTTGATGCAGACAATGACTGGACTCTCGACCCATATCATTGCAATCGCTCCAACGACCCACTTGTCGATAAGGTGATTGGTAATGCCTATGCCGTGGTTCGTGCAGTCTACTGTAACCTGGGTAACCTGAAGTTACTGTATGACTTCCTGAATACCTACGGTATGGTTCTGGGTGTTAAGTCAGAAGCTGAATTAAAGAAGCTGAACAAGTTGGCTAAGTATGCCCGTGTCTATGGCTTCGCTGATACAGGTGACCGTCAGGTAACTGATTACCTGTACGTACCTGATGATACTTCTGGTATTCGCCCAGATGACCAAACTGCTACAGGCTCCTGGATTAAAGTATCTACCTCTGGTTCCGGCAGTGGTGGTACAGGTGGTGGTTCCGGTTCCTACATTCCATACGTTTATGCTAATGGCTCTGCATTGGGTGGGGAGACTTCATTTAAAGTTCCTGCTGAAGCACTGGGTGTACCCTTCATTATCATTAACGGCTCGGTTCAGTACATTGGTTATGGCTTCAGCTTTAACCCAGCTAACTCCACCGTTACCCTTAGTAACCCACTGGTTCAGGGTGATGAAGTTATTGCACTGACTTCTGCTGCTCCAGCAAGCCCGGATAACCCTAATGTATCCAACTGGGTTCAGGTTAACTGGCTCTATAACAACGGTGCTGCTGTAGGTGGCGAACAGGTAATTACTGTTCCTTATAACTTTAAAGATGTTCCTGCCGTTTATAAAAATGGTGAACGTTACTATAAGAACCTGCAAACCAAGTCTTACGTTTATGACCCAAGTACTCGTACTGTTACTATGACTGAGCTTCTTGCTCAGGGTGACCGTGTGATTATTACTCTGGGTGGTGAATCTGCCAGTCTTGAGATTACTGACCGAACCACGCAAGAAGTAGCTCGTGCAAACAACGTAAAGGATACTGATGTTGTTCTGTCTTCCTCTACTAATGTAGTGATTACTGACAAGAAAGTTCTTTACGATGTGAATGCTCAGAAGTACTGGGATTTGCCTAACCTGCCACCTAACGTTTATATCGTTAAGGTAGAAGGTAACAAGCTGATTTATAACCCAGGTGCTGTTGTTATTGACCTGTTAGAACCAGCAAACCCACTGGTAATCGTTGAGCCTGTTCTTTCCCGTCTGGGGGCTGAGACTGGTAATCCAATGGCTGGTACTTTCGAGAAAGGTGCAACTGTTGACTCTGCTGCTAAGTCAGTTGGTTCTACTATGGAAGGTAAGCTATATCGTTGGGAAGGTGCATTACCTAAGACTGTACGTGCCGGGGATACTCCATCTTCTTCTGGTGGAATTGGTTCTGGTAAGTGGGTAGAGATTACTAATGCAACTTTACGCAGCCAGCTTGCAAGTACTGGTGGTGCTGCAATGGTTAAAGCCTCTGACGGGCGTACTGTTGAGCAATGGCTTGTTCAGTCTGACTCTGCATCTTTCCGTGCAAAGAACATGGCTAAGCTGGCATGGTGCGATTATCAAGTTCATAACCGTGGTTCCCTGAAGTGCTGCTTCCTGGGTGACTCCATGACTGCTGGGTTTGACCGTACTTCCTCAGATACTATTCCTGCTCAGGATGGTGACTGGGCTACCCGTGCTTCTATGAACTACCCATACCGTTTTGCTAGTTACTTGCCTGAACAGTCTGGTTGTTCCGTCTATATTACTATGAGAGCAATTTCAGGTTACACGGCGAAACAGGCTTATGAAGAGGCACTGTGGCAGTCTAATCCAAACTGCGACATTGTGTTCATCATGTATGCAATTAACGATTCCGGTGGTGTAGCTGGTGCAACTCTTGACCTCTACATGGAATACATGGAGAAGCTCATTCGTCGTTATATCGACTGGGGCTGTGCTGTAGTTGTTCAACGTCCATCTGGTGGTGGTCAGGGTGCTGGTAACCCAGCGTGGCTGCATTGGGCTAAACGTATGCAGATGGTAGCTCGTGTTTACGGTTGCCCTGTATTTGATGCTCATGAAGTAATGCTTAACCGTCACTATGCTGCTGTTCAGTCTGATGGTACTCACTACAACTCTATGGGTTATGCAATTCACGGTGAGAAGCTGGCATCCATGCTTATGGCAGGTGGACTCCTGGATACCTATAAACCAGTGGTTAATGAAACTACTGTGTGGACGGGTATGATGTCTGACCATATTGGTTGGTGTGATGCACGAGGTAATATTGGTACTGGTCGTTCTGACGGAGCGTACACTCGTGACAAAGTTACTGGTGTACTCCAGGCGGGTAAAGCAACCATCTGTACCTTTAGCTTCTATCTGGATGCAGAAGCTGCACATATCTACGGTAAGCTGGATGGTTTAATTAATACCATTTACACCAATGGATACTGGTGGAATAACGGCAACAAGCCTTATTACCAGTATGCAGTAGATATTGATAACTCTTTTGGTGCATCTTTACAGCGTGTTAATAAGTCAGCTAACAACTACGAGGGAATGCCAGGCTCACGTAAGTTTGTGGGTCGTTTGATTGGTCGTGGTTGGCACACTATTACTCTGTTTACCAATTTGCAAGGTGAAGCACTGAAAGATGCTTTTGTTAACAGCATTACTGTTCAGCCAATCCCAATTGGTCTGTCAACTGAGCAGATGTGGGGACAGGATGAAGAACGCCGTTATCGTGTAGTACACACTCGTCGTATGCCTTCTCCATCTGGGCAGGGTGGTACGCTACCTGTGGCAGTAGCATTGACTGGCTTCCAAATGAGAGCACCTCAGAGCTTCTTGGGTACTGGTCCGGGTACGAATGCAGTTCCTGCTCCTTATTTCTACAATACGGTTCCAGGCAAACTGAAGGTATATAACGAGAAAGGCGACTACATCGAGTGGCTTGTCTACAAGGATGGTTCGTCTGGTCTTAAGTGGAAAGGCAAAGTTCTTACCCACAGTTTTGCTGATGTTGCCTCTGTACCAACCTTAACTGCATACATGGGTACAGCTAAACAAAACGTTATTGTGGCAGCGGGTAGTTCTGGTGCAAACCAACCATTGGAAAATATTTATGACTACAATGCAGGTTTGCAAGAACAAACAGGTAACCCATCTACTGACCTTTCCTGGAAAGGTGGTATTTATCTGGTATTTACTCTGGCATGGCCTAGCACTGCTCCAACTGGTTATTGGACCATTGAGCTTGAGGGTAGTGATTGGTTTGGTAACTCTGAGTCTGCTGTAGGTTGCTTCTAA

Genome Context

Genome Context

Tertiary structure

1 / 3
PDB ID
Source
Method
Resolution
Oligomeric State