Protein
View in Explore- Genbank accession
- XNS39590.1 [GenBank]
- Protein name
- tail fiber protein
- RBP type
-
TFTSPTF
- Protein sequence
-
MGKGSSKGHTPREAKDNLKSTQLLSVIDAISEGPVEGPVDGLKSVLLNSTPVLDSEGNTNISGVTVVFRAGEQEQTPPEGFESSGSETVLGTEVKYDTPITRTITSANIDRLRFTFGVQALVETTSKGDRNPSEVRLLVQIQRNGGWVTEKDITIKGKTTSQYLASVVVGNLPPRPFSIRMRRMTPDSTTDQLQNKTLWSSYTEIIDVKQGYPNTALVGVQVDSEQFGSQQVSRNYHLRGRILQVPSNYNPQTRQYSGIWDGTFKPAYSNNMAWCLWDMLTHPRYGMGKRLGAADVDKWALYVIGQYCDQSVPDGFGGTEPRITCNAYLTTQRKAWDVLSDFCSAMRCMPVWNGQTLTFVQDRPSDKVWTYNRSNVVMPDDGAPFRYSFSALKDRHNAVEVNWIDPNNGWETATELVEDTQAIARYGRNVTKMDAFGCTSRGQAHRAGLWLIKTELLETQTVDFSVGAEGLRHVPGDVIEICDDDYAGISIGGRVLAVNSQTRTLTLDREITLPSSGTTLISLVDGSGNPVSVEVQSVTDGVKVKVSRVPDGVAEYSVWGLKLPTLRQRLFRCVSIRENDDGTYAITAVQHVPEKEAIVDNGAHFDGDQSGTVNGVTPPAVQHLTAEVTADSGEYQVLARWDTPKVVKGVSFLLRLTVTADDGSEWLVSTARTTETTYRFTQLALGNYRLTVRAVNARGQQGEPASVSFRIAAPAAPSRIELTPGYFQITATPHLAFYDPTVQFEFWFSEKRIADIRQVETTARYLGTALYWIAASINIKPGHDYYFYIRSVNTVGKSAFVEAVGQPSDDASGYLDFFKGEIGKTHLAQELWTQIDNGQLAPDLAEIRTSITDVSNEITQTVNKKLEDQSAAIQQIQKVQVDTNNNLNSMWAVKLQQMQDGRLYIAGIGAGIENTPDGMQSQVLLAADRIAMVNPANGNTKPMFVGQGDQIFMNDVFLKRLTAPTITSGGNPPAFSLTPDGKLTAKNADISGNVNANSGTLNNVTINENCQIKGKLSANQIEGDIVKTVSKSFPRTNSYASGTITVRISDDQKFDRQVMIPPVLFRGGKHENFNSNNQQSYWYSTCRLRVTRNGQEIFNQSTTDAQGVFSSVIDMPAGQGTLTLTFTVSSSGANNWTPTTSISDLLVVVMKKSTAGISIS
- Physico‐chemical
properties -
protein length: 1160 AA molecular weight: 127184,11280 Da isoelectric point: 5,74360 aromaticity: 0,08017 hydropathy: -0,34086
Domains
Domains [InterPro]
IPR053171
Unmapped
1–837
Unmapped
1–837
DC_0014
STR
1–1158
STR
1–1158
IPR055385
ATT
86–207
ATT
86–207
IPR013783
STR
617–726
STR
617–726
IPR036116
STR
617–719
STR
617–719
IPR003961
STR
618–701
STR
618–701
IPR003961
STR
618–710
STR
618–710
IPR003961
STR
620–715
STR
620–715
1
1160
Architecture
STR 1-85 | ATT 86-207 | STR 208-330 | ATT 331-498 | STR 499-715 | ATT 716-818 | STR 819-1160
Legend:
ATT
STR
RBD
CBM
LEC
ENZ
CHP
LNK
TAS
TTP
UNK
Unmapped
Tail Spike Domain Segmentation
Tail Spike Domain Segmentation
This protein has been segmented into three structural domains: N-terminal, central domain, and C-terminal.
Domain Layout
1
1160
| Domain | Start | End | Length (AA) | Confidence |
|---|---|---|---|---|
| N-terminal | 1 | 1018 | 1018 | 0,9098 |
| Central domain | 1019 | 1149 | 132 | 0,2105 |
| C-terminal | 1150 | 1160 | 10 | 0,9875 |
Note: Constraints were applied during segmentation.
Fixed 60 C-terminal predictions appearing before Central domain|Sequence started with non-N-terminal domain|C-terminal too short, adjusted boundary
Fixed 60 C-terminal predictions appearing before Central domain|Sequence started with non-N-terminal domain|C-terminal too short, adjusted boundary
Legend:
N-terminal
Central domain
C-terminal
3D Structure with Domain Coloring
The structure is colored according to the domain segmentation: N-terminal (blue), Central (green), C-terminal (pink).
Domain Coloring
N-terminal
1-1018
1-1018
Central
1019-1149
1019-1149
C-terminal
1150-1160
1150-1160
Taxonomy
| Name | Taxonomy ID | Lineage | |
|---|---|---|---|
| Phage |
Escherichia phage Chlobot [NCBI] |
3233691 | Viruses > |
| Host |
Escherichia coli [NCBI] |
562 | cellular organisms > Bacteria > Pseudomonadati > Pseudomonadota > Gammaproteobacteria > Enterobacterales |
Coding sequence (CDS)
Coding sequence (CDS)
Genbank protein accession
XNS39590.1
[NCBI]
Genbank nucleotide accession
PP925838.1
[NCBI]
CDS location
range 25603 -> 29085
strand -
strand -
CDS
ATGGGTAAAGGCAGCAGTAAGGGGCATACTCCGCGCGAAGCGAAGGACAACCTGAAGTCCACGCAGTTGCTGAGTGTGATCGATGCCATCAGCGAAGGGCCGGTTGAAGGTCCGGTGGATGGATTAAAAAGCGTGCTGCTGAACAGTACGCCAGTGCTGGACAGTGAGGGGAATACCAATATCTCCGGCGTCACGGTGGTGTTCCGGGCCGGTGAGCAGGAGCAGACACCGCCGGAGGGATTTGAATCCTCCGGCTCCGAGACGGTGCTGGGTACGGAAGTGAAATACGACACGCCGATCACCCGGACCATCACGTCGGCAAACATTGACCGACTGCGTTTTACCTTCGGCGTGCAGGCACTGGTGGAAACCACCTCAAAGGGGGACAGGAATCCGTCGGAAGTCCGCCTGCTGGTTCAGATCCAGCGTAATGGTGGCTGGGTGACGGAAAAAGACATCACCATTAAGGGCAAAACCACCTCGCAGTATCTGGCCTCGGTGGTGGTGGGTAACCTGCCGCCGCGCCCGTTCAGTATCCGGATGCGCAGGATGACGCCGGACAGCACCACAGACCAGCTGCAGAACAAAACGCTCTGGTCGTCATACACCGAAATCATCGATGTGAAACAGGGCTACCCGAACACGGCACTGGTCGGCGTGCAGGTGGATTCGGAGCAGTTCGGCAGCCAGCAGGTGAGCCGTAATTATCATCTGCGCGGGCGCATTCTGCAGGTGCCGTCGAACTATAACCCGCAGACGCGGCAATACAGCGGTATCTGGGACGGAACGTTTAAACCGGCATACAGCAACAACATGGCCTGGTGTCTGTGGGATATGCTGACCCATCCGCGCTACGGCATGGGGAAACGTCTTGGTGCGGCGGATGTGGATAAATGGGCGCTGTATGTCATCGGCCAGTACTGCGACCAGTCAGTGCCGGACGGCTTTGGCGGCACGGAGCCGCGCATCACCTGTAATGCGTACCTGACCACACAGCGCAAGGCGTGGGATGTGCTCAGTGATTTCTGCTCGGCGATGCGCTGTATGCCGGTATGGAACGGGCAGACGCTGACGTTCGTGCAGGACCGACCATCAGATAAGGTGTGGACCTATAACCGCAGTAATGTGGTGATGCCGGATGATGGCGCGCCGTTCCGCTACAGCTTCAGCGCCCTGAAGGACCGCCATAATGCCGTTGAGGTGAACTGGATTGACCCGAATAACGGCTGGGAGACGGCGACAGAGCTTGTGGAGGACACGCAGGCCATTGCCCGTTACGGTCGTAATGTCACGAAGATGGATGCCTTTGGCTGTACCAGCCGGGGGCAGGCACATCGCGCCGGGCTGTGGCTGATTAAAACAGAACTGCTGGAAACGCAGACCGTGGACTTCAGCGTGGGCGCAGAAGGGCTTCGCCATGTGCCGGGCGATGTCATTGAAATCTGTGATGATGACTATGCCGGTATCAGCATCGGCGGGCGCGTGCTGGCGGTAAACAGCCAGACCCGGACGCTGACGCTCGACCGTGAAATCACGCTGCCATCCTCCGGTACCACGCTGATAAGCCTGGTTGACGGAAGTGGCAATCCGGTCAGCGTGGAGGTTCAGTCCGTCACCGACGGCGTGAAGGTGAAAGTGAGCCGTGTTCCTGACGGCGTTGCTGAATACAGCGTGTGGGGGCTGAAGCTGCCGACGCTGCGCCAGCGCCTGTTCCGCTGCGTGAGTATCCGTGAGAACGACGACGGCACGTATGCCATCACCGCCGTGCAGCATGTACCGGAAAAAGAAGCCATCGTGGATAACGGGGCGCACTTTGACGGCGACCAGAGCGGCACGGTGAATGGTGTCACGCCGCCAGCGGTGCAGCACCTGACTGCCGAAGTCACCGCAGACAGCGGGGAATATCAGGTGCTGGCGCGCTGGGACACGCCGAAGGTGGTGAAGGGCGTGAGCTTCCTGCTCCGTCTGACCGTAACAGCGGATGACGGCAGTGAGTGGCTGGTCAGCACGGCCCGGACGACGGAAACCACTTACCGCTTCACACAACTGGCTCTGGGGAACTACAGGCTGACAGTCCGGGCAGTAAATGCCCGGGGGCAGCAGGGCGAGCCGGCATCGGTATCGTTCCGGATCGCCGCACCGGCAGCGCCGTCGCGGATTGAGCTGACGCCGGGCTATTTTCAGATAACTGCAACGCCGCATCTTGCGTTTTATGACCCGACGGTACAGTTTGAGTTCTGGTTCTCGGAAAAGCGGATTGCGGATATCAGGCAGGTTGAAACCACAGCACGCTATCTTGGCACGGCGCTGTACTGGATAGCCGCCAGTATCAATATCAAACCGGGCCATGATTATTACTTTTATATCCGCAGTGTGAACACCGTTGGCAAATCGGCATTCGTGGAGGCTGTTGGTCAGCCGAGTGATGATGCATCCGGTTATCTGGATTTTTTCAAAGGCGAGATAGGGAAAACCCATCTGGCTCAGGAGCTGTGGACGCAGATTGATAACGGTCAGCTTGCGCCTGACCTGGCTGAAATCAGGACGTCCATTACGGATGTCAGCAATGAAATCACGCAGACCGTCAATAAGAAACTGGAAGACCAGAGTGCTGCAATTCAGCAGATACAGAAGGTTCAGGTTGATACAAATAATAACCTGAACAGCATGTGGGCTGTGAAGCTGCAGCAGATGCAGGACGGACGCCTTTATATCGCGGGTATTGGTGCCGGTATTGAGAACACCCCTGACGGCATGCAGAGTCAGGTGCTGCTGGCGGCGGACAGGATTGCGATGGTTAATCCTGCGAATGGCAACACAAAACCGATGTTTGTTGGTCAGGGCGATCAGATATTCATGAACGACGTGTTCCTGAAACGCCTGACGGCTCCCACCATTACCAGCGGTGGAAATCCGCCGGCATTTTCCCTGACACCAGACGGAAAGCTGACCGCTAAAAATGCGGATATCAGCGGTAATGTGAATGCAAATTCAGGGACGCTCAACAATGTCACGATTAATGAAAACTGTCAGATTAAGGGGAAACTGTCAGCCAATCAGATTGAAGGCGATATTGTCAAAACGGTCAGCAAGTCTTTCCCCCGCACGAACAGTTATGCCAGTGGCACCATCACGGTAAGAATCAGTGATGATCAGAAATTTGACCGGCAGGTCATGATACCGCCAGTGTTATTCCGCGGTGGTAAGCATGAGAATTTCAACAGTAATAACCAACAGTCATACTGGTATTCAACCTGCCGGTTAAGAGTGACCCGCAATGGTCAGGAGATTTTTAATCAGTCCACGACGGATGCTCAGGGCGTATTTTCCTCAGTTATAGATATGCCTGCCGGACAGGGGACGCTGACACTGACATTCACCGTATCTTCATCAGGAGCGAATAACTGGACACCAACAACCAGTATCAGCGATCTGCTGGTTGTGGTGATGAAAAAATCCACAGCAGGTATCAGTATCAGCTGA
Genome Context
Genome Context
Tertiary structure
PDB ID
a0beffde0efd1dd0c76d0f0de3887adbd17b01d539dc1eec47a200b9f8722d91
Model Confidence
Very high
pLDDT > 90
pLDDT > 90
High
90 > pLDDT > 70
90 > pLDDT > 70
Low
70 > pLDDT > 50
70 > pLDDT > 50
Very low
pLDDT < 50
pLDDT < 50