MGnifams

A metagenomics-derived protein families resource


Protein Family: MGYF0000000009

Overview

This is the top-scoring MGnify protein (along with its specific region if not whole) that was recruited in the family through hmmsearch. Links to the MGnify Proteins site. Family representative sequence MGYP002509752315/5-923
# Amino Acids (AA) Representative length 919
The total number of MGnify sequences that have been iteratively recruited in the family through a series of processes such as: creating a seed alignment from the family's initial cluster, building an HMM model, and finally recruiting and aligning sequences from MGnify Proteins with the family HMM model. Total number of sequences in the family 5157
Denotes if FunFam functional annotation hits were identified via the hmmer/hmmsearch tool. Sequence-HMM FunFam matches
Denotes if Pfam functional annotation hits were identified via the hmmer/hmmsearch tool. Sequence-HMM Pfam matches
Denotes if Pfam domain annotation hits were identified through model searching with the hhsuite/hhblits tool. Profile-profile Pfam matches
Denotes if structure homologs of the family's representative sequence have been identified in the AlphaFoldDB or PDB databases through the foldseek tool. Structure-structure hits

ESMFold structure

Predicted 3D protein structure through the Meta AI ESMFold model. ESMFold uses the representations from a large language model (ESM2) to generate an accurate structure prediction from the sequence of a protein.

For more information visit:

Download CIF file

  Very high (pLDDT ≥ 90)   High (90 > pLDDT ≥ 70)   Low (70 > pLDDT ≥ 50)   Very low (pLDDT < 50)

pLDDT corresponds to the model's prediction of its score on the per-residue Local Distance Difference Test. It is a measure of local accuracy. Confidence bands are used to colour-code the residues in the 3D viewer. The exact per-residue pLDDT value is shown when you mouseover the structure. Average structure plddt score: 68.0
The pTM score (predicted Template Modeling score) is a confidence metric that estimates how accurate the global topology of a predicted protein structure is likely to be. pTM score: 0.553

Predicted secondary structure The secondary structure prediction was carried out with the s4pred software.

α-helices:  64.2%
β-strands:  0.0%
coils:      35.8%

The protein appears to be helix-rich, suggesting it may have a compact or globular structure.

Download features JSON file

Predicted transmembrane regions The transmembrane region prediction was carried out with the DeepTMHMMM software.

inside:     100.0%
membrane-α: 0.0%
outside:    0.0%
signal:     0.0%
membrane-β: 0.0%
periplasm:  0.0%

This does not seem to be a transmembrane protein.

Download transmembrane JSON file

Multiple Sequence Alignment (Seed) This is the seed alignment that was used to create the HMM model of the family. It is different to the full alignment, which incorporates all MGnify sequences that have been recruited in the family after searching with the HMM model against the sequence pool. The full alignment is usually quite larger than the seed one and can be downloaded via the FTP.

Download seed MSA file

HMM viewer The family HMM is visualized via the Skylign API.

The height of each stack represents the information content (also known as relative entropy) at that position, while the size of each letter within the stack reflects its estimated probability. Click on a stack to highlight the corresponding column in the seed MSA viewer above.

Download HMM file

Biomes distribution An interactive sunburst plot showing the biomes where the family's underlying MGnify proteins were detected.

Download biomes CSV file

Domain architecture

The top 15 most prevalent domain architectures (including MGnifams and Pfams) found in the full alignment sequences of the family. The numbers on the left indicate how many MGnify sequences share each domain architecture.

Download domains JSON file

Functional annotation through Funfam matches

The family representative sequence was searched against the FunFam database (ver. 4.3.0) with hmmer/hmmsearch.

No FunFam hits found

Functional annotation through Pfam matches

The family representative sequence was searched against the Pfam database (ver. 38.0) with hmmer/hmmsearch.

No Pfam hits found

Profile-profile Pfam matches

This MGnifam HMM profile was searched against the HH-suite profile Pfam database (ver. 35.0) with HHsearch.

No MGnifam model Pfam hits found

Structure-structure hits

This MGnifam 3D structure was searched against the Alphafold/UniProt and PDB databases with foldseek.

Rank Target Structure Target DB Aligned Length Query Start Query End Target Start Target End E-value
1 U2BNG1 AlphaFold 895 1 895 897 9.737e-64
2 A0A3D4XDN3 AlphaFold 548 1 548 550 2.642e-38
3 A0A349BZW0 AlphaFold 777 155 899 805 9.743e-36
4 A0A641W8W0 AlphaFold 875 3 870 881 1.275e-30
5 A0A496NI07 AlphaFold 890 18 901 948 2.761e-30
6 A0A641RRB2 AlphaFold 716 177 891 719 2.697e-29
7 R6RNC9 AlphaFold 502 1 502 504 4.354e-28
8 K1GNU2 AlphaFold 918 2 902 934 4.354e-28
9 A0A1F0DTN6 AlphaFold 919 3 902 928 1.964e-27
10 D4CVX1 AlphaFold 922 3 902 931 2.476e-27
11 A0A095WG19 AlphaFold 864 2 856 880 3.004e-27
12 A0A095WFM8 AlphaFold 917 2 893 925 1.919e-26
13 A0A354HEQ0 AlphaFold 648 3 650 667 2.613e-26
14 K1GY41 AlphaFold 916 3 893 925 2.823e-26
15 A0A3C2CW22 AlphaFold 799 19 811 799 8.134e-25
16 A0A0U1P3F8 AlphaFold 891 1 891 865 1.903e-24
17 A0A3D2LZ46 AlphaFold 714 2 715 709 4.809e-24
18 R6S1J4 AlphaFold 395 503 895 397 1.593e-23
19 A0A3B8TZY4 AlphaFold 508 5 512 505 6.056e-19
20 A0A496VVX8 AlphaFold 892 5 896 840 7.936e-19
21 U2C1U0 AlphaFold 866 10 875 869 7.936e-19
22 A0A660P7V5 AlphaFold 756 132 887 855 2.732e-18
23 R6FPA0 AlphaFold 487 112 598 462 5.068e-18
24 A0A7V3JJL3 AlphaFold 757 131 887 870 8.705e-18
25 A0A660P4B3 AlphaFold 762 130 891 861 1.745e-17
26 G6C481 AlphaFold 547 332 859 549 3.778e-17
27 G6C4I7 AlphaFold 590 333 902 593 9.548e-17
28 A0A2D8USZ1 AlphaFold 662 151 812 801 2.149e-16
29 A0A7C1U8K6 AlphaFold 735 157 891 833 1.799e-15
30 A0A133P3X0 AlphaFold 557 370 898 558 3.747e-15
31 A0A5C6LKA7 AlphaFold 723 3 725 736 4.048e-15
32 A0A7X7F3B6 AlphaFold 490 129 618 593 6.534e-14
33 S7X611 AlphaFold 566 130 695 680 7.625e-14
34 A0A316ADB4 AlphaFold 778 113 890 726 1.039e-13
35 A0A5M5BZB5 AlphaFold 484 2 485 478 1.26e-13
36 A0A6B0Q096 AlphaFold 557 2 558 537 2.197e-12
37 A0A351MGJ7 AlphaFold 737 157 893 886 8.492e-12
38 A0A3D4XDL0 AlphaFold 175 569 743 177 1.728e-10
39 A0A0J1IKW6 AlphaFold 343 131 473 458 2.179e-10
40 W6P0X4 AlphaFold 364 8 371 351 3.384e-09
41 A0A1S2M6M7 AlphaFold 426 2 427 405 4.266e-09
42 R6JA19 AlphaFold 376 3 371 382 1.413e-08
43 A0A381I5L2 AlphaFold 270 60 329 280 2.281e-07
44 A0A519L4G4 AlphaFold 368 2 361 383 7.267e-07
45 G6C4I8 AlphaFold 332 2 329 340 1.636e-06
46 F9EQX2 AlphaFold 339 7 345 351 2.062e-06
47 G6C480 AlphaFold 334 3 333 343 2.703e-06
48 A0A6J4PT32 AlphaFold 355 98 452 341 2.809e-06
49 A0A3D4XDJ0 AlphaFold 154 749 902 155 3.202e-05
50 A0A015YFH2 AlphaFold 229 668 896 235 9.088e-05
51 R5RI63 AlphaFold 241 656 896 235 0.0001061
52 A0A2M9UQC2 AlphaFold 229 668 896 235 0.0001287
53 A0A016CRA8 AlphaFold 229 668 896 235 0.0001337
54 A0A015XC12 AlphaFold 229 668 896 235 0.0001445
55 A0A016EGS9 AlphaFold 229 668 896 235 0.0001821
56 A0A829STM3 AlphaFold 229 668 896 235 0.0001968
57 R6ZBC6 AlphaFold 229 668 896 235 0.0002387
58 A0A015T5Z9 AlphaFold 217 680 896 235 0.0002481
59 A0A0E2AT28 AlphaFold 229 668 896 235 0.0002481
60 Q64W02 AlphaFold 224 668 891 228 0.0003128
61 A0A3E5IH75 AlphaFold 217 680 896 235 0.0003651
62 A0A1Q5PJ67 AlphaFold 170 175 344 171 0.0004261
63 A0A015XIX7 AlphaFold 229 668 896 235 0.0004603
64 A0A812IFX7 AlphaFold 532 129 654 539 0.000627
65 7sqc PDB 771 132 902 702 0.0009077
66 7pks PDB 693 174 816 695 0.0009806

Family Representative Sequence Viewer

Amino acid position: -

TYQTRRKLTEAWVNSTADKGRYLSRREKEMLPDVYGYTIPQDACNEMRKLLTENHYGTLSELYQRRFSSLVDVCVPEESREEFYYALDQMNQYQMTAGWYRRSLRSDSYAPFAEQSVRVLRGYSRLGFYGVTLADLLTGNTQPEFYDHARNERFSYAEILAAQIDRGNEKAVQAVKDILLGEGNTAMLSHELIRGIVMSRNKELYDVLGKFLLAARLQEGARQAVCETMDAGRPEAFLHLFAVIEENDLIRYSSVKRAVSTWIGIFNEKSVDRITDKLLRLMGRCLRDPDFCEEQLSSEDSVAISCALWAKGFYDAKAAVRAVEKLIADGTKHQKMTASYFNQSIQDERLRMQASKDVIIKCSDDLELVACFLPGFMESTGSHFYRLVKDEGSNAYSLRGGKIVKPKKMAPEEMFADRAEALRCYRILKEILKKVPKKGITLSPCIFPWHQVTMNQSDLAARLCLIAWMLQDEEVLDEAAGFIPLIGQGAGYSYYGASRAAAARLLLYRPKSAARKKILFELLHNPEEYTNKEAHLLAEDMELTSEDYIQIEKNLRYKKGRKGTLSLLRRQDKSSLVSSIARLLEEKSEECRMGALDLALELKKEDAGYFENVAPYLRTLSEPTGKEQVLLKELLGEESAAQDILNTPGYGLYDGKKDWILPPVEVDRNQAFGLFTYGEGECIRVYKELDELIGEHASRSYKTAWDQEELLGNDLKTSRYIHNDPDAKPLDAYPFPELWKKFYETKIGTPQLLLEVELYRQCCLQRGLYEQNRKLYKQVFGSGILKRPPFQNLLPSVAYGRQVHTLISVLFAQYVPDALKARFALCGTAKFLSVLDTSNDIFTVNEKRWNGELVTYTKRAAELPIFADMIHWLSCAEEKDWGSAFTLRFRLGQHYLGQEKRERQQYSYQSSSHFYLGLG

HMM Consensus ekekkkeklkkkkkklskeekklleellkesdkdyeyysykeelenelkelleekkikklselfekelkkllellvgkelaedflyildklnkypystgyyRRsvRsknyepylekiisllrallklafygldledllkgeldeeeldyireslsfsyiiAaeidrgneeviealkdillsenntallsrelirgIlmSsneelhellgkLLlAArLQEGLRQaIcEtmDeGtleaflyllkvIeendLiRFSSVkRAvatWtGlgdeeskdriskklleligkcLrdpeereealkseDnveiylALWakGfydvedaveaveeLlksgtkhqklvasyflrslqdeklkrelakkvleeysdDlellaailpnylsdlyynsyeeekkkpkleeyfedkeeaeelfeilkellerlkkketfspciFpWysvtlsksdvaeklalialllqdeelidelaellpeidsysRaallrlllkkpkteaqrefllelLadrseytretAleilkkleLteeeyreledlLrlKssdlRknvislLlkqddealeasierLlsdkkeekrlAaLdlllqlkkdekraelfeelkellkeiekptekEkilleellgeekseaeeytkengfglydpdkevelpeikkdkkldlkklfsliseeelkeilkkLdalieehkdyeYksaygetvlLgnsfrikyseeekldnyPlaelWeefyekeikdpekllqlylllrlrkneeeykkklkkllekvfgklkkkklkkelkklkysnqvrdiisalfeeysdeekkqkfalallsallsllpeknllkkykekkkeseseeytasseelaefkralekleeaetdeefakafeaalkles