The names/IDs of the PKS are encoded as BGC_n, where n in a positive integer.
Download PKS sequences in:
The docking domain names are encoded as "BGC_n_LINKER_k_HC" for Heads (C-terminal DD) and "BGC_n_LINKER_k_TN" for Tails (N-terminal DD), where BGC_n is the BGC ID (or the PKS ID), LINKER_k is the docking domain ID.
Heads: 100 AA at the C-terminus of a PKS peptide.
Tails: 50 AA at the N-terminus of a PKS peptide.
Heads or Tails with the same n are in the same pathway.
Heads and Tails with the same n and k are interacting DDs.
Heads and Tails with the same n but different k are non-interacting DDs.
For example, "BGC_2_LINKER_3_HC" interacts with "BGC_2_LINKER_3_TN" but not "BGC_2_LINKER_1_TN".
The "basic information" file provides a mapping scheme from the above IDs to the PKS gene names.
MIBiG v.1.4 has 252 type I modular PKS records. After excluding PKSs with published pathways, PKSs that do no have availabel sequences and PKSs with only 1 PKS gene, we used antiSMASH + DDAP to predict pathways of the remaining 80 PKSs. A maximum of 50,000 putative pathways are kept for each PKS.