The formation of insoluble inclusion bodies remains a major obstacle to the widespread application of prokaryotic expression systems for recombinant protein production [2, 24]. To address this challenge, we developed a versatile fusion tag system that enables rapid screening for optimal solubility enhancers. In this study, the pullulanase PulA could not be expressed in E. coli BL21(DE3) (Figure S2), likely due to its origin from the hyperthermophilic anaerobe Thermotoga neapolitana. Although the PulA gene was codon-optimized for E. coli, it still failed to express, suggesting that factors beyond codon usage—such as protein folding kinetics or compatibility with the host proteostasis network—may have contributed to its insolubility [2, 25]. Similarly, proteins such as Mals and NodE, which may contain rare codons or complex folding requirements, achieved soluble expression only when fused with specific tags. For instance, MsyB markedly enhanced the solubility of NodE (Fig. 5a), while ArsC, SlyD, DsbA, MsyB, Crr, and Snut all significantly improved the solubility of Mals (Fig. 4).
Other target proteins, including eGFP, FabGXL, FabF1XL, FabF2XL, and FabZXL, were predominantly expressed as inclusion bodies in the absence of fusion tags. While soluble expression was achievable with tag assistance, the efficacy of each tag varied considerably depending on the target protein. For example, all eight tags improved FabZXL solubility (Fig. 5e), whereas only YjgD enabled soluble expression of FabGXL (Fig. 5b). Similarly, MsyB and YjgD were the most effective for FabF1XL and FabF2XL (Fig. 5c–d). ArsC, Ecotin, MsyB, SlyD, Snut, and YjgD fusion tags can promote the solubilization of eGFP (Fig. 2). These observations highlight the target-specific nature of fusion tag efficacy and underscore the importance of employing a multi-tag screening approach [15, 26].
The solubilization mechanism of fusion tags remains incompletely understood, though it is often suggested to relate to their own folding properties and biophysical characteristics, such as surface charge or hydrophilicity [12, 27]. To gain preliminary insight into the mechanisms underlying solubility enhancement, we analyzed the core biophysical properties of our eight fusion tags (Supplementary Table S3). We observed a general trend wherein tags characterized by a high negative net charge and hydrophilic nature, as indicated by a negative GRAVY index, and particularly those with highly acidic properties (pI < 5.0) such as MsyB and YjgD, tended to be the most effective and versatile solubility enhancers. This finding is consistent with prior studies on acidic fusion partners [12, 13]. However, notable exceptions highlight the complexity of the mechanism. For instance, the superior performance of SlyD with Mals likely stems from its intrinsic chaperone activity rather than its charge properties alone [14]. Interestingly, the less acidic Snut (pI 6.324, GRAVY − 1.106, net charge − 1.79) still demonstrated considerable solubilization efficacy, performing well with target proteins such as eGFP, Mals, NodE, and FabZXL. Furthermore, the ability of certain tags (e.g., YigD) to solubilize particularly recalcitrant proteins (FabGXL) for which others failed including the MsyB, underscores that optimal tag selection results from a complex, individualized match between the tag's biophysical properties and the target protein's specific folding pathway and structural needs.
However, the molecular dimensions and structural properties of fusion tags can also interfere with the folding and functionality of target proteins, as evidenced by our experimental data. A notable example is the DsbA tag, which substantially quenched eGFP fluorescence despite maintaining reasonable solubility (Fig. 2C). In this experimental design, the soluble protein EcFabG was intentionally selected to analyze the impact of fusion tags on its solubility and function. As anticipated, the addition of tags reduced its solubility ratio to varying degrees, with the most significant decrease reaching 58%. Furthermore, tags including Crr, Ecotin, DsbA, and SlyD were found to impair EcFabG's enzymatic activity (Fig. 3C). These results emphasize that while fusion tags can enhance solubility, they may also interfere with protein function. It is therefore common practice to remove fusion tags after purification [28]. Interestingly, however, several tags in this study—including SlyD, MsyB, and Crr—enhanced the enzymatic activity of Mals even before cleavage (Fig. 4C). This suggests that, in some cases, fusion partners may do more than improve solubility; they may also assist in folding or stabilize the active conformation of certain target proteins [14, 15]. The underlying mechanisms warrant further investigation.
In summary, our results confirm that no single fusion tag is universally effective for all target proteins. The optimal tag must be empirically determined through parallel screening [15]. The pX vector system developed in this study provides a convenient and efficient platform for such screening, enabling rapid identification of the most suitable fusion tag to enhance both the solubility and functional yield of diverse recombinant proteins.