Skip to main content

Development of a β-glucosidase improved for glucose retroinhibition for cellulosic ethanol production: an integrated bioinformatics and genetic engineering approach

Abstract

Background

The global energy crisis, driven by economic growth and the increasing demand for energy, highlights the urgency of searching for alternative energy sources to mitigate environmental pollution and climate change. β-Glucosidases act in the final step of the enzymatic hydrolysis of cellulose, cleaving the β-1,4-glycosidic bonds in cellobiose to produce second-generation ethanol. However, these enzymes are easily inhibited by glucose, their final product, which limits the production of this biofuel. Genetic engineering combined with bioinformatics tools can improve key enzymatic characteristics, such as catalytic activity and glucose tolerance, in a more precise, faster, and cost-effective manner compared to traditional methods. In this work, a variant of a β-glucosidase from the GH1 family, isolated from the microbial community of Amazonian soil (Brazil), with enhanced catalytic activity and improved for glucose retroinhibition, was developed.

Results

Bioinformatics analyses suggested the substitution of tryptophan at position 404 with leucine. The produced variant (W404L) was expressed in Escherichia coli and showed activity 3.2 times higher in the presence of glucose than the non-mutated control. Moreover, the partially purified mutated variant of β-glucosidase exhibited a 26-fold increase in catalytic activity compared to the original form of the enzyme. The results confirmed that the mutation proposed by computational analyses had a significant impact on enzyme catalytic activity and glucose retroinhibition.

Conclusions

This new variant may become a promising alternative to reduce the costs of enzyme cocktails used in the hydrolysis of lignocellulosic biomass used as a raw material in the production of second-generation ethanol.

Graphical Abstract

Background

Economic growth and the consequent increase in energy demand have triggered a global energy crisis, directly linked to rising environmental pollution and climate change [1]. These issues are urgent and require immediate governmental attention. Fossil fuels dominate the energy matrix, accounting for approximately 81% of the total, and their use releases high concentrations of CO2 into the atmosphere, along with other greenhouse gases (GHGs) [2]. The search for alternative energy sources, which minimize the impact of fossil fuels, has become a priority for sustainable development. Bioenergy, derived directly or indirectly from biomass, has emerged as the world’s fourth-largest energy source, gaining increasing significance [3].

Ethanol is the most widely produced and used biofuel, commonly obtained through the fermentation of sugarcane juice, beet, or corn. Cellulosic ethanol, also known as second-generation ethanol (2G), is produced through the fermentation and distillation of cellulose-rich plant residues, such as sugarcane bagasse and corn straw [4]. It is estimated that 2G ethanol production can reduce GHG emissions by approximately 52–55% compared to gasoline [5].

Cellulose is the most abundant renewable polysaccharide on the planet. This organic molecule is formed by multiple long chains of glucose, linked together, being the main constituent of the cell wall of plant cells. The enzymatic hydrolysis of cellulose occurs through the synergistic action of three classes of enzymes: (i) endoglucanases (EC 3.2.1.4), (ii) exoglucanases (EC 3.3.1.91), and (iii) β-glucosidases (EC 3.2.1.21) [6]. Endoglucanases act by hydrolyzing the cellulose chains internally, while exoglucanases act on the reducing and non-reducing ends, generating cellulose oligosaccharides (cellobiose). These oligosaccharides are subsequently hydrolyzed by β-glucosidases to produce glucose, which is then fermented to produce biofuels [7].

β-Glucosidases, also known as β-D-glucoside hydrolases, act in the final stage of cellulose saccharification, cleaving the β−1,4-glycosidic bonds of cellobiose, releasing glucose as the final product, which microorganisms then ferment into ethanol [7]. This class of cellulases plays an essential role in the production of 2G ethanol, as they not only act in the final hydrolysis step to obtain glucose but also mitigate the feedback inhibition exerted by cellobiose on both endoglucanase and exoglucanase [8]. Besides biofuels, β-glucosidases have various applications in the food and beverage, pharmaceutical, and chemical industries, among other areas, highlighting their economic potential [9].

Most β-glucosidases described in the literature show low catalytic activity and stability and are susceptible to inhibition by glucose, limiting their industrial application. As the concentration of glucose increases over time, the catalytic activity of these enzymes is reduced, resulting in decreased glucose production and accumulation of cellobiose. This accumulation, in turn, interrupts the entire cellulolytic system, leading to a halt in production [7, 10]. Most β-glucosidases used in industry are predominantly of fungal origin, such as those from Trichoderma reesei and Aspergillus niger. Some of these enzymes, like those from Aspergillus niger ASKU28, exhibit glucose tolerance [11]. However, despite their tolerance, these β-glucosidases demonstrate relatively low hydrolytic activity with cellobiose.

β-Glucosidases belonging to the GH1 family are widely studied and known for their glucose tolerance, making them interesting targets for research in biofuel production applications [7]. GH1 enzymes are generally characterized as non-specific and vary in size depending on auxiliary domains, and are commonly found in Archaea, plants, and animals [12]. Enzymes belonging to this family exhibit a barrel-like (β/α)8 structure, where the conserved catalytic amino acids are located at the C-terminal portion of β strands 4 and 7 [12]. The catalytic mechanism occurs through the protonation of the anomeric carbon of cellobiose by a catalytic acid/base, followed by a nucleophilic attack by the catalytic nucleophile, involving glutamate residues present in conserved regions of its active site for both processes. Subsequently, a hydrolysis reaction occurs, resulting in the release of glucose and the restoration of the enzymes conformation [12, 13].

The mechanism of inhibition of β-glucosidase catalytic activity by glucose is not yet fully understood. Although it has been correlated with structural characteristics of the active site entrance, the most plausible mechanism is competitive inhibition, where both the substrate and the inhibitor compete for binding to the enzymes active site [14, 15]. In recent years, several studies have aimed to understand the inhibition mechanism of β-glucosidases and isolating new enzymes that exhibit less feedback inhibition by glucose through bioprospecting and screening of new organisms in nature [16, 17]. However, obtaining enzymes directly from nature often does not yield the desired result in terms of activity.

The most commonly used method in protein genetic engineering is error-prone polymerase chain reaction (Ep-PCR), which creates a library of variants of a particular enzyme through random mutations in its gene [18]. Although this technique provides numerous protein mutants, it generally produces more deleterious mutations than advantageous mutation and requires screening to identify the desired variants, making the process potentially expensive, time-consuming, entirely random process, and not always yielding the expected result [19]. An alternative to Ep-PCR is the application of bioinformatics tools, including molecular modeling of the whole structure and/or specific domains, to select important protein regions/residues as mutation targets, minimizing the size of the mutagenesis library and enhancing result precision [19]. This computational modeling approach enables the creation of a protein structure database, facilitating predictions of interactions between target molecules and their ligands (substrate and/or inhibitor). The structure model driven strategy supports specific changes in amino acid sites within the protein sequence, with the goal of enhancing catalytic efficiency, thermostability, or other desired properties [20].

This study aimed to develop a β-glucosidase variant with enhanced catalytic activity and improved glucose feedback inhibition compared to the wild type, using a β-glucosidase from the GH1 family, isolated by Bergmann et al. [21] from the microbial community of the Amazon soil, Brazil.

Methods

β-Glucosidase origin

The β-glucosidase, named AMBGL18, was discovered by Bergmann et al. [21] through a metagenomic analysis of microorganisms from the Amazonian soil in a forest area in the city of Moju (Pará, Brazil) region. This enzyme is classified in family 1 of glycosyl hydrolases (GH1) and was submitted to the enzyme engineering process described hereinafter.

In silico analyses

The main aim of the study was to find a single point mutation in AMBGL18 capable of reducing enzyme inhibition by glucose while maintaining enzyme stability and activity with cellobiose. In silico analyses were divided into two main stages presented in Fig. 1. The first stage focused on determining the specific position within the AMBGL18 sequence that could be mutated. The second stage involved identifying which of the 19 amino acids should replace the original residue at the selected position.

Fig. 1
figure 1

Flowchart showing the detailed stages for discovering the amino acid to be mutated

With the enzyme β-glucosidase (AMBGL18) converting cellobiose into glucose and, in turn, glucose retroinhibiting this enzyme, the method described here to identify the mutation site was based on analyzing the differential interactions between glucose, cellobiose, and the amino acids within β-glucosidases. By compiling a list of amino acids and their interaction frequencies with glucose or cellobiose, it was possible to determine which amino acids interacted more frequently with glucose and less with cellobiose. Based on the fact that inhibition occurs due to the binding affinity of amino acids to glucose [22], the rationale behind this method was that an amino acid with a higher affinity for the inhibitor (glucose) and a lower affinity for the substrate (cellobiose) would likely decrease enzyme activity, making it a strong candidate for mutation.

To explore the most frequent interactions between glucose, cellobiose, and β-glucosidases (not limited to AMBGL18), a set of proteins resulting from a search with the term “beta-glucosidase” was obtained from the Protein Data Bank (PDB) [23]. FASTA sequences were downloaded from NCBI Protein [24] and the PDB files for all structures were then stored in a local database. These structures belong to different β-glucosidases families, such as GH1, GH3, GH5 and others. The sequences were pairwise aligned to verify their similarity using Biopython’s “pairwise2” function with the Blosum62 scoring matrix [25]. Based on the alignment identity results, a similarity matrix was constructed and used as input to group these proteins based on a 90% or higher identity score.

Since the protein structure of AMBGL18 was not available in the PDB, its sequence was aligned with those in the β-glucosidases of the local data set. The sequence with the highest identity was from a protein with the PDB ID 1OD0 (49%). This value is obtained with Biopython’s pairwise2 function with the Blosum62 scoring matrix [25]. Consequently, the largest group of β-glucosidases containing 1OD0 was selected as docking simulation receptors. These protein structures were then prepared for docking by performing a structural alignment to ensure a consistent search box across all proteins. Protein structures were aligned using Biopython’s Superimposer function, guided by a similarity matrix to prioritize conserved regions. This step ensured spatial consistency across protein conformations and provided a consistent reference frame for docking. Both receptor and ligand were converted to pdbqt file containing only the polar hydrogen atoms and partial charges.

Thus, in the next step, molecular docking simulations were conducted for all β-glucosidase proteins of the selected group, with both glucose and cellobiose. Ligand structures (glucose and cellobiose) were obtained from the ZINC Database [26]. The molecular docking simulations were performed using AutoDock Vina 1.1.2 [27] with a grid box of 20 × 20 × 20 Å and exhaustiveness set to 8 using a python script configured and generated from an in-house framework for virtual screening [28]. AutoDock Vina 1.1.2 is a popular docking software used for predicting the best position and conformation of ligands in the binding site of a protein. It uses an empirical scoring function to estimate the protein–ligand interaction in terms of Free Energy of Binding (FEB), where more negative values indicate stronger and more favorable interactions.

Following the docking simulations, the number of interactions between glucose, cellobiose, and each protein was quantified using the LigPlot + [29] and nAPOLI [30]. These tools were used in this study to calculate and visualize the interactions between the protein of the selected group and the ligands considering the best docking result for each complex. Interactions were counted if any interaction type computed by nAPOLI was identified between any atom of either glucose or cellobiose and the corresponding atoms of the protein. The interactions computed by nAPOLI and the corresponding atoms distances were: (i) aromatic stacking (between 2 and 4 Å), (ii) hydrogen bond/hydrogen bond mediated by water (between 2.8 and 3.9 Å), (iii) hydrophobic (between 2 and 4.5 Å), (iv) attractive electrostatic (between 2 and 6 Å); and (v) repulsive electrostatic interactions (between 2 and 6 Å).

Finally, based on the nAPOLI interaction results, the frequency of interactions between the amino acids of each protein and the two ligands was assessed. By inspecting these interaction frequencies, it was possible to identify the amino acid with the greatest disparity in interaction numbers between cellobiose and glucose, indicating it as the optimal candidate for mutation. Knowing which amino acid is most likely to play a role in the inhibition of AMBGL18, as described in the previous subsection, we now present the methodology employed to select the optimal replacement amino acid that would result in a stable and active mutated AMBGL18.

Since the resolved three-dimensional structure of AMBGL18 was not available in the Protein Data Bank, homology modeling and validation were performed. The modeling was conducted using Phyre2 [31], a web-based tool for predicting protein structures through homology modeling based on a comprehensive database of known structures, utilizing evolutionary information and sequence profiles to predict protein folding. After obtaining the model, its quality was evaluated using PROCHECK [30], Verify3D [31], ModFOLD [32], and MolProbity [33]. PROCHECK evaluates the stereochemical quality of a model by analyzing the overall geometry and each residue’s backbone angles, focusing on the percentage of residues in the favored regions of Ramachandran plot [32]; Verify3D evaluates the compatibility between the three-dimensional model and its physicochemical properties, generating an average 3D–1D score for each residue, where protein structures are considered to have high quality when more than 80% of the residues scored greater than 0.2 [33]; ModFOLD provides a single score (Global Model Quality) and a p value relating to the predicted quality of a 3D model of a protein structure [34]; MolProbity offers an analysis of the geometry, including the detection of steric clashes and rotamer outliers, where lower Molprobity scores reflect better structural accuracy [35].

Next, it was crucial to accurately identify the amino acid position in AMBGL18 to mutate, given that this protein sequence shares 49% identity with proteins in the selected group. Thus, the sequences and structures of these proteins were aligned, allowing the identification of the position defined by Stage 1 in AMBGL18 through the PROMALS3D tool [36].

Having the AMBGL18 model validated and the mutation site identified, it was necessary to select the optimal replacement amino acid that would result in a thermostable mutated protein. Thermostability is related to the resistance of a protein to denaturation at elevated temperatures, which is an important characteristic to consider when evaluating suggested mutations, as it can significantly impact the protein’s functionality and stability. In this context, a single point mutation can either stabilize or destabilize a protein structure according to the change in Gibbs free energy variation (DDG) between the mutated and wild-type proteins.

For this purpose, several tools exist for evaluating the thermostability of proteins upon point mutation. These tools can be broadly divided into those that require only the amino acid sequence and position, and those that also utilize the protein's three-dimensional structure. Since predictions obtained with these tools usually do not converge on the same answer, a combination of sequence-based tools (MuPro [37], iStable [38], MuStab [39]) that predict whether a mutation causes thermostabilization or destabilization, and structured-based tools (iMutant 3.0 [40], SDM [41], mCSM [42], DUET [43], MAESTRO [44], iRDP [45], INPS [46], NeEMO [47], SAAFEC [48], EASE–MM [49]) that predict the DDG, were employed. Combining the results of all the tools, it was possible to establish criteria for selecting the optimal substitute amino acid.

Genetic construct and enzymatic activity

The genes encoding wild-type β-glucosidase (AMBGL18) and the mutant (named W404L) were synthesized by Epoch Life Science, Inc. (USA) using pCDF-1b as the expression vector. The main characteristics of the plasmids are described in Figure S1. The recombinant plasmids pCDF-1b_AMBGL18 and pCDF-1b_W404L were transformed into the Escherichia coli BL21 (DE3) strain (Biolabs) for protein overexpression. Positive clones were cultured in Luria–Bertani (LB) medium [50] supplemented with 50 µg/mL ampicillin and stirred at 250 rpm at 37 °C until the optical density (OD600) reached 0.8. After this process, 5 µL of the cultures were plated on LB medium containing 2% agar, 50 µg/mL ampicillin, 0.8 mM IPTG, 0.1% ferric ammonium citrate (Sigma–Aldrich), and 0.5% esculin (Sigma-Aldrich). To observe glucose tolerance, plates with and without 0.5% glucose were incubated at 37 °C for 24 h [51, 52].

Mattéotti et al. [53] developed a method to quantify β-glucosidase activity in a liquid medium using 4-nitrophenyl-β-D-glucopyranoside (pNPGlc) as a substrate. Colonies of transformed BL21 bacteria, both those expressing the AMBGL18 enzyme and those expressing its mutant (W404L), were grown in LB medium supplemented with 50 μg/mL ampicillin at 37 °C with stirring at 250 rpm until OD600 reached a value of 0.8. Aliquots of 100 µL from each of the cultures were collected and centrifuged at 13,000 rpm for 3 min. The supernatant was discarded, and the pellet was resuspended in 100 µL of a solution containing 4 mM pNPGlc in 100 mM sodium phosphate buffer, pH 8.0. The mixture was incubated at 37 °C for 30 min, and the enzymatic reaction was stopped by adding 100 µL of 1 M Na2CO3. The cells were precipitated by centrifugation at 10,000 × g for 5 min, and the supernatant (200 µL) was transferred to a transparent 96-well plate. To evaluate the effect of the mutation under conditions of strong inhibition, we performed reactions using a high glucose concentration (500 mM). Cell density was normalized prior to measuring enzyme activity to account for variations in cell growth. The amount of p-nitrophenol (pNP) released was measured at a wavelength of 405 nm and compared with a standard curve prepared with various concentrations of pNP. One unit (U) of enzymatic activity is defined as the amount of β-glucosidase required to release 1 µmol of pNP per minute, under the conditions of the assay.

Expression and enzymatic partial purification

Induction of transformed BL21 bacteria, both those expressing the AMBGL18 enzyme and those expressing its mutant (W404L), was performed using the autoinduction method described by Studier [54], which occurs through the regulation of the lac operon by metabolization of lactose in the cell. Selected colonies were cultured in 5 mL of MDAG/ampicillin 50 µg/mL medium and maintained on an orbital shaker at 200 rpm at 37 °C for 20 h. After this period, the cultures were transferred to 2 L Erlenmeyer flasks containing 500 mL of ZYM-5052/ampicillin 50 µg/mL medium and grown until reaching an OD600 of 1.0. Following this period, an incubation at 18 °C, 200 rpm for 72 h was conducted. At the end of the autoinduction, the cells were centrifuged at 5.000 g for 20 min at 4 °C. The supernatant was discarded, and the bacterial pellet was resuspended in lysis buffer (NaCl 500 mM, NaH2PO4 20 mM, and Urea 8 M; pH 7.4) and incubated for 1 h at room temperature. Cellular debris was removed by centrifugation, and purification was performed using the HisPur Ni–NTA Purification Kit (Thermo Fisher Scientific) according to the manufacturer’s protocol. The purified β-glucosidases were eluted in buffer (NaCl 500 mM, Imidazole 250 mM, NaH2PO4 20 mM, and Urea 8 M; pH 7.4), concentrated using the Vivaspin 500 filter (MWCO 10,000, GE Healthcare), resuspended in buffer (NaH2PO4 20 mM, NaCl 500 mM, pH 7.4), and stored at 4 °C. The concentrated enzymes were analyzed by SDS–PAGE electrophoresis (4–12%) using Precision Plus Protein Dual Xtra Standards (BioRad) as molecular markers. The protein concentration was determined and standardized by spectrophotometry (Nanodrop 2000, Thermo Fisher Scientific) at 280 nm.

The activities of the partially purified enzymes AMBGL18 and W404L were determined through the rate of p-nitrophenyl (pNP) release, following the method described by Deshpande et al. [55]. For the assays, 10 µL of the enzymes were added to 1.5 mL Eppendorf tubes with 80 µL of buffer (20 mM HEPES, 50 mM NaCl, pH 6.0) and 10 µL of 50 mM pNPGlc. The samples were incubated at 50 °C for 30 min, and the reaction was stopped by adding 100 µL of 1% Na2CO3. Absorbance was measured at a wavelength of 405 nm and compared with a standard curve prepared with various concentrations of pNP [55]. One unit of enzymatic activity (U) is defined as the amount of enzyme required to release 1 µmol of pNP per minute, under the assay conditions.

Comparison between cellobiose and pNPGlc as substrate for AMBGL18

To compare whether the cellobiose/pNPGlc and glucose/pNP molecules behave similar to experimental data, docking simulations were performed with Autodock Vina 1.2.3 [27]. The objective was to compare the interactions between AMBGL18 and glucose/pNP as well as AMBGL18 and cellobiose/pNPGlc. The receptor file of AMBGL18 was the generated model with the mutation at position 404. The ligands were obtained from the PubChem [56] database. The receptor was prepared using AutoDockTools [57], while the ligands were processed using Open Babel [58] and Avogadro [59] to generate the 3D format, add charges, and incorporate hydrogens.

For the docking simulations, a grid box was defined considering the entire protein structure (blind docking), with the box center set at x = 80.3, y = 30.5, and z = − 261.49, and the size set to 60 Å in x, y, and z. The exhaustiveness was defined as 128 for all simulations. Based on the best docking pose, a file was prepared for each complex to submit to PLIP [60] to analyze the receptor–ligand interactions.

Statistical analyses

All assays were performed in both biological and technical triplicates. Data were tested for normality and homoscedasticity using the Shapiro–Wilk and Bartlett tests, respectively. Simple analysis of variance (two-way ANOVA) was used to assess the enzymatic activity of β-glucosidases in cultures, and Student's t test was used for the activity of the partially purified enzymes. A p value of < 0.05 was considered significant, and all statistical analyses were performed with R software [61].

Results and discussion

Bioinformatics analysis

With the increasing availability of gene sequences and protein sequences and structures deposited in online databases, it has become more accessible to predict specific mutations for a desired trait using bioinformatic tools. These advancements allow the generation of enzymatic variants with less time consumption and reduced material requirements [19]. In this work, an in silico procedure was carried out to suggest the in vitro implementation of a single-point mutation in a β-glucosidase from the GH1 family (AMBGL18), aiming to maintain its structural stability while enhancing its enzymatic function in the hydrolysis of cellobiose to glucose [62].

A total of 205 β-glucosidase structures were obtained from PDB, each accompanied by its corresponding FASTA file (obtained from NCBI Protein) representing the primary structure of the protein. By aligning pairwise all these primary protein structures, β-glucosidases were categorized into groups that share 90% or more sequence similarity between them. One group contained the sequence of the protein with the highest identity (around 49% similarity) with AMBGL18 (the protein with PDB ID 1OD0) and was also the largest group, with 26 protein structures.

Molecular docking experiments were conducted to investigate the active site of β-glucosidases and to better understand the interactions between the enzyme, cellobiose, and glucose. For each of the 26 receptor structures in the selected group, protein–ligand interactions were evaluated, specifically between β-glucosidases and glucose as well as between β-glucosidases and cellobiose. All molecular docking simulations resulted in a negative FEB, indicating good interaction between the proteins and glucose and cellobiose molecules, as shown in Fig. 2. In total, 26 molecular docking experiments were performed with the “glucose” ligand, and 26 experiments with the “cellobiose” ligand. The script to automate the use of LigPlot + and nAPOLI tools for all docking experiments generated output logs and graphs detailing the interactions between the target molecule and the ligand, exemplified in Fig. 3.

Fig. 2
figure 2

Results of molecular docking experiments using glucose and cellobiose as ligands and 26 β-glucosidase structures as target molecules. FEB: free energy of binding

Fig. 3
figure 3

(Source: Ligplot)

Example of molecular interaction between the residues of β-glucosidases and the ligands. A Residues of the protein with PDB code: 2CBV and the atoms of the glucose ligand (Lig 447). B Residues of the protein with PDB code: 2WBG and the atoms of the cellobiose ligand (Lig 446). Each result of molecular interaction highlights the key amino acids that made contact with the atoms of the ligands. Hydrogen bonds are represented by dashed green lines and hydrophobic contacts are highlighted by red representation

The frequency of interactions between the residues of the target protein and the atoms of the ligands was determined using nAPOLI, which uses as input the results of the molecular docking experiments of β-glucosidases with their ligands (glucose and cellobiose). This analysis allowed the identification of which β-glucosidase residues interacted with only one or both ligands across the 26 structures of the selected group. The larger the difference in interaction frequencies between an amino acid and both cellobiose and glucose, the more likely this amino acid was considered to be a good candidate for engineering. Table 1 presents key amino acids and their interaction frequencies with the ligands, as obtained from all the 26 molecular docking simulations for each ligand and extracted using LigPlot and nAPOLI.

Table 1 Main amino acids (AA’s) in contact with glucose and cellobiose during the experiments

The main objective of the in silico study was to find a point mutation in AMBGL18 capable of reducing the inhibition caused by glucose, thus increasing the catalytic activity of the enzyme. As shown in Table 1, it is possible to verify that the amino acid tryptophan (Trp), located at position 398 (Trp398) in the protein sequences of the selected group, was identified as the primary candidate for mutation due to its contact with glucose in 65% of the docking simulations and 0% with cellobiose. Thus, the outcome of Stage 1 of the proposed methodology was the selection of Trp398 as the position for mutation.

To choose the amino acid to replace Trp398, the methodology of Stage 2 was applied, which involved selecting the optimal amino acid for mutation. For this purpose, it was important to evaluate the thermostability of the protein structure upon mutation using computational tools. However, most protein stability analysis tools require a three-dimensional structure as input, and since the β-glucosidase AMBGL18 did not have a structure available in the PDB, molecular modeling by homology was employed to obtain a structural model. The model was generated with Phyre2, and its quality was evaluated with PROCHECK, MolProbity, Verify3D, and ModFOLD.

The PROCHECK results (Table 2) indicate that 88.2% of the residues in the AMBGL18 model are located in the most favored regions of the Ramachandran plot, with an additional 8.8% in allowed regions. Only 1.0% of the residues are in disallowed regions, which is within acceptable limits for a reliable model. A high percentage of residues in favored and allowed regions suggests that the overall backbone conformation of the model is likely accurate. The tool, Verify3D, shows that 90.79% of the residues have a 3D–1D profile score of 0.2 or higher. This score reflects how well the model’s amino acid sequence fits the generated 3D structure. A high percentage of residues meeting this threshold indicates that the local environments of most residues are consistent with known protein structures, further supporting the model’s accuracy. ModFOLD provided a p value of 8.446E-5 and a global model quality score of 0.8924. The low p value indicates that the model is statistically unlikely to have been generated by chance, implying a high level of confidence in its overall fold. The global quality score, which ranges from 0 to 1, suggests a high degree of predicted reliability for the model. A score close to 1 indicates that the predicted structure exhibits characteristics typical of high-quality protein models, based on ModFOLD's training on known protein structures. This suggests the model is likely to be structurally sound and suitable for further analysis, even in the absence of an experimental structure.

Table 2 Evaluation of the structural model of AMBGL18, obtained with Phyre2, according to the PROCHECK tool—Ramachandran plot

Following the good results of the other tools, MolProbity analysis (Table 3) reveals that only 1.89% of the side chains are in unfavorable rotamer conformations, while 95.14% are in favorable conformations. In addition, the Ramachandran plot shows that 93.55% of the residues are in favored regions, with only 1.94% as outliers. These results indicate that the model has a well-optimized geometry, with minimal steric clashes and conformational errors. Analysis of the results from these four tools suggests that the AMBGL18 model is of high quality. The favorable backbone conformations, consistent residue environments, statistically significant global quality score, and optimized side-chain geometries all indicate that this model is a reliable representation of the protein's 3D structure. Consequently, this model is suitable for use as the input in the thermostability computational tools.

Table 3 Evaluation of the structural model of AMBGL18, obtained with Phyre2, according to the MOLPROBITY tool

Due to the differences in the sequences of the 26 proteins of the selected group and AMBGL18, it was necessary to identify the equivalent position of the amino acid to be mutated in ABMGL18. The location of Trp398 in the structure of AMBGL18 was determined by aligning one of the groups structures (2WBG) with the structural model of AMBGL18, enabling the identification of the amino acid of interest. The alignment result revealed that the tryptophan located at position 404 (Trp404) in AMBGL18 corresponds to Trp398 in the structures of the protein group selected in the previous step.

With the structural model of AMBGL18 and the exact position of the amino acid to be substituted (Trp404) (Figs. 4A, B and 5), computational thermostability analyses were conducted using various tools, where all possible amino acid substitutions were tested. The results are presented in Tables 4 and 5.

Fig. 4
figure 4

(Source: Protein Data Bank). B Model of the AMBGL18 structure highlighting in green the tryptophan located at position 404 (W404) (Source: Phyre 2)

A Protein sequence of β-glucosidase AMBGL18 indicating, in a red circle, the tryptophan (W) located at position 404

Fig. 5
figure 5

Structure of AMBGL18, highlighting in green the location of tryptophan at position 404. A Full view of the enzyme structure. B Closer view emphasizing the spatial position of tryptophan 404 within the enzyme. C Detailed close-up showing the positioning of tryptophan 404 inside the enzymes structural cavity

Table 4 Result of the thermostability analysis for tools that have as input the AMBGL18 sequence and, as output, only the information whether the mutation is stabilizing (S) or destabilizing (D)
Table 5 Result of the thermostability analysis for the tools that take as input the modeled structure of AMBGL18 or its sequence, and output the predicted ΔΔG value in kcal/mol

Considering the results from the thermostability analysis tools, the following criteria for selecting the substitute amino acid were established: (i) it must be hydrophobic, to preserving the active sites hydrophobic environment and minimizing disruptions to enzymatic activity, (ii) it should be classified as “stabilizing” by the majority of point mutation prediction tools that use sequence data as input (Table 4), and (iii) it must have a ΔΔG greater than −0.8 kcal/mol in most tools that assess stability using structural data (average values were not used to select candidate amino acids for mutation) (Table 5). Following these criteria, two amino acids were identified as potential substitutes for tryptophan Trp404: methionine and leucine. The values that guided this decision are in bold in Tables 4 and 5. Since leucine presented the highest ΔΔG values, which were underlined and in bold, when compared to methionine, it was chosen as the optimal amino acid for the synthesis of the genetic construct.

In vivo enzymatic assay of AMBGL18 and W404L

The bacteria transformed with the genetic constructs containing the original β-glucosidase (AMBGL18) and its mutant (W404L) were selected on agar plates containing the reagent esculin. In the presence of β-glucosidase, esculin is hydrolyzed to esculetin, which reacts with iron to form a dark brown/black complex in the agar. The hydrolysis of esculin and its detection through binding with iron was first described in 1909 and is still the most employed screening method today for selecting β-glucosidase-producing microorganisms [63, 64]. As observed in Fig. 6A, the substitution of tryptophan at position 404 with leucine demonstrated a visually positive effect on catalytic efficiency even in the presence of glucose, the main inhibitor of β-glucosidases, compared with the wild-type enzyme. This suggests that the point mutation generated a variant with improved glucose feedback inhibition.

Fig. 6
figure 6

A Bacteria cultured on agar + esculin medium, with 0.5% glucose (1) and without glucose (2). B β-Glucosidase activity in non-transgenic (BL21) and transgenic (AMBGL18 and W404L) bacterial cultures using pNPGlc as a substrate. Different letters indicate significant differences (p < 0.05)

When measuring the concentration of pNP in bacterial cultures, a significant difference in activity between the enzymes AMBGL18 and W404L was observed, with the mutant showing 3.33 times more activity than the original enzyme (Fig. 6B). When 500 mM glucose was added to the cultures, the enzymatic activity of W404L decreased by 5.5 times, but remained 3.2 times higher than the activity of AMBGL18. The same result was also observed by Cao et al. [65], who produced glucose-tolerant β-glucosidase variants through error-prone PCR (Ep-PCR), which showed a decline in activity starting from a glucose concentration of 500 mM.

Analyzing the degree of inhibition by glucose, calculated from the normalization of the enzymatic activity in the presence of 500 mM glucose in relation to the activity without glucose, we observed that there are no statistically significant differences between the mutant and wild-type enzymes, with inhibition rates of approximately 81% for both. Although the degree of inhibition is similar, W404L presents a residual activity three times higher than AMBGL18, indicating that it remains functional even at high glucose concentrations. Furthermore, its activity in the presence of 500 mM glucose is equivalent to that of the wild-type enzyme without glucose, reinforcing its resistance to the inhibitor. This suggests that, despite the similar inhibition between the variants, W404L maintains a significantly greater capacity after inhibition.

Since the W404L variant exhibits higher catalytic efficiency, if a small amount of the enzyme is inhibited, the decline in activity will be more significant than for AMBGL18, which naturally has lower activity. β-glucosidases reported in the literature are generally inhibited at glucose concentrations around 100 to 200 mM [12]. Based on the mutation suggested by bioinformatics analyses, a variant of β-glucosidase that demonstrated catalytic activity at a glucose concentration of 500 mM was produced, confirming that the mutation caused a change in the structure of the active site of AMBGL18.

Partial purification of AMBGL18/W404L and docking simulations

AMBGL18 and W404L were purified through nickel (Ni2+) affinity chromatography, with the addition of a histidine tag (6HisTag)—a sequence of six histidine residues—to the N-terminal portion of the enzymes. This tail acts as an electron donor, allowing the protein to interact with the column pre-loaded with nickel (Ni2+). Due to its high affinity and selectivity, ensuring a higher degree of purity of the purified protein, this strategy is widely used for the purification of recombinant proteins [66]. As Escherichia coli BL21(DE3) exhibited low expression of both β-glucosidases, only partial purification of the proteins was carried out. The crude protein extract, obtained after cell lysis of the bacteria, contained a low amount of the recombinant enzymes and, when added to the column, allowed for the binding of nonspecific proteins, resulting in a lower degree of purity.

SDS–PAGE gel electrophoresis revealed that the molecular mass of the recombinant proteins is consistent with the predicted size (53 kDa) (Fig. 7A). The difference in activity between AMBGL18 and W404L is even more significant than that found in bacterial culture assays, with W404L showing an activity 26 times greater than AMBGL18 (Fig. 7B). This is likely due to the fact that although the enzyme is not fully purified, there are fewer interfering substances present in the reaction than in bacterial cultures, which increases the binding affinity between the enzyme and its substrate (pNPGlc).

Fig. 7
figure 7

A SDS–PAGE analysis of partially purified AMBGL18 and its mutant W404L. B Activity of partially purified AMBGL18 and W404L using pNPGlc as a substrate. Different letters represent significant differences (p < 0.05)

Although in silico analyses were performed using glucose and cellobiose as ligands, laboratory experiments were conducted with the chromogenic substrate p-Nitrophenyl β-D-Glucopyranoside (pNPGlc). This substrate is widely used in the characterization of β-glucosidases due to its easy detection, sensitivity, and relatively low cost. To evaluate whether pNPGlc can be used as a substitute for cellobiose in experimental tests, a molecular docking study was performed with AMBGL18 and the different ligands (glucose and pNP/cellobiose and pNPGlc).

The molecular docking data for glucose and pNP revealed that the best free energy of binding (FEB) was − 5.984 kcal/mol and − 5.665 kcal/mol, respectively, indicating similar values in terms of estimated binding free energy. Figure 8A shows the final position of the molecules in the binding site, demonstrating that both interact in similar regions of the enzyme. This similarity is corroborated by protein–ligand interaction profiler (PLIP) analyses of interactions, which identified hydrogen bonds, hydrophobic and pi stacking. Glucose interacts with GLU 164, ASN 220, LEU 221, TYR 299, and SER 300, while pNP interacts with GLU 164, ASN 220, TYR 299, SER 300, and TRP 330, sharing four common residues.

Fig. 8
figure 8

Structure of AMBGL18 highlighting the tryptophan at position 404. A Residue W404 of AMBGL18 is colored in magenta, while glucose and pNP are colored in orange and yellow, respectively. B Residue W404 of AMBGL18 is colored in magenta, while cellobiose and pNPGlc are colored in orange and yellow, respectively

In the docking simulations involving cellobiose and pNPGlc, the FEB values were − 6.886 kcal/mol and − 8.303 kcal/mol, respectively. Figure 8B illustrates the final position of these interactions, showing that both molecules occupy similar regions in the AMBGL18 structure. This is confirmed by the PLIP analyses of interactions, which indicate that cellobiose interacts with residues GLU 164, ASN 220, SER 300 and GLU 414, while pNPGlc interacts with ASN 220, TYR 299, TRP 330, GLU 411, TRP 412 and GLU 414. It can be observed that these molecules share two residues in common and one very close one (SER 300 and TYR 299), reinforcing the similarity between their binding modes. These results confirm that pNPGlc can be used as a reliable substitute for cellobiose in experimental tests, providing a practical and easily detectable alternative for enzymatic assays in the laboratory. To our knowledge, this is the first computational study performed to validate pNPGlc as a substrate analog to cellobiose.

Glucose tolerance in β-glucosidases is essential for industrial applications, such as biofuel production. Many glucose-sensitive β-glucosidases have amino acids in the active site with high affinity for glucose, which facilitates competitive inhibition [67]. Non-competitive inhibition, on the other hand, involves glucose binding to an allosteric site, altering the enzymes conformation and reducing its catalytic efficiency [68]. Some β-glucosidases have cavities near the active site that allow glucose binding without fully blocking substrate hydrolysis [67], while others possess allosteric regions that promote enzymatic activity even in the presence of glucose [69]. β-Glucosidases from the GH1 family typically have a deep and narrow channel, where their catalytic activities occur. Some studies suggest that the structure of the active site entrance is directly correlated with glucose tolerance [14, 70]. The active site has several subsites that are numbered according to the distance from the cleavage point towards the reducing (positive numbers) and non-reducing (negative numbers) ends of the substrate. The + 1 subsite of β-glucosidases is known to have a high affinity for glucose, and the + 2 subsite restricts access to the + 1 subsite, limiting glucose access to this subsite [14]. One hypothesis for the improved catalytic activity of the mutant enzyme (W404L) is that replacing tryptophan with leucine in the terminal region of one of the β-strands and at the entrance of the active site promoted a reconfiguration of the shape and electrostatic properties of the + 1 subsite, reducing its affinity for glucose. Another hypothesis is that the mutation modified the structure of the + 2 subsite, further blocking glucose access to the + 1 subsite (Fig. 5). Our results confirmed that the mutation suggested by in silico analyses had an effect on the catalytic activity of AMBGL18, which could lead to a new pathway for enzyme engineering.

Conclusions

In this study, a variant of the AMBGL18 β-glucosidase, named β-glucosidase W404L, was developed, which demonstrated increased catalytic activity and improved glucose retroinhibition than the wild-type enzyme. The use of advanced computational modeling tools along with genetic engineering techniques can play a crucial role in enzyme engineering for enhancing not only existing β-glucosidases but also other enzymes, representing a promising alternative to solve one of the bottlenecks in the production of industrial enzymes, included those used in second-generation ethanol production. The search for new organisms that produce β-glucosidases with high industrial yield has been a challenge, and the use of heterologously expressed “improved” β-glucosidases could solve the problem of feedback inhibition of endo- and exoglucanases. This could allow for the continuous production of glucose, making 2G ethanol an economically viable and sustainable energy source compared to currently available energy sources. Moreover, the engineered enzyme generated for 2G ethanol production could also represent a good input for the food and beverage industry, as well as other industrial segments.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Singh S. Energy crisis and climate change: global concerns and their solutions. Energ Cris Chall Solut. 2021. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/9781119741503.ch1.

    Article  Google Scholar 

  2. IEA. World energy balances: overview. Paris. https://www.iea.org/data-and-statistics/data-product/world-energy-balances. Accessed 22 Oct 2024.

  3. Kang Y, Yang Q, Bartocci P, Wei H, Liu SS, Wu Z, et al. Bioenergy in China: evaluation of domestic biomass resources and the associated greenhouse gas mitigation potentials. Renew Sustain Energ Rev. 2020;127: 109842. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.rser.2020.109842.

    Article  CAS  Google Scholar 

  4. Jayakumar M, Gindaba GT, Gebeyehu KB, Periyasamy S, Jabesa A, Baskar G, et al. Bioethanol production from agricultural residues as lignocellulosic biomass feedstock’s waste valorization approach: a comprehensive review. Sci Total Environ. 2023;879: 163158. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.scitotenv.2023.163158.

    Article  CAS  PubMed  Google Scholar 

  5. Zhao L, Ou X, Chang S. Life-cycle greenhouse gas emission and energy use of bioethanol produced from corn stover in China: current perspectives and future prospectives. Energy. 2016;115:303–13. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.energy.2016.08.046.

    Article  CAS  Google Scholar 

  6. Margeot A, Hahn-Hagerdal B, Edlund M, Slade R, Monot F. New improvements for lignocellulosic ethanol. Curr Opin Biotechnol. 2009;20:372–80. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.copbio.2009.05.009.

    Article  CAS  PubMed  Google Scholar 

  7. Escuder-Rodríguez JJ, DeCastro ME, Cerdán ME, Rodríguez-Belmonte E, Becerra M, González-Siso MI. Cellulases from thermophiles found by metagenomics. Microorganisms. 2018;6:66. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/microorganisms6030066.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Rani V, Mohanram S, Tiwari R, Nain L, Arora A. Beta-glucosidase: key enzyme in determining efficiency of cellulase and biomass hydrolysis. J Bioprocess Biotech. 2015;5:1. https://doiorg.publicaciones.saludcastillayleon.es/10.4172/2155-9821.1000197.

    Article  CAS  Google Scholar 

  9. Kannan P, Shafreen MM, Achudhan AB, Gupta A, Saleena LM. A review on applications of β-glucosidase in food, brewery, pharmaceutical and cosmetic industries. Carbohyd Res. 2023;530: 108855. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.carres.2023.108855.

    Article  CAS  Google Scholar 

  10. Ouyang B, Wang G, Zhang N, Zuo J, Huang Y, Zhao X. Recent advances in β-glucosidase sequence and structure engineering: a brief review. Molecules. 2023;28:4990. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/molecules28134990.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Thongpoo P, Srisomsap C, Chokchaichamnankit D, Kitpreechavanich V, Svasti J, Kongsaeree PT. Purification and characterization of three β-glycosidases exhibiting high glucose tolerance from Aspergillus niger ASKU28. Biosci Biotechnol Biochem. 2014;78:1167–76. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/09168451.2014.915727.

    Article  CAS  PubMed  Google Scholar 

  12. Cairns JRK, Esen A. β-glucosidases. Cell Mol Life Sci. 2010;67:3389–405. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s00018-010-0399-2.

    Article  CAS  Google Scholar 

  13. Frutuoso MA, Marana SR. A single amino acid residue determines the ratio of hydrolysis to transglycosylation catalyzed by β-glucosidases. Protein Pept Lett. 2013;20:102–6.

    Article  CAS  PubMed  Google Scholar 

  14. De Giuseppe PO, Souza TDACB, Souza FHM, Zanphorlin LM, Machado CB, Ward RJ, et al. Structural basis for glucose tolerance in GH1 β-glucosidases. Acta Crystallogr D Biol Crystallogr. 2014;70:1631–9. https://doiorg.publicaciones.saludcastillayleon.es/10.1107/S1399004714006920.

    Article  CAS  PubMed  Google Scholar 

  15. Yang Y, Zhang X, Yin Q, Fang W, Fang Z, Wang X, et al. A mechanism of glucose tolerance and stimulation of GH1 β-glucosidases. Sci Rep. 2015;5:17296. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/srep17296.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Monteiro LMO, Vici AC, Pinheiro MP, Heinen PR, De Oliveira AHC, Ward RJ, et al. A highly glucose tolerant ß-glucosidase from Malbranchea pulchella (MpBg3) enables cellulose Saccharification. Sci Rep. 2020;10:6998. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41598-020-63972-y.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Zhu Q, Huang Y, Yang Z, Wu X, Zhu Q, Zheng H, et al. A recombinant thermophilic and glucose-tolerant GH1 β-glucosidase derived from Hehua hot spring. Molecules. 2024;29:1017. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/molecules29051017.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. McCullum EO, Williams BAR, Zhang J, Chaput JC. Random mutagenesis by error-prone PCR. In: Braman J, editor. In vitro mutagenesis protocols. 3rd ed. Totowa: Humana Press; 2010. p. 103–9.

    Chapter  Google Scholar 

  19. Kim JY, Yoo HW, Lee PG, Lee SG, Seo JH, Kim BG. In vivo protein evolution, next generation protein engineering strategy: from random approach to target-specific approach. Biotechnol Bioproc Eng. 2019;24:85–94. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s12257-018-0394-2.

    Article  CAS  Google Scholar 

  20. Mariano DCB, Santos LH, Machado KDS, Werhli AV, De Lima LHF, De Melo-Minardi RC. A computational method to propose mutations in enzymes based on structural signature variation (SSV). IJMS. 2019;20:333. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/ijms20020333.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Bergmann JC, Costa OYA, Gladden JM, Singer S, Heins R, Dhaeseleer P, et al. Discovery of two novel β-glucosidases from an Amazon soil metagenomic library. FEMS Microbiol Lett. 2014;351:147–55. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/1574-6968.12332.

    Article  CAS  PubMed  Google Scholar 

  22. Mariano D, Pantuza N, Santos LH, Rocha REO, De Lima LHF, Bleicher L, et al. Glutantβase: a database for improving the rational design of glucose-tolerant β-glucosidases. BMC Mol Cell Biol. 2020;21:50. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12860-020-00293-y.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The protein data bank. Nucl Acid Res. 2000;28:235–42. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/28.1.235.

    Article  CAS  Google Scholar 

  24. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; 2024. https://www.ncbi.nlm.nih.gov/protein/. Accessed 4 Sep 2024.

  25. Cock PJA, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009;25:1422–3. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/bioinformatics/btp163.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG. ZINC: a free tool to discover chemistry for biology. J Chem Inf Model. 2012;52:1757–68. https://doiorg.publicaciones.saludcastillayleon.es/10.1021/ci3001277.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Trott O, Olson AJ. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. 2010;31:455–61. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/jcc.21334.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Seus VR, Silva L, Gomes J, Da Silva PEA, Werhli AV, Machado KS. A framework for virtual screening. Proceedings of the 31st annual ACM symposium on applied computing. 2016. Pp. 31–36. https://doiorg.publicaciones.saludcastillayleon.es/10.1145/2851613.2851618.

  29. Wallace AC, Laskowski RA, Thornton JM. LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein Eng Des Sel. 1995;8:127–34. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/protein/8.2.127.

    Article  CAS  Google Scholar 

  30. Fassio AV, Santos LH, Silveira SA, Ferreira RS, de Melo-Minardi RC. nAPOLI: a graph-based strategy to detect and visualize conserved protein-ligand interactions in large-scale. IEEE/ACM Trans Comput Biol Bioinform. 2020;17:1317–28. https://doiorg.publicaciones.saludcastillayleon.es/10.1109/TCBB.2019.2892099.

    Article  CAS  PubMed  Google Scholar 

  31. Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJE. The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc. 2015;10:845–58. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/nprot.2015.053.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr. 1993;26:283–91. https://doiorg.publicaciones.saludcastillayleon.es/10.1107/S0021889892009944.

    Article  CAS  Google Scholar 

  33. Eisenberg D, Lüthy R, Bowie JU. VERIFY3D: assessment of protein models with three-dimensional profiles. Method Enzymol. 1997;277:396–404. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/s0076-6879(97)77022-8.

    Article  CAS  Google Scholar 

  34. McGuffin LJ, Aldowsari FMF, Alharbi SMA, Adiyaman R. ModFOLD8: accurate global and local quality estimates for 3D protein models. Nucl Acid Res. 2021;49:425–30. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/gkab321.

    Article  CAS  Google Scholar 

  35. Williams CJ, Headd JJ, Moriarty NW, Prisant MG, Videau LL, Deis LN, et al. MolProbity: more and better reference data for improved all-atom structure validation. Protein Sci. 2018;27:293–315. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/pro.3330.

    Article  CAS  PubMed  Google Scholar 

  36. Pei J, Kim BH, Grishin NV. PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucl Acid Res. 2008;36:2295–300. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/gkn072.

    Article  CAS  Google Scholar 

  37. Cheng J, Randall A, Baldi P. Prediction of protein stability changes for single-site mutations using support vector machines. Proteins. 2006;62:1125–32. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/prot.20810.

    Article  CAS  PubMed  Google Scholar 

  38. Chen CW, Lin J, Chu YW. iStable: off-the-shelf predictor integration for predicting protein stability changes. BMC Bioinform. 2013;14:S5. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2105-14-S2-S5.

    Article  CAS  Google Scholar 

  39. Teng S, Srivastava AK, Wang L. Sequence feature-based prediction of protein stability changes upon amino acid substitutions. BMC Genom. 2010;11:S5. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2164-11-S2-S5.

    Article  CAS  Google Scholar 

  40. Capriotti E, Fariselli P, Rossi I, Casadio R. A three-state prediction of single point mutations on protein stability changes. BMC Bioinform. 2008;9:S6. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2105-9-S2-S6.

    Article  CAS  Google Scholar 

  41. Worth CL, Preissner R, Blundell TL. SDM—a server for predicting effects of mutations on protein stability and malfunction. Nucl Acid Res. 2011;39:W215–22. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/gkr363.

    Article  CAS  Google Scholar 

  42. Pires DEV, Ascher DB, Blundell TL. mCSM: predicting the effects of mutations in proteins using graph-based signatures. Bioinformatics. 2014;30:335–42. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/bioinformatics/btt691.

    Article  CAS  PubMed  Google Scholar 

  43. Pires DEV, Ascher DB, Blundell TL. DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach. Nucl Acid Res. 2014;42:W314–9. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/gku411.

    Article  CAS  Google Scholar 

  44. Laimer J, Hofer H, Fritz M, Wegenkittl S, Lackner P. MAESTRO—multi agent stability prediction upon point mutations. BMC Bioinform. 2015;16:116. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12859-015-0548-6.

    Article  CAS  Google Scholar 

  45. Panigrahi P, Sule M, Ghanate A, Ramasamy S, Suresh CG. Engineering proteins for thermostability with iRDP web server. PLoS ONE. 2015;10: e0139486. https://doiorg.publicaciones.saludcastillayleon.es/10.1371/journal.pone.0139486.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Savojardo C, Fariselli P, Martelli PL, Casadio R. INPS-MD: a web server to predict stability of protein variants from sequence and structure. Bioinformatics. 2016;32:2542–4. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/bioinformatics/btw192.

    Article  CAS  PubMed  Google Scholar 

  47. Giollo M, Martin AJ, Walsh I, Ferrari C, Tosatto SC. NeEMO: a method using residue interaction networks to improve prediction of protein stability upon mutation. BMC Genom. 2014;15:S7. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2164-15-S4-S7.

    Article  CAS  Google Scholar 

  48. Li G, Panday SK, Alexov E. SAAFEC-SEQ: a sequence-based method for predicting the effect of single point mutations on protein thermodynamic stability. IJMS. 2021;22:606. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/ijms22020606.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Folkman L, Stantic B, Sattar A, Zhou Y. EASE-MM: sequence-based prediction of mutation-induced stability changes with feature-based multiple models. J Mol Biol. 2016;428:1394–405. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jmb.2016.01.012.

    Article  CAS  PubMed  Google Scholar 

  50. Sambrook J, Fritsch EF, Maniatis T. Molecular cloning: a laboratory manual. 2nd ed. New York: Cold spring harbor laboratory press; 1989.

    Google Scholar 

  51. Fia G, Giovani G, Rosi I. Study of beta-glucosidase production by wine-related yeasts during alcoholic fermentation. A new rapid fluorimetric method to determine enzymatic activity. J Appl Microbiol. 2005;99:509–17. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/j.1365-2672.2005.02657.x.

    Article  CAS  PubMed  Google Scholar 

  52. Rosi I, Vinella M, Domizio P. Characterization of β-glucosidase activity in yeasts of oenological origin. J Appl Bacteriol. 1994;77:519–27. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/j.1365-2672.1994.tb04396.x.

    Article  CAS  PubMed  Google Scholar 

  53. Mattéotti C, Thonart P, Francis F, Haubruge E, Destain J, Brasseur C, et al. New glucosidase activities identified by functional screening of a genomic DNA library from the gut microbiota of the termite Reticulitermes santonensis. Microbiol Res. 2011;166:629–42. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.micres.2011.01.001.

    Article  CAS  PubMed  Google Scholar 

  54. Studier FW. Protein production by auto-induction in high-density shaking cultures. Protein Expr Purif. 2005;41:207–34. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.pep.2005.01.016.

    Article  CAS  PubMed  Google Scholar 

  55. Deshpande MV, Eriksson KE, Pettersson LG. An assay for selective determination of exo-1,4,-beta-glucanases in a mixture of cellulolytic enzymes. Anal Biochem. 1984;138:481–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/0003-2697(84)90843-1.

    Article  CAS  PubMed  Google Scholar 

  56. Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoemaker BA, Thiessen PA, Yu B, Zaslavsky L, Zhang J, Bolton EE. PubChem 2025 update. Nucl Acid Res. 2025;53(D1):D1516–25. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/gkae1059.

    Article  Google Scholar 

  57. Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, Olson AJ. Autodock4 and AutoDockTools4: automated docking with selective receptor flexiblity. J Comput Chem. 2009;16:2785–91. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/jcc.21256.

    Article  CAS  Google Scholar 

  58. O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open babel: an open chemical toolbox. J Cheminform. 2011;3:1–14. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1758-2946-3-33.

    Article  CAS  Google Scholar 

  59. Hanwell MD, Curtis DE, Lonie DC, Vandermeersch T, Zurek E, Hutchison GR. Avogadro: an advanced semantic chemical editor, visualization, and analysis platform. J Cheminform. 2012;4:1–17. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1758-2946-4-17.

    Article  CAS  Google Scholar 

  60. Salentin S, Schreiber S, Haupt VJ, Adasme MF, Schroeder M. PLIP: fully automated protein–ligand interaction profiler. Nucl Acid Res. 2015;43(W1):W443–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/nar/gkv315.

    Article  CAS  Google Scholar 

  61. R Core Team. R: a language and environment for statistical computing. R Found Stat Comput. Vienna, Austria. 2019. https://www.r-project.org/.

  62. Gottschalk LMF, Oliveira RA, Bon EPDS. Cellulases, xylanases, β-glucosidase and ferulic acid esterase produced by Trichoderma and Aspergillus act synergistically in the hydrolysis of sugarcane bagasse. Biochem Eng J. 2010;51:72–8. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.bej.2010.05.003.

    Article  CAS  Google Scholar 

  63. Harrison FC, Van Der Leck J. Aesculin bile salt media for water analysis. Am J Public Hyg. 1909;19:557–63.

    CAS  Google Scholar 

  64. Li Y, Liu N, Yang H, Zhao F, Yu Y, Tian Y, et al. Cloning and characterization of a new β-glucosidase from a metagenomic library of Rumen of cattle feeding with Miscanthus sinensis. BMC Biotechnol. 2014;14:85. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1472-6750-14-85.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Cao L, Wang Z, Ren G, Kong W, Li L, Xie W, et al. Engineering a novel glucose-tolerant β-glucosidase as supplementation to enhance the hydrolysis of sugarcane bagasse at high glucose concentration. Biotechnol Biofuel. 2015;8:202. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13068-015-0383-z.

    Article  CAS  Google Scholar 

  66. Schmitt J, Hess H, Stunnenberg HG. Affinity purification of histidine-tagged proteins. Mol Biol Rep. 1993;18:223–30. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/BF01674434.

    Article  CAS  PubMed  Google Scholar 

  67. Sengupta S, Datta M, Datta S. Chapter 5—β-glucosidase: structure, function and industrial applications. In: Sengupta S, Datta M, Datta S, editors. Glycoside hydrolases. Cambridge: Academic Press; 2023. p. 97–120.

    Chapter  Google Scholar 

  68. Bhatia Y, Mishra S, Bisaria VS. Microbial β-glucosidases: cloning, properties, and applications. Crit Rev Biotechnol. 2002;22:375–407. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/07388550290789568.

    Article  CAS  PubMed  Google Scholar 

  69. Souza FHM, Nascimento CV, Rosa JC, Masui DC, Leone FA, Jorge JA, et al. Purification and biochemical characterization of a mycelial glucose- and xylose-stimulated β-glucosidase from the thermophilic fungus Humicola insolens. Process Biochem. 2010;45:272–8. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.procbio.2009.09.018.

    Article  CAS  Google Scholar 

  70. Santos CA, Morais MAB, Terrett OM, Lyczakowski JJ, Zanphorlin LM, Ferreira-Filho JA, et al. An engineered GH1 β-glucosidase displays enhanced glucose tolerance and increased sugar release from lignocellulosic materials. Sci Rep. 2019;9:4903. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41598-019-41300-3.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors are grateful to all members of the LEGENE—Research Group in Genetic Engineering and Biotechnology (Institute of Biological Sciences, Federal University of Rio Grande—FURG, Brazil) who helped during the experiments. We also thank Embrapa-Agroenergy for providing the AMBGL18 coding sequence and the infrastructure to perform part of the experiments.

Funding

This research was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq, Proc. 440336/2022-8) and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES, Financial Code 001). L. F. Marins and B. F. Quirino are research fellows from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq; Proc. 307304/2022-1 and Proc. 305773/2023-2, respectively).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: RA, AW, KM and LM; In silico analyses: VS, AC, AW and KM; Genetic construct: RA and LM; Purification and enzymatic activity: RA, HS, BQ, LC and LM; Results analysis: RA, HS, VS, AC, AW, KM, BQ, LC and LM; Writing—Original Draft: RA and LM; Writing—Review & Editing: RA, HS, AW, KM, BQ, LC and LM; Funding acquisition: LM. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Raíza dos Santos Azevedo.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Azevedo, R.S., Santana, H., Seus, V.R. et al. Development of a β-glucosidase improved for glucose retroinhibition for cellulosic ethanol production: an integrated bioinformatics and genetic engineering approach. Biotechnol. Biofuels Bioprod. 18, 44 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13068-025-02643-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13068-025-02643-4

Keywords