sample collection
A total of 60 fecal samples were collected in this study, with 30 from DHB and 30 from Duroc. These pigs were selected from a commercial pig farm in Shaoguan, Guangdong, China. The fecal samples were rapidly frozen in liquid and nitrogen then transferred to a -80°C freezer for long-term storage.
Isolation, Cultivation, and Identification of Bacteria
Fecal samples from DHB were cultured under both anaerobic and aerobic conditions. First, the fecal samples were thoroughly mixed with PBS buffer to prepare a uniform bacterial suspension. Subsequently, the supernatant of the bacterial suspension was serially diluted to prepare dilutions of 10⁻³, 10⁻⁴, and 10⁻⁵. 100 microliters of each dilution was then evenly spread onto different types of 15 culture media (Supplementary Table S1).
For anaerobic conditions, the inoculated culture media were placed in an anaerobic chamber with a gas mixture of 85% N₂, 5% CO₂, and 10% H₂, and incubated at 37°C. After culturing for 24 to 48 hours, individual colonies that appeared on the agar plates were selected. These single colonies were then inoculated onto corresponding agar plates for further cultivation until the colonies fully developed. Finally, the mature colonies were identified using Matrix-assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS) and full-length 16S rRNA gene Sanger sequencing.
The purified strains were inoculated into the corresponding liquid culture medium and cultured at 37°C. After 12–24 hours, the cultures were removed and centrifuged at 4°C and 12,000 r/min for 5 minutes. The supernatant was discarded, and the bacterial pellet was retained for high-quality DNA extraction. Whole-genome sequencing was performed by Novogene Bioinformatics Technology Co., Ltd. (Beijing, China).
Library Construction, Quality Control and Sequencing
A total amount of 0.2 µg DNA per sample was used as input material for the DNA library preparations. The sequencing library was generated using the NEBNext® UltraTM DNA Library Prep Kit for Illumina (NEB, USA, Catalog #: E7370L) according to the manufacturer's instructions, with index codes added to each sample. In brief, the genomic DNA samples were fragmented to a size of 350 bp using sonication. The DNA fragments were then endpolished, A-tailed, and ligated to full-length adapters for Illumina sequencing, followed by PCR amplification. The PCR products were purified using the AMPure XP system (Beverly, USA). Subsequently, the library quality was assessed on the Agilent 5400 system (Agilent, USA), and the library was quantified using QPCR (1.5 nM). Based on the effective library concentration and the required data volume, the qualified libraries were pooled and sequenced on the Illumina platform at Beijing Biomarker Technologies Co., LTD. using the PE150 strategy.
Metagenomic Assembled and Binning
We conducted metagenomic sequencing on 60 fecal samples from two different pig breeds. 3 low-quality samples from DHB were excluded, retaining 57 samples for subsequent analysis. To ensure data quality, this study used Trimmomatic (v0.39)17 with the following parameter standards for quality control of raw reads: -threads 30 LEADING:3 TRAILING:3 SLIDINGWINDOW:5:20 MINLEN:60. Bowtie2 (v2.5.1)18 was then employed to map all trimmed reads to the reference pig genome (Scrofa11.1, GCF_000003025.6_Sscrofa11.1) to remove reads that may contain host sequences. Seqkit (v2.6.1)18 was used to generate quality reports for the metagenomic sequencing data reads of each sample. After the above quality control steps, the clean paired-end reads were used for subsequent analyses. Kraken2 (v2.1.3)19 was used to align the clean paired-end read sequences with the standard database, and Bracken (v2.9)20 was employed to estimate the relative abundance of microbial communities at different taxonomic levels using a Bayesian model.
The clean reads after quality control were assembled using the SPAdes (v3.13.0)21 tool. ONT third-generation clean reads were assembled using Flye (v2.9.2)22 in "--nano-raw" mode. The MetaBAT2 (v2.12.1)23 tool was then used to bin the contigs of each sample. Subsequently, the completeness and contamination of the MAGs were assessed using the CheckM (v1.2.2)24 tool. The MAGs with completeness ≥ 50% and contamination ≤ 10% were retained. Finally, dRep (v3.4.5)25 was used to dereplicate the retained MAGs with the following parameters: -comp 50 -con 10 -sa 0.99 -p 24, resulting in the final set of MAGs available for subsequent analyses.
Identification, classification and screening of cellulose and hemicellulose degrading microorganisms
The GTDBtk (v2.4.1)26 tool was used to align MAGs and isolate genomes with the latest GTDB database (R226). In addition, all dereplicated genomes were merged, and Prodigal (v2.6.3)27 was used to construct a complete gene set and predict ORF genes. After that, CD-hit (v4.8.1)28 was used to dereplicate genes from contigs with the following parameters: -c 0.95 -G 0 -aS 0.9 -g 1. The Diamond (v2.0.4)29 tool was used to align contigs with the latest CAZyme database (CAZyDB.07142024) to determine the carbohydrate enzyme composition of these genes. And select from the glycoside hydrolase (GH) family, the enzymes targeting cellulose degradation (GH1, GH5, GH6, GH7, GH9, GH12, GH45, GH48, and GH74) and the enzymes targeting hemicellulose degradation (GH10, GH11, GH26, GH39, GH42, GH43, and GH51).
Genomes with fiber-degrading potential were classified according to the typical steps of cellulose degradation. For cellulose degradation30, candidate genomes were required to encode: (1) ≥ 1 endoglucanase (GH5/GH9), (2) ≥ 1 anaerobic exoglucanase (GH48) or ≥ 1 aerobic exoglucanase (GH6/GH7), and (3) ≥ 1 β-glucosidase (GH1/GH3).
A scoring system was established to assess the functional potential of candidate genomes in fiber degradation. Genomes that conform to the above base criteria were given one point. For cellulose-degrading genomes, three additional criteria were used: (1) ≥ 2 endoglucanases, (2) ≥ 2 exoglucanases, and (3) presence of at least one lytic polysaccharide monooxygenase (LPMO; AA9/AA10). Each fulfilled criterion contributed one point (maximum of three).
Screening of Intestinal Cellulose Degrading Bacteria
For the preliminary screening, the strains were revived, and 2–5 µL of the bacterial suspension was inoculated onto CMC-Na agar plates containing an inorganic nitrogen source (with three replicates per group). The plates were incubated at 37°C for 48 hours. Subsequently, 1 mg/mL Congo red staining solution was added and allowed to stain for 30 minutes. The staining solution was then removed, and 1 mol/L NaCl solution was used to decolorize for 30 minutes. The ability to degrade cellulose was determined by observing the formation of hydrolysis zone, and the degradation intensity was preliminarily assessed by the ratio of the hydrolysis zone diameter (D) to the colony diameter (d) (D/d).
Before enzyme activity determination, a standard curve was prepared: 1 mg/mL glucose standard solution was precisely prepared. Aliquots of 0-1.2 mL were diluted to 2 mL, and 1.5 mL of DNS reagent was added. The mixture was boiled for 10 minutes and then diluted to 25 mL. The absorbance at 540 nm was measured. The standard curve equation was established as y = 0.2962x (R²=0.991).
The target strains after preliminary screening were inoculated into LB liquid medium and cultured at 37℃ with a shaking speed of 160 r/min for 12 hours. Then, 10 mL of the culture was transferred to 200 mL of enzyme production fermentation medium and fermented under the same conditions for 48 hours. The supernatant was obtained by centrifugation at 8,000 r/min for 20 minutes and used as the crude enzyme solution.
CMCase activity detection: The reaction system contained 1 mL of crude enzyme solution and 1 mL of 1% CMC-Na solution (with inorganic nitrogen source). The mixture was hydrolyzed at 50°C for 40 minutes, followed by the addition of 2 mL of DNS reagent. The mixture was boiled for 15 minutes and then diluted to 10 mL to measure the absorbance (A₁). The control group used inactivated crude enzyme solution (boiled for 2 hours) and was measured in the same manner (A₂). The difference in absorbance was calculated as ∆A = A1 -A2, and the concentration of reducing sugar x (mg/mL) was determined by substituting into the standard curve. The CMCase activity (U/mL) was calculated using the following formula: CL(U/mL) = 1000*x*V1/(V2*T) (V 1 is the total volume of the reaction system, 10 mL; V2 is the volume of crude enzyme solution added, 1 mL; T is the enzyme hydrolysis time, 40 minutes; CMCase activity is defined as the amount of glucose produced per minute, with 1 µg of glucose produced per minute being equivalent to 1 CMCase activity unit).