1. Oceanographic context
Seawater in the Gerlache-Bismark Strait (Fig. 1) originates mainly from the Bellingshausen Sea (entering mainly through the Gerlache and Bismarck straits, and Schollaert Channel) and the Weddell Sea (entering through the Bransfield Strait). The dominant current is the Gerlache Strait Current, which flows NE transporting most of the Gerlache Strait waters (including surface and deep waters) towards the Bransfield Strait (Zhou et al., 2002; Savidge et al., 2009; Su et al., 2022). The opposite current is the Bransfield Strait Current, which moves waters from the Bransfield Strait towards the Gerlache Strait (Kerr et al., 2018a; Moffat et al., 2018). Water currents in the Gerlache Strait are linked to the ACC/APCC (Antarctic Circumpolar Current/Antarctic Peninsula Coastal Current) transport system (Kerr et al., 2018a; Moffat et al., 2018).
During summer, there is a permanent isopycnal at approximately 100 m which separates surface and intermediate water layers (Parra et al., 2020). Two main fronts have been described in the Gerlache Strait: the surface water thermal front (STF) and the sub-pycnocline front (SPF). The STF is situated above the pycnocline, at approximately 64.5 °S, close to the Schollaert Channel. This surface front is permanent during the austral summer (Parra et al., 2020). The SPF is situated between 100 and 400 m depth (Parra et al., 2020) and has a variable location along the strait where it separates the relatively warmer intermediate waters entering from the Bismarck Strait from colder waters present in the northern part of the Gerlache Strait (Kerr et al., 2018a; Parra et al., 2020). In this region of Antarctica, water depths are traditionally categorized as surface waters (0 m-100 m), intermediate waters (100 m-300 m), and deep waters (below 300 m to the seafloor) (Parra et al., 2020). The bathymetry of the north and south ends of the Gerlache Strait is shallower (around 200–300 m deep) than that of the central basin with average bottom depths of around 800 m (Fig. 1).
Surface waters in the Gerlache Strait (from 0 to 100 m) have variable characteristics and are affected by multiple processes throughout the year, such as sea ice formation and melting, water mixing, wind forcing, and solar heating. In summer, surface water is continuously mixed (Parra et al., 2020). Surface waters include cold waters, such as Transitional Zonal Waters with Bellingshausen Sea influence (TBW) and Antarctic Surface Waters (AASW) (García et al., 2002). Surface waters also evidence Glacially Modified Waters (GMW), which originates from melting glaciers and typically forms a thin (20–30 m) layer at the surface extending from inner fjords and bays into the strait (Dierssen et al., 2001; Cape et al., 2019, Osińska and Herman, 2024).
Intermediate waters in the Gerlache-Bismarck Strait (from 100 to 300 m) are characterized by a mixing of distinct waters coming from the Bellingshausen and Weddell seas (Kerr et al., 2018b), which include Circumpolar Deep Waters (CDW), Transitional Zonal Waters with Weddell Sea Influence (TWW), and High Salinity Shelf Water (HSSW). Circumpolar Deep Water (CDW) is a relatively warm, salty, and low-oxygenated intermediate water mass (Smith et al., 1999; Kerr et al., 2018b), which transforms into modified mCDW (modified-CDW) when it enters the Gerlache-Bismarck Strait (García et al., 2002; Kerr et al., 2018b). CDW enters the area through the Bismarck Strait (Palmer Deep Canyon), Schollaert Channel, and north off Liége Island. CDW dominates the Bismarck Strait, which is the most important source path into the Gerlache Strait (Parra et al., 2020). CDW can be divided into Upper (UCDW) and Lower (LCDW, normally between 800 and 1000 m) CDW (Smith et al., 1999), and CDW flowing from the Bellingshausen Sea into the Gerlache-Bismark Strait has been described either as LCDW (García et al., 2002) or UCDW (Kerr et al., 2018b). TWW water is generated in the Weddell Sea and crosses the Bransfield Strait towards the northern end of the Gerlache Strait. TWW is found between depths of 300 and 700 m (García et al., 2002; Sangrà et al., 2011). The intermediate and deep-water masses in the Gerlache Strait from the Weddell Sea have been reported as a modified version of the HSSW (mHSSW in Parra et al., 2020; HSSW-derived in Kerr et al., 2020b). The mHSSW enters the Gerlache Strait as bottom water and is characterized by low temperature, high salinity, and high oxygen content (Kerr et al., 2018b). This water mass extends to the southernmost point of the WAP in the central area of the Gerlache Strait (around the Schollaert Channel), becoming scarce in the southern Gerlache Strait, and negligible in the Bismarck Strait. Generally, in the northern part of the Gerlache Strait, mCDW intrudes the TWW (García et al., 2002) and HSSW (Kerr et al., 2018b; Parra et al., 2020), forming the sub-pycnocline front (Parra et al., 2020), which separates the older waters coming from the Bellingshausen Sea from the younger ones coming from the Weddell Sea (Kerr et al., 2018b). Information about the thermohaline indices of the different water masses in the Gerlache-Bismark strait can be found in Supplementary Table 1.
2. Sampling
Sampling was carried out in February 2020 (summer in the Southern Hemisphere), onboard the R/V Karpuj. A transect was performed along the full Gerlache-Bismarck Strait (Fig. 2), from northeast off Brabant Island (64,32º S, 61,80º W) to southwest off Anvers Island (64,97º S, 64,52º W), including 13 oceanographic stations (Fig. 2). At each oceanographic station, a CTD (SBE 25 plus by Seabird®) was used to assess the oxygen concentration (mg L− 1) and saturation percentage (% sat.), temperature (°C), potential temperature (°C), and salinity (practical salinity) (see details in Höfer et al., 2019). Additionally, seawater from the oceanographic stations was sampled at 0, 10, 100, 200, 240, and 400 m using 5 L Niskin and 10 L Go-Flo bottles in order to study the biogeochemistry and microbial communities (Fig. 2). The measured biogeochemical variables included chlorophyll concentration (mg m− 3), in situ pH, dissolved silicate (µmol L− 1), dissolved nitrate (µmol L− 1), dissolved phosphate (µmol L− 1), partial pressure of CO2 (µatm), concentration of suspended inorganic matter (ISM, mg L− 1), and concentration of suspended organic matter (OSM, mg L− 1). The fraction of organic matter present in suspended matter was defined as %OSM. Biogeochemical variables were sampled and analyzed as per Höfer et al. (2019) except for ISM and OSM that were processed following Giesecke et al. (2019).
Microbial parameters included microbial abundance and diversity. Microbial abundance was sampled by fixing seawater (1,350 µL) with glutaraldehyde (150 µL of 1% glutaraldehyde), and samples were collected in triplicate. Prokaryotic and eukaryotic abundances were determined by flow cytometry using a FACSCalibur flow cytometer (Department of Oceanography, University of Concepcion, Chile). The microbial groups detected by flow cytometry included unicellular eukaryotes (composed by phototrophs < 6 µm, including picoeukaryotes and nanoeukaryotes; cells mL− 1) and prokaryotes (including bacteria and archaea; cells mL− 1). To determine microbial diversity (including microscopic prokaryotes and eukaryotes), 9 L of seawater were filtered using a peristaltic pump. To diminish the impact of clogging and dislodging particles from the filters, we filtered at a very low speed and pressure, and replaced the filters when needed. The filters were submerged in a nucleic acid preserving solution (RNAlater, Invitrogen), rapidly frozen at -20ºC onboard and stored at -80 ºC until DNA extraction. The filtration system included a 200 µm mesh followed by 3 polycarbonate filters (pore sizes: 20 µm, 3 µm, and 0.2 µm) arranged sequentially. Using this procedure, we obtained three size fractions corresponding to three sizes of plankton and marine particles: picoplankton and picoparticles (0.2–3 µm), which include free-living prokaryotes and eukaryotes; nanoplankton and nanoparticles (3–20 µm), which consist mainly of larger eukaryotic microorganisms and prokaryotes associated with particles or protists; and microplankton and microparticles (20–200 µm), which include larger eukaryotic microorganisms and prokaryotes associated with larger particles or protists.
3. Characterization of water masses mixing
The principal characteristics of the different water masses present in the Gerlache-Bismarck Strait (i.e., AASW, GMW, TBW, TWW and CDW) were obtained from: Smith et al., 1999; Garcia et al., 2002; Zhou et al., 2002; Sangrà et al., 2011; Kerr et al., 2018b; Moffat et al., 2018; Parra et al., 2020. The contribution of each water mass to the seawater properties at each station and depth was determined by applying the mixing triangle method (Mamayev, 1975) as described in Silva et al., (2009). The method involves analyzing the distribution of water masses using T-S (Temperature-Salinity) diagrams for the considered oceanographic stations. According to this approach, to calculate the mixing ratio percentages, it is essential to define the distinct thermohaline indices characteristic of each water mass within a given triangle (Supplementary Table 1). Finally, for each station and depth, the contribution (in %) of each water mass was obtained, and the dominant water mass at each station and depth was determined as the water mass contributing with a higher % to that particular mixing.
4. DNA extraction, Sequencing and Sequence processing
This work includes the study of prokaryotic (bacteria and archaea) and eukaryotic microbes through DNA analysis, using primers described by Parada et al., (2015). DNA extraction, library preparation, and sequencing were performed at the IMR (Integrated Microbiome Resource, Halifax, Canada), and protocols were carried out as originally described by Comeau et al., (2017) and updated as in Comeau and Kwawukume (2023). DNA extraction was performed using the QIAGEN PowerSoil Pro kit on QIAcubeHT, and sequencing was performed on an Illumina MiSeq platform with a v3 600 cycle kit (2x300bp). Computational analyses of demultiplexed reads were performed based on QIIME2 (Bolyen et al., 2019) and DADA2 (Callahan et al., 2016) along with a custom bioinformatic pipeline incorporating elements from McNichol et al., (2021), and Yeh et al., (2021). Briefly, primer sequences were removed with Cutadapt (Martin 2011), discarding any sequence pairs that did not contain the forward and reverse primers. Mixed amplicon sequences were then split into 16S and 18S rRNA genes pools using bbsplit.sh from the bbtools package (https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/) against curated 16S/18S databases (https://osf.io/e65rs/) derived from SILVA 132 (Quast et al., 2012) and PR2 (Guillou et al., 2012). Subsequently, parallel denoising of 16S and 18S sequences were conducted to generate ASV tables for subsequent analysis. For both 16S and 18S fractions, the forward read was truncated to 220bp and the reverse read to 180bp. 16S reads were merged prior to denoising, whereas for 18S the forward and reverse read were concatenated since sequence lengths are not sufficient for full overlap with 18S rRNA for this primer region. In this study, we used the terms ASVs and OTUs interchangeably.
Data analysis was performed using R software (v4.3.2) within RStudio interface (R Core Team, 2020), and environmental variables together with the geography were visualized using ODV software (Ocean Data View v5.6.3). To analyze microbial communities, the ASV/OTU table was first randomly subsampled (with the rrarefy function, vegan package) to the number of reads present in the sample with the lowest number of reads (n = 4041), reducing from 111 samples to 107 and a total of 13,548 ASVs.
The diversity of each sample was calculated with different diversity indices, those were: the total number of species per sample or richness (index S.Obs), the Shannon index (H), the Simpson index (D), and the Pielou index (J’) (evenness). The formulas used for calculating the indices were implemented in R, following Borcard (2011) and Kers and Saccenti (2022). The specific formulas used are shown in Supplementary Table 2. Subsequently, correlation analysis (Supplementary Fig. 1) was performed among the different indices using the cor and cor.test functions (base package), where we applied the Spearman method. Results were visualized using the ggpairs function (GGally package), revealing that all indices were generally correlated (R2 ≥ 0.4; R > 0.5; p-value < 0.001). Among all indices, we chose the richness index (S.Obs) for downstream diversity analyses because of its conceptual simplicity. To determine whether diversity in terms of richness was statistically similar or different among the levels within each factor (fraction, depth, stations), a Kruskal-Wallis test was conducted using the kruskal.test function from the R base package.
Low abundant ASVs (also referred to as “rare”), were defined as those representing < 1% of the total abundance of all sequences (considering all samples). Taxonomic composition of microbial communities was visualized with barplots (tidyverse and ggplot2 packages), by selecting the most abundant taxa at the Phylum level. The other taxa, together with the rare, were represented as “Others”.
To assess the influence of different factors (such as water fronts, water masses, stations, depth, size fractions and zones) on microbial communities, a series of non-parametric permutational multivariate analysis of variance (PERMANOVA) tests with 999 permutations were conducted (adonis2 function, vegan package). The effects of the environment (explanatory variables) on microbial communities (response variables) were assessed by considering two types of explanatory variables: environmental variables (including physical and biogeochemical variables, as well as microbial abundances) and water masses mixing (i.e., the contribution, in percentage (%), of each water mass). A simple set of the explanatory variables explaining microbial data was obtained by first discarding redundancies (we applied Spearman correlation method and the thresholds: R > 0.9; p-value < 0.001) (Supplementary Fig. 2). Then the Bioenv procedure was applied (bioenv function, vegan package), which identifies the smallest subset of explanatory variables that correlates maximally with microbial community data. For this, distance matrices were constructed (for explanatory variables using Euclidean distances and for response variables using Bray-Curtis distances) and the Spearman-rank-based correlation method was applied. Finally, the effect of selected environmental variables on microbial communities were visualized in a distance-based redundancy analysis (dbRDA) (capscale function, vegan package). To assess the significance of the selected variables on microbial communities, a permutational ANOVA test was conducted (anova.cca function, package vegan), with 999 permutations.
To gain a deeper understanding of the structure of the microbial communities present in different water masses, we explored: co-occurrence networks, indicator (tracer) microbes and the core microbiome. Co-occurrence networks were explored with the igraph R package for network analysis and visualizations. Only those ASVs that appeared in at least three samples were considered. Hypothetical interactions among ASVs were detected through correlations. For this, a correlation matrix between the ASVs was generated by applying the Spearman correlation method and those correlations with a threshold of R > 0.65 were selected. Different co-ocurrence network parameters were calculated (igraph package) including: number of nodes and edges (gorder and gsize functions, respectively), average path length mean_distance function), modularity (cluster_walktrap and modularity function), average degree (mean function, base package), clustering coefficient (transitivity function), and density (edge_density function). Networks were visualized using the functions graph_from_adjacency_matrix, delete.vertices, and graph_from_data_frame. To characterize the indicator species (usually defined as those microbes that are abundant in a particular group of samples, and nearly absent in other groups of samples), we were restrictive in terms of their distribution (we contemplated only those microbes exclusively present in a group of samples, and not present in others) and we included low-abundant taxa, so we renamed the indicator species as tracer microbes. Tracer microbes (i.e., those microbes only present in a particular group of samples, and absent in other groups of samples) of each water mass were obtained with a presence-absence (binomial) ASVs table and applying the multipatt function (indicspecies R package) with 9999 permutations, and selecting r.g as species-site group association function. p-values were adjusted post hoc for multiple comparisons using the p.adjust function with the False Discovery Rate (FDR) method, to reduce the false positive rate (Benjamini and Hochberg, 1995). Tracer microbes were calculated at the ASV level and represented at the Phylum level in dotplots (option geom_point, ggplot function, ggplot2 package). Finally, the core microbiome (i.e., those microorganisms that appear consistently in samples from a given group, regardless of what microorganisms are in other groups) of the water masses was determined using the function core_members from the phyloseq package and visualized with ggplot2. The core microbiome was calculated at the ASV level, considering a minimum abundance of 1%, and a prevalence of 75% among samples of the same water mass. These thresholds align within the range commonly used in similar studies (Custer et al., 2023).