All data collection was conducted following ethical approval by University of Roehampton and the receipt of research permits from Haut-Commissariat aux Eaux et Forêts et à la Lutte Contre la Désertification, Royaume du Maroc.
Study site and subjects
We collected data from two groups of wild Barbary macaques located in Ifrane National Park, Morocco (33˚ 24’N, 05˚ 12’W; elevation 1,500-2,000m above sea level). Both groups were habituated to human researchers and were observed from approximately 7 meters, with all subjects individually recognized. The subjects of this study were the adult males and females of each group (n = 12 [5 males] in the Blue group; n = 15 [6 males] in the Green group). Adults were defined as sexually mature individuals, which was based on body size in both sexes, the presence of anogenital swellings during mating season for females, and descended testicles and large canines for males (Fooden, 2007).
Behavioral data collection
Behavioral data were collected to quantify behavioral syndromes, determine dominance hierarchies, analyze variation in behavioral time allocation and construct social networks. We used a combination of focal, scan, and ad libitum sampling (Altmann, 1974), collected between October 2013 and March 2014, and between February 2015 and April 2015. Behavioral data were collected from near-dawn to near-dusk. Focal observational samples lasted 30 minutes and the order of subjects was pseudorandomized, with no individual sampled more than once on the same day.
During focal observations, the duration of state behaviors (foraging, feeding, resting, traveling, vigilance, play and grooming) were recorded continuously (P. J. Tkaczynski, 2016), while the frequency of contact, proximity, agonistic, dominance and solitary behaviors, as well as facial displays (open mouth, bearing teeth) and vocalizations, (grunts, barks, screams, teeth chatters) were recorded as point events (Fischer & Hammerschmidt, 2002; P. J. Tkaczynski, 2016). For feeding behavior, we also noted what the individuals were eating, first at a broad categorical level (leaves/grass; insects; mushrooms; fruit/nuts; human-produced), then specific species noted if they were possible to identify (e.g., acorns from Quercus ilex and Quercus faginea oaks). At the start and end of the focal samples, proximity scans were conducted to record the number of conspecific group members, as well as humans (specifically tourists and vendors at eco-tourism sites) within 0-1m, 1-5m and 5-10m of the subject of the focal sample. Observers also visually assessed whether subjects were central or peripheral within the group in terms of spatial position; individuals were considered peripheral if they were the outermost individual at the front, rear, or side of the group, and otherwise were considered to be central. The proximity and spatial data were incorporated into the behavioral syndrome quantification; the numbers of humans in proximity were used as a predictor variable in our models (see below). Additional aggression and dominance interactions were collected ad libitum; these aggression and dominance data were used in the dominance interactions and to assess whether focal individuals had received aggression on a given day prior to focal sampling.
All observation data was collected by a team including PJT, LADC and six research assistants. The data were collected using a Psion handheld computer and The Observer XT software version 8.0 (Noldus Information Technology, 2008). Inter-observer reliability tests using intra-class coefficients (ICC; (Shrout & Fleiss, 1979)) were conducted for all researchers: researchers collected behavioral data only once they had recorded two consecutive tests where the frequency and duration of variables recorded were significantly reliable compared to those recorded by PJT (ICC > 0.95; p < 0.05).
Socioecological variables
To account for seasonality, we differentiated between mating and non-mating seasons. The dates of the mating seasons were determined by the first and last observed copulation with ejaculation within each group (Young et al., 2013): mating season 1 (Blue group: 11-Oct-2013:20-Jan-2014; Green group: 9-Oct-2013:9-Jan-2014), non-mating season 1 (Blue group: 21-Jan-2014:5-Mar-2014; Green group: 10-Jan-2014:6-Mar-2014) and non-mating season 2 (4-Feb-2015:18-Apr-2015 for both groups).
To quantify anthropogenic presence, at the start of each focal observation we estimated the number of humans (excluding researchers doing the observations) present within the radius of the group spread of the study group. We recorded the number of humans using a logarithmic index: 0 (no humans), 1 (between 1 and 10 humans), 2 (between 11 and 100 humans), and 3 (above 101 humans) humans present (Price et al., 2014). Using this index allowed for rapid and thus practical assessments of human presence concurrent to behavioral data collection (Price et al., 2014). In addition, during the proximity scans at the start and end of focal samples, we recorded the number of humans within 10 meters of the focal subject.
To quantify high quality food in diet, we calculated the proportion of high quality food items in the diet of subjects (Thompson & Wrangham, 2008). Barbary macaques are dietary generalists that occupy a typically nutrient poor habitat and rely predominantly on herbaceous plants, shrubs, and grasses (Ménard, 2002). However, during autumn and winter seasons, they preferentially feed on more nutrient-rich items, namely the acorns of oak trees (Quercus ilex and Quercus faginea), and the berries of hawthorn bushes (Crataegus spp) (Ménard, 2002; Waterman et al., 2019). Therefore, for each day, we recorded the proportion of the total time spent feeding that was spent feeding on either acorns or hawthorn berries.
As daily minimum temperature increases physiological stress (Young et al., 2014) and the likelihood of the macaques to form larger thermoregulatory huddles (Campbell et al., 2018b), we used this measure as our temperature variable in our analyses. During 2013 and 2014, air temperature was recorded hourly when in the presence of subjects using a 3500 Kestrel Pocket Weather Station. In 2015, due to a change in protocol relating to associated projects, air temperature was recorded only when the subjects left or entered sleeping trees in the morning and evening, respectively. In all time periods, temperatures were recorded in the shade and with the device 1.5 m above the ground. In total, temperature recordings were made on 111 days during the entire study period, an average of 13.75 (± 6.43) daily temperature recordings per study month.
Statistical analysis
All data preparation, analysis, and visualization were performed in R 3.6.3 (R Core Team, 2020) using the RStudio interface (RStudio Team, 2021). For data cleaning, preparation, and the generation of figures, we used a combination of base R functions and the tidyverse suite of packages (Wickham et al., 2019).
Dominance rank calculations
Dyadic interactions of aggression (contact and non-contact) and submission were used for rank calculations. We used a likelihood-based adaptation of the Elo rating approach (Foerster et al., 2016; Mielke et al., 2018; Neumann et al., 2011). Elo ratings reflect the probability of an individual to ‘win’ in a dominance interaction with another individual, with traditional approaches applying a fixed starting value, k, to all individuals (Neumann et al., 2011). The likelihood modification uses a maximum-likelihood estimation to optimize the starting k for all individuals, such that all individuals enter the hierarchy with individual k values (Foerster et al., 2016; Mielke et al., 2018). We assigned continuous Elo ratings to subjects for each day of observation; each score was standardized between 0 (lowest rank) and 1 (highest rank) within each group.
Behavioral syndrome quantification
In a previous study (P. J. Tkaczynski, 2016; P. J. Tkaczynski et al., 2019), three independent, non-correlated and consistently expressed behavioral syndromes have been identified in our study subjects: Excitability, Sociability and Tactility. In brief, behavioral syndrome quantification involved two stages: (i) testing the repeatability of behavioral variables across the three seasons; (ii) identifying correlations among the repeatable behavioral variables. The repeatability of behavioral variables was estimated using the rptr package (Stoffel et al., 2018); factor analyses using the psych package (Revelle, 2018) identified correlations among repeatable behaviors. The sums of the mean values of the behavioral variables that loaded onto a particular factor were used to create individual (subject-specific) scores for a particular behavioral syndrome. One score per behavioral syndrome per season was allocated to each subject, with higher scores indicating greater expression of the syndrome (P. J. Tkaczynski et al., 2019). For the reason outlined above, for the present study, only scores for the Excitability syndrome were considered
Social plasticity in decision making in relation to current social environment
In total, we collected 1,236 hours of behavioral data, a mean ± SD of 46 ± 2 hours per subject.
To determine individual choices about social grooming in relation to the current social environment, we first merged the activities in our focal data into two categories: grooming (included the former variables of giving grooming and of simultaneous grooming) and other activities (co-feed, feed, foraging, foraging-feeding, rest, self-groom, receiving grooming, travel, vigilance). Note, we included “receiving grooming” as separate from our “grooming” category, as we wanted to characterize variation in proactive social decisions about social time allocation from the perspective of the focal animal, i.e., an individual can be resting, approached by another individual and groomed, without making effort to participate in that social interaction.
We then converted the first 15 minutes of each focal sample into a single datapoint, recording whether or not the subject groomed an adult partner during this interval. We focused on adult grooming partners as relationships among adults have established fitness outcomes in this population (Campbell et al., 2018b; Lehmann et al., 2016; Young et al., 2014). The first 15 minutes was chosen as the duration of the interval, as this allowed us to use the proximity scan at the start of the focal to calculate, and thus include in our models the potential social partners and bystanders within 10m of the subject, while allowing sufficient time for the subject to make a social decision to groom another individual or not. We excluded all intervals where the subject was out-of-sight for more than 1 minute and any focal sample in which the focal subject was already engaged in grooming at the start of the focal. We further excluded all datapoints for which we were missing data on our main predictors of variation in social time allocation.
To test variation in social plasticity to the current social environment, we used a generalized linear mixed effect model (model 1) with a Bernoulli error structure and a logit link function; the dependent variable was each interval/choice and whether grooming with an adult partner occurred during that interval.
Our main predictors were the Excitability/Boldness score for each individual in interaction with the following socioecological variables:
-
anthropogenic presence (number of humans within 10m),
-
number of conspecific bystanders within 10m of subject, and
-
maximum rank difference (Elo score difference on day of observation) between the focal and the highest-ranking conspecific within 10m of the subject.
We included control variables of the dominance rank (Elo score of subject on day of focal sample) and sex of the focal subject, the time of day of the interval (calculated as proportion of day, e.g. 12:00 = 0.50; including as a linear and a quadratic term to account for potential peaks in activity around dawn and dusk), the minimum temperature (°C) recorded that day, and group membership (Blue or Green) of the individual.
The random effect of individual identity was included in the model, as well as random slopes for anthropogenic presence, number of conspecifics within 10m, maximum rank difference, dominance rank, time of day (liner and quadratic terms) and the minimum temperature within individual identity. These time allocation models also included a random effect (random intercept) of day of sampling to account for potential autocorrelation or non-independence of response, predictor, or control variables.
Plasticity in social connectivity in relation to fluctuating socioecology
Network construction and metrics
For the network analyses, for each group, directed and weighted social networks based on dyadic grooming data were constructed, here focusing on all grooming bouts between adult partners. We aimed to generate grooming networks spanning short enough periods to determine potential short-term adjustments in social connectivity to fluctuations in socioecology. However, we also wanted to generate networks with sufficient observation time that individuals had the opportunity to interact with social partners. To determine the most appropriate temporal aggregation of our grooming data, we used the netTS R package (Bonnell & Vilette, 2021). This package uses a “moving window” and bootstrapping approach to assess the robustness of networks aggregated at different length timespans.
On average, each individual was observed at least once every 2.29 (± 0.07) days; the maximum interval between two focal samples for an individual within each field season was 17 days. Therefore, we specified our grooming data to be aggregated at 15-, 25-, and 40-day timespans, with a 5-day moving window. Using the ‘check.windowsize’ function of netTS, we performed 1,000 bootstrap comparisons between node degree (the number edges connected to the node) calculated in observed and bootstrapped networks for each of the three timespans. This function calculates the cosine similarity between the observed and bootstrapped metrics. The sensitivity to missing edges was assessed by bootstrapping from either 100, 80 or 70% of the original data.
The results of this procedure were visually assessed and are presented in figures S1 and S2 of the supplementary materials. The 25-day aggregation was clearly preferable to the 15-day aggregations, with no marked improvement with a 40-day aggregation. Therefore, for the network analyses, we split the observational data into nine, 25-day time windows. This equated to roughly one month of data collection and guaranteed each individual was observed at least once, even in time windows with fewer observation days due to field conditions (extreme weather, difficulty finding groups). Subjects were observed for mean ± SD of 5.69 ± 1.05 hours per time window (Table S3).
For each group and for each time window, we used Bayesian inference to quantify edge weights (grooming given and received) between dyads, and then construct social networks (Hart et al., 2021; Redhead et al., 2021). This approach allowed us to incorporate and measure uncertainty in the edge weights, which may be susceptible to variable sampling effort or measurement errors in observing each individual within the dyad, or interactions between particular dyads (Hart et al., 2021).
For each group and each time window, we fitted a mixed-effect binomial model (i.e., 9 models per group). In each model, the response variable was how many focal observations individuals within a dyad groomed one another given the opportunity (i.e., the total number of focal observations each individual was observed in) within a given time window. We included actor and recipient identities as a random effect, with this identity incorporating the direction of the grooming interaction. For example, if individual A was grooming individual B, the dyad identity for that interaction was “A-B”; if B was grooming A, the identity would be “B-A".
From these models (one for each group), we used the posterior draws (n = 3,000; see Model Fitting and Validation below) for the dyad identities to calculate the median edge weights for each dyad in each time window, i.e., the proportion of time individuals within a particular dyad either gave or received grooming given the opportunity to groom one another within a particular time window. Using the median edge weights of the posterior distribution, which are equivalent to the “simple ratio index” (Hart et al., 2021; Whitehead, 2008), we constructed directed and weighted social networks. Figures S4-S21 in the supplementary materials illustrate each network within each time window (n = 9) for the Blue and Green groups, as well as the uncertainty of edge estimates.
For each individual in each of the networks generated from the posterior draws, we calculated two metrics: (a) out-degree, the number of edges connected to the node (scaled between 0 and 1 to account for differences in group sizes), i.e. the number of partners to which the individual directed grooming; and (b) out-strength, the sum of all edge weights connected to the node, i.e. the total grooming rate per individual to all partners (Farine & Whitehead, 2015). We chose these particular metrics as they have been related to fitness outcomes in this species and population previously (Campbell, Tkaczynski, Lehmann, et al., 2018; Lehmann et al., 2016). As these metrics were derived from the median posterior edge weights between each node, they represent the most probable out-degree or out-strength value for each individual based on sampling effort and potential measurement error for each individual in the study (Hart et al., 2021).
Social networks were constructed and illustrated, as well metrics calculated, using the igraph R package (Csardi & Nepusz, 2006).
Social connectivity models
For each time window of each group, we calculated the mean anthropogenic presence index, mean daily proportion of high-quality food in the diet, and mean daily minimum temperature (°C) during this observation period. Additionally, each network of each group was demarked as occurring during either in a mating or non-mating season.
During time window 4, the Blue group transitioned from a mating season to a non-mating season (55.85% of observations occurred during the non-mating period), therefore, we did not include the metrics derived from this network in the analysis.
For both social connectivity models ((a) out-degree, (b) out-strength), individuals social network metrics (out-degree or out-strength) were the dependent variables and our main predictors were the interactions between Excitability/Boldness score and mean anthropogenic presence index, mean daily proportion of high-quality food in the diet, and mean daily minimum temperature (°C), and seasonality (mating or non-mating) during each time window. The interactions were fit to measure plasticity (rate of change) in social connectivity given the socioecological gradients for different behavioral syndrome phenotypes. We included the mean rank (Elo score) of each individual for each network as a control variable. The random effect of individual identity was included in the model, as well as random slopes for each of our socioecological predictors within individual identity.
Both the degree (model 2) and strength (model 3) models were fitted with Beta error structure and a logit link function.
Social network metrics are inherently non-independent variables and, therefore, do not meet the assumptions of responses in most frequentist statistical approaches, especially using null hypothesis testing (Croft et al., 2011; Franks et al., 2021). Using Bayesian inference rather than null hypothesis testing, the probability of a value occurring is determined not by its observed frequency alone, and we can calculate the uncertainty about any such probability (Brent et al., 2017).
Model fitting and validation
For our analyses we employed a Bayesian approach with all models fitted and estimated using Hamiltonian Monte Carlo methods and Stan software (Stan Development Team, 2021) with the brms package (Bürkner, 2018) within R.
Prior to model fitting, any potential issues of covariation or collinearity between our fixed effects were inspected via pairwise plots, pairwise correlations, and variance inflation factors (VIFs). Pairwise plots and correlation coefficients were generated using the ‘covees’ function of the GGally package (Schloerke et al., 2020); VIFs were generated using the ‘VIF’ function of the car package in R (Fox & Weisberg, 2011). In the final presented models, there were no issues with collinearity or covariance among the variables (maximum VIF < 4.5; maximum rho < 0.7; Table S22 and figures S23-24 in the supplementary materials). In the social connectivity models, mean anthropogenic presence was collinear with group identity (VIF > 10.0); therefore, we included only the mean anthropogenic presence variable in those models.
For all models, numeric variables were standardized as z-scores. We fit models with weakly regularizing priors for the fixed effects (β ~ Normal(0,1)). For the priors for the components of the random effects, we used the default priors provided by the get_prior function of brms, namely a weakly regularizing half student-t prior (df = 3, scale parameter = 10) for the random intercepts, and a uniform LKJ Cholesky prior (η = 1) for covariance matrices of the random slopes.
For all models, we specified three chains of 5000 iterations, 2000 of which were devoted to the warm-up. Sampling diagnostics (Rhat < 1.01) and trace plots confirmed chain convergence for all models. Effective sample sizes confirmed no issues with autocorrelation of sampling for all models (Tables S25-27 list all effective samples sizes and Rhat values for the main models presented).
To interpret the strength and uncertainty of the associations between our predictor variables and outcomes, we report the model estimate, 90% credible intervals and the proportion of posterior (p + or p−) supporting the direction (positive or negative) of the model estimate of the associations (Martin et al., 2020; McElreath, 2020; McShane et al., 2019). We considered effects whose direction was supported by more than 95% of the posterior distribution to be well supported, and those whose direction was supported by more than 90% of the posterior distribution to be weakly supported.
All models were validated using posterior predictive checks (see figures S28-32 in the supplementary materials for plots of these checks).