Study species and housing
We conducted this experiment daily between May and August 2024. Bees from 5 colonies (3 colonies and 2 pocket hives purchased from commercial suppliers (BIOBEST from Biobest Belgium N.V., Westerlo, Belgium, and Koppert, UK). Two colonies were housed in a natural environment based at Toyota Manufacturing Deeside and later brought to the lab for the study when there were few individuals left in the colony. The other colony and pocket hives were housed in the lab. Bees were kept in a black box, mirroring their underground nest. The nest was attached to a transparent circular tunnel (diameter 4 cm), which had three shutters and an open end to capture the bees using a plunger for the experiment. The lab (room temperature 21–23∘c) was lit with natural daylight and artificial lights. The bees had ad libitum access to syrup and pollen during the exploration task but reduced amounts during the learning task to increase their motivation to participate in the experiment. Bees (N = 69) were individually marked with a numbered tag (Toppa et al. 2021).
Ethics
While there is no general animal welfare and husbandry guideline for invertebrates, this study strictly followed ASAB and ABS animal welfare guidelines and was approved by the Division of Psychology Ethics Committees, University of Chester (no: SDPC130624). Animal welfare and husbandry were carried out daily, whereby bees were fed syrup and pollen. The bees had ad libitum access to syrup and pollen before the exploration task, but we decreased the amount of syrup and pollen given to them during the learning task to increase their motivation. Enrichments (e.g., paper sticks and tissue tiles) were provided for the bees.
Apparatus
The experiment used two apparatuses that share similar features (Fig. 1). Both apparatuses were rectangular-shaped boxes that had 10 equally divided compartments. For the exploration apparatus (Height x Width x Length: 3.3 cm x 21.8 cm x 50.9 cm, Fig. 1A), the floor of each compartment (W x H: each 4.5 x 1.2 cm) was covered with a random mosaic white-and-red square (1 cm x 1 cm) pattern. Each cell had a hole in the middle of the wall separating it from the next one, and a shutter was used to block this entrance between compartments, and at one end of the apparatus. This allowed the experimenter (PC and SD) to lift the shutter and let a bee enter/exit the experiment or go to the next compartment through the holes.
For the learning apparatus, it was used to conduct the discrimination-reversal learning task (H x W x L: 3.1 x 15.6 x 53.6 cm, Fig. 1B). Each compartment had a pair of colour stimuli (each 4 cm x 4 cm). The colour and size of the stimuli were yellow (Perspex® Yellow 250) and blue (Perspex® Blue 727); these two colours are preferred by bees, and they can clearly distinguish them from each other (e.g., Raine et al. 2006; Raine and Chittka 2008, 2012; Ings et al. 2009; Strang and Sherry 2014; Evans et al. 2017). The stimuli were horizontally positioned, one on the left side and one on the right side of the compartment, equidistant from the entrance. The design of the box and the way of presenting the stimuli horizontally aimed to enhance visual information processing during the task (Rother et al. 2023). Aside from the boxes, there was a video camera, attached to a tripod, used to record the behaviour and performance of the bees. The tripod was placed around 30 cm away from the apparatus, and the video camera was 1.5 m above the box.
Procedure
Bees went through the same standardised procedure; they participated alone in every task daily during their active time. When a bee emerged from its colony and went to the end of the tunnel that connected to their nest, s/he was brought to the task with a plunger. Bees first participated in the activity task, followed by the discrimination learning phase (DLP) and finally a reversal learning phase (RLP). Testing a bee was discontinued if s/he did not emerge from the colony on a test day due to a loss of motivation or dead, which led to a decrease in the sample size across tasks; this procedure controlled/avoided confounding variables (e.g., fluctuation in motivation, memory decay during the learning task).
Sixty-nine bees (33 females, 36 males) left their nest for the first time to freely move in a novel environment (‘the activity box’). This task was designed to measure bees’ active time in a novel environment when they first experienced it. The task started when a bee entered his/her full body in the box and ended when the bee’s full body exited the box. During the task, no shutter was used; the bee visited any compartment as s/he wished. If a bee was inactive for 15 minutes in the box, the session was terminated, and the bee was brought home using the same plunger. A bee can reattempt a session if s/he emerged from the colony 40 minutes or more after the last session. If the bee completed the session (i.e., left the last compartment), s/he was fed with 50 w/w sucrose ad libitum (a food source preferred by both sexes) (Bailes et al. 2018; Pamminger et al. 2019; Brown and Brown 2020); this aimed to assess whether they were responsive to sucrose, and the amount drunk was measured.
The bees that emerged from the colony after the activity task had an additional ‘training phase’. This training phase included two sessions that served two purposes: 1) to allow the bees to familiarise themselves with the procedure of the learning task, and 2) to ensure the bees had high motivation for the learning task. In both training sessions, a shutter was added to block the hole so that the bees had to explore both the right and left sides of the box. The shutter between the compartment and the next one was lifted every 30 seconds, and the bees had to go through all 10 compartments to be considered as passing the training phase. After each training session, the bees were rewarded with 50 w/w sucrose ad libitum (Bailes et al. 2018; Pamminger et al. 2019; Brown and Brown 2020) when s/he left the last compartment. The box was thoroughly cleaned using water and alcohol wipes before the next bee was trained with the following steps: 1) we used tissue papers to absorb bees’ dejections and remaining solutions on the flowers; 2) we used alcohol wipes to clear each compartment and all sides of the box as well as the flowers; 3) we used hot water to clean the compartments and the flowers; and finally, 4) we used tissue papers to dry all compartments and flowers and reset a session for the next bee. These cleaning steps are aimed at removing any scent left from the previous bee. After cleaning, the box was reset for the next bee.
The bees that had passed the training phase went to the learning task (full completion as an indicator of motivation). The learning task was used to measure bees’ colour-reward associative learning ability in the DLP and behavioural flexibility in the RLP. A bee participated in 1–2 sessions daily (as a motivation measure and to avoid over-feeding). Each session included 10 flower pairs (i.e., 1 pair per compartment, bees made 10 choices in total per session). The flower pair was blue and yellow. Each bee was randomly assigned to associate one flower colour with a reward (0.01ml 50 w/w sucrose) and the other flower colour with a control (0.01ml water control). The drop of each flower was placed at the centre of the flower. We decreased the size of the drops to < 1 mm in diameter, which required the bees to walk close to the stimuli and use their antenna to detect whether the drop was water or sucrose. The presentation of the flower pairs was pseudo-randomised within and across sessions. Within sessions, the same flower colour was presented on the same side for no more than two consecutive compartments to avoid bees developing side bias. The same flower colour was shown on the left side five times (out of 10) and another five times on the right side. Across sessions, the same colour pair was not presented in the same compartment more than two consecutive times. All the bees experienced the same colour sequence to control the order effect on the experience.
Bees could explore both flowers, and their first choice (indicated by at least half of his/her body on a flower) was marked as either ‘correct’ (reward) or ‘incorrect/error’ (control). When a bee chose the rewarded flower, the shutter in the middle of the divider was lifted, and the bee passed to the next compartment. When the bee made an error, s/he had to visit the other flower (i.e., correctional choice) before going to the next compartment. Choices were recorded both by direct observation and by the video camera. The criterion for completing a learning phase was that a bee had participated in the task daily and that the bee made ≥ 8 out of 10 rewarded first choices (i.e., ≥ 80% correct) in two consecutive sessions. The bee was brought back home when s/he completed the task and went to the second session if s/he emerged from the colony 40 minutes or more after the last session. After each session, the box was cleaned in four steps mentioned above in the activity task.
Bees went to the RLP the day after they had completed the DLP. The RLP had the same protocol and learning criterion as the DLP. In the RLP, the bees had to unlearn the previous colour-reward association (e.g., B + Y-), and relearn that the previously unrewarded colour became rewarded (e.g., B-Y+) until they reached the same learning criterion set in the DLP (80% choices on rewarded colour).
Behavioural Measurement
Active time in the compartments when the bee first experienced the novel apparatus was analysed (reflected the most when males leaving their nest the first time). Active time in the compartments was measured in seconds, and the recording started when a bee’s full body entered the apparatus (the first compartment) until its full body left the apparatus (the last compartment). From these recordings, we obtained the active time of each bee in each compartment, total active time across compartments (i.e., the sum of active time in each compartment), the frequency of visiting each compartment, and mean active time in each compartments (i.e., total active time in each compartment divided by the number of visits to that compartment).
The associative learning and behavioural flexibility performances were the number of errors made before reaching the learning criterion in the DLP and RLP, respectively (e.g., Tapp et al. 2003; Izquierdo and Jentsch 2012; Wascher et al. 2021). We also adjusted the learning criterion to include bees that had not returned to the task but showed significant learning, when a bee had 80% of correct responses in a single session with at least 5 consecutive correct responses or more in that session.
We additionally measured the size of each bee, which was estimated using the inter-tegular span (ITS) (Cane 1987; Hagen and Dupont 2013). We used a digital calliper to obtain the distance (mm) between the tegulae. Sucrose consumption was measured as the amount (ml) consumed after the activity task, using a syringe with a 0.02 ml graduation.
Statistical analysis
All analyses were conducted using R (version 4.4.1), and the significance level was set as two-tailed p ≤ 0.05. Data were analysed using Generalised Linear Model and Generalised Linear Mixed Model (GLMM) from the ‘glmmTMB’ package (Magnusson et al. 2017), pairwise contrasts with Tukey correction from the ‘emmeans’ package (Lenth et al.), a binomial test and individual analysis (Sokal and Rohlf 1995). All GLMM models included bee ID as a random variable to maximise model convergence. Model fits were checked with the package ‘DHARMa’ (Hartig 2025). Multicollinearity was checked after running each model using the Variance Inflation Factor (VIF) (< 5) and tolerance (0.25). These packages generated estimates, standard errors, z and p values for the results.
Activity task. a GLMM with gamma log link distribution was used to model the positively skewed, non-negative but continuous variables (e.g., active time) (Ng and Cribbie 2017). Main analyses included between-group level: 1) sex predicted the total active time in all the compartments (Table 1a); 2) sex, compartment number, and their interaction influenced the total active time in each compartment (Table 1b); 3) sex, compartment number, and their interaction in relation to the mean active time in each compartment (Table 1c). Additional analyses for the activity task included: 1) a GLMM gamma log link distribution test on sex difference in body size (Table 1e); 2) a Poisson log link distribution test to examine sex, compartment number and their interaction on the frequency of visits to each compartment, (Table 1d), with posthoc analyses uisng pairwise contrasts with Tukey corrections to compare sex differences in frequency of visits to each compartment (Table 2); 3) a GLMM gamma log link distribution test to analyse sex differences in drinking amount (Table 1f); 4) within-sex analyses using a GLMM to examine the effects of body size (ITS) and the total active time in all compartments on sucrose consumption (Table 1g-h).
Discrimination-reversal learning task. For each learning phase, two models were run to examine learning performance using GLMM Poisson log link distribution. In the first model, we only included bees that had met the learning criterion (80% correct responses for two consecutive sessions). This model included a fixed factor, sex, and the response variable was the number of errors made before reaching the learning criterion (Table 3a, 5b). We then included bees that had met the adjusted learning criterion (i.e., met the 80% learning criterion with 5 or more consecutive correct responses in a session) and reran the model (Table 3b).
Between-group analyses of each learning phase. In the DLP, we conducted between-group analyses using pairwise contrasts with Tukey corrections for multiple comparisons to examine whether females and males who had and had not completed the learning phase differed in body size (inter-tegular span) (Table 4a), sucrose consumption (Table 4b), and total active time in the activity task (Table 4c). For RLP, all bees met the (adjusted) learning criterion (i.e., completed the task), and thus, between-group analyses reflected sex differences. We ran GLMM with gamma log link distribution to examine body size in one model (Table 5c), sucrose consumption (Table 5d) in another model, and total active time in all compartments in the final model (Table 5e).
Within-sex analyses of each learning phase. For bees that had completed each learning phase, we carried out within-sex analyses to examine predictors for learning performance. We used GLMM Poisson log link distribution to examine whether their body size (inter-tegular span), sucrose consumption, and total active time in all compartments predicted their DLP learning performance in one model (Table 3c-d) and RLP performance (behavioural flexibility) in another model (Table 5f-g).
Additional analysis. For bees that had met the unadjusted DLP learning criterion (n = 15), we further used GLMM Poisson log link distribution to examine whether their DLP performance was affected by their initial choice (Y/B) in one model (Table S4e), and their assigned initial reward colour (Y/B) in another model (Table S4f). A binomial link distribution was used to examine whether inter-tegular size (ITS), active time in the activity task, and sucrose consumption would predict whether a bee completed DLP or not (Table S4g).
Performance differences between learning phases. A GLMM Poisson log link distribution was used to examine the number of errors in the last session of DLP and the first session of RLP (Table 5a), the total number of errors made before reaching the learning criterion of DLP and RLP (Table 5b). A binomial test was used to examine whether bees’ first choice on the previously rewarded colour was significantly above chance. Individual analyses were also conducted to understand whether bees had learned the DLP colour-reward association.