Study participants
The participants of this secondary analysis were selected from two different study populations (Jarnig et al. 2021 (17) (study population 1 (SP1); N=821)) and Jarnig et al. 2025 (18) (study population 2 (SP2)); N=354). Both studies were approved by the Research Ethics Committee of the University of Graz, Styria, Austria. For all participants included in the analysis, active consent to participate in the study was given by their legal guardians or, in the case of adolescents older than 14 years, by the adolescents themselves, as well as information on age, sex, and type of school (see Figure 1)
Analyses were performed with data from the primary school children participated in SP1. SP2 data from 178 secondary-school and 176 high-school children were used to test the generalizability of findings to older children.
Procedure
Measurements of anthropometric parameters and fitness tasks were carried out by trained members of the research team and took place at schools during physical education classes. All tests were carried out in sport halls with participants wearing standard sports clothing but no shoes (except the 6-minute run (6MR), which was done in sneakers.).
Anthropometrics
Body height (cm) was measured to the nearest 0.1 cm using a stadiometer (SECA 213, Hamburg, Germany), and body weight (kg) was measured to the nearest 0.1 kg using an electronic scale (BOSCH PPW4202/01, Nuremberg, Germany). BMI was calculated by dividing body weight by height in meters squared.
Fitness assessments in SP1
Cardiorespiratory endurance was assessed with the six-minute run (6MR), lower body strength with standing long jump (SLJ), upper body strength with standing1-kg medicine ball chest throw (MBTSCT-1kg), body coordination with jumping sideways test (JS), and action speed with 4×10-meter shuttle run (4×10SHR). Action speed (m/s) was scored by dividing the total distance run (i.e., 40 meters) by seconds of running time 4×10SHR. All fitness tests were performed according to the methodology described by Jarnig et al. in 2021 (17).
Fitness assessment and age groups in SP2
In SP2, secondary school children (from fifth to eighth grade) up to young people in high school (from ninth to twelfth grade; maximum age: 18 years) completed the medicine ball overhead throw (MBTSOT-2kg). Children and adolescents stood at the starting line, using both hands to hold a 2 kg medicine ball behind their heads. They then used both hands to throw the ball over their heads as far forward as possible. The shortest distance between the starting line and the point where the ball hit the ground was measured to the nearest centimeter using a tape measure. The participants had two attempts to throw the ball, with the longest attempt being counted.
All children in SP2 also completed the SLJ according to SP1 instructions. In addition, all children attending secondary school performed a push-up test (PU) based on instructions of the German motor test (19). In the starting position, participants lay on their stomachs on the floor and touched one hand with the other (above the glutes near the spine). To perform a push-up correctly, participants had to place their hands next to their shoulders and push their bodies into a fully extended push-up position. In this position, they had to lift one hand off the floor and touch the back of the other hand. Then, they had to place their hand back on the floor and return to the starting position in a controlled manner. Within 40 seconds, participants had to complete as many push-ups as possible, and the number of correctly performed push-ups was included in the analysis.
Standardization and classification
Anthropometrics
Using the international standard reference values for age and sex from the International Obesity Taskforce (IOTF) (20), the standard deviation scores (BMIIOTF SDS) for BMI were calculated using the LMS method (21). Age- and sex-specific IOTF BMI cut-off values (20) for thinness, normal weight, overweight, obesity, and morbid obesity were used to categorize raw BMI values into a 5-level weight classification.
Fitness test: 6MR, 4×10SHR, JS, SLJ, PU
Standard deviation scores (SDS; z-scores) were calculated based on age- and gender-specific reference values to compare results (raw scores) of the fitness tests with established reference values. Since no national reference values were available for this age groups, international reference values were used. The most recent German percentile tables from the Düsseldorf model ( (22); collected 2011–2018) were used for 6MR and SLJ and the Macedonian standard values from the 2018 “Macedonian fitness meter” (MAKFIT) (23) for the 4 × 10 SHR. Z-scores were derived with the LMS method (21) based on the German (DüMo) and Macedonian (MAKFIT) reference tables. Z-scores for JS and PU were computed based on the German Motor Test (2016 norms (19)) using the usual z-score standardization (24).
Tests of upper body strength: MBTSCT-1kg and MBTSOT-2kg
Since no comparable reference values existed for the MBT_A variants and in order to ensure objective comparability, it was necessary to calculate age- and gender-adjusted z-scores for MBT_T and the alternative assessments (MBT_A1, MBT_A2, and MBT_A3) based on the raw values of our own study population using traditional z-score standardization (24).
Anthropometrically enriched measures of MBT (MBT_A)
Traditionally, MBT_T performance is based on the distance thrown with a medicine ball, usually adjusted for gender and age (MBT_Tz). These raw or z-scores ignore that HB and MB have been shown (16, 25) to significantly influence MBT_T. The alternative MBT_A assessments proposed here try to “correct” this by taking HB and MB into account. We designed three different indices: MBT_A1, MBT_A2, and MBT_A3 to analyze and compare them with the traditional assessment MBT.
MBT_A1. MBT_A1 normalizes the throwing distance on HB and this result on MB or equivalently on the product of HB and MB:
- MBT_A1 = MBT_T/ HB / MB = MBT_T / (HB × MB)
As length units (cm) reduce each other in the mathematical fraction = the unit of MBT_A1 is 1/kg (or: kg-1).
MBT_A2. MBT_A2 normalizes the throwing distance on the ratio of HB and MB:
- MBT_A2 = MBT_T / (MB / HB) = MBT_T × (HB/ MB) = MBT_T × HMR
The unit of MBT_A2 is square centimeters per kilogram cm²/kg (or: cm²kg-1).
MBT_A3. MBT_A3 is a variation of MBT_A2 by setting the throwing distance in relation to BMI (m throwing distance per BMI):
- MBT_A3= MBT_T / (MB / HB2) = MBT_T × (HB2 / MB) = MBT_T / BMI
The unit of MBT_A3 is cubic meters per kilogram: m³/kg (or: m³kg-1).
Statistical analysis
Descriptives
Continuous variables are reported as means (M) and standard deviations (SD), categorical variables as absolute values (n) and percentages (%) for descriptive statistics. No data imputation was performed.
Statistical inference with linear mixed models (LMMs)
With a first set of LMMs we tested alignment of the three anthropometrically enriched MBT_A variants (MBT_A1,MBT_A2,and MBT_A3) with the four other PF tasks and the traditional MBT_T. For statistical inference, we specified four LMMs including the four PF tasks plus either MBT_T or one of the 3 MBT_A measures as a five-level repeated-measures factor task (i.e., as five dependent variables). Fixed effects comprised quadratic BMI, sex, quadratic age, grade, and region. Child and school were included as random factors with variance components (VCs) and correlation parameters (CPs) for the five repeated measures. Model comparison considered LMMs with more complex and simpler fixed-effect structures. Selection was based on likelihood ratio tests (LRTs) and an Akaike Information Criterion (AIC) with a significant improvement requiring a decrease of more than five units for the more complex model. (26)
The second set of LMMs was a modification of the above first set. Specifically, rather than including one of the four MBTs as a level of the task factor (i.e., as dependent variable), we used it as a predictor of the four other fitness tasks (i.e., as an independent variable). This afforded a comparison of the efficiency of traditional and alternative MBTs in the prediction of standard PF tasks. Other fixed effects and random effect structure were the same as in the first set of LMMs. Model comparison and selection followed the procedure described for the first set of LMMs and led to the same LMM structure of fixed effects and random-effect structure for both sets.
The same analysis strategy was employed for SP2, but only with standing long jump and push up as comparative physical fitness tasks. As children were all from the same school, the fixed effect of region and the random factor school were not included in the models.
Software
For data analyses and graphics, we used mainly tidyverse (version 2.0.3) (27) , and easystats (version 0.7.5) (28) packages in the R language (version 4.5.2) (29). LMMs were estimated with MixedModels.jl (version 5.1.0) (30)in the Julia programming language (version 1.12.2). (31)