2.1. Study Area
The study was conducted in the Lake Bogoria watershed, situated within the Marigat and Mogotio Sub-counties of Baringo County in north-western Kenya (Fig. 1). The watershed lies within the axial depression of the Gregory Rift Valley, forming an asymmetric half-graben (Renaut & Owen, 2023). The central feature is Lake Bogoria itself, a narrow, saline-alkaline lake located at approximately 36°06′ E and 0º15′ N (De Cort et al., 2018). The lake occupies a tectonic trough bounded by the Lake Bogoria Escarpment to the east and the fault-fragmented, eastward-sloping Kipngatip plateau, composed of phonolite lava, to the west (Obando et al., 2016). The area is drained primarily by the Waseges-Sandai River, the Emsos River, and the Kipsirian River, alongside numerous ephemeral streams (Renaut & Owen, 2023). The lake is a topographically closed system with no surface outflow, and its water balance is supplemented by hot springs and geysers along its shoreline (Renaut et al., 2017; Salano et al., 2018). The climate is semi-arid, characterized by bimodal rainfall: long rains occur from March to May and short rains from October to November, with occasional precipitation during the typically dry June-September period (De Cort et al., 2018). Mean annual temperatures range from 14°C to 35°C (Renaut et al., 2017). The dominant soils are clays, particularly in the eastern and northern parts of the watershed within the Waseges-Sandai River catchment (De Cort et al., 2018). The region is predominantly inhabited by the Tugen-speaking Endorois community, whose livelihoods are primarily agro-pastoral.
(Insert Fig. 1 here: Map of the study area)
2.2. Data Collection and Analysis
Data for this study was obtained from both primary and secondary sources. Primary data was collected through focus group discussions (FGDs) and interviews with key informants to gather historical context and local ecological knowledge. Participants were drawn from the Endorois community, the Lake Bogoria Basin Water Resource User Association (WRUA), the Ministry of Water and Irrigation, and the Ministry of Agriculture, Livestock and Fisheries. Secondary data formed the core inputs for the Soil and Water Assessment Tool (SWAT) hydrological model. The model operates on the principle of the water balance equation (Neitsch et al., 2009):

Where SWt is the final soil water contents on day i and SW0 is initial soil water content. Time (t) is in days, whereas all the other measurements are in millimeters. The equation subtracts all forms of water loss on day i from precipitation on day i (Rday), that is surface runoff (Qsurf), evapotranspiration (Ea), loss to vadose zone (wseep) and return flow (Qgw) (Neitsch et al., 2009).
The study relied on several key spatial datasets. Topographic information was derived from a 30-meter resolution Digital Elevation Model (DEM) from the Shuttle Radar Topography Mission (SRTM), downloaded from USGS Earth Explorer. This DEM, originally in WGS 1984 geographic coordinates, was projected to WGS 1984 UTM Zone 37N for analysis and established the watershed's elevation range from 988 m to 2324 m. For climate inputs, gridded daily precipitation and maximum/minimum temperature data spanning from January 1, 1979, to December 31, 2020, were sourced from the Copernicus Climate Change Service and extracted for three specific grid points within the watershed (Table 1). Due to the absence of station records for other variables, relative humidity, solar radiation, and wind speed were subsequently simulated by the model's internal weather generator using the available precipitation, temperature, and geographical location data.
Table 1
Climate parameters, coordinates and altitude
|
Parameter
|
Name
|
Lat
|
Long
|
Elevation
|
|
Precipitation and temperature (maximum and minimum)
|
pcp36.07_0.28
tmp36.07_0.28
|
0.28
0.28
|
36.07
36.07
|
998
998
|
|
Precipitation and temperature (maximum and minimum)
|
pcp36.12_0.02
tmp36.12_0.02
|
0.02
0.02
|
36.12
36.12
|
1549
1549
|
|
Precipitation and temperature (maximum and minimum)
|
pcp36.12_0.49
tmp36.12_0.49
|
0.49
0.49
|
36.12
36.12
|
1436
1436
|
(Insert Table 1 here: Climate parameters, coordinates and altitude)
Land Use/Land Cover (LULC) data for the years 1981, 1991, 2001, 2011, and 2020 were generated from Landsat satellite imagery (using Landsat 5, 7, and 8 respectively, acquired from USGS Earth Explorer) and classified into nine categories: water, grassland, tree cover, shrubland, rain-fed agriculture, irrigated agriculture, bare land, peri-urban areas, and Prosopis juliflora thickets. Finally, a soil map and its associated physical properties data were obtained from the FAO-soils database on the SWAT website. This dataset included critical parameters for hydrological modeling namely saturated hydraulic conductivity, bulk density, available water capacity, and soil texture. These parameters were used to define the Hydrologic Soil Group (HSG) for each soil unit, with the dominant groups in the watershed being C (slow infiltration) and D (very slow infiltration, primarily clay soils), as detailed in Table 2.
Table 2
FAO soil data for Lake Bogoriawatershed
|
SNo.
|
FAO Soil code
|
HSG
|
Area (km2)
|
|
1.
|
Re63-2c-248
|
C
|
11.04
|
|
2.
|
I-R-74
|
D
|
81.67
|
|
3.
|
Jc6-2a-118
|
D
|
3.54
|
|
4.
|
Ne12-2c-155
|
D
|
3.11
|
|
5.
|
Lf17-2ab-737
|
C
|
0.64
|
(Insert Table 2 here: FAO soil data for Lake Bogoria watershed)
All spatial data processing, including projection, watershed delineation, and map algebra, was performed using ArcGIS 10.5, with the ArcSWAT 2012 interface used for model setup and operation.
2.3. SWAT Model Setup and Simulation
The SWAT model was constructed following a standardized procedure within the ArcSWAT interface. The process starting point is watershed delineation. The DEM was loaded to define the basin outlet, generate the stream network, and subdivide the watershed into 1705 sub-basins. Three key outlets were defined corresponding to the Waseges, Kipsirian, and Emsos/Ngiriki rivers (Fig. 2).
(Insert Fig. 2 here: Drainage network within Lake Bogoria watershed)
The next step defined and generated Hydrologic Response Units (HRUs). HRUs are unique combinations of land use, soil type, and slope that allow the model to account for spatial heterogeneity within a sub-basin. The land use and soil maps were reclassified into SWAT model codes using predefined lookup tables. The slope was discretized into three classes: 0–12%, 12–30%, and > 30%. The ‘multiple HRUs’ option was selected with a threshold of 5% for land use, soil, and slope, meaning any combination covering less than 5% of a sub-basin area was excluded to maintain model efficiency without significant loss of information.
Subsequently, weather data was inputted. The prepared gridded precipitation and temperature data were loaded into the model by linking the three climate points to their respective sub-basins using the ‘WGEN_user’ option. The model's built-in weather generator was used to synthesize the remaining required climate variables.
A unique SWAT model was built and executed for each of the five LULC maps (1981, 1991, 2001, 2011, 2020). Each simulation was run for a three-year period, with the first year serving as a warm-up period to initialize the model's water balance and minimize the influence of arbitrary initial conditions. The subsequent two years were used as the effective simulation period for analysis (Table 3). River discharge was simulated at the outlet of each of the three main rivers.
Table 3
SWAT models simulationperiods
|
Year
|
Climate period
|
Warm up period
|
Simulation period
|
|
1981
|
1st Jan 1979–31st Dec 1981
|
1st Jan – 31st Dec 1979
|
1st Jan 1980–31st Dec 1981
|
|
1991
|
1st Jan 1989–31st Dec 1991
|
1st Jan – 31st Dec 1989
|
1st Jan 1990–31st Dec 1991
|
|
2001
|
1st Jan 1999–31st Dec 2001
|
1st Jan – 31st Dec 1999
|
1st Jan 2000–31st Dec 2001
|
|
2011
|
1st Jan 2009–31st Dec 2011
|
1st Jan – 31st Dec 2009
|
1st Jan 2010–31st Dec 2011
|
|
2020
|
1st Jan 2018–31st Dec 2020
|
1st Jan – 31st Dec 2018
|
1st Jan 2019–31st Dec 2020
|
(Insert Table 3 here: SWAT models simulation periods)
2.4. Model Calibration, Uncertainty, and Performance Evaluation
A significant challenge for this study was the absence of long-term, continuous observed streamflow data within the Lake Bogoria watershed for direct model calibration and validation. To address this, a regionalization approach was adopted. This technique is based on the premise that catchments with similar physical characteristics exhibit similar hydrological behavior, allowing for the transfer of parameters from a gauged (donor) catchment to an ungauged (recipient) one.
The donor catchment selected was the River Malewa watershed, part of the Lake Naivasha basin, whose parameters had been previously calibrated and validated by Muthuwatta (2004). While not ideal, this catchment was chosen as a best available proxy based on its location within the Kenyan Rift Valley system. Key physical characteristics of both the donor and recipient watersheds are compared in Table 4. Critical calibrated parameters from the Malewa study, including the curve number (CN2), groundwater revap coefficient (GW_REVAP), threshold depth for return flow (GWQMN), and saturated hydraulic conductivity (SOL_K), were transferred to initialize the Lake Bogoria SWAT model (Table 5).
Table 4
Properties of the donor and recipientwatersheds
|
Watershed
|
Annual rainfall (mm)
|
Mean Slope (%)
|
Elevation (m)
|
SHG
|
LULC (Major)
|
|
River Waseges
|
1006
|
12.7
|
988–2324
|
C and D
|
Shrub and grass
|
|
River Kipsrian
|
1006
|
5.78
|
988–1740
|
C and D
|
Shrub and grass
|
|
River Ngiriki
|
1006
|
9.45
|
988–1610
|
C
|
Shrub and grass
|
|
River Malewa
|
1100
|
49.21
|
1887–2800
|
D
|
Agriculture and Shrub
|
Table 5
Transferred watershedparameters
|
Parameter
|
Value
|
|
CN2 (runoff curve number f)
|
72 (Shrub/grass/trees)
65 (Crop farming)
|
|
GW_REVAP (Groundwater "revap" coefficient)
|
0.116
|
|
QWQMIN (Threshold depth of water in the shallow aquifer required for return flow to occur (mm))
|
1009
|
|
SOL_K (Saturated hydraulic conductivity)
|
23 (Shrub/grass/trees)
37.5 (Crop farming)
|
To refine these parameters and account for uncertainty, the Sequential Uncertainty Fitting algorithm (SUFI-2) within the SWAT-CUP software was used. The SUFI-2 performs inverse modeling through a series of iterations, each involving numerous simulations, to identify the set of parameter values that result in the best fit between simulations and observations while quantifying parameter uncertainty. Given the lack of local streamflow data, this process focused on ensuring the model's internal consistency and realism based on the regionalized parameters and the physical constraints of the watershed.
Model performance for the final simulated discharge was evaluated using two statistical metrics namely the Coefficient of Determination (R²) and Nash-Sutcliffe Efficiency (NSE). The R² measures the proportion of the variance in observed data that is explained by the model. Values range from 0 to 1, with higher values indicating a better fit given by the formular:

Where, Qm is the observed (measured) stream flow on day i (m/s 3), Qs is the simulated stream flow on day i (m/s3), and bars indicate averages.
The NSE assesses the predictive power of the model relative to the mean of the observations. Values can range from -∞ to 1, where 1 indicates a perfect match. Performance is generally considered satisfactory if NSE > 0.5 and good to very good if NSE > 0.65 (Moriasi et al., 2007) described by the formular:

(Insert Table 4 here: Properties of the donor and recipient watersheds)
(Insert Table 5 here: Transferred watershed parameters)