In this study, seismic fragility assessment of RC elevated tanks is augmented by two complementary machine learning algorithms—SVR and MLP—selected for their documented efficacy in resolving nonlinear seismic response prediction challenges. SVR is widely adopted in structural engineering for its ability to model high-dimensional relationships between ground motion parameters and structural demands while maintaining computational efficiency, as demonstrated in hybrid frameworks for liquid-retaining systems (Al-Ayoubi et al., 2025). MLP, conversely, is prioritized for its capacity to capture intricate fluid-structure interactions and staging system nonlinearities, a capability validated in prior studies on elevated tanks subjected to hydrodynamic and seismic coupling (Pourbagheri et al., 2017). The models are trained on a database of 738 response samples generated via elastic-plastic time-history analyses under IDA. Ground-motion intensity measures (Sa, PGV, HI, AI, and TP), together with key structural attributes (tank height (H), tank length (L), and column width (Col)), constitute the input feature set.
3.1 Dataset Composition and Feature Engineering
The predictive database comprises 738 distinct IDA sample points, each representing a unique combination of scaled far-field ground motion record and tank geometry. Box–Cox transformations were applied to AI, SD and to reduce skewness, as these features exhibited non-Gaussian distributions, all features were subsequently standardized to zero mean and unit variance to expedite algorithm convergence and mitigate scale-driven bias (Box & Cox, 1964). Table 4 summarizes the input and output feature definitions, facilitating reproducibility of the modeling exercise.
Table 4
Definitions of ML input features and model parameters
| Input Parameters | Output Parameter |
| Ground motion parameters | Structural parameters |
| Sa | HI | L | IDR |
| PGV | TP | H |
| AI | SD | Col |
Hyperparameter selection critically influences both model accuracy and generalization. Rather than exhaustive grid search, Bayesian optimization framework (BayesSearchCV) was employed to iteratively evaluate and update a probabilistic surrogate model of the validation loss surface to efficiently identify optimal hyperparameter combinations (Bergstra et al., 2011; Kaveh, 2024). At each iteration, the acquisition function balances exploration of poorly sampled regions against exploitation of promising configurations, reducing the total number of evaluations required to converge on the global optimum. This approach was applied to both SVR and MLP, tuning parameters such as the RBF-kernel coefficient γ and regularization parameter C for SVR, and hidden‐layer widths, learning rate, and L₂‐regularization strength for MLP. The 5‐fold cross‐validation process used throughout hyperparameter search is illustrated in Fig. 5, where data splits are shown along with corresponding validation‐set performance trajectories (Kaveh et al., 2021; Lundberg & Lee, 2017). All computations were performed on a Ryzen 7-5800H CPU, 32 GB RAM, and an NVIDIA RTX 3050 Laptop GPU, with MLP leveraging GPU acceleration for training.
3.2 Structural Response Prediction Using SVR
The SVR model employs a radial basis function kernel to nonlinearly project the input feature space, enabling the capture of complex IM–structural response relationships. Optimal hyperparameters determined via BayesSearchCV (C = 100, ε = 0.0183) balance bias–variance trade-offs and constrain margin violations (Tao et al., 2024). The database was randomly partitioned into 80% training and 20% testing subsets; a 5‐fold cross‐validation scheme was embedded within the training set to guard against overfitting. Model performance is evaluated by the coefficient of determination (R2), which measures the proportion of variance in the observed collapse IDR explained by the predictions Eq. (1); the root‐mean‐square error (RMSE), which represents the square root of the average squared prediction deviations and thus indicates overall predictive dispersion Eq. (2); the mean‐absolute error (MAE), which quantifies the average absolute bias between predicted and actual values Eq. (3); and the mean‐absolute‐percentage error (MAPE), which expresses the average relative error as a percentage of observed values Eq. (4)—together offering a multifaceted view of predictive fidelity (Kaveh & Khavaninzadeh, 2023). Scatter plots comparing actual versus predicted collapse IDR for each tank configuration under SVR (Fig. 6) demonstrate good alignment along the 1:1 line for low‐rise tanks, with modest dispersion emerging in the taller 1008 m3 configuration.
$$\:{R}^{2}=1-\frac{{\sum\:}_{i=1}^{n}{({y}_{i}-\widehat{{y}_{i}})}^{2}}{{\sum\:}_{i=1}^{n}{({y}_{i}-\stackrel{-}{y})}^{2}}$$
1
$$\:RMSE=\:\sqrt{\frac{1}{n}{\sum\:}_{i=1}^{n}{({y}_{i}-\widehat{{y}_{i}})}^{2}}$$
2
$$\:MAE=\frac{1}{n}{\sum\:}_{i=1}^{n}\left|{y}_{i}-\widehat{{y}_{i}}\right|$$
3
$$\:MAPE=\frac{100\%}{n}{\sum\:}_{i=1}^{n}\left|\frac{{y}_{i}-\widehat{{y}_{i}}}{{y}_{i}}\right|$$
4
Where \(\:n\) is the total number of observations in the measured set, \(\:{y}_{i}\) is the actual IDR for the \(\:i\)th sample,
\(\:\widehat{{y}_{i}}\) is the ML-predicted collapse IDR for the \(\:i\)th sample, and \(\:\stackrel{-}{y}\) is the mean of the observed IDR values.
3.3 Structural Response Prediction Using MLP
The MLP framework developed for predicting IDR consists of an input layer aligned with the dimensionality of the selected ground motion and structural features, followed by two hidden layers of 256 neurons each. This configuration was selected based on hyperparameter optimization using Bayes Optimizer, which systematically explored candidate architectures by balancing accuracy and training efficiency (Yoo et al., 2021). The ReLU activation function was applied to all hidden layers to enable nonlinearity while addressing vanishing gradient issues (Nair & Hinton, 2010). He normal initialization was used to ensure stable gradient propagation. The model incorporated L₂ regularization with a coefficient of 1.05×10⁻⁴ to constrain weight magnitudes and mitigate overfitting. The optimizer adopted was Adam, with a learning rate of 1.69×10⁻3, identified through Bayesian search (He et al., 2015). This optimizer dynamically adjusted learning rates based on first and second moment estimates of the gradients. Learning rate decay was implemented by halving the rate upon stagnation of validation loss over five consecutive epochs. Early stopping was triggered when no improvement in validation MSE was observed for ten epochs, using a 10% holdout from the training partition, and training was capped at 500 epochs (Kingma & Ba, 2014). Convergence typically occurred by epoch 120. The selected architecture reflects a trade-off achieved through Bayesian search: it offers sufficient depth and capacity to capture nonlinear interactions among input features—particularly those arising from fluid–structure coupling—without incurring excessive computational cost or overfitting. Figure 7 illustrates the model structure, and Fig. 8 shows the resulting predictive performance on test data, confirming that MLP achieves closer alignment with actual IDR values across all tank configurations relative to SVR.
3.4 Comparative Model Performance and Computational Efficiency
The reported test metrics (Table 5) reflect performance on a 20% held-out set not used in training or tuning, confirming model generalization. Both SVR and MLP exhibit strengths and limitations that must be weighed when selecting an algorithm for seismic fragility assessment. As shown in Table 5, SVR completed training in 0.894 s and inference in 0.019 s, whereas MLP required 21.203 s for training and 0.004 s for inference. This disparity reflects the underlying computational complexity: SVR solves a convex quadratic program whose cost scales between O(n2) and O(n3) with the number of samples, while MLP back-propagation over 50,000 weights entails repeated gradient updates proportional to network size and epochs (Smola & Schölkopf, 2004). For reference, a single IDA run for one tank configuration required approximately 2 hours on similar hardware, underscoring the computational efficiency.
Despite the longer runtimes, MLP yielded higher predictive fidelity: test R2 improved from 0.953 (SVR) to 0.990 (MLP), and RMSE decreased from 0.0021 to 0.0009, with corresponding MAE and MAPE reductions (from 0.0010 to 0.0007 and from 7.73–4.93%, respectively). These gains align with broader findings that neural networks often outperform SVR in capturing complex, nonlinear mappings when sufficient data are available, at the expense of increased risk of overfitting and heavier computational demands. Conversely, SVR’s reliance on support vectors (often a small subset of the training set) confers robust generalization under limited data conditions. Moreover, SVR’s unique global optimum (due to convexity) ensures reproducible results across runs, while MLP training on a nonconvex loss surface may converge to different local minima unless multiple random restarts or advanced optimizers are employed. Inference speed also favors SVR when the number of support vectors remains modest, but MLP can leverage GPU acceleration and batch processing to mitigate latency in high-throughput applications (Rumelhart et al., 1986).
Both SVR and MLP serve as efficient surrogates for IDA, with distinct trade-offs. SVR delivers rapid, reliable estimates for high- to moderate-probability damage states (e.g., IO, LS) with errors under 5%, while MLP provides superior fidelity across all damage thresholds—including CP—with maximum error below 6%. These trade-offs underscore MLP’s suitability for scenarios demanding precise collapse-level predictions, provided computational resources and overfitting controls are prioritized.
Table 5
Comparative performance metrics and computational times for SVR and MLP models
| Model | Dataset | R2 | MAE | RMSE | MAPE | Time (s) |
| SVR | Train | 0.9911 | 0.000345 | 0.000892 | 2.065% | 0.894 |
| Test | 0.9529 | 0.001019 | 0.002087 | 7.728% | 0.019 |
| MLP | Train | 0.9931 | 0.000460 | 0.000791 | 2.861% | 21.203 |
| Test | 0.9897 | 0.000653 | 0.000938 | 4.934% | 0.004 |
3.5 Model Interpretability and SHAP Analysis
SHapley Additive exPlanations (SHAP) was used to interpret the predictions made by both the SVR and MLP models by attributing each prediction to its input features. For the SVR model, the SHAP summary plot (Fig. 9) identifies PGV and HI as the most influential predictors. Their SHAP values are distributed around zero, indicating that their contributions to the predicted IDR vary depending on the specific input scenario. Sa and TP also contribute meaningfully, whereas AI, SD, and geometric features such as Length, Height, and Col show lesser but non-negligible influence. The color gradients in the SHAP plot suggest that higher PGV and HI values are typically associated with increased IDR, reflecting their role in amplifying structural demand (Lundberg & Lee, 2017).
For the MLP model, the SHAP summary (Fig. 10) shows a broader range of feature contributions. PGV and HI again dominate the input space, but Sa and AI also have significant positive associations with higher predicted IDR. Unlike SVR, the MLP model assigns greater importance to structural parameters, particularly Col, which is not prominent in the SVR interpretation. TP and SD also show increased relevance. This redistribution of feature importance is indicative of MLP’s capacity to learn complex nonlinear interactions, in which both dynamic and geometric characteristics jointly influence seismic response. Length and Height show limited direct influence, implying that their effects may be indirectly captured through interactions with other parameters. The observed emphasis on Col in the MLP model likely stems from its ability to model coupled effects of geometry and dynamic response, suggesting improved sensitivity to FSI-related behaviors compared to SVR.