Ensuring the safety of reinforcement learning (RL) policies in high-stakes environments requires more than formal verification: it needs interpretability and targeted falsification—the deliberate search for counter-examples that expose potential failures before deployment. We present AEGIS-RL (Abstract, Explainable Graphs for Integrated Safety in RL), a hybrid framework that unifies (1) explainable RL, (2) probabilistic model checking, and (3) risk-guided falsification, and augments them with (4) a lightweight runtime safety shield that switches to a fallback policy when estimated risk exceeds a threshold. AEGIS-RL first builds a directed, semantically meaningful graph from offline trajectories that blends local and global explanations to make policy behavior transparent and verifier-friendly. This abstract graph is fed to a probabilistic model checker (e.g., Storm) to verify temporal safety specifications; when violations exist, the checker returns interpretable counterexample traces that pinpoint how the policy fails. When specifications appear satisfied, AEGIS-RL estimates residual risk during checking to steer falsification toward high-risk, under-explored states, broadening coverage beyond the offline data. Across safety-critical benchmarks including two MuJoCo tasks and a medical insulin-dosing scenario; AEGIS-RL uncovers significantly more violations than uncertainty- and fuzzing-based baselines and yields a broader, more novel set of failure trajectories. The resulting explanations and counterexamples provide actionable guidance to understand, debug, and repair unsafe policies while enabling runtime mitigation without retraining.