BRIDGING CLASSICAL STATISTICS AND MODERN AI TOWARD INTERPRETABLE DATA‑SCIENCE MODELS

Muhammad Ahsan Hayat; Imran Ali Channa; Ubaidullah Khan; Nazia Alfred Fernandes; Urooj Tariq; Khan Ikram Uddin

Authors

Muhammad Ahsan Hayat
Imran Ali Channa
Ubaidullah Khan
Nazia Alfred Fernandes
Urooj Tariq
Khan Ikram Uddin

Keywords:

Interpretable Machine Learning, Explainable AI, Classical Statistics, Generalized Additive Models, SHAP, LIME, Symbolic Regression, Transparency, Causal Interpretability

Abstract

The rise of artificial intelligence (AI) in real-world applications has intensified the long-standing tension between predictive accuracy and interpretability. Traditional statistical models, such as linear and generalized linear regression, are often appreciated for their clarity, inferential power, and ability to quantify uncertainty. Yet, these methods struggle when faced with highly complex, nonlinear, or high-dimensional problems. In contrast, contemporary machine learning models particularly deep neural networks achieve exceptional predictive accuracy but typically operate as black boxes, offering little transparency to practitioners or decision-makers. Addressing this divide has become one of the central challenges in modern data science. In recent years, researchers have sought to merge the strengths of statistical reasoning with the adaptability of machine learning. Approaches such as generalized additive models, explainable boosting machines, Shapley additive explanations (SHAP), local interpretable model-agnostic explanations (LIME), and symbolic regression reflect this trend. These tools aim to produce models that deliver strong performance while remaining understandable, a balance that is particularly vital in fields like healthcare, finance, policy, and engineering sectors where openness and accountability are non-negotiable. This paper examines these methodological advances, situates them within the broader history of interpretable modeling, and outlines possible directions for future work that seeks to integrate the inferential depth of statistics with the expressive capacity of modern AI.