Is a GLM AI? Clarifying the Distinction in Actuarial Science | InsureAI Blog

Executive Summary

The insurance analytics landscape is witnessing an explosion of "AI-powered" solutions. However, not all algorithms marketed as "AI" represent genuine artificial intelligence, creating confusion among insurance leaders about the true capabilities of various analytical solutions.

Our opinion based on going through the literature: while Generalised Linear Models (GLMs) with regularization techniques such as fused lasso and group lasso represent valuable advances in statistical modelling, they do not on their own deliver the representation-learning capabilities that (in our opinion, as well as Sam Altman’s; see his post where he writes “Deep Learning Worked”) characterize AI applications driven by modern deep learning. True AI systems offer fundamentally different capabilities: automatic feature learning, unstructured data processing, and discovery of complex patterns that would be impossible to specify manually. Besides for well-known applications like ChatGPT, which rely on massive implementations of deep learning models, recent research indicates significant performance gains within the insurance space when using these techniques.

Understanding these analytical distinctions matters greatly. Insurers who mistakenly adopt traditional statistical methods marketed as "AI" may miss significant competitive advantages, underperforming relative to peers who leverage true AI solutions. As the regulatory landscape evolves, with frameworks like the EU AI Act and NAIC Model Bulletins defining governance requirements, precision in terminology becomes not just academic but essential for compliance and strategic planning. True AI positions insurers to not just respond to market changes, but to proactively shape their competitive future.

The AI Label Epidemic in Insurance Technology

Imagine walking through an insurtech conference in 2025: you will encounter a striking phenomenon! Many analytics vendors claim to offer "AI-powered" solutions. From simple regression enhancements to genuine deep learning systems, everything has been rebranded as "AI". This is not merely a matter of marketing semantics – it is creating real confusion in the market that has tangible consequences for technology adoption and strategic planning.

In our experience, we have found that many solutions marketed as "AI-powered" are essentially traditional rule-based systems with minor statistical enhancements. When vendors might, for example, promote regularized GLMs as "AI solutions", they might not just stretching the truth; they might be potentially confusing actuaries about fundamental capabilities, infrastructure requirements, and achievable outcomes.

At insureAI, we believe in technological honesty. As practitioners who have published extensively on both traditional actuarial methods and genuine AI applications based on the prevailing paradigm in AI systems (deep learning), we are positioned to clarify these critical distinctions. Our mission extends beyond developing cutting-edge solutions – we are committed to elevating the entire profession's understanding of what truly constitutes AI in actuarial science.

Understanding the Analytical Spectrum: From Statistics to AI

To appreciate our view as to why regularized GLMs are not AI in the modern sense, we need to understand the fundamental evolution of analytical approaches in insurance. This progression represents not just incremental improvements but paradigm shifts in how we approach risk quantification and prediction.

Traditional Statistical Methods: The Foundation

Statistical approaches like GLMs have formed the historical bedrock of actuarial science. These methods embody decades of statistical insight and actuarial know-how, providing interpretable, theoretically grounded approaches to risk assessment. At their core, they require explicit model specification where actuaries define the mathematical relationship between variables. Through parameter estimation using maximum likelihood techniques, we find optimal coefficients that quantify these relationships. The resulting models offer clear interpretability (we understand precisely how each factor influences the outcome) and enable statistical inference through hypothesis testing and confidence intervals.

For example, a traditional motor insurance GLM might specify that log claim frequency equals a base rate plus coefficients for age bands, vehicle groups, and territories. Note that for brevity, we have omitted link functions and exposure terms that would be included in practice. The crucial point is that the actuary explicitly defines these relationships based on domain expertise, drawing on years of experience and actuarial judgment.

Machine Learning: The Prediction Revolution

Machine learning algorithms like gradient boosting machines (GBMs) and random forests marked a paradigm shift in analytical philosophy. Rather than starting with explicit relationships, these algorithms discover patterns through algorithmic exploration. While tree-based GBMs can automatically learn some non-linear interactions, they still often rely heavily on engineered tabular inputs - a key distinction from deep learning approaches (we will have more to say about this in a future publication).

The focus shifts dramatically from inference to prediction. Instead of asking "why does this factor matter?", we ask "how accurately can we predict outcomes?". These algorithms capture complex interactions automatically, finding patterns that would be difficult or impossible to specify manually. However, they still require significant feature engineering, where domain experts must transform raw data into meaningful inputs.

Consider how a GBM might approach the same motor insurance problem. It could discover that the interaction between driver age, time of day, and weather conditions significantly impacts claim frequency - a three-way interaction that would be challenging to specify and test in a traditional GLM framework. Yet the GBM may still requires us to provide engineered features like "night driving percentage" or "harsh braking frequency" to attain optimal predictive performance rather than learning these concepts from raw data.

Modern AI (Deep Learning): The Representation Learning Breakthrough

Contemporary AI introduces fundamentally different capabilities through deep neural networks, representing – in the insureAI view - the most significant leap in analytical capability since the introduction of computers to actuarial work. The key innovation is automatic feature learning (what researchers call representation learning) where models discover optimal representations directly from raw data without manual intervention.

Modern deep learning systems can process unstructured data directly, whether that is text from claim descriptions, images from property inspections, or streaming sensor data from telematics devices. Through hierarchical learning across multiple neural network layers, these systems build increasingly abstract concepts. They can leverage transfer learning to apply knowledge from one domain to another, and they optimize end-to-end, learning the entire analytical pipeline from raw input to final prediction.

The state of the art has evolved rapidly. As noted in Bhattacharya et al.'s 2025 systematic review in Frontiers in AI, transformer architectures with attention mechanisms are now starting to dominate performance benchmarks for both sequential and tabular data, not just in natural language processing and computer vision applications where they first gained prominence.

To illustrate the practical difference: our deep learning models at insureAI currently ingest raw telematics data - GPS coordinates, acceleration patterns, timestamps - and automatically learn that certain combinations indicate risky driving behaviours. The model might discover that rapid deceleration approaching intersections during rush hour in rainy conditions represents a distinct risk pattern, all without any human ever explicitly programming or coding this relationship.

Beyond actuarial science, deep learning has become the dominant paradigm in the quest to create AI systems. Sam Altman – the founder of OpenAI - distills a decade of AI progress into three simple words: deep learning worked. In “The Intelligence Age,” he explains that once researchers saw how reliably larger neural networks improved with more data and compute, they realized they had found an algorithm that can learn almost any data distribution. Scaling, not exotic new theory, became the engine of advance - each jump in resources being used to train deep neural networks unlocks fresh capabilities, from natural language reasoning to autonomous problem solving, convincing Altman that superintelligence could be only a few thousand days away. At insureAI, we believe that actuarial intelligence will be unlocked in a similar way.

Regularized GLMs: Why Advanced Statistics ≠ AI

Now let us examine the specific case of marketing that prompted this discussion: GLMs enhanced with regularization techniques. These methods have gained significant popularity and are marketed by some as "AI solutions." But do they truly represent artificial intelligence in the modern sense? Not in our opinion!

What Regularization Adds to GLMs

Regularization techniques - including lasso, ridge, elastic net, fused lasso, and group lasso - enhance traditional GLMs in meaningful ways. They enable automated variable selection, with lasso famously setting unimportant coefficients exactly to zero. They handle high-dimensional data gracefully, working effectively even when predictors outnumber observations. Through penalty terms that constrain model complexity, they reduce overfitting and improve out-of-sample performance. Advanced variants like fused lasso can even group similar categories, automatically discovering that certain geographical regions share similar risk profiles.

These are genuinely valuable improvements that address real limitations of traditional GLMs. A regularized GLM might automatically discover that among 200 potential rating factors, only 30 truly influence claims experience, saving actuaries significant time in model development and reducing the risk of spurious relationships.

Why Regularized GLMs Remain Statistical Models

Despite these enhancements, regularized GLMs fundamentally remain within the statistical modelling paradigm rather than representing true AI. Beware attempt to give fancy new modern names to old models and methods from the statistical literature; we cite some of the original papers below. The distinction lies not in their sophistication but in their fundamental architecture and capabilities.

First, the model structure remains predefined. The actuary must still specify the link function, error distribution, and basic functional form. While regularization helps select variables, it does not change the fundamental GLM framework.

Second, these models remain linear in their parameters. While a GLM can incorporate non-linear transformations of inputs through basis expansions like splines or polynomial terms, the learning algorithm does not autonomously discover these expansions - they must be specified by the modeler.

Third, and perhaps most critically, regularized GLMs cannot learn features from raw data. They operate only on pre-engineered features, requiring human experts to transform raw information into model inputs. While text can be vectorized and fed into GLMs using techniques like Term Frequency-Inverse Document Frequency (TF-IDF), the resulting models typically show inferior performance compared to deep learning encoders that can understand semantic meaning.

Fourth, these models cannot build hierarchical abstractions. Unlike deep networks that construct increasingly complex representations through multiple layers, regularized GLMs work with a single layer of hand-crafted features. They cannot discover that certain low-level patterns combine to form higher-level risk indicators.

Consider this practical comparison: A regularized GLM approach requires pre-engineered features like "average speed," "hard braking events," and "night driving percentage". The model applies penalties to select important features and estimate their coefficients, ultimately producing a linear combination of the selected features. In contrast, a true AI approach ingests raw GPS traces, accelerometer data, and timestamps directly. The neural network learns that certain acceleration patterns combined with location and time indicate risky behaviours, outputting a complex non-linear function automatically discovered from the data.

The Real-World Performance Gap

The distinction between regularized statistics and AI is not merely an academic one - it translates to measurable business impact that directly affects competitive positioning and profitability. Based on published research and our implementation experience at insureAI, the performance improvements are substantial and consistent across different insurance applications.

Predictive Accuracy Improvements

Modern systematic reviews provide compelling evidence for the superiority of deep learning approaches. For instance, mortality forecasting studies have shown that deep learning methods can substantially outperform industry-standard benchmarks, markedly improving predictive accuracy. Similarly, neural network approaches applied to claims reserving and motor liability pricing have consistently demonstrated reductions in prediction errors compared to traditional statistical methods, particularly where complex data patterns exist.

This lift is particularly striking in telematics, where features automatically extracted by auto-encoders have proven more predictive than traditional variables. Studies in similar applications have shown that hybrid deep learning models like Convolutional Neural Networks and Long Short-Term Memory networks can significantly outperform traditional GLM baselines, though published gains of 15 - 30% should be treated as illustrative unless directly verified in telematics pricing.

Beyond accuracy, AI can deliver massive efficiency gains, with one case showing a neural network surrogate for calculating variable-annuity Greeks delivering a 100-fold speed-up over Monte Carlo simulations.

However, it is crucial to apply these techniques appropriately, as they do not necessarily outperform simpler models when applied to highly random or noise-driven data series like equity and bond returns.

Taken together, of the available evidence demonstrates that deep learning can deliver notable performance enhancements in actuarial tasks involving complex, non-linear data relationships. Positioning these techniques as valuable complements to traditional statistical approaches.

Capability Expansions

Beyond improving existing processes, true AI enables entirely new applications that would be impossible with statistical methods alone. Modern computer vision models can automatically assess vehicle damage from photographs, estimating repair costs with accuracy approaching that of experienced adjusters. Natural language processing using transformer architectures can identify fraudulent patterns in claim descriptions that no rule-based system could detect. Real-time risk scoring becomes feasible when neural networks process streaming telematics data, enabling truly dynamic usage-based insurance products. Perhaps most importantly for our climate-challenged future, deep learning can integrate satellite imagery, weather data, and historical claims patterns to model climate risks with unprecedented accuracy.

A Case Study: Telematics-Based Pricing

Let me share a concrete example from our work at insureAI with telemetry data that illustrates the transformative potential of genuine AI. Using traditional approaches - including sophisticated regularized GLMs - actuaries would manually engineer features from the raw data. They might calculate harsh braking counts, determine night driving percentages, and compute highway versus city driving ratios. Building a GLM with these engineered features typically achieves a lift ratio of about 2.5 between the best and worst risk deciles. Our deep learning approach – applied in a novel way to a commercial line of insurance business - takes a fundamentally different path. We feed raw GPS and sensor data directly to a neural network without any feature engineering. The network automatically learns complex patterns that no human would think to specify - for instance, allowing us to identify vehicles behaving in a subtly different and more risky manner during business-as-usual operations.

Insurers using our AI-inspired approaches can price more accurately, attracting profitable risks while avoiding underpriced policies. They discover risk factors that human experts never considered, continuously improving their understanding of what drives claims.

Making Informed Decisions: A Practical Framework

Given these fundamental distinctions, how should actuaries and insurance leaders evaluate analytical solutions? We've developed a practical framework based on our experience implementing both traditional and AI-based solutions across the insurance industry.

Assessing Your Actual Needs

The choice between regularized GLMs and true AI should be driven by your specific business context and objectives, not by marketing claims or competitive pressure. Regularized GLMs remain excellent choices when interpretability is paramount - for instance, in regulatory filings where you must explain and defend every rating factor. They work well with smaller datasets or purely tabular data where the additional complexity of deep learning may not be justified. When traditional governance frameworks must apply, or when quick implementation is critical, regularized GLMs offer a proven path with established best practices.

Consider true AI when predictive accuracy is the primary goal and small improvements translate to significant business value. If you have access to unstructured data - text from claims descriptions, images from property inspections, or sensor data from IoT devices - deep learning may be the only way to extract value effectively. When you suspect complex patterns exist in your data that would be difficult to specify manually, or when competitive advantage justifies the additional investment in infrastructure and expertise, AI approaches become compelling.

Asking the Right Questions

When evaluating potential vendor claims of "AI-powered" solutions, cut through the marketing with specific technical questions.

Ask about data handling capabilities: "Can your system process raw telematics traces, claim photos, or medical reports directly, or does it require pre-processed features?".
Probe their approach to feature engineering: "Do we need to manually create features, or does the system learn them automatically from raw data?".
Investigate the technical architecture: "What specific neural network architectures do you employ? Are you using convolutional networks for image data, recurrent networks for sequences, or transformers for complex dependencies?".
Understand infrastructure requirements: "What computing resources are required for training and deployment? Do you support GPU acceleration? What are the latency characteristics for real-time scoring?".
Ask about explainability and governance: "What explainability tooling do you provide? Do you support SHAP values, Integrated Gradients, or other interpretation methods? How do you help us meet regulatory requirements for model transparency?".
Finally, demand validation: "Can you demonstrate superior performance on a benchmark dataset? Do you have published research or white papers on your methodology?".

Demanding Proof, Not Promises

Move beyond vendor assertions to concrete evidence. Request performance comparisons on standard benchmark datasets that allow apples-to-apples comparisons. Look for published research or detailed white papers that explain the methodology - genuine AI innovations are often accompanied by peer-reviewed publications. Examine case studies with quantified improvements, not just directional claims. Review technical architecture documentation to understand what is actually under the hood.

Be particularly wary of claims that seem too good to be true or vendors who cannot provide technical details. True AI capabilities come with trade-offs and limitations that vendors will discuss openly.

The Path Forward: Embracing Both Statistics and AI

At insureAI, we are not advocating for AI supremacy – we are calling for precision and clarity in how we describe and apply different analytical approaches. The future of insurance analytics is not about choosing sides but about building a sophisticated toolkit where each approach is used for its strengths.

Regularized GLMs remain excellent statistical tools with enduring value. They excel in regulatory pricing models where transparency is non-negotiable. For quick exploratory analysis where you need to understand relationships between variables, their interpretability is invaluable. In situations with limited data or where the additional complexity of AI is not justified, they provide robust, defensible results. They serve as excellent baselines against which to measure the incremental value of more complex approaches.

True AI excels in different scenarios. When maximizing predictive accuracy can deliver competitive advantage, the investment in deep learning often pays for itself quickly. For integrating diverse data types - combining structured policy data with unstructured text and images - AI approaches are often the only viable option. They excel at discovering unknown patterns that human experts would never think to investigate. Most importantly, they can automate complex analytical processes, freeing actuaries to focus on strategic questions rather than feature engineering.

As we look toward the future, emerging techniques promise to blur some of these distinctions while creating new opportunities. Foundation models and large language models are beginning to transform insurance applications. Recent work by Milliman shows that instruction-tuned language models outperform legacy rule-based natural language processing by 18-30% in claims triage and fraud flagging tasks. Even insurers without vast data reservoirs can leverage foundation models-as-a-service to unlock value from unstructured text.

Graph neural networks (GNNs) show particular promise for spatial and systemic risk modelling. A 2024 study from Los Alamos National Laboratory demonstrated that statistically-augmented GNNs reduced blackout severity prediction error by 42% compared to XGBoost baselines. For property and casualty insurers, these techniques could revolutionize catastrophe modelling and supply chain risk assessment.

Privacy-preserving techniques like federated learning are opening new possibilities for industry collaboration. The Society of Actuaries' 2024 report on federated learning demonstrates that carriers can achieve superior models by collaborating without sharing raw data - a federated Tweedie pricing model across three carriers achieved 9% lower mean absolute error than any single-carrier model.

Conclusion: Technological Honesty in the Age of AI

As the insurance industry accelerates its digital transformation, the pressure to claim "AI capabilities" intensifies. But potentially confusing terminology does not just confuse - it actively harms the industry. It creates unrealistic expectations that lead to failed projects and wasted resources. It obscures genuine innovation, making it harder to identify truly transformative technologies. It delays adoption of technologies that could deliver real competitive advantage. Most perniciously, it leads to misallocation of resources to incremental improvements marketed as breakthroughs.

The regulatory dimension adds another layer of importance to this discussion. As frameworks like the EU AI Act and emerging NAIC model bulletins define governance requirements for AI systems, the distinction between statistical models and true AI becomes a compliance issue, not just a technical one. Organizations must understand what they are actually implementing to ensure appropriate governance structures are in place.

At insureAI, we have spent years developing and implementing genuine AI-inspired solutions for actuarial applications. We have seen firsthand the transformative power of true deep learning - improvements in risk differentiation that fundamentally change the economics of insurance. We have also seen the continued value of traditional statistical methods in appropriate contexts. Both have their place but conflating them serves no one.

As you evaluate analytical solutions for your organization, we encourage you to look beyond marketing labels to understand actual capabilities. Match technology to need rather than choosing based on buzzwords. Demand technical clarity from vendors about their methods - those with genuine innovations will be happy to explain. Most importantly, invest in understanding these distinctions within your organization. The future of insurance analytics is bright, with both sophisticated statistical methods and genuine AI playing important roles. But realizing this future requires that we speak precisely about our tools and honestly about their capabilities.

Ready to explore what AI inspired solutions can do for your actuarial processes? Contact our team at insureAI to see how our deep learning solutions can transform your insurance analytics - no marketing hype, just proven technology and measurable results. We'll show you exactly how our approaches differ from traditional methods and help you determine which tools are right for your specific challenges.

References

Altman, S. (2024, September 23). The intelligence age. https://ia.samaltman.com/

Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798-1828. Bhattacharya, S., et al. (2025). AI revolution in insurance: bridging research and reality. Frontiers in Artificial Intelligence. Available at: https://pmc.ncbi.nlm.nih.gov/articles/PMC12014612/

Breiman, L. (2001). Statistical modeling: the two cultures. Statistical Science, 16(3), 199-231.

Global Market Insights. (2025). Insurance Telematics Market Size Report. Available at: https://www.gminsights.com/industry-analysis/insurance-telematics-market

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

Gorka, J., et al. (2024). Real-time Risk Prediction of Cascading Blackouts with Graph Neural Networks. arXiv:2403.15363. Available at: https://arxiv.org/html/2403.15363v1

Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical learning with sparsity: the lasso and generalizations. CRC press.

International Journal of Finance and Management Research. (2024). Real-Time Risk Assessment in Insurance: A Deep Learning Approach. Available at: https://www.ijfmr.com/papers/2024/6/31710.pdf

Milliman. (2024). The potential of large language models in the insurance sector. Available at: https://www.milliman.com/en/insight/potential-of-large-language-models-insurance-sector

Richman, R. (2020). AI in actuarial science – a review of recent advances – part 1. Annals of Actuarial Science, 1-23.

Richman, R., & Wüthrich, M. V. (2019). A neural network extension of the Lee-Carter model to multiple populations. Annals of Actuarial Science, 13(2), 268-281.

Shmueli, G. (2010). To explain or to predict? Statistical Science, 25(3), 289-310.

Society of Actuaries. (2024). Federated Learning for Insurance Companies. Available at: https://www.soa.org/globalassets/assets/files/resources/research-report/2024/federated-learning-insurance-companies.pdf

Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B, 67(1), 91-108.