Google Cloud Big Data and Machine Learning Blog

Innovation in data processing and machine learning technology

Using machine learning for insurance pricing optimization

Wednesday, March 29, 2017

By Kaz Sato, Staff Developer Advocate, Google Cloud

AXA, the large global insurance company, has used machine learning in a POC to optimize pricing by predicting “large-loss” traffic accidents with 78% accuracy.

The TensorFlow machine-learning framework has been open source since just 2015, but in that relatively short time, its ecosystem has exploded in size, with more than 8,000 open source projects using its libraries to date. This increasing interest is also reflected by its growing role in all kinds of image-processing applications (with examples including skin cancer detection, diagnosis of diabetic eye disease and even sorting cucumbers), as well as natural-language processing ones such as language translation.

We're also starting to see TensorFlow used to improve predictive data analytics for mainstream business use cases, such as price optimization. For example, in this post, I’ll describe why AXA, a large, global insurance company, built a POC using TensorFlow as a managed service on Google Cloud Machine Learning Engine for predicting "large-loss" car accidents involving its clients.

Understanding the use case

Approximately 7-10% of AXA’s customers cause a car accident every year. Most of them are small accidents involving insurance payments in the hundreds or thousands of dollars, but about 1% are so-called large-loss cases that require payouts over $10,000. As you might expect, it’s important for AXA adjusters to understand which clients are at higher risk for such cases in order to optimize the pricing of its policies.

Toward that goal, AXA’s R&D team in Japan has been researching the use of machine learning to predict if a driver may cause a large-loss case during the insurance period. Initially, the team had been focusing on a traditional machine-learning technique called Random Forest. Random Forest is a popular algorithm that uses multiple Decision Trees (such as possible reasons why a driver would cause a large-loss accident) for predictive modeling. Although Random Forest can be effective for certain applications, in AXA's case, its prediction accuracy of less than 40% was inadequate.

In contrast, after developing an experimental deep learning (neural-network) model using TensorFlow via Cloud Machine Learning Engine, the team achieved 78% accuracy in its predictions. This improvement could give AXA a significant advantage for optimizing insurance cost and pricing, in addition to the possibility of creating new insurance services such as real-time pricing at point of sale. AXA is still at the early stages with this approach architecting neural nets to make them transparent and easy to debug will take further development but it’s a great demonstration of the promise of leveraging these breakthroughs.

How does it work?

AXA created a cool demo UI for the test bed. Let's look at the details of its neural-network model that achieved the improvement.

AXA's deep learning model demo UI

At the left side, you can see there are about 70 values as input features including the following.

  • Age range of the driver
  • Region of the driver's address
  • Annual insurance premium range
  • Age range of the car

AXA entered these features into a single vector with 70 dimensions and put it into a deep learning model in the middle. The model is designed as a fully connected neural network with three hidden layers, with a ReLU as the activation function. AXA used data in Google Compute Engine to train the TensorFlow model, and Cloud Machine Learning Engine’s HyperTune feature to tune hyperparameters.

The following is the end result. The red line shows the accuracy rate with the deep learning model (78%).

Test results for POC

TensorFlow on business data

AXA's case is one example of using machine learning for predictive analytics on business data. As another example, recently DeepMind used a machine-learning model to reduce the cost of Google data-center cooling by 40%. The team entered numerical values acquired from IoT sensors in Google data centers (temperatures, power, pump speeds, setpoints and so on) into a deep learning model and got better results than the existing approach.

To learn more about business applications of machine learning (including the AXA example), watch this session video from Google Cloud NEXT ‘17:

You may also want to watch these sessions:

  • Big Data Solutions

  • Product deep dives, technical comparisons, how-to's and tips and tricks for using the latest data processing and machine learning technologies.

  • Learn More

12 Months FREE TRIAL

Try BigQuery, Machine Learning and other cloud products and get $300 free credit to spend over 12 months.