Using scikit-learn on Kaggle and AI Platform Prediction

You can deploy scikit-learn models trained in Kaggle to AI Platform Prediction for serving predictions at scale.

This AI Adventures episode explains the basic workflow about how to take a model trained anywhere, including Kaggle, and serve online predictions from AI Platform Prediction.

Overview

  1. Train your scikit-learn model on Kaggle. You can see an example in this introduction to scikit-learn. See how to create a notebook kernel on Kaggle.
  2. Save your model using the sklearn.externals.joblib library, making sure to name the file model.joblib. Select the Commit & Run button to execute all of your kernel code cells in order. This saves and runs your model training code.

  3. Download model.joblib from your kernel outputs.

  4. Upload your model.joblib file to Cloud Storage.

  5. Create model and version resources on AI Platform Prediction using the Google Cloud console, providing information about how you trained your model and where you stored it in Cloud Storage.

  6. Send a prediction request.

Find your model files in Kaggle

You can download your model files from the Output tab in your kernel.

At the main link to your kernel, https://www.kaggle.com/[YOUR-USER-NAME]/[YOUR-KERNEL-NAME]/:

  1. Select the Output tab at the top of the page.
  2. Your model.joblib file appears in a list of Data Sources. To download the file, select the Download All button. Alternatively, hover your mouse over the name of the model, and then select the download icon that appears by the model name.

What's next