Jump to Content
Data Analytics

Google Cloud and YouTube-8M Challenge: Predict YouTube video tags for a chance to win up to $30K

February 15, 2017
Philippe Poutonnet

Product Marketing Lead

Mike Styer

Strategic Partner Development Manager

 In partnership with YouTube, Google Research and Kaggle, Google Cloud Platform (GCP) invites you to participate in a large-scale video classification and representation learning task:

Using Google Cloud Machine Learning, TensorFlow, or your favorite machine learning framework, the competition challenges you to develop classification algorithms that accurately assign video-level labels using the YouTube-8M dataset. The dataset was created from over 7 million YouTube videos (450,000 hours of video) and includes video labels from a vocabulary of 4,716 classes (3.4 labels/video on average). It also comes with pre-extracted audio and visual features from every second of video (3.2B feature vectors in total). By taking part, Kagglers can not only play a pivotal role in setting state-of-the-art benchmarks, but can also improve search and organization of video archives.

Are you up for the challenge?

https://storage.googleapis.com/gweb-cloudblog-publish/images/kagglewjig.max-300x300.PNG

Some of the biggest breakthroughs in machine learning and machine perception have come thanks to large labeled datasets such as ImageNet, which includes millions of images labeled with thousands of classes, and has significantly accelerated research in image understanding. Google has released many such datasets for Cloud Machine Learning, from Word Vector Models to Deep Learning for Robots, and more recently a few vision-related datasets, including Open Images, YouTube-8M and YouTube-BoundingBoxes.Video represents another great opportunity to detect and recognize objects and understand human actions and interactions with the world. Improving our understanding of video imagery can lead to better video search, organization and discovery —

for personal memories, enterprise video archives or public video collections.

Getting started

  1. Review the data page for special instructions on how to access the competition's data. It will be hosted on Google Cloud. Participants have the option to download the data to work locally or work within the Google Cloud Machine Learning beta Platform.
  2. Review the tutorial on Getting Started with Google Cloud, and try the starter code.
  3. We've also provided a subsample of the data to explore on Kernels. Take a look at this Python notebook and create your own.
  4. Don't forget to review the prize eligibility details, which includes requirements for code open-sourcing and a paper submission.
Cloud Machine Learning is currently in beta, and we welcome your feedback on using the service. Please share your questions and thoughts on the Kaggle competition forum and Cloud Machine Learning questions to StackOverflow.

Posted in