Jump to Content
Training and Certifications

Now updated: Our Data Engineering Learning Path

February 11, 2020
Ajay Hemnani

Cloud Training

With the market for artificial intelligence and machine learning-powered solutions projected to grow to $1.2B by 2023, it’s important to consider business needs now and in the future. We've heard from our customers and have witnessed internally that the data engineering role has evolved and now requires a larger set of skills. In the past, data engineers worked with distributed systems and Java programming to use Hadoop MapReduce in the data center but now, they need to leverage AI, machine learning, and business intelligence skills to efficiently manage and analyze data. To address the new skills data engineers now need, we updated our Data Engineering on Google Cloud learning path.

We’ve added new course content to this learning path like introductions to Data Fusion and Cloud Composer. We also added more labs on advanced BigQuery, BigQuery ML, and Bigtable streaming to help you get more hands-on practice.

This learning path covers the primary responsibilities of data engineers and consists of five courses: 

  • Google Cloud Big Data and Machine Learning Fundamentals - Start off by learning the important GCP big data and machine learning concepts and terminologies. 

  • Modernizing Data Lakes and Data Warehouse with Google Cloud - Understand the responsibilities of data engineers, the business need for effective data pipelines, and the benefits of data engineering in the cloud. This course will also dig deeper into the use cases and available GCP solutions for data lakes and warehouses, the key components of data pipelines. 

  • Building Batch Data Pipelines on Google Cloud - Discover which paradigm to use for different batch data as this course walks you through the main data pipeline paradigms: extract-load, extract-load-transform or extract-transform-load. You’ll also learn more about data transformation technologies such as how to use BigQuery, execute Spark on Dataproc, pipeline graphs in Data Fusion, and do serverless data processing with Dataflow. 

  • Building Resilient Streaming Analytics Systems on Google Cloud - Learn how to build streaming data pipelines on Google Cloud, apply aggregations and transformations to streaming data using Dataflow, and store processed records to BigQuery or Bigtable for analysis in order to get real-time metrics on business operations. 

  • Smart Analytics, Machine Learning, and AI on Google Cloud -  Extract more insights from your  data by learning how to customize machine learning in data pipelines on Google Cloud in this course. You will learn how to use AutoML for when you need little to no customization and how to use AI Platform Notebooks and BigQuery ML for more tailored machine learning capabilities. You will also be taught how to productionalize machine learning solutions using Kubeflow Pipelines. 

Want to learn more? Join us for a special webinar Data Engineering, Big Data, and Machine Learning 2.0, on Feb 21 at 9:00 AM PST with Lak Lakshmanan, Head of Google Cloud Data Analytics and AI Solutions. We will go over what this learning path has to offer, demonstrate hands-on labs, and answer any questions you have. Also, just for attending the webinar, we will give you special discounts on training. Register today!

Posted in