Scheduling and sampling arrive for Google Cloud Dataprep
By Eric Anderson, Product Manager
Google Cloud Dataprep, which has been available to the public in a beta release for just a month, had its first public update on Thursday. Included is a fresh UI, job scheduling, and richer sampling options. Let’s take a look at each of them.
Throughout our early releases, users’ most common request has been Flow scheduling. As of Thursday’s release, Flows can be scheduled with minute granularity at any frequency. When a Flow schedule executes, any designated Datasets are published. Your scheduled publishing destination can even be different from that used for manual execution (development). can even specify different publishing destinations for.
A fresh user interface
Cloud Dataprep is easy to explain because people understand its value almost immediately upon seeing it. In part, that’s because the pain of data preparation is almost universally known, but also because the visual experience of Dataprep is intuitive. That said, there is a world of functionality and expressiveness within Dataprep that may not have been immediately apparent, until today.
With this release, new users of Dataprep are greeted with a preloaded sample dataset, a step-by-step in-product walkthrough, and videos to guide the way. If you haven’t tried Dataprep yet, now’s a good time. If you have tried Dataprep, you’ll notice a reorganized and updated visual interface, as shown here:
Finally, power users shared that they wanted more expressive sampling options. Consider a dataset with lots of mistakes. Not all of those errors are likely to be included in a simple top-of-file sample. As such, they may go untreated and end up in your published datasets. For the example described, you might use the new stratified sampling technique to ensure all the permutations of a column are included in the sample.
To experience all of the above for yourself, head over to Cloud Dataprep right now and get started. Meanwhile, we are already hard at work on the next release and can’t wait to share what’s next.