Google Cloud Big Data and Machine Learning Blog

Innovation in data processing and machine learning technology

Queue your questions: common queries from Google Cloud customers

Thursday, May 3, 2018
Barrett Williams, Big Data and Machine Learning Editor, Google Cloud

In March, I attended Gartner’s Data and Analytics Summit in Grapevine, Texas to get a better sense for the questions our users are asking as they move their workloads to the cloud. Here are some answers to the questions we most frequently heard on the show floor, gathered by working with my colleagues in the office, as well as during and after the event. I hope you’ll find them useful! And if this set of Q-and-As doesn’t address a question that’s been on your mind, feel free to reach out on Twitter.

Storing data and keeping compliant

Question: I support a lot of teams who use their own discrete data silos, and I’d like to access and ultimately store their contents in a centralized location. What is your data warehouse solution? And can it scale?

BigQuery is a great choice because it’s a scalable, efficient, and exceptionally fast data warehouse that enables you to run complex queries on massive datasets in a matter of seconds. Ingest all data—batch or streaming—from your business applications or IoT sensors with Cloud IoT Core, Cloud Pub/Sub, and Cloud Dataflow. You can eliminate many data silos by centrally storing both structured, semi-structured, or unstructured data with similar processes in Google Cloud Storage. You can then create advanced experiments with Cloud Datalab, Cloud Machine Learning Engine, and TensorFlow. If you are using BigQuery as a long-term storage service, feel free to check out our pricing rubric here.

Question: I work for a government agency, so compliance is a major consideration for me. What are your government certifications?

We have announced several international ISO security certifications, and for US federal and state agencies, we provide FedRAMP as our primary government regulatory compliance certification. As we announced in March, in multiple regions we now provide offerings that are certified to meet the security, monitoring, and audit needs of certain federal agencies.

Question: I work in the healthcare space, and our services need to safely and securely store sensitive patient information. Are your cloud services HIPAA compliant?

Yes, as we announced this past March at HIMSS, we’ve extended our HIPAA compliance to more GCP products, including machine learning APIs and IoT Core. This compliance enables hospital groups, insurers, and other users of health-related data to store sensitive patient information in the cloud without having to maintain legacy and on-premise systems.

Machine learning

Question: What are some ways to start experimenting with machine learning?

Over the past several months, I’ve discovered a variety of useful options for customers who want to get started with machine learning. My favorite would have to be Google’s Machine Learning Crash Course. It introduces TensorFlow from firstly an estimation or prediction perspective, rather than image classification or object detection. You start by building a recommendation engine, and then explore the ways in which machine learning can be applied to image recognition.

The ML Engine sample code and TensorFlow tutorials are also a great place to start if you already understand what you want to achieve, as well as some of the relevant terminology. But do keep in mind, although TensorFlow has gained widespread industry attention, you can train, estimate, and classify with other frameworks—scikit-learn and XGBoost—as well.

Question: How do I architect data collection and ML-powered analytics from remote IoT devices?

You can build a solution from Cloud IoT and Cloud Dataflow. Here is an example of a solution derived for the oil and gas industry to monitor remote wells; this solution can be applied to a variety of remote embedded devices that typically generate and distribute sensor data, including wearable devices, or cloud-connected cameras.

Question: Can I use machine learning frameworks other than TensorFlow?

Yes, we recently announced support for scikit-learn and XGBoost in Cloud ML Engine, including an online prediction beta. Of course you can install whichever frameworks you’d like to use in a Compute Engine instance, so that workflow affords ultimate flexibility for training and classifying with the frameworks and libraries of your choice.

Enterprise search options

Question: How do I deploy intranet search within my org if I use Google Cloud Platform?

Google Cloud has announced a partnership with Elastic to provide internal search services. You can deploy your own Elasticsearch server on App Engine, Compute Engine, or setup managed services with Elastic. Here is a more detailed description of these offerings.

Bootstrap on a shoestring

Question: How much can I really get done with your free trial? How close will I get to a proof-of-concept?

Lots. You get 1 terabyte of BigQuery queries, with 10 GB of data storage per month, 1000 units of detection on the Vision API, 60 minutes of recognition on the Speech API, and 5000 API responses from Google Cloud Natural Language. With these tools alone, you might be able to combine textual and audio customer feedback and run sentiment analysis, for example. Furthermore, if you’re a first time user, you can sign up for a $300 credit. The details are here.

Summary

We know that every customer’s needs in the cloud are a little different, and we’ll continue to launch new products and improve existing ones to help meet those needs. We hope this helps clarify some common questions we've heard from our customers. Learn more here or consider signing up for a free trial.

  • Big Data Solutions

  • Product deep dives, technical comparisons, how-to's and tips and tricks for using the latest data processing and machine learning technologies.

  • Learn More

12 Months FREE TRIAL

Try BigQuery, Machine Learning and other cloud products and get $300 free credit to spend over 12 months.

TRY IT FREE