Many options for many scenarios
Lift and shift Hadoop clusters
Rapidly migrate your existing Hadoop and Spark deployment as is to Google Cloud without re-architecting. Take advantage of Google Cloud’s fast and flexible compute infrastructure as a service, Compute Engine, to provision your ideal Hadoop cluster and use your existing distribution. Let your Hadoop administrators focus on cluster usefulness, not server procurement and solving hardware issues.
Optimize for cloud scale and efficiency
Drive down Hadoop costs by migrating to Google Cloud’s managed Hadoop and Spark service, Cloud Dataproc. Explore new approaches for processing data in a Hadoop ecosystem by separating storage and compute using Cloud Storage as well as exploring the practice of on-demand ephemeral clusters.
Modernize your data processing pipeline
Reduce your Hadoop operational overhead by considering cloud-managed services to remove complexity from how you process data. For streaming analytics, explore using a serverless option like Cloud Dataflow to handle real-time streaming data needs. For Hadoop use cases focused on analytics and that use SQL compatible solutions like Apache Hive, consider BigQuery, Google’s enterprise-scale serverless cloud data warehouse.
With Cloud Dataproc, we implemented autoscaling which enabled us to increase or reduce the clusters easily depending on the size of the project. On top of that, we also used preemptible nodes for parts of the clusters, which helped us with efficiency of costs.Orit Yaron, VP of Cloud Platform, Outbrain