Google Cloud Platform for Data Center Professionals: Management

This article discusses Google Cloud Platform's management and operations offerings and how they compare to traditional data center technologies. The article focuses on the following service types:

  • Identity and access management
  • Logging (including audit logging)
  • Monitoring
  • Resource provisioning

Identity and access management

When you move your workloads to Cloud Platform, you can continue to manage your end users by using common user management services and tools like LDAP, Active Directory, or G Suite's Admin service. However, the way you manage access to networked resources diverges slightly from what you might be used to.

On Cloud Platform, you configure access to your virtual resources by using Google Cloud IAM. Cloud IAM does not directly manage identities. Instead, it allows you to assign roles, which are user-defined collections of access permissions, to several Google identity types. Supported types include:

You can assign roles to non-Google identity types as well. To do so, you bind the non-Google type to a Google identity type, and then assign the role to the Google identity type. See Best Practices for Enterprise Organizations for more information.

Service accounts

When you run an application in a traditional data center, you typically create a separate identity that your application uses while running. Your application then performs operations requiring authentication, such as API calls, under that identity.

You can use the same model on Cloud Platform by using service accounts. Service accounts are special accounts that belong to either your application or a Google Compute Engine virtual machine (VM) instance rather than to an individual end user. Your application uses the service account to call the APIs of Google services. You can create custom service accounts for your Cloud Platform project by using the Google Cloud Platform Console, the Cloud IAM API, or the Cloud SDK.

For more information about service accounts, see Service Accounts.

Google Groups

You can use Google Groups to apply a role to a collection of users. You simply create a Google Group, add your Google accounts or service accounts to the group, and then apply your Cloud IAM role to the group.

For more information about Google Groups, see the Google Groups Help Center.

Logging

Logging on Cloud Platform is nearly identical to logging in a traditional data center environment. To collect logs, you install a logging agent on each virtual machine. The logging agent logs information about the machine and its applications, and then sends the logs to a centralized location to be indexed. After the logs have been aggregated and indexed, they can be processed, analyzed, or visualized.

Cloud Platform offers a powerful integrated suite of logging-oriented services. In the Cloud Platform stack, Stackdriver Logging serves as the centralized collection and indexing service, aggregating logs from your Compute Engine VM instances and your other Cloud Platform resources. You can search and filter your logs in the Cloud Platform Console's built-in logs viewer, or you can stream your logs to other Cloud Platform endpoints such as Google Cloud Storage, Google BigQuery, and Google Cloud Pub/Sub for further processing and analysis.

Collecting logs

To collect logs from your Compute Engine VM instances, you install the Stackdriver Logging agent on each instance. The agents then automatically start sending their log data to Stackdriver.

If you plan to maintain a hybrid architecture, in which you will maintain some resources on Cloud Platform and other resources elsewhere, you can still take advantage of Cloud Platform's logging-related services. For example, the Fluentd community plugins page contains a set of plugins for streaming aggregated logs to the Stackdriver Logging API, BigQuery, Cloud Storage, and Cloud Pub/Sub. Similarly, the Logstash output plugins page contains plugins for streaming logs to BigQuery and Cloud Storage.

You can also use Stackdriver Logging if your hybrid architecture includes virtual machines on other clouds. In particular, the Stackdriver Logging agent can be installed directly on Amazon EC2 instances.

Indexing and storing logs

As Stackdriver Logging collects logs from your Compute Engine instances, it stores and indexes the logs. Stackdriver Logging retains logs for 30 days. To store your logs for later processing, analysis, or auditing, you can set up Stackdriver Logging to automatically export the logs to other Cloud Platform services, including:

  • Cloud Storage, Cloud Platform's distributed object storage service. Cloud Storage is a highly available, highly redundant, global storage service. Cloud Platform also provides additional long-term storage classes to optimize costs for infrequently accessed data. For more details, see the Storage article.
  • BigQuery, Cloud Platform's data warehouse service. BigQuery is a fully-managed, append-only database that can perform complex queries on billions of rows of data in seconds.
  • Cloud Pub/Sub, Cloud Platform's publisher/subscriber-based messaging service. You can use Cloud Pub/Sub to stream your logs to third-party services or your own custom endpoints.

Analyzing and processing logs

In traditional data centers, intensive analysis and processing tasks must compete with other tasks for resources, and are subject to the same upfront capital investment and capacity constraints. On Cloud Platform, these issues disappear. You provision and pay for what you need when you need it—you don't have to worry about reserving time, cores, or storage resources.

Logs analysis and processing on Cloud Platform is designed to be flexible. You can use Cloud Platform's stack of logs analysis and processing tools, which includes:

  • The Stackdriver Logs Viewer
  • BigQuery
  • Google Cloud Dataproc
  • Google Cloud Dataflow

If you prefer to use your current analysis and processing tools, you can do so by setting them up on Compute Engine VM instances. You can also integrate the Cloud Platform logging stack into your current analysis and processing pipelines.

Logs analysis with the Stackdriver Logs Viewer and BigQuery

The Cloud Platform Console provides a built-in Stackdriver Logs Viewer that you can use to search and filter your logged data. For large-scale analysis across a massive dataset, you can stream or export your logs into BigQuery. Unlike a bulky MapReduce job, which can take minutes or hours, BigQuery is capable of performing queries on terabytes of logs in tens of seconds in some cases, allowing you to quickly identify application anomalies, perform audit logs analysis, perform trend analysis on your logs, and more.

Logs processing with Cloud Dataproc and Cloud Dataflow

If you need to process your logs before analysis, you can export them to Cloud Pub/Sub or Cloud Storage, and then use Cloud Dataproc or Cloud Dataflow to process them. After you process your logs with Cloud Dataproc or Cloud Dataflow, you can send the results to BigQuery for interactive or batch analysis.

Cloud Dataproc is Cloud Platform’s managed Apache Hadoop and Apache Spark service. If you're already using Hadoop and Spark, you can reuse your existing jobs on Cloud Dataproc without worrying about acquiring hardware resources ahead of time. Because you can store both the original and the processed logs in Cloud Storage, you can also shut down your Cloud Dataproc cluster when not in use. Cloud Dataproc clusters charge you only for the virtual CPU resources you use, and only for the length of time you use them.

You might also consider Cloud Dataflow, a fully managed stream and batch processing service. Based on Apache Beam, Cloud Dataflow is a true serverless solution. Cloud Dataflow dynamically provisions and allocates resources on demand, minimizing latency while maintaining high utilization efficiency. For more information about how you can integrate Cloud Dataflow into your logs processing pipeline, see Processing Logs at Scale with Cloud Dataflow.

Visualizing logs

Google provides two managed services that you can use to visualize your logs data:

  • Cloud Datalab, built on the Jupyter notebook model, allows you to query and visualize data stored in BigQuery and Cloud Storage. As a bonus, because Cloud Datalab is built on Jupyter notebooks, there is a large ecosystem already in place to help get you started.
  • Google Data Studio 360 supports creating customizable and shareable reports, and is backed by BigQuery.

In addition, several Cloud Platform partner visualization services provide native connectors to BigQuery, including Tableau, BIME, and re:dash. See BigQuery Partners for more information.

Finally, for maximum customization, you can build out a dashboard on top of BigQuery using the BigQuery API and a JavaScript charting library of your choice. See Creating a BigQuery Dashboard for details.

Costs

Stackdriver Logging offers a Basic Tier, which is free up to 5 GB per month, and a Premium Tier. The Premium Tier is priced per monitored resource per month, prorated hourly. For more information, see the Announcing pricing for Google Stackdriver blog post.

If you export logs to Cloud Storage, BigQuery, or Cloud Pub/Sub, you will also be charged for the use of those services. If you export your logs to BigQuery, you will also be charged a small data-streaming fee. See BigQuery Pricing for more information.

Auditing

Cloud Platform provides built-in audit logging, and makes it easy to integrate your current audit-logging solutions as well. For information on which services currently generate audit logs, see Google Cloud Audit Logging.

VM instance logging

If you use standard OS-native audit-logging solutions, such as syslog on Linux or Windows Event Log on Windows, you can set them up by creating a Linux or Windows VM instance on Compute Engine. You can also deploy your preferred third-party solutions on Compute Engine instances.

Cloud Platform resource logging

Cloud Platform also provides built-in resources for audit logging. Cloud Platform provides two types of audit logging:

  • Admin Activity audit logs, which contain an entry for every API call or administrative action that modifies the configuration or metadata of a service or project. This log type is always enabled.
  • Data Access audit logs, which contain an entry for each instance of the following events:

    • An API call or administrative action that reads the configuration or metadata of a service or project.
    • An API call or administrative action that creates, modifies, or reads user-provided data managed by a service, such as data stored in a database service.

In most Cloud Platform services, Data Access audit logs are not enabled by default, as they can have a much higher volume than Admin Activity logs. Data Access logs are also more restricted than Admin Activity logs. By default, only project owners and users with the Private Logs Viewer role can access Data Access logs. Admin Activity logs are visible to all project members.

Stackdriver Logging allows you to manage access to your audit logs through Cloud IAM, Cloud Platform's identity and access management layer. For more information, see the Identity and access management section.

As with other Stackdriver Logging logs, audit logs are retained for 30 days by default. You can export your audit logs to BigQuery, Cloud Storage, and Cloud Pub/Sub if you want to retain them for a longer period of time.

Integrating with third-party services

Several third-party logging solutions, such as Splunk, provide add-ons that integrate with Cloud Platform's audit logging service.

Monitoring

As with logging, monitoring in a cloud environment uses a model common to data center environments. You install a monitoring agent on the virtual machines you would like to monitor. This monitoring agent then sends metrics to a centralized location. From there, you can define alerting configurations and create dashboards to visualize your metrics in real time.

For monitoring tasks, Cloud Platform provides Stackdriver Monitoring, a full-featured monitoring framework. As with logging, you can also use your current monitoring tools and services, such as Splunk, DataDog, the Elastic/ELK stack, Sensu, or Nagios.

Metric collection

To collect metrics from your Compute Engine VM instances, you install the Stackdriver Monitoring agent on each instance. The agents then automatically send metrics to Stackdriver Monitoring.

By default, Stackdriver Monitoring monitors machine resources for each instance, such as CPU load and network I/O. However, with a small amount of additional configuration, Stackdriver Monitoring can also monitor a number of common third-party applications as well, including the Apache Web Server, MongoDB, NGINX, Redis, and Varnish. See Monitoring Third-party Applications for more information.

In addition to collecting metrics from Compute Engine VM instances, Stackdriver Monitoring automatically collects metrics from several other Cloud Platform services. The Stackdriver Monitoring agent can also be installed directly to Amazon EC2 instances, and Stackdriver Monitoring can be configured to collect metrics from many Amazon Web Services (AWS) services as well. For a complete list of metrics available in Stackdriver Monitoring, see Metrics List.

You can also create custom metrics, and then instrument your applications to send them to Stackdriver Monitoring through the Monitoring API.

Stackdriver Monitoring retains metrics for six weeks.

Alerting

Stackdriver Monitoring can be configured to send alerts to multiple endpoints. By default, supported alerting endpoints include:

If you'd like to target an endpoint that isn't natively supported by Stackdriver Monitoring, you can also configure a webhook. See Configuring Webhooks for more information.

Visualization

Like most monitoring frameworks, Stackdriver Monitoring provides a customizable dashboard UI that allows you to visualize events in a meaningful and actionable way. You can create charts that display specific metrics for a given resource type, aggregated metrics, metrics by a given resource ID, and more. In addition, you can view indexed event logs and incident lists.

The Cloud Platform Console also provides visualizations on a per-service basis for common metrics such as CPU, disk usage, and network traffic. As with Stackdriver Logging, you can also use Cloud Datalab to visualize and manipulate your metrics data.

Costs

Stackdriver Monitoring offers a Basic Tier, which is free up to 5GB per month, and a Premium Tier. The Premium Tier is priced per monitored resource per month, prorated hourly. For more information, see the Announcing pricing for Google Stackdriver blog post.

Resource provisioning

This section discusses ways in which you can provision your virtual resources on Cloud Platform, and discusses the role of versioning in the provisioning process.

VM instance provisioning

Compute Engine includes some built-in features that streamline instance provisioning. You can create machine profiles called instance templates, and then assign them to instance groups to create tens or hundreds of identical instances as needed.

You can automatically scale the number of VM instances within these groups by using Compute Engine's built-in autoscaler. To use the autoscaler, you define the minimum and maximum number of instances that you want to be running at a given time, and then define the metrics against which the autoscaler will create or destroy instances. You can set the autoscaler to scale depending on CPU utilization, load balancer capacity, or your own custom metrics. For more information, see Autoscaling Groups of Instances.

General resource provisioning

You can automate the deployment of all of your Cloud Platform resources with Google Cloud Deployment Manager. As with other configuration management tools, such as Puppet and Chef, you specify your resources in a deployment template, and then Deployment Manager uses this template to instantiate and manage your resources. You can specify your deployment template in several formats, including YAML, Python, and Jinja2.

Deployment Manager can also be used in conjunction with the tools you use to automate your machine configurations today, such as Chef, Puppet, Ansible, or Terraform. For an example configuration using Puppet, see Automating Configuration Management with Cloud Deployment Manager and Puppet.

Version control and source repositories

If you prefer to use your current version control solutions, you can do so on Cloud Platform by hosting and running them on Cloud Platform or by connecting to an externally hosted or managed service such as GitHub or Bitbucket.

For users familiar with Git, Google also provides Google Cloud Source Repositories, which are fully featured, private Git repositories hosted on Cloud Platform. Cloud Source Repositories can be added to a local Git repository as a remote repository or hosted on GitHub or Bitbucket. Cloud Source Repositories also provide a source editor that you can use to browse, view, edit and commit changes to repository files from within the Cloud Console.

Costs

Deployment Manager does not charge for use, but any billable resources you deploy using Deployment Manager will incur charges.

Cloud Source Repositories is free of charge during its beta release.

What's next?

Check out the other Google Cloud Platform for Data Center Professionals articles:

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Google Cloud Platform for Data Center Professionals