Google Cloud for Data Center Professionals: Management

This article discusses Google Cloud's management and operations offerings and how they compare to traditional data center technologies. The article focuses on the following service types:

  • Identity and access management
  • Logging (including audit logging)
  • Monitoring
  • Resource provisioning

Identity and access management

When you move your workloads to Google Cloud, you can continue to manage your end users by using common user management services and tools like LDAP, Active Directory, or Google Workspace's Admin service. However, the way you manage access to networked resources diverges slightly from what you might be used to.

On Google Cloud, you configure access to your virtual resources by using Identity and Access Management (IAM). IAM does not directly manage identities. Instead, it allows you to assign roles, which are user-defined collections of access permissions, to several Google identity types. Supported types include:

You can assign roles to non-Google identity types as well. To do so, you bind the non-Google type to a Google identity type, and then assign the role to the Google identity type. See Best Practices for Enterprise Organizations for more information.

Service accounts

When you run an application in a traditional data center, you typically create a separate identity that your application uses while running. Your application then performs operations requiring authentication, such as API calls, under that identity.

You can use the same model on Google Cloud by using service accounts. Service accounts are special accounts that belong to either your application or a Compute Engine virtual machine (VM) instance rather than to an individual end user. Your application uses the service account to call the APIs of Google services. You can create custom service accounts for your Google Cloud project by using the Google Cloud Console, the IAM API, or the Cloud SDK.

For more information about service accounts, see Service Accounts.

Google Groups

You can use Google Groups to apply a role to a collection of users. You simply create a Google Group, add your Google accounts or service accounts to the group, and then apply your IAM role to the group.

For more information about Google Groups, see the Google Groups Help Center.

Logging

Logging on Google Cloud is nearly identical to logging in a traditional data center environment. To collect logs, you install a logging agent on each virtual machine. The logging agent logs information about the machine and its applications, and then sends the logs to a centralized location to be indexed. After the logs have been aggregated and indexed, they can be processed, analyzed, or visualized.

Google Cloud offers a powerful integrated suite of logging-oriented services. In the Google Cloud stack, Cloud Logging serves as the centralized collection and indexing service, aggregating logs from your Compute Engine VM instances and your other Google Cloud resources. You can search and filter your logs in the Cloud Console's built-in logs viewer, or you can stream your logs to other Google Cloud endpoints such as Cloud Storage, BigQuery, and Pub/Sub for further processing and analysis.

Collecting logs

To collect logs from your Compute Engine VM instances, you install the Cloud Logging agent on each instance. The agents then automatically start sending their log data to Cloud Logging.

If you plan to maintain a hybrid architecture, in which you will maintain some resources on Google Cloud and other resources elsewhere, you can still take advantage of Google Cloud's logging-related services. For example, the Fluentd community plugins page contains a set of plugins for streaming aggregated logs to the Cloud Logging API, BigQuery, Cloud Storage, and Pub/Sub. Similarly, the Logstash output plugins page contains plugins for streaming logs to BigQuery and Cloud Storage.

You can also use Cloud Logging if your hybrid architecture includes virtual machines on other clouds. In particular, the Cloud Logging agent can be installed directly on Amazon EC2 instances.

Indexing and storing logs

As Cloud Logging collects logs from your Compute Engine instances, it stores and indexes the logs. Cloud Logging retains logs for 30 days. To store your logs for later processing, analysis, or auditing, you can set up Cloud Logging to automatically export the logs to other Google Cloud services, including:

  • Cloud Storage, Google Cloud's distributed object storage service. Cloud Storage is a highly available, highly redundant, global storage service. Google Cloud also provides additional long-term storage classes to optimize costs for infrequently accessed data. For more details, see the Cloud Storage article.
  • BigQuery, Google Cloud's data warehouse service. BigQuery is a fully-managed, append-only database that can perform complex queries on billions of rows of data in seconds.
  • Pub/Sub, Google Cloud's publisher/subscriber-based messaging service. You can use Pub/Sub to stream your logs to third-party services or your own custom endpoints.

Analyzing and processing logs

In traditional data centers, intensive analysis and processing tasks must compete with other tasks for resources, and are subject to the same upfront capital investment and capacity constraints. On Cloud Platform, these issues disappear. You provision and pay for what you need when you need it—you don't have to worry about reserving time, cores, or storage resources.

Logs analysis and processing on Google Cloud is designed to be flexible. You can use Google Cloud's stack of logs analysis and processing tools, which includes:

  • The Cloud Logging Logs Viewer
  • BigQuery
  • Dataproc
  • Dataflow

If you prefer to use your current analysis and processing tools, you can do so by setting them up on Compute Engine VM instances. You can also integrate the Cloud Logging stack into your current analysis and processing pipelines.

Logs analysis with the Cloud Logging Logs Viewer and BigQuery

The Cloud Console provides a built-in Cloud Logging Logs Viewer that you can use to search and filter your logged data. For large-scale analysis across a massive dataset, you can stream or export your logs into BigQuery. Unlike a bulky MapReduce job, which can take minutes or hours, BigQuery is capable of performing queries on terabytes of logs in tens of seconds in some cases, allowing you to quickly identify application anomalies, perform audit logs analysis, perform trend analysis on your logs, and more.

Logs processing with Dataproc and Dataflow

If you need to process your logs before analysis, you can export them to Pub/Sub or Cloud Storage, and then use Dataproc or Dataflow to process them. After you process your logs with Dataproc or Dataflow, you can send the results to BigQuery for interactive or batch analysis.

Dataproc is Google Cloud's managed Apache Hadoop and Apache Spark service. If you're already using Hadoop and Spark, you can reuse your existing jobs on Dataproc without worrying about acquiring hardware resources ahead of time. Because you can store both the original and the processed logs in Cloud Storage, you can also shut down your Dataproc cluster when not in use. Dataproc clusters charge you only for the virtual CPU resources you use, and only for the length of time you use them.

You might also consider Dataflow, a fully managed stream and batch processing service. Based on Apache Beam, Dataflow is a true serverless solution. Dataflow dynamically provisions and allocates resources on demand, minimizing latency while maintaining high utilization efficiency. For more information about how you can integrate Dataflow into your logs processing pipeline, see Processing Logs at Scale with Dataflow.

Visualizing logs

Google provides two managed services that you can use to visualize your logs data:

  • Datalab, built on the Jupyter notebook model, allows you to query and visualize data stored in BigQuery and Cloud Storage. As a bonus, because Datalab is built on Jupyter notebooks, there is a large ecosystem already in place to help get you started.
  • Google Data Studio 360 supports creating customizable and shareable reports, and is backed by BigQuery.

In addition, several Google Cloud partner visualization services provide native connectors to BigQuery, including Tableau, BIME, and re:dash. See BigQuery Partners for more information.

Finally, for maximum customization, you can build out a dashboard on top of BigQuery using the BigQuery API and a JavaScript charting library of your choice. See Creating a BigQuery Dashboard for details.

Costs

Cloud Logging offers a Basic Tier, which is free up to 5 GB per month, and a Premium Tier. The Premium Tier is priced per monitored resource per month, prorated hourly.

If you export logs to Cloud Storage, BigQuery, or Pub/Sub, you will also be charged for the use of those services. If you export your logs to BigQuery, you will also be charged a small data-streaming fee. See BigQuery Pricing for more information.

Auditing

Google Cloud provides built-in audit logging, and makes it easy to integrate your current audit-logging solutions as well. For information on which services currently generate audit logs, see Cloud Audit Logs.

VM instance logging

If you use standard OS-native audit-logging solutions, such as syslog on Linux or Windows Event Log on Windows, you can set them up by creating a Linux or Windows VM instance on Compute Engine. You can also deploy your preferred third-party solutions on Compute Engine instances.

Google Cloud resource logging

Google Cloud also provides built-in resources for audit logging. Google Cloud provides two types of audit logging:

  • Admin Activity audit logs, which contain an entry for every API call or administrative action that modifies the configuration or metadata of a service or project. This log type is always enabled.
  • Data Access audit logs, which contain an entry for each instance of the following events:

    • An API call or administrative action that reads the configuration or metadata of a service or project.
    • An API call or administrative action that creates, modifies, or reads user-provided data managed by a service, such as data stored in a database service.

In most Google Cloud services, Data Access audit logs are not enabled by default, as they can have a much higher volume than Admin Activity logs. Data Access logs are also more restricted than Admin Activity logs. By default, only project owners and users with the Private Logs Viewer role can access Data Access logs. Admin Activity logs are visible to all project members.

Cloud Logging allows you to manage access to your audit logs through IAM, Google Cloud's identity and access management layer. For more information, see the Identity and access management section.

As with other Cloud Logging logs, audit logs are retained for 30 days by default. You can export your audit logs to BigQuery, Cloud Storage, and Pub/Sub if you want to retain them for a longer period of time.

Integrating with third-party services

Several third-party logging solutions, such as Splunk, provide add-ons that integrate with Google Cloud's audit logging service.

Monitoring

As with logging, monitoring in a cloud environment uses a model common to data center environments. You install a monitoring agent on the virtual machines you would like to monitor. This monitoring agent then sends metrics to a centralized location. From there, you can define alerting configurations and create dashboards to visualize your metrics in real time.

For monitoring tasks, Google Cloud provides Cloud Monitoring, a full-featured monitoring framework. As with logging, you can also use your current monitoring tools and services, such as Splunk, DataDog, the Elastic/ELK stack, Sensu, or Nagios.

Metric collection

To collect metrics from your Compute Engine VM instances, you install the Cloud Monitoring agent on each instance. The agents then automatically send metrics to Cloud Monitoring.

By default, Cloud Monitoring monitors machine resources for each instance, such as CPU load and network I/O. However, with a small amount of additional configuration, Cloud Monitoring can also monitor a number of common third-party applications as well, including the Apache Web Server, MongoDB, NGINX, Redis, and Varnish. See Monitoring Third-party Applications for more information.

In addition to collecting metrics from Compute Engine VM instances, Cloud Monitoring automatically collects metrics from several other Google Cloud services. The Cloud Monitoring agent can also be installed directly to Amazon EC2 instances, and Cloud Monitoring can be configured to collect metrics from many Amazon Web Services (AWS) services as well. For a complete list of metrics available in Cloud Monitoring, see Metrics List.

You can also create custom metrics, and then instrument your applications to send them to Cloud Monitoring through the Monitoring API.

Cloud Monitoring retains metrics for six weeks.

Alerting

Cloud Monitoring can be configured to send alerts to multiple endpoints. By default, supported alerting endpoints include:

If you'd like to target an endpoint that isn't natively supported by Cloud Monitoring, you can also configure a webhook. See Configuring Webhooks for more information.

Visualization

Like most monitoring frameworks, Cloud Monitoring provides a customizable dashboard UI that allows you to visualize events in a meaningful and actionable way. You can create charts that display specific metrics for a given resource type, aggregated metrics, metrics by a given resource ID, and more. In addition, you can view indexed event logs and incident lists.

The Cloud Console also provides visualizations on a per-service basis for common metrics such as CPU, disk usage, and network traffic. As with Cloud Logging, you can also use Datalab to visualize and manipulate your metrics data.

Costs

Cloud Monitoring offers a Basic Tier, which is free up to 5 GB per month, and a Premium Tier. The Premium Tier is priced per monitored resource per month, prorated hourly.

Resource provisioning

This section discusses ways in which you can provision your virtual resources on Google Cloud, and discusses the role of versioning in the provisioning process.

VM instance provisioning

Compute Engine includes some built-in features that streamline instance provisioning. You can create machine profiles called instance templates, and then assign them to instance groups to create tens or hundreds of identical instances as needed.

You can automatically scale the number of VM instances within these groups by using Compute Engine's built-in autoscaler. To use the autoscaler, you define the minimum and maximum number of instances that you want to be running at a given time, and then define the metrics against which the autoscaler will create or destroy instances. You can set the autoscaler to scale depending on CPU utilization, load balancer capacity, or your own custom metrics. For more information, see Autoscaling Groups of Instances.

General resource provisioning

You can automate the deployment of all of your Google Cloud resources with Cloud Deployment Manager. As with other configuration management tools, such as Puppet and Chef, you specify your resources in a deployment template, and then Cloud Deployment Manager uses this template to instantiate and manage your resources. You can specify your deployment template in several formats, including YAML, Python, and Jinja2.

Cloud Deployment Manager can also be used in conjunction with the tools you use to automate your machine configurations today, such as Chef, Puppet, Ansible, or Terraform. For an example configuration using Puppet, see Automating Configuration Management with Cloud Deployment Manager and Puppet.

Version control and source repositories

If you prefer to use your current version control solutions, you can do so on Google Cloud by hosting and running them on Google Cloud or by connecting to an externally hosted or managed service such as GitHub or Bitbucket.

For users familiar with Git, Google also provides Cloud Source Repositories, which are fully featured, private Git repositories hosted on Google Cloud. Cloud Source Repositories can be added to a local Git repository as a remote repository or hosted on GitHub or Bitbucket. Cloud Source Repositories also provide a source editor that you can use to browse, view, edit and commit changes to repository files from within the Cloud Console.

Costs

Cloud Deployment Manager does not charge for use, but any billable resources you deploy using Cloud Deployment Manager will incur charges.

During its beta release, Cloud Source Repositories is available at no cost.

What's next?

Check out the other Google Cloud for Data Center Professionals articles: