优化网络连接

Cloud Functions 简单易用,让您可以快速开发代码并在无服务器环境中运行。在规模适中的情况下,运行函数的成本很低,因此优化代码似乎不是一项高优先级的工作。但是,随着部署规模的扩大,代码优化会变得越来越重要。

本文档介绍如何优化函数的网络连接方式。优化网络连接方式可以获得如下好处:

  • 减少在每次函数调用时建立新连接所需的 CPU 时间。
  • 降低用尽连接配额或 DNS 配额的可能性。

维持持久性连接

本部分通过若干示例,说明如何在函数中维持持久性连接。如果不在函数中维持持久性连接,可能会迅速耗尽连接配额。

本部分包含下列使用场景:

  • HTTP/HTTPS
  • Google API

HTTP/S 请求

下面的优化代码段显示了如何维持持久性连接,而不是在每次函数调用时创建新连接:

Node.js

const fetch = require('node-fetch');

const http = require('http');
const https = require('https');

const functions = require('@google-cloud/functions-framework');

const httpAgent = new http.Agent({keepAlive: true});
const httpsAgent = new https.Agent({keepAlive: true});

/**
 * HTTP Cloud Function that caches an HTTP agent to pool HTTP connections.
 *
 * @param {Object} req Cloud Function request context.
 * @param {Object} res Cloud Function response context.
 */
functions.http('connectionPooling', async (req, res) => {
  try {
    // TODO(optional): replace this with your own URL.
    const url = 'https://www.example.com/';

    // Select the appropriate agent to use based on the URL.
    const agent = url.includes('https') ? httpsAgent : httpAgent;

    const fetchResponse = await fetch(url, {agent});
    const text = await fetchResponse.text();

    res.status(200).send(`Data: ${text}`);
  } catch (err) {
    res.status(500).send(`Error: ${err.message}`);
  }
});

Python

import functions_framework
import requests

# Create a global HTTP session (which provides connection pooling)
session = requests.Session()

@functions_framework.http
def connection_pooling(request):
    """
    HTTP Cloud Function that uses a connection pool to make HTTP requests.
    Args:
        request (flask.Request): The request object.
        <http://flask.pocoo.org/docs/1.0/api/#flask.Request>
    Returns:
        The response text, or any set of values that can be turned into a
        Response object using `make_response`
        <http://flask.pocoo.org/docs/1.0/api/#flask.Flask.make_response>.
    """

    # The URL to send the request to
    url = "http://example.com"

    # Process the request
    response = session.get(url)
    response.raise_for_status()
    return "Success!"

Go


// Package http provides a set of HTTP Cloud Functions samples.
package http

import (
	"fmt"
	"net/http"
	"time"

	"github.com/GoogleCloudPlatform/functions-framework-go/functions"
)

var urlString = "https://example.com"

// client is used to make HTTP requests with a 10 second timeout.
// http.Clients should be reused instead of created as needed.
var client = &http.Client{
	Timeout: 10 * time.Second,
}

func init() {
	functions.HTTP("MakeRequest", MakeRequest)
}

// MakeRequest is an example of making an HTTP request. MakeRequest uses a
// single http.Client for all requests to take advantage of connection
// pooling and caching. See https://godoc.org/net/http#Client.
func MakeRequest(w http.ResponseWriter, r *http.Request) {
	resp, err := client.Get(urlString)
	if err != nil {
		http.Error(w, "Error making request", http.StatusInternalServerError)
		return
	}
	if resp.StatusCode != http.StatusOK {
		msg := fmt.Sprintf("Bad StatusCode: %d", resp.StatusCode)
		http.Error(w, msg, http.StatusInternalServerError)
		return
	}
	fmt.Fprintf(w, "ok")
}

PHP

我们建议使用 Guzzle PHP HTTP 框架发送 HTTP 请求,因为它可以自动处理持久性连接。

访问 Google API

以下示例使用的是 Cloud Pub/Sub,但此方法也适用于其他客户端库,例如,Cloud Natural LanguageCloud Spanner。请注意,性能改进可能取决于特定客户端库的当前实现。

如果创建 Pub/Sub 客户端对象,则每次调用会执行一次连接和两次 DNS 查询。为了避免不必要的连接和 DNS 查询,请按照以下示例所示,在全局范围内创建 Pub/Sub 客户端对象:

Node.js

const functions = require('@google-cloud/functions-framework');
const {PubSub} = require('@google-cloud/pubsub');
const pubsub = new PubSub();

/**
 * HTTP Cloud Function that uses a cached client library instance to
 * reduce the number of connections required per function invocation.
 *
 * @param {Object} req Cloud Function request context.
 * @param {Object} req.body Cloud Function request context body.
 * @param {String} req.body.topic The Cloud Pub/Sub topic to publish to.
 * @param {Object} res Cloud Function response context.
 */
functions.http('gcpApiCall', (req, res) => {
  const topic = pubsub.topic(req.body.topic);

  const data = Buffer.from('Test message');
  topic.publishMessage({data}, err => {
    if (err) {
      res.status(500).send(`Error publishing the message: ${err}`);
    } else {
      res.status(200).send('1 message published');
    }
  });
});

Python

import os

import functions_framework
from google.cloud import pubsub_v1

# Create a global Pub/Sub client to avoid unneeded network activity
pubsub = pubsub_v1.PublisherClient()

@functions_framework.http
def gcp_api_call(request):
    """
    HTTP Cloud Function that uses a cached client library instance to
    reduce the number of connections required per function invocation.
    Args:
        request (flask.Request): The request object.
    Returns:
        The response text, or any set of values that can be turned into a
        Response object using `make_response`
        <http://flask.pocoo.org/docs/1.0/api/#flask.Flask.make_response>.
    """

    """
    The `GCP_PROJECT` environment variable is set automatically in the Python 3.7 runtime.
    In later runtimes, it must be specified by the user upon function deployment.
    See this page for more information:
        https://cloud.google.com/functions/docs/configuring/env-var#python_37_and_go_111
    """
    project = os.getenv("GCP_PROJECT")
    request_json = request.get_json()

    topic_name = request_json["topic"]
    topic_path = pubsub.topic_path(project, topic_name)

    # Process the request
    data = b"Test message"
    pubsub.publish(topic_path, data=data)

    return "1 message published"

Go


// Package contexttip is an example of how to use Pub/Sub and context.Context in
// a Cloud Function.
package contexttip

import (
	"context"
	"encoding/json"
	"fmt"
	"log"
	"net/http"
	"os"
	"sync"

	"cloud.google.com/go/pubsub"
	"github.com/GoogleCloudPlatform/functions-framework-go/functions"
)

// client is a global Pub/Sub client, initialized once per instance.
var client *pubsub.Client
var once sync.Once

// createClient creates the global pubsub Client
func createClient() {
	// GOOGLE_CLOUD_PROJECT is a user-set environment variable.
	var projectID = os.Getenv("GOOGLE_CLOUD_PROJECT")
	// err is pre-declared to avoid shadowing client.
	var err error

	// client is initialized with context.Background() because it should
	// persist between function invocations.
	client, err = pubsub.NewClient(context.Background(), projectID)
	if err != nil {
		log.Fatalf("pubsub.NewClient: %v", err)
	}
}

func init() {
	// register http function
	functions.HTTP("PublishMessage", PublishMessage)
}

type publishRequest struct {
	Topic   string `json:"topic"`
	Message string `json:"message"`
}

// PublishMessage publishes a message to Pub/Sub. PublishMessage only works
// with topics that already exist.
func PublishMessage(w http.ResponseWriter, r *http.Request) {
	// use of sync.Once ensures client is only created once.
	once.Do(createClient)
	// Parse the request body to get the topic name and message.
	p := publishRequest{}

	if err := json.NewDecoder(r.Body).Decode(&p); err != nil {
		log.Printf("json.NewDecoder: %v", err)
		http.Error(w, "Error parsing request", http.StatusBadRequest)
		return
	}

	if p.Topic == "" || p.Message == "" {
		s := "missing 'topic' or 'message' parameter"
		log.Println(s)
		http.Error(w, s, http.StatusBadRequest)
		return
	}

	m := &pubsub.Message{
		Data: []byte(p.Message),
	}
	// Publish and Get use r.Context() because they are only needed for this
	// function invocation. If this were a background function, they would use
	// the ctx passed as an argument.
	id, err := client.Topic(p.Topic).Publish(r.Context(), m).Get(r.Context())
	if err != nil {
		log.Printf("topic(%s).Publish.Get: %v", p.Topic, err)
		http.Error(w, "Error publishing message", http.StatusInternalServerError)
		return
	}
	fmt.Fprintf(w, "Message published: %v", id)
}

函数负载测试

如需测量您的函数平均执行多少次连接,只需将其部署为 HTTP 函数,然后使用性能测试框架以特定的 QPS 对其进行调用。您可选择使用 Artillery - 只需一行代码就能调用:

$ artillery quick -d 300 -r 30 URL

此命令会以 30 QPS 的速度提取指定的网址,持续时间为 300 秒。

执行测试后,您可以在 Cloud 控制台中转至 Cloud Functions API 配额页面检查连接配额的使用情况。如果使用量一直在 30(或其倍数)左右,说明您在每次调用中都会建立一个(或多个)连接。优化代码后,您应该只会在测试开始时看到一些连接(10-30 个)。

您还可以在同一页面上的 CPU 配额图中比较优化前后的 CPU 费用。