对 Cloud Storage 文件的内容进行分类

分析存储在 Google Cloud Storage 中的文件,并返回文档中找到的文本所适用的内容分类列表

代码示例

Go

如需了解如何安装和使用 Natural Language 的客户端库,请参阅 Natural Language 客户端库。 有关详情,请参阅 Natural Language Go API 参考文档

如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证


func classifyTextFromGCS(ctx context.Context, gcsURI string) (*languagepb.ClassifyTextResponse, error) {
	return client.ClassifyText(ctx, &languagepb.ClassifyTextRequest{
		Document: &languagepb.Document{
			Source: &languagepb.Document_GcsContentUri{
				GcsContentUri: gcsURI,
			},
			Type: languagepb.Document_PLAIN_TEXT,
		},
	})
}

Java

如需了解如何安装和使用 Natural Language 客户端库,请参阅 Natural Language 客户端库。 有关详情,请参阅 Natural Language Java API 参考文档

如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

// Instantiate the Language client com.google.cloud.language.v2.LanguageServiceClient
try (LanguageServiceClient language = LanguageServiceClient.create()) {
  // Set the GCS content URI path
  Document doc =
      Document.newBuilder().setGcsContentUri(gcsUri).setType(Type.PLAIN_TEXT).build();
  ClassifyTextRequest request = ClassifyTextRequest.newBuilder().setDocument(doc).build();
  // Detect categories in the given file
  ClassifyTextResponse response = language.classifyText(request);

  for (ClassificationCategory category : response.getCategoriesList()) {
    System.out.printf(
        "Category name : %s, Confidence : %.3f\n",
        category.getName(), category.getConfidence());
  }
}

Node.js

如需了解如何安装和使用 Natural Language 的客户端库,请参阅 Natural Language 客户端库。 有关详情,请参阅 Natural Language Node.js API 参考文档

如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

// Imports the Google Cloud client library.
const language = require('@google-cloud/language').v2;

// Creates a client.
const client = new language.LanguageServiceClient();

/**
 * TODO(developer): Uncomment the following lines to run this code
 */
// const bucketName = 'Your bucket name, e.g. my-bucket';
// const fileName = 'Your file name, e.g. my-file.txt';

// Prepares a document, representing a text file in Cloud Storage
const document = {
  gcsContentUri: `gs://${bucketName}/${fileName}`,
  type: 'PLAIN_TEXT',
};

// Classifies text in the document
const [classification] = await client.classifyText({document});

console.log('Categories:');
classification.categories.forEach(category => {
  console.log(`Name: ${category.name}, Confidence: ${category.confidence}`);
});

PHP

如需了解如何安装和使用 Natural Language 的客户端库,请参阅 Natural Language 客户端库

如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

use Google\Cloud\Language\V1\ClassifyTextRequest;
use Google\Cloud\Language\V1\Client\LanguageServiceClient;
use Google\Cloud\Language\V1\Document;
use Google\Cloud\Language\V1\Document\Type;

/**
 * @param string $uri The cloud storage object to analyze (gs://your-bucket-name/your-object-name)
 */
function classify_text_from_file(string $uri): void
{
    $languageServiceClient = new LanguageServiceClient();

    // Create a new Document, pass GCS URI and set type to PLAIN_TEXT
    $document = (new Document())
        ->setGcsContentUri($uri)
        ->setType(Type::PLAIN_TEXT);

    // Call the analyzeSentiment function
    $request = (new ClassifyTextRequest())
        ->setDocument($document);
    $response = $languageServiceClient->classifyText($request);
    $categories = $response->getCategories();
    // Print document information
    foreach ($categories as $category) {
        printf('Category Name: %s' . PHP_EOL, $category->getName());
        printf('Confidence: %s' . PHP_EOL, $category->getConfidence());
        print(PHP_EOL);
    }
}

Python

如需了解如何安装和使用 Natural Language 的客户端库,请参阅 Natural Language 客户端库。 有关详情,请参阅 Natural Language Python API 参考文档

如需向 Natural Language 进行身份验证,请设置应用默认凭据。如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import language_v2


def sample_classify_text(
    gcs_content_uri: str = "gs://cloud-samples-data/language/classify-entertainment.txt",
) -> None:
    """
    Classifies Content in text file stored in Cloud Storage.

    Args:
      gcs_content_uri: Google Cloud Storage URI where the file content is located.
        e.g. gs://[Your Bucket]/[Path to File].
    """

    client = language_v2.LanguageServiceClient()

    # Available types: PLAIN_TEXT, HTML
    document_type_in_plain_text = language_v2.Document.Type.PLAIN_TEXT

    # Optional. If not specified, the language is automatically detected.
    # For list of supported languages:
    # https://cloud.google.com/natural-language/docs/languages
    language_code = "en"
    document = {
        "gcs_content_uri": gcs_content_uri,
        "type_": document_type_in_plain_text,
        "language_code": language_code,
    }

    response = client.classify_text(request={"document": document})
    # Loop through classified categories returned from the API
    for category in response.categories:
        # Get the name of the category representing the document.
        # See the predefined taxonomy of categories:
        # https://cloud.google.com/natural-language/docs/categories
        print(f"Category name: {category.name}")
        # Get the confidence. Number representing how certain the classifier
        # is that this category represents the provided text.
        print(f"Confidence: {category.confidence}")

后续步骤

如需搜索和过滤其他 Google Cloud 产品的代码示例,请参阅 Google Cloud 示例浏览器