[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-04 (世界標準時間)。"],[[["\u003cp\u003eThis example demonstrates a Hadoop MapReduce job that counts word occurrences in a text file, storing the results in Bigtable.\u003c/p\u003e\n"],["\u003cp\u003eThe code, located in the \u003ccode\u003eGoogleCloudPlatform/cloud-bigtable-examples\u003c/code\u003e GitHub repository, uses the \u003ccode\u003eWordCountHBase\u003c/code\u003e class to implement the MapReduce logic.\u003c/p\u003e\n"],["\u003cp\u003eA mapper tokenizes the text and generates key-value pairs where each word is a key and the value is 1.\u003c/p\u003e\n"],["\u003cp\u003eA reducer sums the values for each word and writes the final count to a specified Bigtable table in a \u003ccode\u003ecf:count\u003c/code\u003e column.\u003c/p\u003e\n"],["\u003cp\u003eTo run this example in a local environment, you will need to install and initialize the gcloud CLI, then set up application default credentials.\u003c/p\u003e\n"]]],[],null,["Hadoop MapReduce job with Bigtable\n\nThis example uses [Hadoop](https://hadoop.apache.org/) to perform a simple MapReduce job that\ncounts the number of times a word appears in a text file. The MapReduce job\nuses Bigtable to store the results of the map operation. The code for\nthis example is in the GitHub repository\n[GoogleCloudPlatform/cloud-bigtable-examples](https://github.com/GoogleCloudPlatform/cloud-bigtable-examples/), in the directory\n`java/dataproc-wordcount`.\n\nSet up authentication\n\n\nTo use the Java samples on this page in a local\ndevelopment environment, install and initialize the gcloud CLI, and\nthen set up Application Default Credentials with your user credentials.\n\n1. [Install](/sdk/docs/install) the Google Cloud CLI.\n2. If you're using an external identity provider (IdP), you must first [sign in to the gcloud CLI with your federated identity](/iam/docs/workforce-log-in-gcloud).\n3. If you're using a local shell, then create local authentication credentials for your user account: \n\n```bash\ngcloud auth application-default login\n```\n4. You don't need to do this if you're using Cloud Shell.\n5. If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have [signed in to the gcloud CLI with your federated identity](/iam/docs/workforce-log-in-gcloud).\n\n\nFor more information, see\n[Set up authentication for a local development environment](/bigtable/docs/authentication#local-development).\n\nOverview of the code sample\n\nThe code sample provides a simple command-line interface that takes one or more\ntext files and a table name as input, finds all of the words that appear in the\nfile, and counts how many times each word appears. The MapReduce logic appears\nin the [`WordCountHBase` class](https://github.com/GoogleCloudPlatform/cloud-bigtable-examples//blob/master/java/dataproc-wordcount/src/main/java/com/example/bigtable/sample/WordCountHBase.java).\n\nFirst, a mapper tokenizes the text file's contents and generates key-value\npairs, where the key is a word from the text file and the value is `1`: \n\n public static class TokenizerMapper extends\n Mapper\u003cObject, Text, ImmutableBytesWritable, IntWritable\u003e {\n\n private final static IntWritable one = new IntWritable(1);\n\n @Override\n public void map(Object key, Text value, Context context) throws IOException,\n InterruptedException {\n StringTokenizer itr = new StringTokenizer(value.toString());\n ImmutableBytesWritable word = new ImmutableBytesWritable();\n while (itr.hasMoreTokens()) {\n word.set(Bytes.toBytes(itr.nextToken()));\n context.write(word, one);\n }\n }\n }\n\nA reducer then sums the values for each key and writes the results to a\nBigtable table that you specified. Each row key is a word from the\ntext file. Each row contains a `cf:count` column, which contains the number of\ntimes the row key appears in the text file. \n\n public static class MyTableReducer extends\n TableReducer\u003cImmutableBytesWritable, IntWritable, ImmutableBytesWritable\u003e {\n\n @Override\n public void reduce(ImmutableBytesWritable key, Iterable\u003cIntWritable\u003e values, Context context)\n throws IOException, InterruptedException {\n int sum = sum(values);\n Put put = new Put(key.get());\n put.addColumn(COLUMN_FAMILY, COUNT_COLUMN_NAME, Bytes.toBytes(sum));\n context.write(null, put);\n }\n\n public int sum(Iterable\u003cIntWritable\u003e values) {\n int i = 0;\n for (IntWritable val : values) {\n i += val.get();\n }\n return i;\n }\n }"]]