Format output from the cbt CLI
This document describes how to format specific types of data stored in
Bigtable rows when displayed by the 
cbt CLI
.
Examples of formatting
Starting with version 0.12.0, the 
cbt CLI
 can format
certain complex types of data stored in table rows.
When you use the cbt read or cbt lookup command, the 
cbt CLI
 can
"pretty print" values stored in the rows.
The following example shows data output from the 
cbt CLI
 without
formatting.
----------------------------------------
r1
  fam1:col1                                 @ 2022/03/09-11:19:45.966000
    "\n\x05Brave\x10\x02"
  fam1:col2                                 @ 2022/03/14-11:17:20.014000
    "{\"name\": \"Brave\", \"age\": 2}"
The following example shows data output from the 
cbt CLI
 with
formatting.
r1
  fam1:col1                                 @ 2022/03/09-11:19:45.966000
    name: "Brave"
    age: 2
  fam1:col2                                 @ 2022/03/14-11:17:20.014000
    age:     2.00
    name:   "Brave"
Print rows with formatting
To format a column or column family, you must provide a YAML file that
specifies the formatting for that column. When you call cbt lookup or
cbt read, you pass in the path to the YAML file with the format-file
argument. The following snippet shows an example of calling cbt lookup with
the format-file argument supplied.
cbt lookup my-table r1 format-file=/path/to/formatting.yml
Define column data formats in YAML
The formatting YAML file must connect the column names or column family names with the data types stored within them. The following snippet shows an example of a YAML formatting file.
protocol_buffer_definitions:
  - cat.proto
protocol_buffer_paths:
  - testdata/
columns:
  col1:
    encoding: ProtocolBuffer
    type: Cat
  col2:
    encoding: json
The following snippet shows the contents of 'cat.proto'.
syntax = "proto3";
package cats;
option go_package = "github.com/protocolbuffers/protobuf/examples/go/tutorialpb";
message Cat {
  string name = 1;
  int32 age = 2;
}
Looking at the example:
- The protocol_buffer_definitionsfield provides a list of .proto files that can contain protocol buffer message types to use for decoding protobuf data.
- The protocol_buffer_pathsfield provides a list of local paths that can contain .proto files for decoding protocol buffer types. You do not need to specify the locations of standard protocol buffer imports, such as messages in thegoogle/protobufpackage.
- The - columnsfield contains a list of column names with the corresponding data types for each column:- The protobufcolumn has itsencodingset to "ProtocolBuffer" and itstypeis set to 'Cat'. ThecbtCLI interprets and formats all values stored in this column as aCatproto message type. The type must correspond to a message type defined in one of the .proto files provided for theprotocol_buffer_definitionfield.
- The jsoncolumn has itsencodingfield set to "json". Thecbtinterprets and formats all values stored in this column as a JSON structure.
 
- The 
Other fields that you can provide:
- default_encoding: This field defines a default formatting for all all columns in a table or all columns in a column family.
- default_type: This field defines a default data type for protocol buffer, big-endian, and little-endian encoded columns.
- families: This field defines encodings and types for all columns within a column family. You can provide a- default_encodingand- default_typefor a column family. You can also override these encodings at the column level by providing a- columnsfield that lists columns by name with the appropriate encoding and data types, as shown in the following snippet:- families: family1: default_encoding: BigEndian default_type: INT64 columns: address: encoding: PROTO type: tutorial.Person
Supported data types
The 
cbt CLI
 supports formatting for several complex data types. The following
table lists the supported data types and strings to provide in the YAML file
for each of the list types. String values are not case-sensitive.
| Data type | Formatting value for YAML | 
|---|---|
| Hexadecimal | Hex,H | 
| Big-endian | BigEndian,B | 
| Little-endian | LittleEndian,L | 
| Protocol buffer | ProtocolBuffer,P,PROTO | 
| JSON | JSON,J | 
Table 1. Data types supported for formatting in cbt output.
- The hexadecimal encoding is type agnostic. Data are displayed as a raw hexadecimal representation of the stored data.
- The available types for the big-endian and little-endian encodings are
int8,int16,int32,int64,uint8,uint16,uint32,uint64,float32, andfloat64. Stored data length must be a multiple of the type sized, in bytes. Data are displayed as scalars if the stored length matches the type size, or as arrays otherwise. Types names are not case-sensitive.
- The types given for the protocol-buffer encoding must match message types defined in provided protocol-buffer definition files. The types are not case-sensitive. If no type is specified, it defaults to the column name for the column data being displayed.
- The formatting values for YAML are not case-sensitive.