Format output from the cbt CLI
This document describes how to format specific types of data stored in
Bigtable rows when displayed by the
cbt
CLI
.
Examples of formatting
Starting with version 0.12.0, the
cbt
CLI
can format
certain complex types of data stored in table rows.
When you use the cbt read
or cbt lookup
command, the
cbt
CLI
can
"pretty print" values stored in the rows.
The following example shows data output from the
cbt
CLI
without
formatting.
----------------------------------------
r1
fam1:col1 @ 2022/03/09-11:19:45.966000
"\n\x05Brave\x10\x02"
fam1:col2 @ 2022/03/14-11:17:20.014000
"{\"name\": \"Brave\", \"age\": 2}"
The following example shows data output from the
cbt
CLI
with
formatting.
r1
fam1:col1 @ 2022/03/09-11:19:45.966000
name: "Brave"
age: 2
fam1:col2 @ 2022/03/14-11:17:20.014000
age: 2.00
name: "Brave"
Print rows with formatting
To format a column or column family, you must provide a YAML file that
specifies the formatting for that column. When you call cbt lookup
or
cbt read
, you pass in the path to the YAML file with the format-file
argument. The following snippet shows an example of calling cbt lookup
with
the format-file
argument supplied.
cbt lookup my-table r1 format-file=/path/to/formatting.yml
Define column data formats in YAML
The formatting YAML file must connect the column names or column family names with the data types stored within them. The following snippet shows an example of a YAML formatting file.
protocol_buffer_definitions:
- cat.proto
protocol_buffer_paths:
- testdata/
columns:
col1:
encoding: ProtocolBuffer
type: Cat
col2:
encoding: json
The following snippet shows the contents of 'cat.proto'.
syntax = "proto3";
package cats;
option go_package = "github.com/protocolbuffers/protobuf/examples/go/tutorialpb";
message Cat {
string name = 1;
int32 age = 2;
}
Looking at the example:
- The
protocol_buffer_definitions
field provides a list of .proto files that can contain protocol buffer message types to use for decoding protobuf data. - The
protocol_buffer_paths
field provides a list of local paths that can contain .proto files for decoding protocol buffer types. You do not need to specify the locations of standard protocol buffer imports, such as messages in thegoogle/protobuf
package. The
columns
field contains a list of column names with the corresponding data types for each column:- The
protobuf
column has itsencoding
set to "ProtocolBuffer" and itstype
is set to 'Cat'. Thecbt
CLI interprets and formats all values stored in this column as aCat
proto message type. The type must correspond to a message type defined in one of the .proto files provided for theprotocol_buffer_definition
field. - The
json
column has itsencoding
field set to "json". Thecbt
interprets and formats all values stored in this column as a JSON structure.
- The
Other fields that you can provide:
default_encoding
: This field defines a default formatting for all all columns in a table or all columns in a column family.default_type
: This field defines a default data type for protocol buffer, big-endian, and little-endian encoded columns.families
: This field defines encodings and types for all columns within a column family. You can provide adefault_encoding
anddefault_type
for a column family. You can also override these encodings at the column level by providing acolumns
field that lists columns by name with the appropriate encoding and data types, as shown in the following snippet:families: family1: default_encoding: BigEndian default_type: INT64 columns: address: encoding: PROTO type: tutorial.Person
Supported data types
The
cbt
CLI
supports formatting for several complex data types. The following
table lists the supported data types and strings to provide in the YAML file
for each of the list types. String values are not case-sensitive.
Data type | Formatting value for YAML |
---|---|
Hexadecimal | Hex , H |
Big-endian | BigEndian , B |
Little-endian | LittleEndian , L |
Protocol buffer | ProtocolBuffer , P , PROTO |
JSON | JSON , J |
Table 1. Data types supported for formatting in cbt
output.
- The hexadecimal encoding is type agnostic. Data are displayed as a raw hexadecimal representation of the stored data.
- The available types for the big-endian and little-endian encodings are
int8
,int16
,int32
,int64
,uint8
,uint16
,uint32
,uint64
,float32
, andfloat64
. Stored data length must be a multiple of the type sized, in bytes. Data are displayed as scalars if the stored length matches the type size, or as arrays otherwise. Types names are not case-sensitive. - The types given for the protocol-buffer encoding must match message types defined in provided protocol-buffer definition files. The types are not case-sensitive. If no type is specified, it defaults to the column name for the column data being displayed.
- The formatting values for YAML are not case-sensitive.