Orientar o comportamento do agente com contexto criado para fontes de dados do Looker

Esta página descreve a estrutura recomendada para escrever comandos eficazes para seus agentes de dados da API Conversational Analytics no Looker. Esses comandos são contextos criados por você e definidos como strings usando o parâmetro system_instruction.

Informações sobre o contexto criado no Looker

Os desenvolvedores do Looker também podem definir um contexto diretamente nos modelos do Looker usando parâmetros do LookML. O agente de dados vai coletar o contexto adequado dessas definições do LookML sem que você precise redefini-las nas definições principais das instruções do sistema.

Os agentes de dados podem derivar contexto dos seguintes parâmetros da LookML:

  • description: esse parâmetro da LookML define uma descrição apenas para campos, que corresponde às descrições que podem ser definidas nas chaves fields e measures.
  • synonyms: esse parâmetro do LookML define sinônimos apenas para campos, o que corresponde aos sinônimos que podem ser definidos nas chaves fields e measures.
  • relationship: esse parâmetro da LookML define a relação join entre as visualizações unidas.

Exemplos de componentes principais das instruções do sistema

As seções a seguir contêm exemplos de componentes principais das instruções do sistema no Looker. Essas chaves incluem o seguinte:

Para descrições desses componentes principais, consulte a página de documentação Orientar o comportamento do agente com contexto criado.

Descrever seus dados com tables

O bloco de código YAML a seguir mostra a estrutura básica da chave tables para a tabela order_items:

- tables:
  - table:
    - name: order_items
    - tags:
      - ecommerce
      - transaction

Descrever campos usados com frequência com fields

O código YAML de amostra a seguir descreve campos principais, como total_sale_price, total_gross_margin, created_month_name, 30_day_repeat_purchase_rate e count para a tabela orders_items:

- tables:
  - table:
    - name: order_items
    - synonyms:
        - sales
        - line items
        - items sold
        - orders
    - fields:
      - field:
        - name: total_sale_price
        - description: "The total monetary value from items sold. Corresponds to 'Revenue'."
        - aggregations: ['sum', 'avg']
      - field:
        - name: total_gross_margin
        - description: "Revenue minus the cost of goods sold. Corresponds to 'Profit'."
        - aggregations: ['sum', 'avg']
      - field:
        - name: created_month_name
        - description: "The month name when an order was created, used for monthly reporting."
      - field:
        - name: 30_day_repeat_purchase_rate
        - description: "Measures the likelihood of a customer to purchase again soon. Corresponds to 'Buying Propensity'."
        - aggregations: ['avg']
      - field:
        - name: count
        - description: "A count of order items."
        - aggregations: ['sum']

Neste exemplo, o campo synonyms já foi definido na LookML subjacente, então não precisa ser redefinido aqui.

Definir métricas de negócios com measures

Por exemplo, você pode definir um cálculo de tabela profit como um cálculo dos ganhos menos o custo da seguinte maneira:

- tables:
  - table:
    - name: order_items
    - measures:
      - measure:
        - name: profit
        - description: Raw profit (earnings minus cost).
        - exp: earnings - cost
        - synonyms: gains

Definir junções de tabelas com relationships

Por exemplo, é possível definir uma relação order_items_to_users entre a tabela order_items da análise detalhada "Itens do pedido" e a tabela users da seguinte maneira:

- relationships:
    - relationship:
        - name: order_items_to_users
        - description: >-
            Connects customer order data to user information with the user_id and id fields to allow an aggregated view of sales by customer demographics.
        - relationship_type: many-to-one
        - join_type: left
        - left_table: order_items.order_items
        - right_table: order_items.users
        - relationship_columns:
            - left_column: user_id
            - right_column: id

Melhorar a acurácia com o golden_queries

Por exemplo, é possível definir consultas de ouro para análises comuns dos dados na tabela order_items da seguinte maneira:

- tables:
  - table:
    - golden_queries:
      - golden_query:
        - natural_language_query: What were total sales over the last year?
        - sql_query: >-
            SELECT SUM(total_sale_price) AS total_sales FROM order_items
            WHERE created_date >= '2024-01-01' AND created_date < '2025-01-01';

Como alternativa, você pode definir consultas de ouro usando uma consulta do Looker em vez de uma consulta SQL, da seguinte maneira:

- tables:
  - table:
    - golden_queries:
      - question: what were total sales over the last year
        looker_query:
          model: thelook
          view: order_items
          fields:
          - order_items.total_sale_price
          filters:
            order_items.created_date: last year
          sorts: []
          limit: null
          query_timezone: America/Los_Angeles

Descrever tarefas de várias etapas com golden_action_plans

Por exemplo, é possível definir um plano de ação para mostrar detalhamentos de pedidos por idade e incluir detalhes sobre a consulta SQL e as etapas relacionadas à visualização da seguinte forma:

- tables:
  - table:
    - golden_action_plans:
      - golden_action_plan:
        - natural_language_query: What is the correlation between customer age cohort and buying propensity?
        - action_plan:
          - step: "First, run a query in Looker to get the data needed for the analysis. You need to group by `users.age` (NOT AGE TIER) and calculate the average `order_items.30_day_repeat_purchase_rate` for each age."
          - step: "Then, pass the resulting data table to the Python tool. Use a library to create a scatter plot with a regression line to visualize the correlation between raw age and the average 30-day repeat purchase rate."

Explicar termos comerciais com o glossaries

Por exemplo, você pode definir termos como status comerciais comuns e "Cliente fiel" de acordo com seu contexto comercial específico da seguinte forma:

- glossaries:
  - glossary:
      - term: Loyal Customer
      - description: A customer who has made more than one purchase. Maps to the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value loyal customers are those with high 'user_order_facts.lifetime_revenue'.
      - synonyms:
        - repeat customer
        - returning customer

Incluir mais instruções com additional_descriptions

Por exemplo, é possível usar a chave additional_descriptions para fornecer informações sobre seu agente da seguinte forma:

- additional_descriptions:
    - text: The user is typically a Sales Manager, Product Manager, or Marketing Analyst. They need to understand performance trends, build customer lists for campaigns, and analyze product sales.

Exemplo: instruções do sistema no Looker usando YAML

O exemplo a seguir mostra instruções de sistema de amostra para um agente analista de vendas fictício.

- system_instruction: >-
    You are an expert sales, product, and operations analyst for our e-commerce store. Your primary function is to answer questions by querying the 'Order Items' Explore. Always be concise and data-driven. When asked about 'revenue' or 'sales', use 'order_items.total_sale_price'. For 'profit' or 'margin', use 'order_items.total_gross_margin'. For 'customers' or 'users', use 'users.count'. The default date for analysis is 'order_items.created_date' unless specified otherwise.
- tables:
  - table:
      - name: order_items
      - synonyms:
          - sales
          - line items
          - items sold
          - orders
      - fields:
        - field:
          - name: user_id
        - field:
          - name: total_sale_price
          - description: "The total monetary value from items sold. Corresponds to 'Revenue'."
          - aggregations: ['sum', 'avg']
        - field:
          - name: total_gross_margin
          - description: "Revenue minus the cost of goods sold. Corresponds to 'Profit'."
          - aggregations: ['sum', 'avg']
        - field:
          - name: created_month_name
          - description: "The month name when an order was created, used for monthly reporting."
        - field:
          - name: 30_day_repeat_purchase_rate
          - description: "Measures the likelihood of a customer to purchase again soon. Corresponds to 'Buying Propensity'."
          - aggregations: ['avg']
        - field:
          - name: count
          - description: "A count of order items."
          - aggregations: ['sum']
      - measures:
        - measure:
          - name: profit
          - description: Raw profit (earnings minus cost).
          - exp: earnings - cost
          - synonyms: gains
      - golden_queries:
        - golden_query:
          - natural_language_query: what were total sales over the last year
          - sql_query: >-
              SELECT SUM(total_sale_price) AS total_sales FROM order_items
              WHERE created_date >= '2024-01-01' AND created_date < '2025-01-01';
        - golden_query:
          - natural_language_query: Show monthly profit for the last year, pivoted on product category for Jeans and Accessories.
          - sql_query: >-
              SELECT DATE_TRUNC(oi.created_date, MONTH) AS sale_month,
              SUM(CASE WHEN p.category = 'Jeans' THEN oi.total_gross_margin ELSE 0 END) AS "Jeans",
              SUM(CASE WHEN p.category = 'Accessories' THEN oi.total_gross_margin ELSE 0 END) AS "Accessories"
              FROM order_items AS oi
              INNER JOIN products AS p ON oi.product_id = p.id
              WHERE oi.created_date >= '2024-01-01' AND oi.created_date < '2025-01-01'
              AND p.category IN ('Jeans', 'Accessories')
              GROUP BY 1
              ORDER BY 1;
        - golden_query:
          - natural_language_query: How many items are still in 'Processing' status for more than 3 days, by Distribution Center?
          - sql_query: >-
              SELECT dc.name AS distribution_center, COUNT(oi.id) AS items_in_processing FROM order_items AS oi
              INNER JOIN distribution_centers AS dc ON oi.distribution_center_id = dc.id
              WHERE oi.status = 'Processing'
              AND oi.created_date < DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)
              GROUP BY 1
              ORDER BY 2 DESC;
        - golden_query:
          - natural_language_query: What is the total revenue from new customers acquired this year?
          - sql_query: >-
              SELECT SUM(oi.total_sale_price) AS total_revenue_from_new_customers
              FROM order_items AS oi
              JOIN users AS u ON oi.user_id = u.id
              JOIN user_order_facts AS uof ON u.id = uof.user_id
              WHERE uof.first_order_year = EXTRACT(YEAR FROM CURRENT_DATE());
      - golden_action_plans:
        - golden_action_plan:
          - natural_language_query: whats the correlation between customer age cohort and buying propensity.
          - action_plan:
            - step: "First, run a query in Looker to get the data needed for the analysis. You need to group by `users.age` (NOT AGE TIER) and calculate the average `order_items.30_day_repeat_purchase_rate` for each age."
            - step: "Then, pass the resulting data table to the Python tool. Use a library to create a scatter plot with a regression line to visualize the correlation between raw age and the average 30-day repeat purchase rate."
  - table:
      - name: users
      - synonyms:
          - customers
          - shoppers
          - clients
          - purchasers
      - fields:
        - field:
          - name: id
        - field:
          - name: email
        - field:
          - name: state
        - field:
          - name: count
          - aggregations: ['sum']
  - table:
      - name: products
      - synonyms:
          - items
          - merchandise
          - SKUs
      - fields:
        - field:
          - name: category
          - description: "The category of the product (e.g., Jeans, Accessories)."
        - field:
          - name: brand
          - description: "The brand name of the product."
  - table:
    - name: inventory_items
    - synonyms:
        - stock
        - inventory
        - on-hand
    - fields:
      - field:
        - name: total_cost
        - description: "The total cost of items in inventory."
        - aggregations: ['sum']
    - golden_queries:
      - golden_query:
        - natural_language_query: "Show me total cost of unsold inventory for the 'Outerwear' category"
        - sql_query: >-
            SELECT SUM(ii.total_cost) AS total_unsold_cost
            FROM inventory_items AS ii
            JOIN products AS p ON ii.product_id = p.id
            WHERE ii.is_sold = 'No'
            AND p.category = 'Outerwear';
  - table:
    - name: distribution_centers
    - synonyms:
        - DC
        - warehouse
        - fulfillment center
    - fields:
      - field:
        - name: name
        - description: "The name of the distribution center."
- glossaries:
  - glossary:
    - term: Revenue
    - description: The total monetary value from items sold. Maps to the measure 'order_items.total_sale_price'.
    - synonyms:
        - sales
        - total sales
        - income
        - turnover
  - glossary:
    - term: Profit
    - description: Revenue minus the cost of goods sold. Maps to the measure 'order_items.total_gross_margin'.
    - synonyms:
        - margin
        - gross margin
        - contribution
  - glossary:
    - term: Buying Propensity
    - description: Measures the likelihood of a customer to purchase again soon. Primarily maps to the 'order_items.30_day_repeat_purchase_rate' measure.
    - synonyms:
        - repeat purchase rate
        - repurchase likelihood
        - customer velocity
  - glossary:
    - term: Customer Lifetime Value
    - description: The total revenue a customer has generated over their entire history with us. Maps to 'user_order_facts.lifetime_revenue'.
    - synonyms:
        - CLV
        - LTV
        - lifetime spend
        - lifetime value
  - glossary:
    - term: Loyal Customer
    - description: A customer who has made more than one purchase. Maps to the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value loyal customers are those with high 'user_order_facts.lifetime_revenue'.
    - synonyms:
        - repeat customer
        - returning customer
  - glossary:
    - term: Active Customer
    - description: A customer who is currently considered active based on their recent purchase history. Mapped to 'user_order_facts.currently_active_customer' being 'Yes'.
    - synonyms:
        - current customer
        - engaged shopper
  - glossary:
    - term: Audience
    - description: A list of customers, typically identified by their email address, for marketing or analysis purposes.
    - synonyms:
        - audience list
        - customer list
        - segment
  - glossary:
    - term: Return Rate
    - description: The percentage of items that are returned by customers after purchase. Mapped to 'order_items.return_rate'.
    - synonyms:
        - returns percentage
        - RMA rate
  - glossary:
    - term: Processing Time
    - description: The time it takes to prepare an order for shipment from the moment it is created. Maps to 'order_items.average_days_to_process'.
    - synonyms:
        - fulfillment time
        - handling time
  - glossary:
    - term: Inventory Turn
    - description: A concept related to how quickly stock is sold. This can be analyzed using 'inventory_items.days_in_inventory' (lower days means higher turn).
    - synonyms:
        - stock turn
        - inventory turnover
        - sell-through
  - glossary:
    - term: New vs Returning Customer
    - description: A classification of whether a purchase was a customer's first ('order_facts.is_first_purchase' is Yes) or if they are a repeat buyer ('user_order_facts.repeat_customer' is Yes).
    - synonyms:
        - customer type
        - first-time buyer
- additional_descriptions:
  - text: The user is typically a Sales Manager, Product Manager, or Marketing Analyst. They need to understand performance trends, build customer lists for campaigns, and analyze product sales.
  - text: This agent can answer complex questions by joining data about sales line items, products, users, inventory, and distribution centers.

Exemplo: instruções do sistema no Looker usando JSON

Os agentes de dados do Looker também podem acomodar instruções do sistema escritas em JSON. Confira um exemplo semelhante ao fornecido em YAML:

{
  "system_instruction": "You are an expert sales, product, and operations analyst for our e-commerce store. Your primary function is to answer questions by querying the 'Order Items' Explore. Always be concise and data-driven. When asked about 'revenue' or 'sales', use 'order_items.total_sale_price'. For 'profit' or 'margin', use 'order_items.total_gross_margin'. For 'customers' or 'users', use 'users.count'. The default date for analysis is 'order_items.created_date' unless specified otherwise. For advanced statistical questions, such as correlation or regression analysis, use the Python tool to fetch the necessary data, perform the calculation, and generate a plot (like a scatter plot or heatmap).",
  "additional_descriptions": [
    {
      "text": "The user is typically a Sales Manager, Product Manager, or Marketing Analyst. They need to understand performance trends, build customer lists for campaigns, and analyze product sales."
    },
    {
      "text": "This agent can answer complex questions by joining data about sales line items, products, users, inventory, and distribution centers."
    }
  ],
  "glossaries": [
    {
      "term": "Revenue",
      "description": "The total monetary value from items sold. Maps to the measure 'order_items.total_sale_price'.",
      "synonyms": [
        "sales",
        "total sales",
        "income",
        "turnover"
      ]
    },
    {
      "term": "Profit",
      "description": "Revenue minus the cost of goods sold. Maps to the measure 'order_items.total_gross_margin'.",
      "synonyms": [
        "margin",
        "gross margin",
        "contribution"
      ]
    },
    {
      "term": "Buying Propensity",
      "description": "Measures the likelihood of a customer to purchase again soon. Primarily maps to the 'order_items.30_day_repeat_purchase_rate' measure.",
      "synonyms": [
        "repeat purchase rate",
        "repurchase likelihood",
        "customer velocity"
      ]
    },
    {
      "term": "Customer Lifetime Value",
      "description": "The total revenue a customer has generated over their entire history with us. Maps to 'user_order_facts.lifetime_revenue'.",
      "synonyms": [
        "CLV",
        "LTV",
        "lifetime spend",
        "lifetime value"
      ]
    },
    {
      "term": "Loyal Customer",
      "description": "A customer who has made more than one purchase. Maps to the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value loyal customers are those with high 'user_order_facts.lifetime_revenue'.",
      "synonyms": [
        "repeat customer",
        "returning customer"
      ]
    },
    {
      "term": "Active Customer",
      "description": "A customer who is currently considered active based on their recent purchase history. Mapped to 'user_order_facts.currently_active_customer' being 'Yes'.",
      "synonyms": [
        "current customer",
        "engaged shopper"
      ]
    },
    {
      "term": "Audience",
      "description": "A list of customers, typically identified by their email address, for marketing or analysis purposes.",
      "synonyms": [
        "audience list",
        "customer list",
        "segment"
      ]
    },
    {
      "term": "Return Rate",
      "description": "The percentage of items that are returned by customers after purchase. Mapped to 'order_items.return_rate'.",
      "synonyms": [
        "returns percentage",
        "RMA rate"
      ]
    },
    {
      "term": "Processing Time",
      "description": "The time it takes to prepare an order for shipment from the moment it is created. Maps to 'order_items.average_days_to_process'.",
      "synonyms": [
        "fulfillment time",
        "handling time"
      ]
    },
    {
      "term": "Inventory Turn",
      "description": "A concept related to how quickly stock is sold. This can be analyzed using 'inventory_items.days_in_inventory' (lower days means higher turn).",
      "synonyms": [
        "stock turn",
        "inventory turnover",
        "sell-through"
      ]
    },
    {
      "term": "New vs Returning Customer",
      "description": "A classification of whether a purchase was a customer's first ('order_facts.is_first_purchase' is Yes) or if they are a repeat buyer ('user_order_facts.repeat_customer' is Yes).",
      "synonyms": [
        "customer type",
        "first-time buyer"
      ]
    }
  ],
  "tables": [
    {
      "name": "order_items",
      "synonyms": [
        "sales",
        "line items",
        "items sold",
        "orders"
      ]
    },
    {
      "name": "products",
      "synonyms": [
        "items",
        "merchandise",
        "SKUs"
      ]
    },
    {
      "name": "users",
      "synonyms": [
        "customers",
        "shoppers",
        "clients",
        "purchasers"
      ]
    },
    {
      "name": "inventory_items",
      "synonyms": [
        "stock",
        "inventory",
        "on-hand"
      ]
    },
    {
      "name": "distribution_centers",
      "synonyms": [
        "DC",
        "warehouse",
        "fulfillment center"
      ]
    }
  ],
  "golden_queries": [
    {
      "question": "what were total sales over the last year",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.total_sale_price"
        ],
        "filters": {
          "order_items.created_date": "last year"
        },
        "sorts": [],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "Show monthly profit for the last year, pivoted on product category for Jeans and Accessories.",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "products.category",
          "order_items.total_gross_margin",
          "order_items.created_month_name"
        ],
        "filters": {
          "products.category": "Jeans,Accessories",
          "order_items.created_date": "last year"
        },
        "pivots": [
          "products.category"
        ],
        "sorts": [
          "order_items.created_month_name asc",
          "order_items.total_gross_margin desc 0"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "what were total sales over the last year break it down by brand only include brands with over 50000 in revenue",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.total_sale_price",
          "products.brand"
        ],
        "filters": {
          "order_items.created_date": "last year",
          "order_items.total_sale_price": ">50000"
        },
        "sorts": [
          "order_items.total_sale_price desc 0"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "What is the buying propensity by Brand?",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.30_day_repeat_purchase_rate",
          "products.brand"
        ],
        "filters": {},
        "sorts": [
          "order_items.30_day_repeat_purchase_rate desc 0"
        ],
        "limit": "10",
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "How many items are still in 'Processing' status for more than 3 days, by Distribution Center?",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "distribution_centers.name",
          "order_items.count"
        ],
        "filters": {
          "order_items.created_date": "before 3 days ago",
          "order_items.status": "Processing"
        },
        "sorts": [
          "order_items.count desc"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "Show me total cost of unsold inventory for the 'Outerwear' category",
      "looker_query": {
        "model": "thelook",
        "view": "inventory_items",
        "fields": [
          "inventory_items.total_cost"
        ],
        "filters": {
          "inventory_items.is_sold": "No",
          "products.category": "Outerwear"
        },
        "sorts": [],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "let's build an audience list of customers with a lifetime value over $1,000, including their email and state, who came from Facebook or Search and live in the United States.",
      "looker_query": {
        "model": "thelook",
        "view": "users",
        "fields": [
          "users.email",
          "users.state",
          "user_order_facts.lifetime_revenue"
        ],
        "filters": {
          "user_order_facts.lifetime_revenue": ">1000",
          "users.country": "United States",
          "users.traffic_source": "Facebook,Search"
        },
        "sorts": [
          "user_order_facts.lifetime_revenue desc 0"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "Show me a list of my most loyal customers and when their last order was.",
      "looker_query": {
        "model": "thelook",
        "view": "users",
        "fields": [
          "users.id",
          "users.email",
          "user_order_facts.lifetime_revenue",
          "user_order_facts.lifetime_orders",
          "user_order_facts.latest_order_date"
        ],
        "filters": {
          "user_order_facts.repeat_customer": "Yes"
        },
        "sorts": [
          "user_order_facts.lifetime_revenue desc"
        ],
        "limit": "50",
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "What's the breakdown of customers by age tier?",
      "looker_query": {
        "model": "thelook",
        "view": "users",
        "fields": [
          "users.age_tier",
          "users.count"
        ],
        "filters": {},
        "sorts": [
          "users.count desc"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "What is the total revenue from new customers acquired this year?",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.total_sale_price"
        ],
        "filters": {
          "user_order_facts.first_order_year": "this year"
        },
        "sorts": [],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    }
  ],
  "golden_action_plans": [
    {
      "natural_language_query": "whats the correlation between customer age cohort and buying propensity.",
      "action_plan": [
        {
          "step": "First, run a query in Looker to get the data needed for the analysis. You need to group by `users.age` (NOT AGE TIER) and calculate the average `order_items.30_day_repeat_purchase_rate` for each age."
        },
        {
          "step": "Then, pass the resulting data table to the Python tool. Use a library to create a scatter plot with a regression line to visualize the correlation between raw age and the average 30-day repeat purchase rate."
        }
      ]
    }
  ]
}