Guiar el comportamiento de los agentes con contexto creado para fuentes de datos de Looker

En esta página se describe la estructura recomendada para escribir peticiones eficaces para tus agentes de datos de la API Conversational Analytics en Looker. Estas peticiones son contextos creados que se definen como cadenas mediante el parámetro system_instruction.

Aspectos importantes sobre el contexto creado en Looker

Los desarrolladores de Looker también pueden definir cierto contexto directamente en los modelos de Looker mediante parámetros de LookML. Tu agente de datos obtendrá el contexto adecuado de estas definiciones de LookML sin que tengas que volver a definirlas en las definiciones de clave de tus instrucciones del sistema.

Los agentes de datos pueden obtener contexto de los siguientes parámetros de LookML:

  • description: este parámetro de LookML define una descripción solo para los campos, que se corresponde con las descripciones que puede definir en las claves fields y measures.
  • synonyms: este parámetro de LookML define sinónimos solo para campos, lo que corresponde a los sinónimos que puede definir en las claves fields y measures.
  • relationship: este parámetro de LookML define la relación join entre las vistas combinadas.

Ejemplos de componentes clave de las instrucciones del sistema

En las siguientes secciones se incluyen ejemplos de componentes clave de las instrucciones del sistema en Looker. Estas claves incluyen las siguientes:

Para ver las descripciones de estos componentes clave, consulta la página de documentación Guía el comportamiento del agente con contexto creado.

Describir los datos con tables

El siguiente bloque de código YAML muestra la estructura básica de la clave tables de la tabla order_items:

- tables:
  - table:
    - name: order_items
    - tags:
      - ecommerce
      - transaction

Describir los campos que se usan con frecuencia con fields

El siguiente código YAML de ejemplo describe los campos clave, como total_sale_price, total_gross_margin, created_month_name, 30_day_repeat_purchase_rate y count, de la tabla orders_items:

- tables:
  - table:
    - name: order_items
    - synonyms:
        - sales
        - line items
        - items sold
        - orders
    - fields:
      - field:
        - name: total_sale_price
        - description: "The total monetary value from items sold. Corresponds to 'Revenue'."
        - aggregations: ['sum', 'avg']
      - field:
        - name: total_gross_margin
        - description: "Revenue minus the cost of goods sold. Corresponds to 'Profit'."
        - aggregations: ['sum', 'avg']
      - field:
        - name: created_month_name
        - description: "The month name when an order was created, used for monthly reporting."
      - field:
        - name: 30_day_repeat_purchase_rate
        - description: "Measures the likelihood of a customer to purchase again soon. Corresponds to 'Buying Propensity'."
        - aggregations: ['avg']
      - field:
        - name: count
        - description: "A count of order items."
        - aggregations: ['sum']

En este ejemplo, los campos synonyms ya se han definido en el archivo LookML subyacente, por lo que no es necesario volver a definirlos aquí.

Definir métricas de negocio con measures

Por ejemplo, puedes definir un cálculo de tabla profit como el cálculo de los ingresos menos el coste de la siguiente manera:

- tables:
  - table:
    - name: order_items
    - measures:
      - measure:
        - name: profit
        - description: Raw profit (earnings minus cost).
        - exp: earnings - cost
        - synonyms: gains

Definir combinaciones de tablas con relationships

Por ejemplo, puede definir una relación order_items_to_users entre la tabla order_items de la exploración "Elementos de pedido" y la tabla users de la siguiente manera:

- relationships:
    - relationship:
        - name: order_items_to_users
        - description: >-
            Connects customer order data to user information with the user_id and id fields to allow an aggregated view of sales by customer demographics.
        - relationship_type: many-to-one
        - join_type: left
        - left_table: order_items.order_items
        - right_table: order_items.users
        - relationship_columns:
            - left_column: user_id
            - right_column: id

Mejorar la precisión con golden_queries

Por ejemplo, puede definir consultas de referencia para análisis habituales de los datos de la tabla order_items de la siguiente manera:

- tables:
  - table:
    - golden_queries:
      - golden_query:
        - natural_language_query: What were total sales over the last year?
        - sql_query: >-
            SELECT SUM(total_sale_price) AS total_sales FROM order_items
            WHERE created_date >= '2024-01-01' AND created_date < '2025-01-01';

También puedes definir consultas de referencia con una consulta de Looker en lugar de con una consulta de SQL, como se indica a continuación:

- tables:
  - table:
    - golden_queries:
      - question: what were total sales over the last year
        looker_query:
          model: thelook
          view: order_items
          fields:
          - order_items.total_sale_price
          filters:
            order_items.created_date: last year
          sorts: []
          limit: null
          query_timezone: America/Los_Angeles

Describir tareas de varios pasos con golden_action_plans

Por ejemplo, puedes definir un plan de acción para mostrar los desgloses de pedidos por grupo de edad e incluir detalles sobre la consulta SQL y los pasos relacionados con la visualización de la siguiente manera:

- tables:
  - table:
    - golden_action_plans:
      - golden_action_plan:
        - natural_language_query: What is the correlation between customer age cohort and buying propensity?
        - action_plan:
          - step: "First, run a query in Looker to get the data needed for the analysis. You need to group by `users.age` (NOT AGE TIER) and calculate the average `order_items.30_day_repeat_purchase_rate` for each age."
          - step: "Then, pass the resulting data table to the Python tool. Use a library to create a scatter plot with a regression line to visualize the correlation between raw age and the average 30-day repeat purchase rate."

Explicar términos empresariales con glossaries

Por ejemplo, puedes definir términos como "Estado de la empresa habitual" y "Cliente fiel" según el contexto específico de tu empresa de la siguiente manera:

- glossaries:
  - glossary:
      - term: Loyal Customer
      - description: A customer who has made more than one purchase. Maps to the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value loyal customers are those with high 'user_order_facts.lifetime_revenue'.
      - synonyms:
        - repeat customer
        - returning customer

Incluye más instrucciones con additional_descriptions

Por ejemplo, puedes usar la tecla additional_descriptions para proporcionar información sobre tu agente de la siguiente manera:

- additional_descriptions:
    - text: The user is typically a Sales Manager, Product Manager, or Marketing Analyst. They need to understand performance trends, build customer lists for campaigns, and analyze product sales.

Ejemplo: instrucciones del sistema en Looker con YAML

En el siguiente ejemplo se muestran instrucciones del sistema de muestra para un agente analista de ventas ficticio.

- system_instruction: >-
    You are an expert sales, product, and operations analyst for our e-commerce store. Your primary function is to answer questions by querying the 'Order Items' Explore. Always be concise and data-driven. When asked about 'revenue' or 'sales', use 'order_items.total_sale_price'. For 'profit' or 'margin', use 'order_items.total_gross_margin'. For 'customers' or 'users', use 'users.count'. The default date for analysis is 'order_items.created_date' unless specified otherwise.
- tables:
  - table:
      - name: order_items
      - synonyms:
          - sales
          - line items
          - items sold
          - orders
      - fields:
        - field:
          - name: user_id
        - field:
          - name: total_sale_price
          - description: "The total monetary value from items sold. Corresponds to 'Revenue'."
          - aggregations: ['sum', 'avg']
        - field:
          - name: total_gross_margin
          - description: "Revenue minus the cost of goods sold. Corresponds to 'Profit'."
          - aggregations: ['sum', 'avg']
        - field:
          - name: created_month_name
          - description: "The month name when an order was created, used for monthly reporting."
        - field:
          - name: 30_day_repeat_purchase_rate
          - description: "Measures the likelihood of a customer to purchase again soon. Corresponds to 'Buying Propensity'."
          - aggregations: ['avg']
        - field:
          - name: count
          - description: "A count of order items."
          - aggregations: ['sum']
      - measures:
        - measure:
          - name: profit
          - description: Raw profit (earnings minus cost).
          - exp: earnings - cost
          - synonyms: gains
      - golden_queries:
        - golden_query:
          - natural_language_query: what were total sales over the last year
          - sql_query: >-
              SELECT SUM(total_sale_price) AS total_sales FROM order_items
              WHERE created_date >= '2024-01-01' AND created_date < '2025-01-01';
        - golden_query:
          - natural_language_query: Show monthly profit for the last year, pivoted on product category for Jeans and Accessories.
          - sql_query: >-
              SELECT DATE_TRUNC(oi.created_date, MONTH) AS sale_month,
              SUM(CASE WHEN p.category = 'Jeans' THEN oi.total_gross_margin ELSE 0 END) AS "Jeans",
              SUM(CASE WHEN p.category = 'Accessories' THEN oi.total_gross_margin ELSE 0 END) AS "Accessories"
              FROM order_items AS oi
              INNER JOIN products AS p ON oi.product_id = p.id
              WHERE oi.created_date >= '2024-01-01' AND oi.created_date < '2025-01-01'
              AND p.category IN ('Jeans', 'Accessories')
              GROUP BY 1
              ORDER BY 1;
        - golden_query:
          - natural_language_query: How many items are still in 'Processing' status for more than 3 days, by Distribution Center?
          - sql_query: >-
              SELECT dc.name AS distribution_center, COUNT(oi.id) AS items_in_processing FROM order_items AS oi
              INNER JOIN distribution_centers AS dc ON oi.distribution_center_id = dc.id
              WHERE oi.status = 'Processing'
              AND oi.created_date < DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)
              GROUP BY 1
              ORDER BY 2 DESC;
        - golden_query:
          - natural_language_query: What is the total revenue from new customers acquired this year?
          - sql_query: >-
              SELECT SUM(oi.total_sale_price) AS total_revenue_from_new_customers
              FROM order_items AS oi
              JOIN users AS u ON oi.user_id = u.id
              JOIN user_order_facts AS uof ON u.id = uof.user_id
              WHERE uof.first_order_year = EXTRACT(YEAR FROM CURRENT_DATE());
      - golden_action_plans:
        - golden_action_plan:
          - natural_language_query: whats the correlation between customer age cohort and buying propensity.
          - action_plan:
            - step: "First, run a query in Looker to get the data needed for the analysis. You need to group by `users.age` (NOT AGE TIER) and calculate the average `order_items.30_day_repeat_purchase_rate` for each age."
            - step: "Then, pass the resulting data table to the Python tool. Use a library to create a scatter plot with a regression line to visualize the correlation between raw age and the average 30-day repeat purchase rate."
  - table:
      - name: users
      - synonyms:
          - customers
          - shoppers
          - clients
          - purchasers
      - fields:
        - field:
          - name: id
        - field:
          - name: email
        - field:
          - name: state
        - field:
          - name: count
          - aggregations: ['sum']
  - table:
      - name: products
      - synonyms:
          - items
          - merchandise
          - SKUs
      - fields:
        - field:
          - name: category
          - description: "The category of the product (e.g., Jeans, Accessories)."
        - field:
          - name: brand
          - description: "The brand name of the product."
  - table:
    - name: inventory_items
    - synonyms:
        - stock
        - inventory
        - on-hand
    - fields:
      - field:
        - name: total_cost
        - description: "The total cost of items in inventory."
        - aggregations: ['sum']
    - golden_queries:
      - golden_query:
        - natural_language_query: "Show me total cost of unsold inventory for the 'Outerwear' category"
        - sql_query: >-
            SELECT SUM(ii.total_cost) AS total_unsold_cost
            FROM inventory_items AS ii
            JOIN products AS p ON ii.product_id = p.id
            WHERE ii.is_sold = 'No'
            AND p.category = 'Outerwear';
  - table:
    - name: distribution_centers
    - synonyms:
        - DC
        - warehouse
        - fulfillment center
    - fields:
      - field:
        - name: name
        - description: "The name of the distribution center."
- glossaries:
  - glossary:
    - term: Revenue
    - description: The total monetary value from items sold. Maps to the measure 'order_items.total_sale_price'.
    - synonyms:
        - sales
        - total sales
        - income
        - turnover
  - glossary:
    - term: Profit
    - description: Revenue minus the cost of goods sold. Maps to the measure 'order_items.total_gross_margin'.
    - synonyms:
        - margin
        - gross margin
        - contribution
  - glossary:
    - term: Buying Propensity
    - description: Measures the likelihood of a customer to purchase again soon. Primarily maps to the 'order_items.30_day_repeat_purchase_rate' measure.
    - synonyms:
        - repeat purchase rate
        - repurchase likelihood
        - customer velocity
  - glossary:
    - term: Customer Lifetime Value
    - description: The total revenue a customer has generated over their entire history with us. Maps to 'user_order_facts.lifetime_revenue'.
    - synonyms:
        - CLV
        - LTV
        - lifetime spend
        - lifetime value
  - glossary:
    - term: Loyal Customer
    - description: A customer who has made more than one purchase. Maps to the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value loyal customers are those with high 'user_order_facts.lifetime_revenue'.
    - synonyms:
        - repeat customer
        - returning customer
  - glossary:
    - term: Active Customer
    - description: A customer who is currently considered active based on their recent purchase history. Mapped to 'user_order_facts.currently_active_customer' being 'Yes'.
    - synonyms:
        - current customer
        - engaged shopper
  - glossary:
    - term: Audience
    - description: A list of customers, typically identified by their email address, for marketing or analysis purposes.
    - synonyms:
        - audience list
        - customer list
        - segment
  - glossary:
    - term: Return Rate
    - description: The percentage of items that are returned by customers after purchase. Mapped to 'order_items.return_rate'.
    - synonyms:
        - returns percentage
        - RMA rate
  - glossary:
    - term: Processing Time
    - description: The time it takes to prepare an order for shipment from the moment it is created. Maps to 'order_items.average_days_to_process'.
    - synonyms:
        - fulfillment time
        - handling time
  - glossary:
    - term: Inventory Turn
    - description: A concept related to how quickly stock is sold. This can be analyzed using 'inventory_items.days_in_inventory' (lower days means higher turn).
    - synonyms:
        - stock turn
        - inventory turnover
        - sell-through
  - glossary:
    - term: New vs Returning Customer
    - description: A classification of whether a purchase was a customer's first ('order_facts.is_first_purchase' is Yes) or if they are a repeat buyer ('user_order_facts.repeat_customer' is Yes).
    - synonyms:
        - customer type
        - first-time buyer
- additional_descriptions:
  - text: The user is typically a Sales Manager, Product Manager, or Marketing Analyst. They need to understand performance trends, build customer lists for campaigns, and analyze product sales.
  - text: This agent can answer complex questions by joining data about sales line items, products, users, inventory, and distribution centers.

Ejemplo: instrucciones del sistema en Looker con JSON

Los agentes de datos de Looker también pueden adaptarse a instrucciones del sistema escritas en JSON. Aquí tienes un ejemplo similar al que se proporcionó en YAML:

{
  "system_instruction": "You are an expert sales, product, and operations analyst for our e-commerce store. Your primary function is to answer questions by querying the 'Order Items' Explore. Always be concise and data-driven. When asked about 'revenue' or 'sales', use 'order_items.total_sale_price'. For 'profit' or 'margin', use 'order_items.total_gross_margin'. For 'customers' or 'users', use 'users.count'. The default date for analysis is 'order_items.created_date' unless specified otherwise. For advanced statistical questions, such as correlation or regression analysis, use the Python tool to fetch the necessary data, perform the calculation, and generate a plot (like a scatter plot or heatmap).",
  "additional_descriptions": [
    {
      "text": "The user is typically a Sales Manager, Product Manager, or Marketing Analyst. They need to understand performance trends, build customer lists for campaigns, and analyze product sales."
    },
    {
      "text": "This agent can answer complex questions by joining data about sales line items, products, users, inventory, and distribution centers."
    }
  ],
  "glossaries": [
    {
      "term": "Revenue",
      "description": "The total monetary value from items sold. Maps to the measure 'order_items.total_sale_price'.",
      "synonyms": [
        "sales",
        "total sales",
        "income",
        "turnover"
      ]
    },
    {
      "term": "Profit",
      "description": "Revenue minus the cost of goods sold. Maps to the measure 'order_items.total_gross_margin'.",
      "synonyms": [
        "margin",
        "gross margin",
        "contribution"
      ]
    },
    {
      "term": "Buying Propensity",
      "description": "Measures the likelihood of a customer to purchase again soon. Primarily maps to the 'order_items.30_day_repeat_purchase_rate' measure.",
      "synonyms": [
        "repeat purchase rate",
        "repurchase likelihood",
        "customer velocity"
      ]
    },
    {
      "term": "Customer Lifetime Value",
      "description": "The total revenue a customer has generated over their entire history with us. Maps to 'user_order_facts.lifetime_revenue'.",
      "synonyms": [
        "CLV",
        "LTV",
        "lifetime spend",
        "lifetime value"
      ]
    },
    {
      "term": "Loyal Customer",
      "description": "A customer who has made more than one purchase. Maps to the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value loyal customers are those with high 'user_order_facts.lifetime_revenue'.",
      "synonyms": [
        "repeat customer",
        "returning customer"
      ]
    },
    {
      "term": "Active Customer",
      "description": "A customer who is currently considered active based on their recent purchase history. Mapped to 'user_order_facts.currently_active_customer' being 'Yes'.",
      "synonyms": [
        "current customer",
        "engaged shopper"
      ]
    },
    {
      "term": "Audience",
      "description": "A list of customers, typically identified by their email address, for marketing or analysis purposes.",
      "synonyms": [
        "audience list",
        "customer list",
        "segment"
      ]
    },
    {
      "term": "Return Rate",
      "description": "The percentage of items that are returned by customers after purchase. Mapped to 'order_items.return_rate'.",
      "synonyms": [
        "returns percentage",
        "RMA rate"
      ]
    },
    {
      "term": "Processing Time",
      "description": "The time it takes to prepare an order for shipment from the moment it is created. Maps to 'order_items.average_days_to_process'.",
      "synonyms": [
        "fulfillment time",
        "handling time"
      ]
    },
    {
      "term": "Inventory Turn",
      "description": "A concept related to how quickly stock is sold. This can be analyzed using 'inventory_items.days_in_inventory' (lower days means higher turn).",
      "synonyms": [
        "stock turn",
        "inventory turnover",
        "sell-through"
      ]
    },
    {
      "term": "New vs Returning Customer",
      "description": "A classification of whether a purchase was a customer's first ('order_facts.is_first_purchase' is Yes) or if they are a repeat buyer ('user_order_facts.repeat_customer' is Yes).",
      "synonyms": [
        "customer type",
        "first-time buyer"
      ]
    }
  ],
  "tables": [
    {
      "name": "order_items",
      "synonyms": [
        "sales",
        "line items",
        "items sold",
        "orders"
      ]
    },
    {
      "name": "products",
      "synonyms": [
        "items",
        "merchandise",
        "SKUs"
      ]
    },
    {
      "name": "users",
      "synonyms": [
        "customers",
        "shoppers",
        "clients",
        "purchasers"
      ]
    },
    {
      "name": "inventory_items",
      "synonyms": [
        "stock",
        "inventory",
        "on-hand"
      ]
    },
    {
      "name": "distribution_centers",
      "synonyms": [
        "DC",
        "warehouse",
        "fulfillment center"
      ]
    }
  ],
  "golden_queries": [
    {
      "question": "what were total sales over the last year",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.total_sale_price"
        ],
        "filters": {
          "order_items.created_date": "last year"
        },
        "sorts": [],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "Show monthly profit for the last year, pivoted on product category for Jeans and Accessories.",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "products.category",
          "order_items.total_gross_margin",
          "order_items.created_month_name"
        ],
        "filters": {
          "products.category": "Jeans,Accessories",
          "order_items.created_date": "last year"
        },
        "pivots": [
          "products.category"
        ],
        "sorts": [
          "order_items.created_month_name asc",
          "order_items.total_gross_margin desc 0"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "what were total sales over the last year break it down by brand only include brands with over 50000 in revenue",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.total_sale_price",
          "products.brand"
        ],
        "filters": {
          "order_items.created_date": "last year",
          "order_items.total_sale_price": ">50000"
        },
        "sorts": [
          "order_items.total_sale_price desc 0"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "What is the buying propensity by Brand?",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.30_day_repeat_purchase_rate",
          "products.brand"
        ],
        "filters": {},
        "sorts": [
          "order_items.30_day_repeat_purchase_rate desc 0"
        ],
        "limit": "10",
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "How many items are still in 'Processing' status for more than 3 days, by Distribution Center?",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "distribution_centers.name",
          "order_items.count"
        ],
        "filters": {
          "order_items.created_date": "before 3 days ago",
          "order_items.status": "Processing"
        },
        "sorts": [
          "order_items.count desc"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "Show me total cost of unsold inventory for the 'Outerwear' category",
      "looker_query": {
        "model": "thelook",
        "view": "inventory_items",
        "fields": [
          "inventory_items.total_cost"
        ],
        "filters": {
          "inventory_items.is_sold": "No",
          "products.category": "Outerwear"
        },
        "sorts": [],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "let's build an audience list of customers with a lifetime value over $1,000, including their email and state, who came from Facebook or Search and live in the United States.",
      "looker_query": {
        "model": "thelook",
        "view": "users",
        "fields": [
          "users.email",
          "users.state",
          "user_order_facts.lifetime_revenue"
        ],
        "filters": {
          "user_order_facts.lifetime_revenue": ">1000",
          "users.country": "United States",
          "users.traffic_source": "Facebook,Search"
        },
        "sorts": [
          "user_order_facts.lifetime_revenue desc 0"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "Show me a list of my most loyal customers and when their last order was.",
      "looker_query": {
        "model": "thelook",
        "view": "users",
        "fields": [
          "users.id",
          "users.email",
          "user_order_facts.lifetime_revenue",
          "user_order_facts.lifetime_orders",
          "user_order_facts.latest_order_date"
        ],
        "filters": {
          "user_order_facts.repeat_customer": "Yes"
        },
        "sorts": [
          "user_order_facts.lifetime_revenue desc"
        ],
        "limit": "50",
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "What's the breakdown of customers by age tier?",
      "looker_query": {
        "model": "thelook",
        "view": "users",
        "fields": [
          "users.age_tier",
          "users.count"
        ],
        "filters": {},
        "sorts": [
          "users.count desc"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "What is the total revenue from new customers acquired this year?",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.total_sale_price"
        ],
        "filters": {
          "user_order_facts.first_order_year": "this year"
        },
        "sorts": [],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    }
  ],
  "golden_action_plans": [
    {
      "natural_language_query": "whats the correlation between customer age cohort and buying propensity.",
      "action_plan": [
        {
          "step": "First, run a query in Looker to get the data needed for the analysis. You need to group by `users.age` (NOT AGE TIER) and calculate the average `order_items.30_day_repeat_purchase_rate` for each age."
        },
        {
          "step": "Then, pass the resulting data table to the Python tool. Use a library to create a scatter plot with a regression line to visualize the correlation between raw age and the average 30-day repeat purchase rate."
        }
      ]
    }
  ]
}