Guía el comportamiento del agente con contexto creado para las fuentes de datos de Looker

En esta página, se describe la estructura recomendada para escribir instrucciones eficaces para tus agentes de datos de la API de Conversational Analytics en Looker. Estas instrucciones son contexto creado que defines como cadenas con el parámetro system_instruction.

Aspectos que debes saber sobre el contexto creado en Looker

Los desarrolladores de Looker también pueden definir cierto contexto directamente en los modelos de Looker con parámetros de LookML. Tu agente de datos recopilará el contexto adecuado de estas definiciones de LookML sin que tengas que volver a definirlas en las definiciones clave de las instrucciones del sistema.

Los agentes de datos pueden derivar contexto de los siguientes parámetros de LookML:

  • description: Este parámetro de LookML define una descripción solo para los campos, que corresponde a las descripciones que puedes definir en las claves fields y measures.
  • synonyms: Este parámetro de LookML define sinónimos solo para los campos, lo que corresponde a los sinónimos que puedes definir en las claves fields y measures.
  • relationship: Este parámetro de LookML define la relación join entre las vistas unidas.

Ejemplos de componentes clave de las instrucciones del sistema

En las siguientes secciones, se incluyen ejemplos de componentes clave de las instrucciones del sistema en Looker. Estas claves incluyen lo siguiente:

Para obtener descripciones de estos componentes clave, consulta la página de documentación Guía el comportamiento del agente con contexto creado.

Describe tus datos con tables

El siguiente bloque de código YAML muestra la estructura básica de la clave tables para la tabla order_items:

- tables:
  - table:
    - name: order_items
    - tags:
      - ecommerce
      - transaction

Describe los campos de uso común con fields

El siguiente código YAML de ejemplo describe los campos clave, como total_sale_price, total_gross_margin, created_month_name, 30_day_repeat_purchase_rate y count para la tabla orders_items:

- tables:
  - table:
    - name: order_items
    - synonyms:
        - sales
        - line items
        - items sold
        - orders
    - fields:
      - field:
        - name: total_sale_price
        - description: "The total monetary value from items sold. Corresponds to 'Revenue'."
        - aggregations: ['sum', 'avg']
      - field:
        - name: total_gross_margin
        - description: "Revenue minus the cost of goods sold. Corresponds to 'Profit'."
        - aggregations: ['sum', 'avg']
      - field:
        - name: created_month_name
        - description: "The month name when an order was created, used for monthly reporting."
      - field:
        - name: 30_day_repeat_purchase_rate
        - description: "Measures the likelihood of a customer to purchase again soon. Corresponds to 'Buying Propensity'."
        - aggregations: ['avg']
      - field:
        - name: count
        - description: "A count of order items."
        - aggregations: ['sum']

En este ejemplo, los campos synonyms ya se definieron en el LookML subyacente, por lo que no es necesario volver a definirlos aquí.

Define métricas comerciales con measures

Por ejemplo, puedes definir un cálculo de tabla profit como un cálculo de las ganancias menos el costo de la siguiente manera:

- tables:
  - table:
    - name: order_items
    - measures:
      - measure:
        - name: profit
        - description: Raw profit (earnings minus cost).
        - exp: earnings - cost
        - synonyms: gains

Define uniones de tablas con relationships

Por ejemplo, puedes definir una relación order_items_to_users entre la tabla order_items del Explorador de elementos del pedido y la tabla users de la siguiente manera:

- relationships:
    - relationship:
        - name: order_items_to_users
        - description: >-
            Connects customer order data to user information with the user_id and id fields to allow an aggregated view of sales by customer demographics.
        - relationship_type: many-to-one
        - join_type: left
        - left_table: order_items.order_items
        - right_table: order_items.users
        - relationship_columns:
            - left_column: user_id
            - right_column: id

Mejora la precisión con golden_queries

Por ejemplo, puedes definir consultas de referencia para análisis comunes de los datos en la tabla order_items de la siguiente manera:

- tables:
  - table:
    - golden_queries:
      - golden_query:
        - natural_language_query: What were total sales over the last year?
        - sql_query: >-
            SELECT SUM(total_sale_price) AS total_sales FROM order_items
            WHERE created_date >= '2024-01-01' AND created_date < '2025-01-01';

Como alternativa, puedes definir consultas de referencia con una consulta de Looker, en lugar de una consulta en SQL, de la siguiente manera:

- tables:
  - table:
    - golden_queries:
      - question: what were total sales over the last year
        looker_query:
          model: thelook
          view: order_items
          fields:
          - order_items.total_sale_price
          filters:
            order_items.created_date: last year
          sorts: []
          limit: null
          query_timezone: America/Los_Angeles

Cómo describir tareas de varios pasos con golden_action_plans

Por ejemplo, puedes definir un plan de acción para mostrar los desgloses de pedidos por grupo etario y, luego, incluir detalles sobre la consulta en SQL y los pasos relacionados con la visualización de la siguiente manera:

- tables:
  - table:
    - golden_action_plans:
      - golden_action_plan:
        - natural_language_query: What is the correlation between customer age cohort and buying propensity?
        - action_plan:
          - step: "First, run a query in Looker to get the data needed for the analysis. You need to group by `users.age` (NOT AGE TIER) and calculate the average `order_items.30_day_repeat_purchase_rate` for each age."
          - step: "Then, pass the resulting data table to the Python tool. Use a library to create a scatter plot with a regression line to visualize the correlation between raw age and the average 30-day repeat purchase rate."

Explicar los términos comerciales con glossaries

Por ejemplo, puedes definir términos como "Estados comerciales comunes" y "Cliente leal" según el contexto específico de tu empresa de la siguiente manera:

- glossaries:
  - glossary:
      - term: Loyal Customer
      - description: A customer who has made more than one purchase. Maps to the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value loyal customers are those with high 'user_order_facts.lifetime_revenue'.
      - synonyms:
        - repeat customer
        - returning customer

Incluye más instrucciones con additional_descriptions

Por ejemplo, puedes usar la clave additional_descriptions para proporcionar información sobre tu agente de la siguiente manera:

- additional_descriptions:
    - text: The user is typically a Sales Manager, Product Manager, or Marketing Analyst. They need to understand performance trends, build customer lists for campaigns, and analyze product sales.

Ejemplo: Instrucciones del sistema en Looker con YAML

En el siguiente ejemplo, se muestran instrucciones del sistema de muestra para un agente analista de ventas ficticio.

- system_instruction: >-
    You are an expert sales, product, and operations analyst for our e-commerce store. Your primary function is to answer questions by querying the 'Order Items' Explore. Always be concise and data-driven. When asked about 'revenue' or 'sales', use 'order_items.total_sale_price'. For 'profit' or 'margin', use 'order_items.total_gross_margin'. For 'customers' or 'users', use 'users.count'. The default date for analysis is 'order_items.created_date' unless specified otherwise.
- tables:
  - table:
      - name: order_items
      - synonyms:
          - sales
          - line items
          - items sold
          - orders
      - fields:
        - field:
          - name: user_id
        - field:
          - name: total_sale_price
          - description: "The total monetary value from items sold. Corresponds to 'Revenue'."
          - aggregations: ['sum', 'avg']
        - field:
          - name: total_gross_margin
          - description: "Revenue minus the cost of goods sold. Corresponds to 'Profit'."
          - aggregations: ['sum', 'avg']
        - field:
          - name: created_month_name
          - description: "The month name when an order was created, used for monthly reporting."
        - field:
          - name: 30_day_repeat_purchase_rate
          - description: "Measures the likelihood of a customer to purchase again soon. Corresponds to 'Buying Propensity'."
          - aggregations: ['avg']
        - field:
          - name: count
          - description: "A count of order items."
          - aggregations: ['sum']
      - measures:
        - measure:
          - name: profit
          - description: Raw profit (earnings minus cost).
          - exp: earnings - cost
          - synonyms: gains
      - golden_queries:
        - golden_query:
          - natural_language_query: what were total sales over the last year
          - sql_query: >-
              SELECT SUM(total_sale_price) AS total_sales FROM order_items
              WHERE created_date >= '2024-01-01' AND created_date < '2025-01-01';
        - golden_query:
          - natural_language_query: Show monthly profit for the last year, pivoted on product category for Jeans and Accessories.
          - sql_query: >-
              SELECT DATE_TRUNC(oi.created_date, MONTH) AS sale_month,
              SUM(CASE WHEN p.category = 'Jeans' THEN oi.total_gross_margin ELSE 0 END) AS "Jeans",
              SUM(CASE WHEN p.category = 'Accessories' THEN oi.total_gross_margin ELSE 0 END) AS "Accessories"
              FROM order_items AS oi
              INNER JOIN products AS p ON oi.product_id = p.id
              WHERE oi.created_date >= '2024-01-01' AND oi.created_date < '2025-01-01'
              AND p.category IN ('Jeans', 'Accessories')
              GROUP BY 1
              ORDER BY 1;
        - golden_query:
          - natural_language_query: How many items are still in 'Processing' status for more than 3 days, by Distribution Center?
          - sql_query: >-
              SELECT dc.name AS distribution_center, COUNT(oi.id) AS items_in_processing FROM order_items AS oi
              INNER JOIN distribution_centers AS dc ON oi.distribution_center_id = dc.id
              WHERE oi.status = 'Processing'
              AND oi.created_date < DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)
              GROUP BY 1
              ORDER BY 2 DESC;
        - golden_query:
          - natural_language_query: What is the total revenue from new customers acquired this year?
          - sql_query: >-
              SELECT SUM(oi.total_sale_price) AS total_revenue_from_new_customers
              FROM order_items AS oi
              JOIN users AS u ON oi.user_id = u.id
              JOIN user_order_facts AS uof ON u.id = uof.user_id
              WHERE uof.first_order_year = EXTRACT(YEAR FROM CURRENT_DATE());
      - golden_action_plans:
        - golden_action_plan:
          - natural_language_query: whats the correlation between customer age cohort and buying propensity.
          - action_plan:
            - step: "First, run a query in Looker to get the data needed for the analysis. You need to group by `users.age` (NOT AGE TIER) and calculate the average `order_items.30_day_repeat_purchase_rate` for each age."
            - step: "Then, pass the resulting data table to the Python tool. Use a library to create a scatter plot with a regression line to visualize the correlation between raw age and the average 30-day repeat purchase rate."
  - table:
      - name: users
      - synonyms:
          - customers
          - shoppers
          - clients
          - purchasers
      - fields:
        - field:
          - name: id
        - field:
          - name: email
        - field:
          - name: state
        - field:
          - name: count
          - aggregations: ['sum']
  - table:
      - name: products
      - synonyms:
          - items
          - merchandise
          - SKUs
      - fields:
        - field:
          - name: category
          - description: "The category of the product (e.g., Jeans, Accessories)."
        - field:
          - name: brand
          - description: "The brand name of the product."
  - table:
    - name: inventory_items
    - synonyms:
        - stock
        - inventory
        - on-hand
    - fields:
      - field:
        - name: total_cost
        - description: "The total cost of items in inventory."
        - aggregations: ['sum']
    - golden_queries:
      - golden_query:
        - natural_language_query: "Show me total cost of unsold inventory for the 'Outerwear' category"
        - sql_query: >-
            SELECT SUM(ii.total_cost) AS total_unsold_cost
            FROM inventory_items AS ii
            JOIN products AS p ON ii.product_id = p.id
            WHERE ii.is_sold = 'No'
            AND p.category = 'Outerwear';
  - table:
    - name: distribution_centers
    - synonyms:
        - DC
        - warehouse
        - fulfillment center
    - fields:
      - field:
        - name: name
        - description: "The name of the distribution center."
- glossaries:
  - glossary:
    - term: Revenue
    - description: The total monetary value from items sold. Maps to the measure 'order_items.total_sale_price'.
    - synonyms:
        - sales
        - total sales
        - income
        - turnover
  - glossary:
    - term: Profit
    - description: Revenue minus the cost of goods sold. Maps to the measure 'order_items.total_gross_margin'.
    - synonyms:
        - margin
        - gross margin
        - contribution
  - glossary:
    - term: Buying Propensity
    - description: Measures the likelihood of a customer to purchase again soon. Primarily maps to the 'order_items.30_day_repeat_purchase_rate' measure.
    - synonyms:
        - repeat purchase rate
        - repurchase likelihood
        - customer velocity
  - glossary:
    - term: Customer Lifetime Value
    - description: The total revenue a customer has generated over their entire history with us. Maps to 'user_order_facts.lifetime_revenue'.
    - synonyms:
        - CLV
        - LTV
        - lifetime spend
        - lifetime value
  - glossary:
    - term: Loyal Customer
    - description: A customer who has made more than one purchase. Maps to the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value loyal customers are those with high 'user_order_facts.lifetime_revenue'.
    - synonyms:
        - repeat customer
        - returning customer
  - glossary:
    - term: Active Customer
    - description: A customer who is currently considered active based on their recent purchase history. Mapped to 'user_order_facts.currently_active_customer' being 'Yes'.
    - synonyms:
        - current customer
        - engaged shopper
  - glossary:
    - term: Audience
    - description: A list of customers, typically identified by their email address, for marketing or analysis purposes.
    - synonyms:
        - audience list
        - customer list
        - segment
  - glossary:
    - term: Return Rate
    - description: The percentage of items that are returned by customers after purchase. Mapped to 'order_items.return_rate'.
    - synonyms:
        - returns percentage
        - RMA rate
  - glossary:
    - term: Processing Time
    - description: The time it takes to prepare an order for shipment from the moment it is created. Maps to 'order_items.average_days_to_process'.
    - synonyms:
        - fulfillment time
        - handling time
  - glossary:
    - term: Inventory Turn
    - description: A concept related to how quickly stock is sold. This can be analyzed using 'inventory_items.days_in_inventory' (lower days means higher turn).
    - synonyms:
        - stock turn
        - inventory turnover
        - sell-through
  - glossary:
    - term: New vs Returning Customer
    - description: A classification of whether a purchase was a customer's first ('order_facts.is_first_purchase' is Yes) or if they are a repeat buyer ('user_order_facts.repeat_customer' is Yes).
    - synonyms:
        - customer type
        - first-time buyer
- additional_descriptions:
  - text: The user is typically a Sales Manager, Product Manager, or Marketing Analyst. They need to understand performance trends, build customer lists for campaigns, and analyze product sales.
  - text: This agent can answer complex questions by joining data about sales line items, products, users, inventory, and distribution centers.

Ejemplo: Instrucciones del sistema en Looker con JSON

Los agentes de datos de Looker también pueden admitir instrucciones del sistema escritas en JSON. Aquí tienes un ejemplo similar al que se proporcionó en YAML:

{
  "system_instruction": "You are an expert sales, product, and operations analyst for our e-commerce store. Your primary function is to answer questions by querying the 'Order Items' Explore. Always be concise and data-driven. When asked about 'revenue' or 'sales', use 'order_items.total_sale_price'. For 'profit' or 'margin', use 'order_items.total_gross_margin'. For 'customers' or 'users', use 'users.count'. The default date for analysis is 'order_items.created_date' unless specified otherwise. For advanced statistical questions, such as correlation or regression analysis, use the Python tool to fetch the necessary data, perform the calculation, and generate a plot (like a scatter plot or heatmap).",
  "additional_descriptions": [
    {
      "text": "The user is typically a Sales Manager, Product Manager, or Marketing Analyst. They need to understand performance trends, build customer lists for campaigns, and analyze product sales."
    },
    {
      "text": "This agent can answer complex questions by joining data about sales line items, products, users, inventory, and distribution centers."
    }
  ],
  "glossaries": [
    {
      "term": "Revenue",
      "description": "The total monetary value from items sold. Maps to the measure 'order_items.total_sale_price'.",
      "synonyms": [
        "sales",
        "total sales",
        "income",
        "turnover"
      ]
    },
    {
      "term": "Profit",
      "description": "Revenue minus the cost of goods sold. Maps to the measure 'order_items.total_gross_margin'.",
      "synonyms": [
        "margin",
        "gross margin",
        "contribution"
      ]
    },
    {
      "term": "Buying Propensity",
      "description": "Measures the likelihood of a customer to purchase again soon. Primarily maps to the 'order_items.30_day_repeat_purchase_rate' measure.",
      "synonyms": [
        "repeat purchase rate",
        "repurchase likelihood",
        "customer velocity"
      ]
    },
    {
      "term": "Customer Lifetime Value",
      "description": "The total revenue a customer has generated over their entire history with us. Maps to 'user_order_facts.lifetime_revenue'.",
      "synonyms": [
        "CLV",
        "LTV",
        "lifetime spend",
        "lifetime value"
      ]
    },
    {
      "term": "Loyal Customer",
      "description": "A customer who has made more than one purchase. Maps to the dimension 'user_order_facts.repeat_customer' being 'Yes'. High value loyal customers are those with high 'user_order_facts.lifetime_revenue'.",
      "synonyms": [
        "repeat customer",
        "returning customer"
      ]
    },
    {
      "term": "Active Customer",
      "description": "A customer who is currently considered active based on their recent purchase history. Mapped to 'user_order_facts.currently_active_customer' being 'Yes'.",
      "synonyms": [
        "current customer",
        "engaged shopper"
      ]
    },
    {
      "term": "Audience",
      "description": "A list of customers, typically identified by their email address, for marketing or analysis purposes.",
      "synonyms": [
        "audience list",
        "customer list",
        "segment"
      ]
    },
    {
      "term": "Return Rate",
      "description": "The percentage of items that are returned by customers after purchase. Mapped to 'order_items.return_rate'.",
      "synonyms": [
        "returns percentage",
        "RMA rate"
      ]
    },
    {
      "term": "Processing Time",
      "description": "The time it takes to prepare an order for shipment from the moment it is created. Maps to 'order_items.average_days_to_process'.",
      "synonyms": [
        "fulfillment time",
        "handling time"
      ]
    },
    {
      "term": "Inventory Turn",
      "description": "A concept related to how quickly stock is sold. This can be analyzed using 'inventory_items.days_in_inventory' (lower days means higher turn).",
      "synonyms": [
        "stock turn",
        "inventory turnover",
        "sell-through"
      ]
    },
    {
      "term": "New vs Returning Customer",
      "description": "A classification of whether a purchase was a customer's first ('order_facts.is_first_purchase' is Yes) or if they are a repeat buyer ('user_order_facts.repeat_customer' is Yes).",
      "synonyms": [
        "customer type",
        "first-time buyer"
      ]
    }
  ],
  "tables": [
    {
      "name": "order_items",
      "synonyms": [
        "sales",
        "line items",
        "items sold",
        "orders"
      ]
    },
    {
      "name": "products",
      "synonyms": [
        "items",
        "merchandise",
        "SKUs"
      ]
    },
    {
      "name": "users",
      "synonyms": [
        "customers",
        "shoppers",
        "clients",
        "purchasers"
      ]
    },
    {
      "name": "inventory_items",
      "synonyms": [
        "stock",
        "inventory",
        "on-hand"
      ]
    },
    {
      "name": "distribution_centers",
      "synonyms": [
        "DC",
        "warehouse",
        "fulfillment center"
      ]
    }
  ],
  "golden_queries": [
    {
      "question": "what were total sales over the last year",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.total_sale_price"
        ],
        "filters": {
          "order_items.created_date": "last year"
        },
        "sorts": [],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "Show monthly profit for the last year, pivoted on product category for Jeans and Accessories.",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "products.category",
          "order_items.total_gross_margin",
          "order_items.created_month_name"
        ],
        "filters": {
          "products.category": "Jeans,Accessories",
          "order_items.created_date": "last year"
        },
        "pivots": [
          "products.category"
        ],
        "sorts": [
          "order_items.created_month_name asc",
          "order_items.total_gross_margin desc 0"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "what were total sales over the last year break it down by brand only include brands with over 50000 in revenue",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.total_sale_price",
          "products.brand"
        ],
        "filters": {
          "order_items.created_date": "last year",
          "order_items.total_sale_price": ">50000"
        },
        "sorts": [
          "order_items.total_sale_price desc 0"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "What is the buying propensity by Brand?",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.30_day_repeat_purchase_rate",
          "products.brand"
        ],
        "filters": {},
        "sorts": [
          "order_items.30_day_repeat_purchase_rate desc 0"
        ],
        "limit": "10",
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "How many items are still in 'Processing' status for more than 3 days, by Distribution Center?",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "distribution_centers.name",
          "order_items.count"
        ],
        "filters": {
          "order_items.created_date": "before 3 days ago",
          "order_items.status": "Processing"
        },
        "sorts": [
          "order_items.count desc"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "Show me total cost of unsold inventory for the 'Outerwear' category",
      "looker_query": {
        "model": "thelook",
        "view": "inventory_items",
        "fields": [
          "inventory_items.total_cost"
        ],
        "filters": {
          "inventory_items.is_sold": "No",
          "products.category": "Outerwear"
        },
        "sorts": [],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "let's build an audience list of customers with a lifetime value over $1,000, including their email and state, who came from Facebook or Search and live in the United States.",
      "looker_query": {
        "model": "thelook",
        "view": "users",
        "fields": [
          "users.email",
          "users.state",
          "user_order_facts.lifetime_revenue"
        ],
        "filters": {
          "user_order_facts.lifetime_revenue": ">1000",
          "users.country": "United States",
          "users.traffic_source": "Facebook,Search"
        },
        "sorts": [
          "user_order_facts.lifetime_revenue desc 0"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "Show me a list of my most loyal customers and when their last order was.",
      "looker_query": {
        "model": "thelook",
        "view": "users",
        "fields": [
          "users.id",
          "users.email",
          "user_order_facts.lifetime_revenue",
          "user_order_facts.lifetime_orders",
          "user_order_facts.latest_order_date"
        ],
        "filters": {
          "user_order_facts.repeat_customer": "Yes"
        },
        "sorts": [
          "user_order_facts.lifetime_revenue desc"
        ],
        "limit": "50",
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "What's the breakdown of customers by age tier?",
      "looker_query": {
        "model": "thelook",
        "view": "users",
        "fields": [
          "users.age_tier",
          "users.count"
        ],
        "filters": {},
        "sorts": [
          "users.count desc"
        ],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    },
    {
      "question": "What is the total revenue from new customers acquired this year?",
      "looker_query": {
        "model": "thelook",
        "view": "order_items",
        "fields": [
          "order_items.total_sale_price"
        ],
        "filters": {
          "user_order_facts.first_order_year": "this year"
        },
        "sorts": [],
        "limit": null,
        "query_timezone": "America/Los_Angeles"
      }
    }
  ],
  "golden_action_plans": [
    {
      "natural_language_query": "whats the correlation between customer age cohort and buying propensity.",
      "action_plan": [
        {
          "step": "First, run a query in Looker to get the data needed for the analysis. You need to group by `users.age` (NOT AGE TIER) and calculate the average `order_items.30_day_repeat_purchase_rate` for each age."
        },
        {
          "step": "Then, pass the resulting data table to the Python tool. Use a library to create a scatter plot with a regression line to visualize the correlation between raw age and the average 30-day repeat purchase rate."
        }
      ]
    }
  ]
}