This page applies to Apigee and Apigee hybrid.
View Apigee Edge documentation.
Overview
The DataCapture policy captures data (such as payload, HTTP headers, and path or query parameters) from an API proxy for use in Analytics. You can use captured data in custom Analytics reports, as well as to implement monetization, and monitoring rules.
This policy is an Extensible policy and use of this policy might have cost or utilization implications, depending on your Apigee license. For information on policy types and usage implications, see Policy types.
Data collector resource
To use the DataCapture
policy, you must first create a
data collector resource. For steps to create a data collector resource using either the Apigee UI and
the Apigee API, see
Creating a data collector.
<DataCapture>
The <DataCapture>
element defines a DataCapture
policy.
<DataCapture async="true" continueOnError="true" enabled="true" name="DC">
Here's an example of a DataCapture
policy:
<DataCapture name="DC-1"> <Capture> <DataCollector>dc_data_collector</DataCollector> <Collect ref="my_data_variable" /> </Capture> </DataCapture>
The main element of the
DataCapture
policy is the <Capture>
element,
which specifies the means of capturing the data. It has two required child elements:
- The
<DataCollector>
element, which specifies a data collector REST resource. In this case, the resource is nameddc_data_collector
. - The
<Collect>
element, which specifies the means for capturing the data.
In this simple example, the data is extracted from a variable named
my_data_variable
, which has been created elsewhere in the proxy.
The variable is specified by the ref
attribute.
The <Collect>
element also provides several other ways
of capturing data from various sources through its
child elements.
See Examples for more examples of capturing data
with the DataCapture
policy.
The DataCapture
element has the following syntax.
<DataCapture name="capturepayment" continueOnError="false" enabled="true"> <DisplayName>Data-Capture-Policy-1</DisplayName> <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables> <ThrowExceptionOnLimit>false</ThrowExceptionOnLimit> <!-- Existing Variable --> <Capture> <Collect ref="existing-variable" default="0"></Collect> <DataCollector>dc_1</DataCollector> </Capture> <!-- JSONPayload --> <Capture> <DataCollector>dc_2</DataCollector> <Collect default="0"> <Source>request</Source> <JSONPayload> <JSONPath>result.var</JSONPath> </JSONPayload> </Collect> </Capture> <!-- URIPath --> <Capture> <DataCollector>dc_3</DataCollector> <Collect default="0"> <URIPath> <!-- All patterns must specify a single variable to extract named $ --> <Pattern ignoreCase="false">/foo/{$}</Pattern> <Pattern ignoreCase="false">/foo/bar/{$}</Pattern> </URIPath> </Collect> </Capture> </DataCapture>
This element has the following attributes that are common to all policies:
Attribute | Default | Required? | Description |
---|---|---|---|
name |
N/A | Required |
The internal name of the policy. The value of the Optionally, use the |
continueOnError |
false | Optional | Set to false to return an error when a policy fails. This is expected behavior for
most policies. Set to true to have flow execution continue even after a policy
fails. See also:
|
enabled |
true | Optional | Set to true to enforce the policy. Set to false to turn off the
policy. The policy will not be enforced even if it remains attached to a flow. |
async |
false | Deprecated | This attribute is deprecated. |
The following table provides a high-level description of the child elements of
<DataCapture>
.
Child Element | Required | Description |
---|---|---|
<Capture> |
Required | Captures the data for a specified variable. |
Examples
The following examples illustrate various ways to use the DataCapture
policy.
Capturing data for a built-in variable
The code sample below illustrates how to capture data
for a built-in variable, message.content
, which contains the
content of the request, response, or error message. See
Flow variables for
more information about built-in variables.
<DataCapture name="DC-FullMessage"> <Capture> <DataCollector>dc_data_collector</DataCollector> <Collect ref="message.content" /> </Capture> </DataCapture>
In the code above, the ref
attribute of the </Collect>
element
specifies the variable to capture, which in this example is
named "message.content"
.
The sample captures the data with a <Capture>
element,
which also contains a <DataCollector>
element specifying
the name of the data collector resource.
Capturing data inline
The next example shows how to capture data inline using
<JSONPayload>
, a child element of the
<Collect>
element.
<DataCapture name="DC-Currency"> <Capture> <DataCollector>dc_data_collector<DataCollector> <Collect> <JSONPayload> <JSONPath>$.results[0].currency</JSONPath> </JSONPayload> </Collect> </Capture> </DataCapture>
In the code above:
- The
<JSONPayload>
element specifies the JSON-formatted message from which the value of the variable is extracted. - The
<JSONPath>
element specifies the JSON path used to extract the value from the message, which in this case is$.results[0].currency
.
As an illustration, suppose the value extracted at the time
the message was received is 1120
. Then
the resulting entry sent to Analytics would be
{ "dc_data_collector": "1120" }
<Capture>
The <Capture>
element specifies the means of capturing the data.
<Capture />
The following table provides a high-level description of the child elements of
<Capture>
.
Child Element | Required? | Description |
---|---|---|
<DataCollector> |
Required | Specifies the data collector resource. |
<Collect> |
Required | Specifies the means for capturing data. |
<DataCollector>
The <DataCollector>
element specifies the
data collector resource.
<DataCollector>dc_data_collector</DataCollector>The following table describes the attributes of the
<DataCollector>
element.
Attribute | Description | Default | Required? | Type |
---|---|---|---|---|
scope |
Specify this attribute and set the value to |
N/A | Optional | String |
The body of the <DataCollector>
element contains the name of the
data collector resource.
<Collect>
The <Collect>
element specifies the means for capturing data.
<Collect ref="existing-variable" default="0"/>
The following table describes the attributes of the <Collect>
element.
Attribute | Description | Default | Required? | Type |
---|---|---|---|---|
ref |
The variable for which you are capturing data. |
N/A | Optional—If ref is omitted, exactly one of the
following must be specified:
QueryParam ,
Header ,
FormParam ,
URIPath ,
JSONPayload , or
XMLPayload .
|
String |
default | Specifies the value that is sent to Analytics if the value of
the variable is not populated at runtime. For example, if you set
default="0" , the value sent to Analytics would be 0.
|
If you don't specify the value of
default , and the value of the variable
is not populated at runtime, the value sent to Analytics is null
for a numeric variable or "Not set" for a string variable.
|
Required | String |
The data can be captured from an existing variable using the ref
attribute, or
by child elements <Collect>
.
Child elements of <Collect>
The following table provides a high-level description of the child elements of
<Collect>
:
Child Element | Required? | Description |
---|---|---|
<Source> |
Optional | Specifies the variable to be parsed. |
<URIPath> |
Optional | Extracts a value from the proxy.pathsuffix
of a request source message. |
<QueryParam> |
Optional | Extracts a value from the specified query parameter of a request source message. |
<Header> |
Optional | Extracts a value from the specified HTTP header of the specified request or response message. |
<FormParam> |
Optional | Extracts a value from the specified form parameter of the specified request or response message. |
<JSONPayload> |
Optional | Specifies the JSON-formatted message from which the value of the variable will be extracted. |
<XMLPayload> |
Optional | Specifies the XML-formatted message from which the value of the variable will be extracted. |
<Source>
Specifies a variable naming the message to be parsed. The value of
<Source>
defaults to message
. The message
value
is context-sensitive. In a request flow, message
resolves to the request message. In
a response flow, message
resolves to the response message.
If the variable specified in <Source>
cannot be resolved, or resolves to a non-message type,
the policy will fail to respond.
Default Value | N/A |
Required? | Optional |
Type | String |
Parent Element |
<Collect> |
Child Elements | N/A |
<Source >request</Source>
<URIPath>
Extracts a value
from the proxy.pathsuffix
of a request
source message. The path applied to
the pattern is the proxy.pathsuffix
, which does not include the basepath for the API proxy. If
the source message resolves to a message type of response
, the element does nothing.
Default Value | N/A |
Required? | Optional |
Type | Complex |
Parent Element |
<Collect> |
Child Elements | <Pattern> |
Attributes
Attribute | Description | Default | Required? | Type |
---|---|---|---|---|
ignoreCase | Specifies to ignore case when matching the pattern. |
false |
Optional | Boolean |
<Collect> <URIPath> <Pattern ignoreCase="false">/foo/{$}</Pattern> </URIPath> </Collect>
You can use multiple <Pattern>
elements:
<URIPath> <Pattern ignoreCase="false">/foo/{$}</Pattern> <Pattern ignoreCase="false">/foo/bar/{$}</Pattern> </URIPath>
<QueryParam>
Extracts a value
from the specified query parameter of a request
source message. If the
source message resolves to a message type of response
, the element does
nothing.
Default Value | N/A |
Required? | Optional |
Type | Complex |
Parent Element |
<Collect> |
Child Elements | <Pattern> |
Attributes
Attribute | Description | Default | Required? | Type |
---|---|---|---|---|
name | Specifies the name of the query parameter. If multiple query parameters have the same name, use indexed referencing, where the first instance of the query parameter has no index, the second is at index 2, the third at index 3, etc. |
N/A |
Required | String |
<Collect> <QueryParam name="code"> <Pattern ignoreCase="true">{$}</Pattern> </QueryParam> </Collect>
If multiple query parameters have the same name, use indices to reference the parameters:
<Collect> <QueryParam name="code.2"> <Pattern ignoreCase="true">{$}</Pattern> </QueryParam> </Collect>
Note: You must specify a single variable named {$}
.
There may be multiple unique Pattern
elements, but the first matching pattern will
resolve for a particular request.
<Header>
Extracts a value
from the specified HTTP header of the specified request
or
response
message. If multiple headers have
the same name, their values are stored in an array.
Default Value | N/A |
Required? | Optional |
Type | Complex |
Parent Element |
<Collect> |
Child Elements | <Pattern> |
Attributes
Attribute | Description | Default | Required? | Type |
---|---|---|---|---|
name | Specifies the name of the header from which you extract the value. If multiple
headers have the same name, use indexed referencing, where the first instance of the
header has no index, the second is at index 2, the third at index 3, etc. Use
.values to get all headers in the array. |
N/A |
Required | String |
<Collect> <Header name="my_header"> <Pattern ignoreCase="false">Bearer {$}</Pattern> </Header> </Collect>
If multiple headers have the same name, use indices to reference individual headers in the array:
<Collect> <Header name="my_header.2"> <Pattern ignoreCase="true">{$}</Pattern> </Header> </Collect>
Or the following to list all the headers in the array:
<Collect> <Header name="my_header.values"> <Pattern ignoreCase="true">{$}</Pattern> </Header> </Collect>
<FormParam>
Extracts a value
from the specified form parameter of the specified request
or response
message. Form parameters
can be extracted only when the Content-Type
header of the specified message is
application/x-www-form-urlencoded
.
Default Value | N/A |
Required? | Optional |
Type | Complex |
Parent Element |
<Collect> |
Child Elements | <Pattern> |
Attributes
Attribute | Description | Default | Required? | Type |
---|---|---|---|---|
name | The name of the form parameter from which you extract the value. |
N/A |
Optional | String |
<Collect> <FormParam name="greeting"> <Pattern>hello {$}</Pattern> </FormParam> </Collect>
<JSONPayload>
Specifies the
JSON-formatted message from which the value of the variable will be extracted. JSON
extraction is performed only when message's Content-Type
header is
application/json
.
Default Value | N/A |
Required? | Optional |
Type | Complex |
Parent Element |
<Collect> |
Child Elements | <JSONPath> |
<Collect> <JSONPayload> <JSONPath>$.results[0].currency</JSONPath> </JSONPayload> </Collect>
<JSONPath>
Required child element of the <JSONPayload>
element. Specifies the
JSON path used to extract a value from a JSON-formatted message.
Default Value | N/A |
Required? | Required |
Type | String |
Parent Element |
<JSONPayload> |
Child Elements | N/A |
<JSONPath>$.rss.channel.title</JSONPath>
<XMLPayload>
Specifies the
XML-formatted message from which the value of the variable will be extracted. XML payloads
are extracted only when the Content-Type
header of the message
is text/xml
, application/xml
,
or application/*+xml
.
Default Value | N/A |
Required? | Optional |
Type | Complex |
Parent Element |
<Collect> |
Child Elements |
<Namespaces> <XPath> |
The following table provides a high-level description of the child elements of
<XMLPayload>
.
Child Element | Required? | Description |
---|---|---|
<Namespaces> |
Optional | Specifies zero or more namespaces to be used in the XPath evaluation. |
<XPath> |
Required | Specifies the XPath defined for the variable. |
<Collect> <XMLPayload> <Namespaces> <Namespace prefix="soap">http://schemas.xmlsoap.org/soap/envelope/</Namespace> <Namespace prefix="ns1">http://ns1.example.com/operations</Namespace> </Namespaces> <!-- get the local name of the SOAP operation --> <XPath>local-name(/soap:Envelope/soap:Body/ns1:*[1])</XPath> </XMLPayload> </Collect>
<Namespaces>
Specifies the set of namespaces that can be used in the XPath expression. An example.
<Collect> <XMLPayload> <Namespaces> <Namespace prefix="maps">http://maps.example.com</Namespace> <Namespace prefix="places">http://places.example.com</Namespace> </Namespaces> <XPath>/maps:Directions/maps:route/maps:leg/maps:endpoint/places:name</XPath> </XMLPayload> </Collect>
If you are not using namespaces in your XPath expressions, you can omit or comment out the
<Namespaces>
element, as the following example shows:
<Collect> <XMLPayload> <!-- <Namespaces/> --> <XPath>/Directions/route/leg/name</XPath> </XMLPayload> </Collect>
<Namespace>
Specifies one namespace and a corresponding prefix for use within the XPath expression. An example.
Default Value | N/A |
Required? | Optional |
Type | String |
Parent Element |
<Namespaces> |
Child Elements | N/A |
Attributes
Attribute | Description | Default | Required? | Type |
---|---|---|---|---|
prefix |
The prefix you use to refer to the namespace in the xpath expression. This need not be the same prefix as is used in the original XML document. |
N/A |
Required | String |
<Collect> <XMLPayload> <Namespaces> <Namespace prefix="maps">http://maps.example.com</Namespace> </Namespaces> <XPath>/maps:Directions/maps:route/maps:leg/maps:endpoint</XPath> </XMLPayload> </Collect>
<XPath>
Required child element of the XMLPayload element. Specifies the XPath defined for the variable. Only XPath 1.0 expressions are supported.
Default Value | N/A |
Required? | Required |
Type | String |
Parent Element |
<XMLPayload> |
Child Elements | N/A |
<XPath>/test/example</XPath>
Note: If you use namespaces in your XPath expressions, you must declare
the namespaces in the
<XMLPayload><Namespaces>
section
of the policy.
<ThrowExceptionOnLimit>
The <ThrowExceptionOnLimit>
element specifices what happens when the capture
limits on the number of variables or the maximum size of a variable are reached. See
Enforcing capture limits.
The value of <ThrowExceptionOnLimit>
can be one of the following:
false
: The data for the variables is sent to Analytics.true
: An error message is returned, and the data is not sent to Analytics.
Error Reference
Runtime errors
The table below describes runtime errors, which can occur when the policy executes.
Fault code | Cause |
---|---|
DataCollectorTypeMismatch |
The value to be captured did not match the |
ExtractionFailure |
The data extraction failed. |
UnresolvedVariable |
The variable does not exist. |
VariableCountLimitExceeded |
The number of captured variables exceeded the variable count limit of 100 variables |
VariableValueLimitExceeded |
The size of a captured value exceeded the single variable limit of 400 bytes. |
MsgContentParsingFailed |
Message content failed to be parsed into XML or JSON. |
InvalidMsgContentType |
The message content type does not match the expected message content type in the policies capture clause. |
NonMsgVariable |
The <Source> element value did not reference a message variable. |
JSONPathQueryFailed |
The JSONPath query failed to resolve to a value. |
PrivateVariableAccess |
Attempt to access a private variable failed. |
XPathEvaluationFailed |
XPath failed to resolve to a value. |
Runtime errors are returned in two ways:
- Error response back to client (
continueOnError=false
)When the policy's
continueOnError
attribute is set tofalse
, errors that occur during the policy execution will abort the message processing and return a descriptive error message. The policy will attempt to capture all the relevant errors in the data capture policy before returning the message. DataCapture
errors analytics fieldThe
dataCapturePolicyErrors
field contains a list of all errors that have occurred. An example of how this would appear in the analytics data map is shown below:# Example payload [ { errorType: TypeMismatch, policyName: MyDataCapturePolicy-1, dataCollector: purchaseValue }, { errorType: MaxValueSizeLimitReached, policyName: MyDataCapturePolicy-1, dataCollector: purchasedItems }, ]
This field is subject to the 400 byte variable size limit.
Deployment errors
Fault code | Cause |
---|---|
DeploymentAssertionError |
The DataCollector referenced in the policy couldn't be found in the organization during deployment. |
JsonPathCompilationFailed |
Compiling with the specified JSONPath failed. |
XPathCompilationFailed |
If the prefix or the value used in the XPath element is not part
of any of the
declared namespaces in the policy, then the deployment of the API proxy
fails. |
PatternCompilationFailed |
Pattern compilation failed. |
Finding DataCapture
Errors in the Debug tool
The dataCapturePolicyErrors
variable is available in the Debug tool.
This an additional tool that you can use to catch errors without going to Analytics.
For example, you can catch an error that occurs if you upgrade your version of the hybrid runtime
and inadvertently break the analytics in an already deployed proxy.
Enforcing capture limits
Apigee enforces the following limits on variables in the captured data:
- The number of variables allowed is 100.
- The maximum size of each variable (including list values) is 400 bytes.
When the Data Capture Policy execution, before a value is added to the data capture map in the message context:
- If the limit on the number of variables has been reached, the new variable will be dropped.
- If the limit on the size of the variables has been reached, the value will be trimmed to fit within the desired limits.
In both cases:
- A debug message will be logged to the Message Processor log.
- A
limit reached
error message will be appended todataCapturePolicyErrors
, which will available in both Analytics and Debug. Note: Only one error message for reaching the maximum number of allowed variables will be appended. - If <ThrowExceptionOnLimit> is
true
, the data is not sent to Analytics and instead an error is returned to the client.