E-commerce sample application using streaming analytics and real-time AI

The e-commerce sample application illustrates common use cases and best practices for implementing streaming data analytics and real-time AI. Use it to learn how to dynamically respond to customer actions by analyzing and responding to events in real time, and also how to store, analyze and visualize that event data for longer-term insights.

The application is implemented in Java, and uses the following products:

  • BigQuery
  • Cloud Bigtable
  • Dataflow
  • Pub/Sub

The sample application is available on GitHub at retail-java-applications.

Goals

The application was designed to address the following requirements:

  • Validate incoming data and apply corrections to it where possible.
  • Analyze clickstream data to keep a count of the number of views per product in a given time period. Store this information in a low latency store where the application can use it to provide 'number of people who viewed this product' messages to customers on the web site.
  • Use transaction data to inform inventory ordering:

    • Analyze transaction data to calculate the total number of sales for each item, both by store and globally, for a given period.
    • Analyze inventory data to calculate the incoming inventory for each item.
    • Pass this data to inventory systems on a continuous basis so it can be used for inventory purchasing decsions decisions.
  • Validate incoming data and apply corrections to it where possible. Write any uncorrectable data to a dead letter queue for additional analysis and processing. Make a metric that represents the percentage of incoming data that gets sent to the dead letter queue available for monitoring and alerting.

  • Process all incoming data into a standard format and store it in a data warehouse to use for future analysis and visualization.

  • Denormalize transaction data for in-store sales so that it can include information like the latitude and longitude of the store location. Provide the the store information through a slowly changing table in BigQuery, using the store ID as a key.

Data

The application processes the following types of data:

  • Clickstream data being sent by Newkick's web interface.
  • Transaction data being sent by on-premise or software-as-a-service (SaaS) systems.
  • Inventory data being sent by on-premise or SaaS systems.

Task patterns

The application contains a number of task patterns that show the best way to accomplish Java programming tasks that are commonly needed to create this type of application.

The application contains the following task patterns: