Jump to Content
Data Analytics

Built with BigQuery: Aible's serverless journey to challenge the cost vs. performance paradigm

February 15, 2023
Dr. Ali Arsanjani

Director, AI/ML Partner Engineering, Head of AI Center of Excellence, Google Cloud

Arijit Sengupta

Founder and CEO, Aible

Aible is the leader in generating business impact from AI in less than 30 days by helping teams go from raw data to business value with solutions for customer acquisition, churn prevention, demand prediction, preventative maintenance, and more. These solutions help IT and data teams identify valuable data through automated data validation, enabling collaborative open-world exploration of data, and deliver AI recommendations in enterprise applications to help teams achieve business goals while considering unique business circumstances such as marketing budgets and changing market conditions. 

For example, if a sales optimization model would require a 10% increase in sales resources for optimal revenue/profit outcome, the user can specify whether or not such a resource shift is possible, and Aible would choose the best predictive model and threshold level across thousands of model-hyperparameter combinations of models autotrained by Aible to satisfy the business needs. Thus, Aible combines business optimization and machine learning by saving off the hyperparameter-model search space and then searching for the optimal model settings given the users business goals and constraints.

https://storage.googleapis.com/gweb-cloudblog-publish/images/1_Aible.max-700x700.jpg

As economic conditions change, many companies shift their data warehouse use cases away from standard subscription models that procure static/fixed-size infrastructure configurations, regardless of the actual utilization or demand rate. However, the paradigm breaks down for most organizations the moment they want to analyze or build predictive models based on the data in the data warehouse - all of a sudden, data scientists start bringing up server clusters that they keep running for six to nine months during the duration of the analytics or data science project because most data science and analytics platforms are not serverless today and accrue expenses if they are “always on.”

Aible’s Value Proposition (Ease of use, automation and faster ROI) powered by BigQuery's serverless architecture 

Serverless architectures overcome unnecessary server uptime and allow for significant cost efficiencies. Instead of needing to keep servers running for the duration of analytics and data science & machine learning projects, serverless approaches let the users interact with the system in a highly responsive manner using metadata and browsers while ramping up compute resources for short lengths of time - when absolutely necessary. A serverless, fully managed enterprise data warehouse like BigQuery can save state until the next invocation or access is required and also provides beneficial security and scalability characteristics. 

Aible leverages Google Cloud to bring serverless architecture and a unique augmented approach to most analytics and data science use cases across user types while realizing significant cost efficiencies. Aible realized a simple fact - in the time a human can ask a handful of questions, an AI can ask millions of questions and save off the answers as metadata. Then, if you had a truly serverless end-to-end system, users could get their questions answered without hitting the server with the raw data again. 

For example, one user may create a dashboard focused on sales channels, while another user may analyze geographical patterns of sales, and a third user might benchmark different salespeople's performance; but all of these could be done based on manipulating the metadata. Aible’s end-to-end serverless user interface runs directly in the user’s browser and accesses saved off metadata in the customer’s cloud account.

https://storage.googleapis.com/gweb-cloudblog-publish/images/2_Aible.max-2000x2000.jpg

The big question was whether the cost was indeed lower if the AI asked a million questions all at once? In January 2023, Google and Aible worked with a joint Fortune 500 customer to test out this architecture. The test was run using Aible on BigQuery without any special optimizations. The customer had sole discretion over what datasets they used. The results were outstanding. Over two weeks, more than 75 datasets of various sizes were evaluated. The total number of rows exceeded 100 million, and the total number of questions answered and then saved off was over 150 million. The total cost across all that evaluation was just $80.

https://storage.googleapis.com/gweb-cloudblog-publish/images/3_Aible.max-1900x1900.jpg

At this customer, traditional analytics and data science projects typically take about four months to complete. Based on their typical completion time, they estimated that it would have cost more than $200,000 in server and associated costs to conduct these 75 projects. As shown in the table above, the AI-first end-to-end serverless approach was more than 1,000 times efficient compared to traditional servers. 

The following diagram shows exactly why the combined Aible and Google, AI-first end-to-end serverless environment was so efficient. Note that because Aible could run the actual queries serverless on BigQuery, it was able to analyze any size data on a truly end-to-end serverless environment. Aible, supports AWS and Azure as well. The architecture would work exactly the same way using Lambdas and Function Apps for small and medium sized datasets. But, for larger datasets on AWS and Azure, Aible today brings up Spark and at that point the efficiency of the system drops significantly compared to the end-to-end serverless capabilities offered on Google Cloud. 

As shown in the example below, a typical data analysis project may run for six months, requiring 4,320 hours of server time, Aible may actively conduct ‘analysis’ activities for just six hours during the entire project. That translates to a 720-times reduction in server time. But, Aible’s serverless analysis is also three times more cost-effective than the same analysis on comparable servers, according to this benchmark by Intel and Aible.

https://storage.googleapis.com/gweb-cloudblog-publish/images/4_Aible.max-2000x2000.jpg

When Aible needs to evaluate, transform, analyze data, or create predictive models, it pushes the relevant queries to the customer-owned BigQuery datasets or BigQueryML models, as appropriate. It then saves the relevant metadata (including analysis results and models) in the customer’s own private Google Cloud project in Cloud Storage or BigQuery as appropriate. Whenever a user interacts with the analysis results or models, all of the work is done in their browsers, and the metadata is securely accessed as necessary. Aible never gets access to the customer’s data, which remains securely in the customer’s own private Google Cloud project.

Aible built on Google Cloud Platform services

1. Aible Sense

Aible Sense starts you on the data journey and helps you go from overwhelming data to valuable data. With no upfront effort, Aible Sense completely automates the data engineering and data science tasks to ensure a dataset is of sufficient quality (running tests like outlier detection, inclusion probabilities, SHAP values, etc.) to generate statistically valid insights, high-impact predictive models, and high-value data warehouses.

https://storage.googleapis.com/gweb-cloudblog-publish/images/5_Aible.max-2000x2000.jpg

The image below depicts the Aible Sense architecture deployed on Google Cloud. Aible is pushing the analysis workload to BigQuery, BigQueryML, and Vertex AI as appropriate to do the feature creation and tests described above:

https://storage.googleapis.com/gweb-cloudblog-publish/images/8_Aible.max-1600x1600.jpg

2. Aible Explore

Aible Explore enables your team to brainstorm with their data. Open world exploration reveals new paths for discovery and helps to identify patterns and relationships among variables. With guided data exploration and augmented analytics, Aible Explore helps business users visually understand business drivers, uncover root causes, and identify contextual insights in minutes. Aible exports dynamic Looker dashboards with a single click, creates the necessary LookML, which is the language needed to build the semantic model, and points to the underlying data in BigQuery. Aible enables rapid deployment of Looker dashboards on BQ data by generating the necessary LookML code without the need for further user intervention thus drastically reducing the cycle time.

https://storage.googleapis.com/gweb-cloudblog-publish/images/7_Aible.max-2000x2000.jpg

The image below depicts the Aible Explore architecture deployed on Google Cloud. Because BigQuery scales exceptionally well for large and complex data, by pushing the queries to BQ, Aible was finally able to enable analysis on any size data without resorting to bringing up spark clusters:

https://storage.googleapis.com/gweb-cloudblog-publish/images/10_Aible.max-1600x1600.jpg

3. Aible Optimize

Aible Optimize considers your unique benefit of correct predictions and cost of incorrect predictions, and business constraints such as marketing budget limits that may prevent you from acting upon every AI recommendation. It then shows you exactly how the AI recommendations would impact your business given such business realities. The optimal predictive model is automatically deployed as a serverless (CloudRun) restful endpoint that can be consumed from enterprise applications or systems such as Looker and Salesforce.

https://storage.googleapis.com/gweb-cloudblog-publish/images/9_Aible.max-2000x2000.jpg

The image below depicts the Aible Optimize architecture deployed on Google Cloud. With regard to training models, because BigQueryML and VertexAI scale exceptionally well for large and complex datasets, by leveraging this underlying technology Aible was finally able to enable the training of predictive models on any size data without having to resort to bringing up spark clusters and at the same time adding extra levels of resilience beyond the ones provided by the spark framework.

https://storage.googleapis.com/gweb-cloudblog-publish/images/6_Aible.max-1800x1800.jpg

The proof is in the pudding - Overstock’s customer journey: 

Overstock.com used Aible to improve speed to data-quality evaluation from weeks to minutes per dataset. The entire Aible project took just 5 days, including installation and integration with Overstock's BigQuery to Executive review and acceptance of results. 

Joel Weight, Overstock.com's CTO wrote, "We extensively use Google BigQuery. Aible's seamless integration with BigQuery allowed us to analyze datasets with a single click, and in a matter of minutes automatically get to a dynamic dashboard showing us the key insights we need to see. This would have taken weeks of work using our current best practices. When we can analyze data in minutes, we can get fresh insights instantly as market conditions and customer behavior changes."

Joel's comment points to a far more valuable reason to use Aible - beyond massive analysis cost reduction. In rapidly changing markets, the most actionable patterns will be the 'unknown unknowns.' Of course, dashboards can be quickly refreshed with new data, but they still ask the same data questions as they always have. What about new insights hidden in the data? The questions we have yet to think to ask? The traditional manual analysis would take weeks or months to detect such insights, and even then, they can't ask all possible questions. Aible on BigQuery can ask millions of questions and present the key insights in the order of how the insights affect business KPI such as revenue, costs, etc. And it can do so in minutes. This completely changes the art of the possible of who can conduct analysis and how quickly it can generate results. 

Aible natively leverages Google BigQuery, part of Google’s  data cloud, to parallelize these data evaluations, data transformations, explorations, and model training, across virtually unlimited resources. Aible seamlessly analyzes data from various sources by securely replicating the data in the customer's own BigQuery dataset. Aible also seamlessly generates native Looker dashboards on top of data staged in BigQuery (including data from other sources that Aible automatically stages in BigQuery), automatically taking care of all necessary steps, including custom LookML generation.

Conclusion

Google’s data cloud provides a complete platform for building data-driven applications from simplified data ingestion, processing, and storage to powerful analytics, AI, ML, and data sharing capabilities — all integrated with the open, secure, and sustainable Google Cloud platform. With a diverse partner ecosystem, open-source tools, and APIs, Google Cloud can provide technology companies the portability and differentiators they need.

To learn more about Aible on Google Cloud, visit Aible.com

Click here to learn more about Google Cloud’s Built with BigQuery initiative. 


We thank the Google Cloud team member who contributed to the blog: Christian Williams, Principal Architect, Cloud Partner Engineering

Posted in