doc.ai: A mobile AI app blazes trails in medical research

About doc.ai

doc.ai is an AI platform that enables ML computations on real-world data to develop personal health insights and predictive models.

Industries: Healthcare, Life Sciences
Location: United States

Tell us your challenge. We're here to help.

Contact us

In less than a year, over 35,000 active consumers have interacted with the doc.ai mobile app to monitor their health or participate in research trials.

Google Cloud results

  • Runs on-device AI while training ML models in the cloud
  • Trains doctors and researchers in deep ML technologies using a cloud-hosted platform
  • Provides new channels for personal health monitoring and medical research using Google Cloud, including Google Kubernetes Engine
  • Implementing a blockchain solution to data privacy and research trial transparency using Google Cloud

Implemented an AI training product on Google Cloud in 2 days

doc.ai has built an AI platform that enables machine learning (ML) computations on real-world and multiomics data to develop personal health insights and predictive models.

The mobile app has been available in the iOS App Store for less than a year with a monthly growth of about 5,000 new active users using it to obtain a more precise picture of their health. Consumers can also help accelerate research by participating in data trials.

With user permission, doc.ai uses the mobile camera to take a "medical selfie", doc.ai's proprietary technology from which age, height, weight, and other phenome data are inferred. It also enables users to pull data from their medical records and lab tests, such as genetic, blood, and microbiome data along with environmental, physical activity, and prescription data. The app then uses deep learning models to perform a range of health-related predictions.

Consumers are also invited to participate in data trials by research sponsors, such as patient networks, academic researchers, pharmaceutical companies, and healthcare providers. "It's a new way of conducting research that uses AI to accelerate medical research with more and better engagement from participants," says Sophia Arakelyan, Communications Director at doc.ai.

Current trials are for those affected by allergies, Crohn's disease and colitis, and epilepsy. The allergy trial is being conducted in collaboration with Professors Chirag Patel and Arjun Manrai from Harvard Medical School with the goal of building a predictive model. The data trial for Crohn's disease and colitis has been initiated by the patient network Crohnology.com to indicate optimal supplements for this condition. This data trial is being conducted with the support of Ubiome, which provides free gut microbiome testing kits to participants. In addition, doc.ai has licensed its AI technology to Anthem, one of the largest health insurers in the United States.

"You can move much faster building products on Google Cloud than on AWS."

Neeraj Kashyap, Head of AI, doc.ai

By engaging individuals directly in AI-powered research, and making it available to medical professionals, doc.ai is already showing higher retention, higher engagement, and acceleration in the progress of data trials.

Secure hosting and ML tuning in the cloud

doc.ai hosts its solution on Google Cloud and uses Google Kubernetes Engine (GKE) to deliver services to mobile app users and research communities that analyze the data.

The mobile doc.ai app acts as a portal that integrates a diverse set of ML services using TensorIO. These services run on the device and interact with personal, environmental, and health-related datasets that the app collects, encrypts, and dispatches to the cloud.

The backend for these ML services is hosted on a GKE cluster. As services are added and refined, and as new participants create profiles, the company uses Kubernetes to deploy mobile app builds. "Google Kubernetes Engine is our most heavily used product," says Neeraj Kashyap, Head of AI at doc.ai.

The doc.ai training environment uses Google Compute Engine to tune its ML models. The models are C++ binaries that process TensorFlow bundles — batches of multidimensional arrays — which store data extracted from the mobile doc.ai app. The solution relies on Google Kubernetes Engine clusters and Cloud GPUs to generate inferences. Research clients access inference results via an API implemented in Python.
The doc.ai training environment uses Google Compute Engine to tune its ML models. The models are C++ binaries that process TensorFlow bundles — batches of multidimensional arrays — which store data extracted from the mobile doc.ai app. The solution relies on Google Kubernetes Engine clusters and Cloud GPUs to generate inferences. Research clients access inference results via an API implemented in Python.

Engineers at doc.ai rely exclusively on Compute Engine GPUs to tune its supervised ML models. "As a small company, we don't have the budget to just buy 100 GPUs and keep them on-premises," says Neeraj. "When we're figuring out what hyperparameters with which we want to train our models, we use Google Cloud because it allows us to scale to an unlimited extent, whether we need 100 or 500 GPUs, which is great."

doc.ai exposes its ML models via APIs to licensed partners who then conduct research trials.

Provisioning a Google Cloud environment for doc.ai experiments

The snippet shows how doc.ai provisions a Google Cloud environment for doc.ai experiments.
The snippet shows how doc.ai provisions a Google Cloud environment for doc.ai experiments. It uses Compute Engine startup scripts and a metadata server to set up and orchestrate hyperparameter tuning jobs that train ML models. See the full script.

AI training to accelerate research

To enable research of its AI-processed data, the company has taken on the role of directly training medical researchers and data scientists. "We created these prediction models to benefit people, but at the same time we're targeting doctors to help them better understand AI as well," says Neeraj.

doc.ai acquired Crestle.ai to help spread the word. Crestle is a GPU-enabled ML education platform. It provides researchers with a Python Jupyter notebook, pre-loaded datasets, cloud storage, and a managed Anaconda environment to support versioning and packaging.

"We found that the Google Cloud developer experience just beats the AWS and Azure developer experience hands down."

Neeraj Kashyap, Head of AI, doc.ai

Among the courses that doc.ai offers on the Crestle platform is fast.ai, an instructional set of code libraries designed to simplify writing ML tasks. When doc.ai acquired Crestle, the fast.ai product was implemented on AWS. It is now powered by Google Cloud.

"We had a very tight deadline, and after a week of working on the AWS version, we decided that it wasn't going to meet our requirements," recalls Neeraj. The firm completely re-implemented fast.ai, offering it from scratch on Google Cloud in two days. "You can move much faster building products on Google Cloud than on AWS," says Neeraj.

Open source blockchain on Kubernetes

The hurdles doc.ai faced in developing such a diverse set of computational models were made more difficult by the highly personal nature of the data. HIPAA (Health Insurance Portability and ¬Accountability Act of 1996) and other privacy protections restrict the availability of personal health and other medical data.

By design, all personal data is decentralized on each participant's mobile device so that only individuals can own their data. The data that they elect to share is encrypted and then stored in Cloud SQL where it can be dispatched for Google Cloud AI processing and for availability to researchers.

The firm is now looking to blockchain to support data privacy and to provide transparency to support the integrity of doc.ai research trials. The blockchain will be implemented as an Ethereum private net running on GKE.

"A batch of experiments that would take us two weeks on our infrastructure takes us a day on the Google Cloud infrastructure."

Neeraj Kashyap, Head of AI, doc.ai

"Part of running a research trial, especially an IRB-approved one (conforming to FDA guidelines), is making privacy guarantees to participants and being able to prove it," says Neeraj. "To meet these requirements, we have released two Ethereum smart contracts into open source."

The Stem contract, is a modified ERC20 token that doc.ai uses to track participant engagement in a data trial. The Stimulus contract provides transparency by tracking principal investigator approvals of participants for the data trial and validating that data submitted by participants is not fraudulent.

"We intend to run these on a private Ethereum network, hosted on Google Cloud, and offer customers and/or auditors access to this network," says Neeraj.

A better experience accelerates development

For Neeraj, Google Cloud support for AI is better because it is designed around developers' needs. "In theory, what we've implemented here could be implemented on any cloud provider," says Neeraj. "We found that the Google Cloud developer experience just beats the AWS and Azure developer experience hands down."

"It means that we can move much faster," says Neeraj. "A batch of experiments that would take us two weeks on our infrastructure takes us a day on the Google Cloud infrastructure. That's the nicest thing that we get on the AI side from Google Cloud. It's a great experience."

Contributors to this story

Neeraj Kashyap: doc.ai Head of AI. Neeraj was a Developer Programs Engineer at Google, a Product Manager at Human API, and a Co-founder of Adaptix.

Sophia Arakelyan: doc.ai Communications Director. Sophia is also the founder of buZZrobot. She was an OpenAI scholar, and a journalist for BusinessWeek.

Tell us your challenge. We're here to help.

Contact us

About doc.ai

doc.ai is an AI platform that enables ML computations on real-world data to develop personal health insights and predictive models.

Industries: Healthcare, Life Sciences
Location: United States