kakao healthcare logo

Kakao Healthcare: Empowering medical research and development with hospital data collaboration

Google Cloud Results
  • Predicts recurrence in breast cancer patients in four months instead of two years

  • Enables collaboration of medical data from 16 universities for research while keeping personal data secure

  • Systematizes data across hospitals to extract the value of collective data

Kakao Healthcare's data platform uses Federated Learning to enable secure analysis of medical data from multiple hospitals, improving data usability and prediction accuracy.

Data in healthcare is very valuable, especially when it pertains to patient health. While South Korea is known for its advanced digitalization of medical information like electronic medical records, the sensitive nature of medical and healthcare data makes it a challenge to effectively use them to create new value. This is mainly due to obstacles in sharing and analyzing them.

“While working as a CIO at a university hospital, I was constantly thinking about what the large amount of data generated in the medical field would mean from the hospital’s perspective. The constantly accumulating data should be appropriately used to improve patient treatments, clinical trials, academic research, and further collaboration with external organizations,” says Hee Hwang, CEO of Kakao Healthcare.

With this in mind, Kakao Healthcare built a medical data platform on Google Cloud and introduced it to major university hospitals and large hospitals in Korea. Through its platform, medical data can be combined across hospitals for analysis and learning, while storing and managing sensitive data such as personal and corporate information within individual environments. 

The platform uses Federated Learning with Google Cloud. This Machine learning (ML) distributed processing technology can be interpreted on a larger scale by sharing ML learning results with each other, while simultaneously storing and managing sensitive data, such as personal information and corporate information in each individual's physical environment.

Transforming the medical field with data

If the data of each hospital is well organized, expanded coverage leads to huge amount of clinical data results by collecting and analyzing it in one place. Kakao Healthcare deployed a joint learning platform that can analyze this safely, easily, and systematically, building a large-scale data community involving 16 domestic hospitals participating as of July 2024.

Hee Hwang

CEO, Kakao Healthcare

Federated Learning is a concept introduced by Google in 2017. In general, training a model, which is the core of ML, requires computing resources that can collect all data in one place and train it. Because ML has the capability to produce more sophisticated results, even scattered data with similar roles are collected and learned over time. However, there are many cases whereby organizations may not want their data sets to be released to the outside world for reasons such as security or asset value.

The same goes for medical data. In this case, rather than collecting the data in one place, the artificial intelligence (AI) model is downloaded and learned in each computing environment that physically holds the data, and the results are collected again to enable large-scale learning without data flowing out.

“If the data of each hospital is well organized, we can obtain a huge amount of clinical data results by collecting and analyzing it in one place. Kakao Healthcare deployed a joint learning platform that can analyze this safely, easily, and systematically, building a large-scale data community involving 16 domestic hospitals participating as of July 2024,” says Hee.

The new way to personalize healthcare for everyone.

Collaborative learning protects hospital data and achieves large-scale data outcomes

By combining data from multiple subjects under one ML model by using  Google Kubernetes Engine (GKE), Google Cloud was able to create an environment where each hospital could create and operate a large learning federation based on the individual data sets. Based on this foundation, Kakao Healthcare developed a healthcare data platform and Federated Learning environment to process medical data into appropriate data sets and enable joint learning to occur securely and smoothly, and is supplying those data sets to large hospitals in Korea.

However, there were some obstacles, the first being the unity of data. Because each hospital manages data in its own way, the format, storage, and analysis methods are different, and not all data is optimized for storage and analysis.

Although some data such as body temperature, blood pressure measurements, or blood test results should be universally understood, MRI image interpretation, biopsy interpretation, and doctors' medical records are all managed differently at each hospital. Even if the data is standardized, the codes that deal with the disease may differ from hospital to hospital.

Kakao Healthcare built a data platform where data is standardized in a consistent manner without affecting the existing treatment and work environment of each hospital. Of course, this data cannot be accessed by Kakao Healthcare or other hospitals, and each hospital can directly utilize it in various ways as needed. This not only serves the goal of joint learning, but also serves as a foundation for digital transformation that systematizes the data of individual hospitals and extracts value from each data.

Hee Hwang

CEO, Kakao Healthcare

To solve this issue, Hwang says, “Kakao Healthcare built a data platform where data is standardized in a consistent manner without affecting the existing treatment and work environment of each hospital. Crucially, this data cannot be accessed by Kakao Healthcare or other hospitals, and each hospital can use it in various ways as needed. This not only serves the goal of joint learning, but also serves as a foundation for digital transformation that systematizes the data of individual hospitals and extracts value from each data.”

However, from the hospitals’ perspective, there are still concerns that data for federated learning could be used by other institutions once they’re registered in the cloud. This is where Google's Federated Learning Security Consideration comes in.

“The data required for analysis is stored in Kakao Healthcare’s data platform built on Google Cloud. Data is not uploaded indiscriminately, but is directly controlled in each individual's area, and no one else has access to its contents. And data is strictly managed only within each hospital's dedicated cloud environment. In the Federated Learning process, an ML model for research analysis that has been studied in advance is deployed, and the model analyzes data from each hospital and transmits only the results,” says Hwang.

Hospitals also have the right to decide if they want to participate in a project. When a project is proposed, hospitals are recruited to participate, and only the data from hospitals that participate are subjected to analysis. Likewise, only participating hospitals can use the results generated. 

One of the reasons Federated Learning can be done securely, seamlessly, and with strict security and access controls is because all processes are managed within the GKE environment, from data collection and analysis to extracting specific results.  

It also provides additional features to facilitate Federated Learning, including:

  • Hosting Federated Learning coordinators
  • Hosting Federated Learning participants
  • Providing a secure and scalable communication channel
  • Enabling lifecycle maintenance of federated deployments
  • Safe storage and saving of the Federated Learning Model that has been built from data of participants in the federation

The Google Cloud Federated Learning Reference Architecture also lists security controls applicable to the use of GKE as part of the federated learning framework. This means that security is doubled, since GKE is secure by default.

As data can be loaded, safely managed, and conveniently utilized, hospitals can achieve the data-centered treatment, research, and operations they have long anticipated. The number of hospitals Kakao Healthcare works with is also rapidly increasing. By the end of 2024, 20 hospitals plan to participate in Kakao Healthcare's data platform for joint learning, with the data from those hospitals covering approximately 15,000 beds and 20 million people.

Providing sincere care for the population

Confirming the value of Federated Learning and data

Joint learning using the data platform built with Kakao Healthcare surprised us due to its ease of use. Thanks to the vast amount of standardized medical data system that has already been systematically built within the framework of a data alliance created with the trust of multiple organizations, the experience of clutter-free administrative processing, rapid data processing, and easy and intuitive federated learning process was very impressive. It is clear that without a data platform, not only would it have been difficult to experience, but it would also have been difficult to achieve the desired results in a short period of time. We have no doubt that by expanding the number of participating organizations and continuously accumulating high-quality data, we will be able to use it beyond our imagination and produce excellent results.

Prof Dosang Cho

Ewha Womans University Medical Center

Kakao Healthcare and the group of hospitals it collaborates with conducted two joint learning projects. The first was to conduct a test to see whether it was possible to predict recurrence in breast cancer patients. A large-scale analysis was conducted on a total of 25,000 people, including about 15,000 people from four hospitals for the study and about 10,000 people from another hospital for external validation.

“Usually, this scale of data analysis requires extensive data pre-processing and compliance procedures to prepare. With the possibility of labor intensive complications during the actual analysis stage, the task could take more than two years. However, Kakao Healthcare's joint learning only took four months from preparation to the time the results were released. At the same time, it produced much better prediction results since it was not limited to the small amount of data from individual hospitals,” says Hwang.

He explains that Federated Learning has helped Kakao Healthcare achieve surprising results, even from the testing stage. As the amount of data increases, the accuracy and reliability of the data improves, and as a result, more meaningful results are being produced compared to previous studies that included only data within hospitals.

The prediction performance for each participating hospital previously varied from 0.6397 to 0.8362, but the result of Federated Learning exceeded this at 0.8482. By checking the performance based on external validation data that wasn’t used at all for learning, the result was 0.7769, showing a satisfactory performance of 92% of the Federated Learning result.

Federated Learning result bar graph

According to a customer of Kakao Healthcare, Professor Sungeun Song from Korea University Anam Hospital, “Although the development of predictive models using medical data is being carried out at individual hospitals and research charities, there are limitations in that verification and performance evaluation at other institutions are not easy. At the beginning of the research, there were doubts about whether federated learning would be possible to create a collaborative model by effectively learning from medical data accumulated in multiple institutions while protecting the distributed data itself. However, through collaborative research with Kakao Healthcare, we have clearly learned that this is feasible, and we have become convinced that this is the path forward for future medical data research. I believe that more research and great results will be created through joint learning, and that Kakao Healthcare will play a leading role in this process.”

Another customer, Professor Dosang Cho from Ewha Womans University Medical Center says, “Joint learning using the data platform built with Kakao Healthcare surprised us due to its ease of use. Thanks to the vast amount of standardized medical data system that has already been systematically built within the framework of a data alliance created with the trust of multiple organizations, the experience of clutter-free administrative processing, rapid data processing, and easy and intuitive federated learning process was very impressive. It is clear that without a data platform, not only would it have been difficult to experience, but it would also have been difficult to achieve the desired results in a short period of time. We have no doubt that by expanding the number of participating organizations and continuously accumulating high-quality data, we will be able to use it beyond our imagination and produce excellent results.”

Although the development of predictive models using medical data is being carried out at individual hospitals and research charities, there are limitations in that verification and performance evaluation at other institutions are not easy. However, through collaborative research with Kakao Healthcare, we have clearly learned that this is feasible, and we have become convinced that this is the path forward for future medical data research. I believe that more research and great results will be created through joint learning, and that Kakao Healthcare will play a leading role in this process.

Prof Sungeun Song

Korea University Anam Hospital

Help people, healthcare providers and society

Expanding the scope of research with Kakao Healthcare

Kakao Healthcare and the participating hospitals in this joint learning journey already have plans to expand the scope of research, continue in-depth research with more hospitals, and publish papers on each study based on joint learning. A project to interpret signs for colon cancer is also underway. Although the hospitals are still in the process of forming a union, the data from each hospital has been standardized for analysis, and various studies are being considered.

Kakao Healthcare's data platform is now advantageous not only for joint learning, but also for managing information within hospitals, where digital transformation of treatment systems, ward management, and operations is actively taking place. There are cases where new systems are built to process data for this purpose, and Kakao Healthcare provides about 100 types of data that hospitals can effectively use.

Google is Kakao Healthcare’s most important partner in driving joint learning. We first proposed technologies for Federated Learning, and close technical support continues to this date. We are not just providing a service, we are actively working together to make Federated Learning a success in healthcare.

Hee Hwang

CEO, Kakao Healthcare

Data is also used for various research activities within the hospital. Recent research is always accompanied by analysis through ML, and the data sets organized by Kakao Healthcare's data platform are more effective for individual research because they were originally designed with ML in mind. Researchers simply have to upload the appropriate model to Vertex AI and start analyzing without worrying about additional computing resources or workflow.

Another area of ​​interest in Federated Learning is drug development. “Pharmaceutical companies need detailed analysis of patients' treatment processes to create better drugs. Additionally, during the drug development process, clinical trials must be repeated over a long period of time to simultaneously verify effectiveness and safety,” says Hwang. “Federated Learning has the potential to support further by not only conducting research on the effectiveness of existing drugs, but also through its capability to extensively verify the results of final-stage clinical trials through participating hospitals.”

The number of subjects in clinical trials are usually limited. However, when enough data is collected from multiple hospitals, it can paint a better picture, and hospitals can contribute to the development of more effective drugs while identifying new drugs on their own.

The combination of data and AI is redefining the medical environment, helping to manage health more sensitively and create a better medical treatment environment. Hwang says that the collaboration and efforts of the medical Federated Learning platform over the past two years with Google Cloud were the driving force behind making a data-driven medical environment a reality.

“Google is Kakao Healthcare’s most important partner in driving joint learning. We first proposed technologies for Federated Learning, and close technical support continues to this date. We are not just providing a service, we are actively working together to make Federated Learning a success in healthcare.”

Hwang emphasizes the importance of agreeing on how meaningful work can be created, rather than simply accessing how much cloud service is needed and what GPU resources are available. By successfully building on domestic cases and expanding the scope to the global medical environment, Kakao Healthcare and Google Cloud are working together to can create a better medical data environment—now and for years to come.

Kakao Healthcare seeks to solve the inconveniences experienced by users and partners of various healthcare services based on its core value of ‘companion, friend, and secretary for everyone.’ It provides digital healthcare services to contribute to the improvement of public health, targeting the global healthcare market through its mission to “keep people healthy through technology."

Industry: Healthcare

Location South Korea

Products: Generative AI on Google Cloud, Vertex AI, Gemini, Google Kubernetes Engine