Recruit: Building a next-generation security data warehouse to democratize log analysis
About Recruit
Under the vision of "Follow Your Heart," Recruit Co. Ltd., develops a wide range of services centered on business solutions within the areas of human resources and sales promotion. It is constantly innovating, with over 17,300 employees as of April 2022.
Tell us your challenge. We're here to help.
Contact usBy building a next-generation security data warehouse with BigQuery, Recruit Co., Ltd. has made it possible to analyze security logs without relying on engineers with specialized knowledge.
Google Cloud results
- Achieves significant cost savings compared to existing solutions with BigQuery
- Significantly lowers the threshold for HR development and recruitment by configuring a technical stack centered on SQL
- Enables low-code ETL processing that facilitates development and maintenance with Cloud Data Fusion
Successfully democratized security log analysis
Security is of utmost importance for Recruit Co., Ltd. (Recruit), which has more than 17,000 employees and offers a wide range of services as a platform. The company's Security Operation Center (SOC) is a department that creates mechanisms and develops tools against risks such as external attacks and internal fraud.
In order for the security departments to take correct measures, it is necessary to accurately grasp what is happening by means of logs and other facts. Hisashi Hibino, Development Lead, Security Strategy Group, Security Division of Recruit led the development of the company's latest initiative, in building a next-generation security data warehouse (DWH). Hibino, who has been involved in building the back-end framework for the SOC for about three years, talks about the concept of next-generation security DWH, which will be the company's latest offering.
"Many people think of the real-time incident detection tool called Security Information and Event Management (SIEM) when they hear of a security monitoring tool that uses logs. What we developed this time, however, is a framework that effectively uses logs for purposes other than monitoring, such as finding signs of attacks, conducting forensics after incidents have occurred, and reporting. It is a mechanism that effectively uses logs for purposes other than monitoring, such as finding signs of attacks, conducting forensics after incidents have occurred, and reporting," says Hibino. "The goal was to make it possible for people who are not specialized engineers to perform analysis work using logs."
Building a log analysis platform on BigQuery
In the past, such analysis was performed on a request basis ordered from a team member, and an SOC engineer who has the log analysis skills would process the report. Hibino shares that this method caused issues due to an over reliance on engineers with high technical skills, with other team members not being able to obtain real-time analysis results. Also, there would be some findings that were never addressed. When building a next-generation security DWH to solve these problems, Mr. Hibino chose to combine general-purpose data analysis solutions such as BigQuery instead of introducing existing security solutions due to its many advantages.
"The starting point of this initiative was the idea that running costs could be greatly reduced if we could use a DWH service that can be used for query billing. We evaluated both performance and storage costs and chose BigQuery."
—Hisashi Hibino, Development Lead, Security Strategy Group, Security Division, Recruit Co., Ltd"Cost was a concern. Log analysis solutions for security uses a license system that charges based on the amount of logs captured and the operating time of the instance while running, which made the storage expensive.The starting point of this initiative was the idea that running costs could be greatly reduced if we could use a DWH service that can be used for query billing. We evaluated both performance and storage costs and chose BigQuery," says Hibino.
He goes on to share that adopting a DWH service like BigQuery has significant benefits other than cost. "With a solution that is specialized in security, we are required to write queries in a specific language in order to search and extract the information from the log data that is necessary for analysis, so the learning cost was the major hurdle in training and securing human resources.
"Our next-generation security DWH aims to configure a technical stack centered on SQL, which many engineers are familiar with for many years, thus lowering the threshold for human resource development and recruitment. This formed the beginning of this project. It's an idea I've had for a long time, and as the use of SQL has greatly expanded from before, I believe we've got it right."
Lowering the entry barrier for analytics with low-code Cloud Data Fusion
The figure above is a system configuration diagram of the next-generation security DWH, which started operations in October 2021. By linking Cloud Storage, which accumulates logs centering on BigQuery, and Looker, which visualizes data, Recruit has realized the functions necessary for analysis work. Dataproc is used for the pipeline execution environment. By making full use of VPC Service Controls, the company devised ways to securely exchange data with BigQuery and Cloud Storage.
"We had decided on BigQuery and Looker before the project started, so there were no major difficulties in introducing them. However, there was considerable trial and error regarding ETL (Extract / Transform / Load) when accumulating log data in BigQuery. At first, mainly in terms of cost, we assumed that we would use Cloud Composer for Python-based processing," says Hibino.
"Advanced expertise and experience are required for data analysis and maintenance, and it becomes impossible to lower the threshold of analysis work, which was the original purpose. We decided to use Cloud Data Fusion, a fully managed data integration service which allowed for a low-code ETL tool."
—Hisashi Hibino, Development Lead, Security Strategy Group, Security Division, Recruit Co., Ltd"Advanced expertise and experience are required for data analysis and maintenance, and it becomes impossible to lower the threshold of analysis work, which was the original purpose. As a solution we use Cloud Data Fusion, a fully managed data integration service which allows for a low-code ETL tool. On-premise environment logs uploaded to Cloud Storage are processed by Cloud Data Fusion for ETL. Logs from external services such as SaaS and IaaS are also crawled via API and sent to Cloud Data Fusion."
Looking back, Hibino recalls that there were some unexpected difficulties when introducing Cloud Data Fusion, and it was necessary to make adjustments while stabilizing the system. "At the time, we were grateful for the support we received from everyone at Google Cloud. Thanks to that, we are now able to perform fairly stable data processing."
Realizing the benefits of a next-generation security DWH
When it comes to the current utilization of the next-generation security DWH, Hibino has this to say, "To encourage the use of the next-generation security DWH, I am participating in regular meetings with on-site departments that use it. In the past, analysis data could not be produced without relying on the SOC. However, With Looker, we could quickly produce analysis data, which had a major impact overall. New insights can be obtained through such analysis, and quickly realized on the spot. I really feel that the fact that we can shape the system is creating new value in the field. It has only been a year since the start of operations, but the response already exceeds our expectations."
"The appeal of Google Cloud is that it has a rich set of analysis platforms and tools for engineers and developers. As the IT environment of companies undergo major changes, the question of what can be done to improve security without sacrificing convenience and productivity has become a major theme."
—Hisashi Hibino, Development Lead, Security Strategy Group, Security Division, Recruit Co., LtdHibino also believes that the next-generation security DWH will continue to evolve further. "In order to operate the ever-increasing number of pipelines more efficiently, we plan to make functional improvements with an awareness of large scale, such as making it possible to manage interfaces that are currently GUI-based with a code base.
"The appeal of Google Cloud is that it has a rich set of analysis platforms and tools for engineers and developers. As the IT environment of companies undergo major changes, the question of what can be done to improve security without sacrificing convenience and productivity has become a major theme. I'm personally interested in BeyondCorp Enterprise, the zero trust solution that Google Cloud is working on and would like to dig deeper into how we can use them in our business in the future. To change with the times, I feel that we need to consider whether we can achieve carbon neutrality by migrating the data we have on-premises to the cloud, which has better power efficiency. I have high hopes for it," concludes Hibino.
Tell us your challenge. We're here to help.
Contact usAbout Recruit
Under the vision of "Follow Your Heart," Recruit Co. Ltd., develops a wide range of services centered on business solutions within the areas of human resources and sales promotion. It is constantly innovating, with over 17,300 employees as of April 2022.