Posted by Kaz Sato, Developer Advocate, Google Cloud Platform
It’s not hyperbole to say that use cases for machine learning and deep learning are only limited by our imaginations. About one year ago, a former embedded systems designer from the Japanese automobile industry named Makoto Koike started helping out at his parents’ cucumber farm, and was amazed by the amount of work it takes to sort cucumbers by size, shape, color and other attributes.
Makoto's father is very proud of his thorny cucumber, for instance, having dedicated his life to delivering fresh and crispy cucumbers, with many prickles still on them. Straight and thick cucumbers with a vivid color and lots of prickles are considered premium grade and command much higher prices on the market.
But Makoto learned very quickly that sorting cucumbers is as hard and tricky as actually growing them. "Each cucumber has different color, shape, quality and freshness," Makoto says.
Cucumbers from Makoto's farm and ones from retail stores
In Japan, each farm has its own classification standard and there's no industry standard. At Makoto's farm, they sort them into nine different classes, and his mother sorts them all herself — spending up to eight hours per day at peak harvesting times.
"The sorting work is not an easy task to learn. You have to look at not only the size and thickness, but also the color, texture, small scratches, whether or not they are crooked and whether they have prickles. It takes months to learn the system and you can't just hire part-time workers during the busiest period. I myself only recently learned to sort cucumbers well,” Makoto said.
Distorted or crooked cucumbers are ranked as low-quality product
There are also some automatic sorters on the market, but they have limitations in terms of performance and cost, and small farms don't tend to use them.
Makoto doesn’t think sorting is an essential task for cucumber farmers. "Farmers want to focus and spend their time on growing delicious vegetables. I'd like to automate the sorting tasks before taking the farm business over from my parents."
Makoto Koike, center, with his parents at the family cucumber farm
The many uses of deep learning
Makoto first got the idea to explore machine learning for sorting cucumbers from a completely different use case: Google AlphaGo competing with the world's top professional Go player.
"When I saw the Google's AlphaGo, I realized something really serious is happening here,” said Makoto. “That was the trigger for me to start developing the cucumber sorter with deep learning technology."
Using deep learning for image recognition allows a computer to learn from a training data set what the important "features" of the images are. By using a hierarchy of numerous artificial neurons, deep learning can automatically classify images with a high degree of accuracy. Thus, neural networks can recognize different species of cats, or models of cars or airplanes from images. Sometimes neural networks can exceed the performance of the human eye for certain applications. (For more information, check out my previous blog post Understanding neural networks with TensorFlow Playground.)
TensorFlow democratizes the power of deep learning
But can computers really learn mom's art of cucumber sorting? Makoto set out to see whether he could use deep learning technology for sorting using Google's open source machine learning library, TensorFlow.
"Google had just open sourced TensorFlow, so I started trying it out with images of my cucumbers,” Makoto said. “This was the first time I tried out machine learning or deep learning technology, and right away got much higher accuracy than I expected. That gave me the confidence that it could solve my problem."
With TensorFlow, you don't need to be knowledgeable about the advanced math models and optimization algorithms needed to implement deep neural networks. Just download the sample code and read the tutorials and you can get started in no time. The library lowers the barrier to entry for machine learning significantly, and since Google open-sourced TensorFlow last November, many "non ML" engineers have started playing with the technology with their own datasets and applications.
Cucumber sorting system design
Here's a systems diagram of the cucumber sorter that Makoto built. The system uses Raspberry Pi 3 as the main controller to take images of the cucumbers with a camera, and in a first phase, runs a small-scale neural network on TensorFlow to detect whether or not the image is of a cucumber. It then forwards the image to a larger TensorFlow neural network running on a Linux server to perform a more detailed classification.
Makoto used the sample TensorFlow code Deep MNIST for Experts with minor modifications to the convolution, pooling and last layers, changing the network design to adapt to the pixel format of cucumber images and the number of cucumber classes.
Here's Makoto’s cucumber sorter, which went live in July:
Here's a close-up of the sorting arm, and the camera interface:
And here is the cucumber sorter in action:
Pushing the limits of deep learning
One of the current challenges with deep learning is that you need to have a large number of training datasets. To train the model, Makoto spent about three months taking 7,000 pictures of cucumbers sorted by his mother, but it’s probably not enough.
"When I did a validation with the test images, the recognition accuracy exceeded 95%. But if you apply the system with real use cases, the accuracy drops down to about 70%. I suspect the neural network model has the issue of "overfitting" (the phenomenon in neural network where the model is trained to fit only to the small training dataset) because of the insufficient number of training images."
The second challenge of deep learning is that it consumes a lot of computing power. The current sorter uses a typical Windows desktop PC to train the neural network model. Although it converts the cucumber image into 80 x 80 pixel low-resolution images, it still takes two to three days to complete training the model with 7,000 images.
"Even with this low-res image, the system can only classify a cucumber based on its shape, length and level of distortion. It can't recognize color, texture, scratches and prickles,” Makoto explained. Increasing image resolution by zooming into the cucumber would result in much higher accuracy, but would also increase the training time significantly.
To improve deep learning, some large enterprises have started doing large-scale distributed training, but those servers come at an enormous cost. Google offers Cloud Machine Learning (Cloud ML), a low-cost cloud platform for training and prediction that dedicates hundreds of cloud servers to training a network with TensorFlow. With Cloud ML, Google handles building a large-scale cluster for distributed training, and you just pay for what you use, making it easier for developers to try out deep learning without making a significant capital investment.
These specialized servers were used in the AlphaGo match
Makoto is eagerly awaiting Cloud ML. "I could use Cloud ML to try training the model with much higher resolution images and more training data. Also, I could try changing the various configurations, parameters and algorithms of the neural network to see how that improves accuracy. I can't wait to try it."