Descartes Labs: Advancing global food security
Feeding a growing world population in a changing climate requires highly accurate agricultural forecasting beyond what traditional survey data can provide. Descartes Labs, a research-and analytics-driven company in New Mexico, is helping the world address food security crises and even identify early signs of famine outbreak. By using machine learning to analyze years of scientifically calibrated satellite imagery, the company can successfully predict changes in crop health and yield.
Named after the French mathematician René Descartes—who discovered that the position of a point can be determined by coordinates—Descartes Labs provides instant, programmatic access to satellite images of any geographic location. Using environmental change analysis, the company gives customers information about the global food supply through deep learning, remote sensing, and large-scale, high-performance computing. Governments, academic researchers, and food producers can use its forecasts to ensure crop harvests are sufficient and that critical links in the food chain remain economically healthy.
Descartes Labs’ forecasting platform uses satellite imagery to capture insights about more than just crop health. It offers insights into human populations, natural resources, the growth of cities, the spread of forest fires, and the state of available drinking water across the globe. The company also makes its platform available to organizations that want to gain insights from their own data to optimize pricing and better understand their customers.
Building a living, breathing atlas of the world with deep historical coverage of the entire planet involves massive datasets—and the amount of data is growing all the time. When Descartes Labs learned that Google Earth Engine hosted all NASA Landsat satellite imagery since 1973 natively on Google Cloud Storage, the company jumped at the chance to use images—the longest continuous observations of the Earth ever captured—to back-test models over many years. However, it needed a practical way to process more than 1 petabyte of satellite imagery of U.S. corn production data (4 quadrillion pixels) without setting up a large physical infrastructure, which would have taken more than six months and cost millions of dollars.
As a startup company in a highly competitive space, Descartes Labs could not afford to wait months to prove its viability in the agricultural research market. Using Google Cloud Platform, Descartes Labs was able to scale compute, networking, and storage to process the entire Landsat image archive in just over 15 hours. By enabling historical back-testing, the company indicates it can now predict corn yields faster and more accurately than government organizations that use survey data. On the first day its forecasting models received major media coverage, the corn futures market moved by 3%.
“Just a few years ago, we would have needed the world’s largest supercomputers to do what we can now do with Google Cloud Platform. Compared with other cloud solutions, GCP offers unmatched scalability, which is a business requirement as we push toward exascale computing.” - Mark Johnson, CEO, Descartes Labs
“We are constantly getting petabytes of new data from hundreds of satellites. With Google Cloud Platform, we don’t worry about whether the compute, network, or storage can scale and instead can focus on improving models and analyzing larger datasets for better forecasts.” - Tim Kelton, Co-founder & Cloud Architect, Descartes Labs
Supercomputing in the cloud with Google Cloud Platform
Descartes Labs now has the ability to scale its proprietary machine learning tools on demand by using Google Cloud Platform to process even the largest datasets, including the European Space Agency’s Sentinel satellite constellation. It has developed the first-ever global composite views from some of these satellites, showing different frequency bands to monitor changes in vegetation and the Earth’s surface.
To help keep costs low, it uses preemptible VMs—Google Compute Engine instances that are extremely affordable because they are short-lived. To help keep performance high, it uses tens of thousands of CPUs to ingest the imagery and high-bandwidth links to Google Cloud Storage, where the compressed imagery is stored. Soon, Descartes Labs expects to have nearly 15 petabytes of processed imagery on Google Cloud Storage, where it can be analyzed at any time.
“We use Google Cloud Storage as our large distributed file system because we know it will scale as our datasets grow beyond multiple petabytes. We saw aggregate read bandwidth of 230 gigabytes per second using 512 compute nodes, which is comparable to the best HPC storage systems in existence,” says Tim.
“Google Container Engine lets us get code into production faster and provide better APIs to our customers. It’s like playing Tetris with our workloads—everything automatically goes in the best slot, and it’s constantly checking to make sure services are always available,” says Tim.
Providing more value to customers with Google Cloud SQL
As images are analyzed, Descartes Labs captures information about each scene as vector data, and uses Google Cloud Pub/Sub in conjunction with a microservice hosted on Google Container Engine to persist that information into a PostGIS database for geospatial queries. To provision databases quickly, Descartes Labs uses Google Cloud SQL, a fully-managed database service that makes it easy to setup, maintain, and administer databases in the cloud.
“Google Cloud SQL supports PostGIS, so it’s easy to get data into a geospatial database. Our developers can quickly provision PostGIS, PostgreSQL, and MySQL databases without assistance. Thanks to Cloud SQL, we have more time to work on products that provide value to our customers,” adds Tim.
Building on Google innovation
Descartes Labs uses Google BigQuery to analyze logs from its applications and the APIs it provides to customers, gaining valuable insight to improve products. For monitoring and alerting it uses Google Stackdriver Monitoring, helping engineers resolve issues quickly. To store visual signatures that identify landmarks or structures such as wind turbines, solar farms, golf courses, or airport runways for the company’s GeoVisual Search tool, it uses Google Cloud Bigtable, which returns search results much faster than a standard relational database.
“A major advantage of Google Cloud Platform is that we get access to the same tools Google uses to power its own services, which helps us improve developer efficiency. Bigtable is a perfect example. Google has expertise in search, and we can benefit from the same technology,” says Tim.
Just a few years ago, Descartes Labs was a new company needing to maximize productivity on a limited budget. G Suite was the natural choice to help the company grow, and today supports ten times the number of original employees. Descartes Labs depends on Gmail, Google Drive, Google Sheets, Google Docs, and Google Calendar to keep business moving, and regularly uses Hangouts to meet with customers and connect employees across offices.
“G Suite offers excellent security with multi-factor authentication, which is important to us. We started our company using G Suite apps and never looked back. It’s an effective and affordable way to communicate and collaborate on the devices our teams rely on,” says Tim.
Scaling for the future of forecasting
As the cost of satellites and satellite launches come down dramatically, the amount of image data observing the earth is dramatically increasing. With Google Cloud Platform, Descartes Labs is enabling an understanding of the world at a scale and at a level of granularity that would have been impossible just a few years ago. The company expects to have more than 30 petabytes on Google Cloud Platform within the next few years. As datasets grow, it is exploring new ways to make model training even more efficient, including using Tensor Processing Units (TPUs) from Google to accelerate machine learning.
Descartes Labs collaborates with Google to learn about the latest innovations that can help maximize the value of its forecasting platform. The results of this collaboration could have enormous and lasting impact, helping people worldwide avoid food shortages and other potentially catastrophic events.
“For a company with dreams to change the world, our collaboration with Google is invaluable. The Google Cloud team has been eager to partner with us, helping us scale and get maximum benefit from Google Cloud Platform. We look forward to continue working with Google to push the limits of cloud computing,” adds Mark.