How AlloyDB transformed Bayer’s data operations
Aaron Joyce
Engineering Lead, Bayer Crop Science
Editor’s note: Bayer built its modern data solution, Field Answers, to store and analyze vast amounts of observational data. As part of the process of preparing to onboard new market segments, extensive load testing revealed that a new database solution was necessary to handle the dramatic increase in traffic. Migrating to AlloyDB for PostgreSQL has helped streamline operations, centralize solutions, and improve collaboration across the company.
Bayer uses the power of science to shape the future of farming. Our Global Data Assets team manages geospatial data for the company. We support hundreds of teams and applications with access to maps, phenotypic observations (observed physical characteristics of a plant), satellite imagery, and other environmental data like weather and soil strata.
Field Answers is a modern data solution we created to efficiently collect and compute billions of observations across field and greenhouse operations globally. This data is vital for the decisions made at various stages in our research and development (R&D) pipelines, including choosing the best seeds, optimizing the costs of production, and marketing our products to farmers. However, managing such a large-scale system presents its own challenges.
Weeding out database challenges
As we prepared to onboard a new market segment to Field Answers, we anticipated a dramatic increase in traffic to the tool. Field Answers is a distributed solution, and its sensitivity to order and replication lag can affect its performance. Based on extensive load testing, we knew our open-source PostgreSQL setup would not be able to meet latency and throughput demands. This would reduce access to the valuable datasets our teams require.
We needed a new database, and we needed it fast. After trying multiple products, AlloyDB for PostgreSQL emerged as our top choice. We received consistent support from the Google Cloud team as we tested AlloyDB, which assured us they would be there in case of any unforeseen migration issues. Because it was compatible with our existing Postgres database, we could migrate with zero application changes and hit our aggressive migration timelines. With the North American agricultural planting season just around the corner, that compatibility was huge!
Harvesting growth with AlloyDB
Migrating to AlloyDB has been transformative for our business. In our previous PostgreSQL setup, the primary writer was responsible for both write operations and replicating those changes to reader nodes. The anticipated increase in write traffic and reader count would have overwhelmed this node, leading to potential bottlenecks and increased replication lag. AlloyDB's architecture, which utilizes a single source of truth for all nodes, significantly reduced the impact of scaling read traffic. After migrating, we saw a dramatic improvement in performance, ensuring our ability to meet growing demands and maintain consistently low replication delay. In parallel load tests, a smaller AlloyDB instance reduced response times by over 50% on average and increased throughput by 5x compared to our previous PostgreSQL solution.
By migrating to AlloyDB, we've ensured that our business growth won't be hindered by database limitations, allowing us to focus on innovation. The true test of our migration came during our first peak harvest season, a time where performance is critical for product decision timelines. Due to agriculture’s seasonal nature, a delay of just a few days can postpone a product launch by an entire year. Our customers were understandably nervous, but thanks to Google Cloud and AlloyDB, the harvest season went as smoothly as we could have hoped for.
Cultivating a thriving data architecture
Partnering with Google Cloud has played a crucial role in implementing our data strategy, an adaptation of the data mesh approach where each asset serves a particular data domain. This approach allows us to decentralize data ownership and management, enabling domain-driven teams to take responsibility for their data while ensuring quality, accessibility, and governance.
To support our data strategy, we have adopted a consistent architecture across our Google Cloud projects. For a typical project, the stack consists of Google Kubernetes Engine (GKE) hosted pods and pipelines for publishing events and analytics data. While Bayer uses Apache Kafka across teams and cloud providers for data streaming, individual teams regularly use Pub/Sub internally for messaging and event-driven architectures. Data for analytics and reporting is generally stored in BigQuery, with custom processes for materialization once it lands. By using cross-project BigQuery datasets, we are able to work with a larger, real-time user group and enhance our operational capabilities.
Nurturing innovation through collaboration
Looking ahead, we're excited about the potential to combine AlloyDB, Datastream, Pub/Sub, and BigQuery. With the built-in integrations between these tools, we see opportunities to reduce toil, increase reliability, and scale our applications more effectively. We're also eager to explore AlloyDB’s integration with Vertex AI, which could open up new opportunities to use machine learning and advanced analytics.
As we continue our journey with Google Cloud, we're confident we have the right tools to tackle the challenges and opportunities that lie ahead. By leveraging the power of AlloyDB and the Google Cloud ecosystem, we're not only enhancing our own operational capabilities but also contributing to the future of farming. With more efficient and innovative solutions, we can help farmers make data-driven decisions, optimize their operations, and ultimately, feed the world more sustainably. The future of agriculture is digital, and we're proud to be at the forefront of this transformation with Google Cloud by our side.
Next steps
-
Discover how AlloyDB combines the best of PostgreSQL with the power of Google Cloud in our latest e-book.
-
Try AlloyDB at no cost for 30 days with AlloyDB free trial clusters!
-
Discover the power of BigQuery.