Digital media is one of the fastest growing areas on the internet. According to a market study by Informa Telecoms & Media conducted in 2012, the global online video market only, will reach $37 billion in 20171. Other common media types include images, music, and digital documents. One driving force for this phenomena growth is the popularity of feature rich mobile devices, equipped with higher resolution cameras, bigger screens, and faster data connections. This has led to a massive increase in media content production and consumption. Another driving force is the trend among many social networks to incorporate media sharing as a core feature in their systems. Meanwhile, numerous startup companies are trying to build their own niche areas in this market.
This paper will use an example scenario to provide a technical deep-dive on how to use Google Cloud Platform to build a digital media asset management and sharing system.
Example Scenario - Photofeed
Photofeed, a fictitious start-up company, is interested in building a photo sharing application that allows users to upload and share photos with each other. This application also includes a social aspect and allows people to post comments about photos. Photofeed's product team believes that in order for them to be competitive in this space, users must be able to upload, view, and edit photos quickly, securely and with great user experiences. Additionally, they would like this application to easily scale as the number of users and photos increases. In order for these goals to be achieved, the system must also have an efficient pipeline for photo processing capabilities, such as resizing, cropping, and thumbnail generation. As the business grows, the system must allow the development team to rapidly introduce new features.
Challenges In Building Scalable Digital Media Systems
Building a scalable digital media system from scratch that supports a large number of users and stores huge amount of media content is not a trivial task. The following list provides an overview of the common technical challenges associated with building scalable digital media systems:
- The system must allow end users to quickly and securely upload media objects while still providing a compelling user experience.
- Metadata of the media objects needs to be ingested and synchronized if media objects are modified or re-ingested.
- The ingestion workflow that defines the communication among all involved components needs to be managed.
- Virtually unlimited storage for the media content and the storage must be reliable, globally accessible, and cost effective2.
- Scalable computing resources are required for media processing, such as document format conversion, image processing, and media transcoding.
- The media processing workflow needs to be managed.
- The system must allow end users to quickly and securely download media content while still providing a good user experience.
- The serving workflow needs to be managed.
- Media Applications
- The system supports the integration of media metadata with application specific domain data. It also allows for the development of scalable media applications, such as asset management, and content sharing, on top of this data.
- End User Experiences
- The system provides compelling user experiences for multiple clients such as browsers, mobile devices, and desktop applications.
The solution presented in this paper demonstrates how Google Cloud Platform is able to address each of the challenges described above. The proposed system architecture is generally applicable to any media type. The solution serves as a reference for software architects and software developers for building their own digital media systems on Google Cloud Platform.
App Engine, Cloud Storage, and Compute Engine are the three features of Google Cloud Platform. As shown in Figure 1, all of these products work together to form the basis of this digital media asset management and sharing solution.
Both Cloud Storage and App Engine play a critical role in media content ingestion. During uploading, media content flows directly from the client, through the global Google network into the Cloud Storage. With its global reach, massive bandwidth, and integration with Cloud Storage, the Google network allows content to be ingested into the storage with low latency from almost anywhere. Cloud Storage supports two common uploading mechanisms: HTTP POST using signed URL and RESTful APIs.
App Engine is designed to power scalable web applications that handle millions of users. Front end application for content ingestion can be developed on App Engine. The application is responsible for authentication, allowing only authorized users to upload content. Meanwhile, the application manages the ingestion workflow and coordinates with the clients to upload content to Cloud Storage. For browser clients, the application also implements the web user interface for content uploading. For mobile or desktop clients, the user interface resides in the client application while the App Engine application exposes its functionality as RESTful APIs using Cloud Endpoints. The client side applications make calls to the APIs for authentication and for gaining access to Google Cloud Storage.
Another important role of the App Engine application is ingesting metadata and keeping it in sync with the media content. The metadata is stored along with the application data in the App Engine Datastore or in the Cloud SQL database. The decision about which storage option to choose from depends on the characteristics of your application. There are a few ways to synchronize metadata ingestion with media content ingestion, for example, (1) by using Blobstore upload callback URL, (2) by using the Cloud Storage Object Change Notification, or (3) simply by exposing appropriate APIs from the App Engine application using Cloud Endpoints.
Cloud Storage provides virtually unlimited storage for media content at low cost. The media data is redundant. By leveraging the Google network, the content in Cloud Storage is globally accessible from App Engine, from Compute Engine, and from public internet outside Google Cloud Platform.
Cloud Storage provides several storage classes to choose from to best meet your needs.
Compute Engine provides superior performance for batch computation. Media processing, such as document format conversion, transcoding, and image manipulation, is a perfect candidate for Compute Engine. In this case, Cloud Storage acts as both the input source and the output destination of the media processing pipeline. Since Cloud Storage is well integrated with Compute Engine, such as automatic authentication via service account, it can be easily accessed from Compute Engine.
The media processing workflow is also managed by the App Engine application mentioned previously. After media content is uploaded into storage, the App Engine application creates and inserts media processing tasks into TaskQueue. The enqueued tasks are pulled out by the media processing software running on Compute Engine using RESTful APIs and executed accordingly. The App Engine application can also maintain the processing status of the media content and the load information of virtual machines in order to autoscale the Compute Engine instances.
Cloud Storage leverages the Google network to allow media content to be served across the internet with low latency and high availability. The Google network automatically provides edge caching capability for public content, which can significantly lower the serving costs.
As is the case with ingestion, the App Engine application handles user authentication and authorization, and coordinates access to Cloud Storage from the clients. For browser clients, the App Engine application powers the web user interface for media content downloading. For mobile or desktop clients, the client-side applications implement the user interface and communicate with the App Engine application through APIs exposed using Google Cloud Endpoints.
Various media applications can be built with the availability of metadata and application data. Depending on the application domains, some common examples of media applications are asset management, content sharing, and social gaming. App Engine provides a scalable platform to build media applications. App Engine applications are easy to build, easy to maintain, and easy to scale as your traffic and data storage needs grow. This allows developers to focus on building their core business and bring new features to market quickly.
In this solution, the App Engine application plays a critical role in defining user experiences for the system. As mentioned earlier, for browser clients, the App Engine application implements the web user interface for ingestion, serving, and media applications. For mobile and desktop clients, the App Engine application exposes its functionality as APIs using Cloud Endpoints. The native user interface at the client side is powered by these APIs.
The next section walks through the implementation details of the proposed digital media solution. It begins with a list of key components of the system and ends with a detailed presentation of the three important workflows of the system: media ingestion flow, media processing flow, and media serving flow.
- Frontend and Media Applications Running on App Engine
- Authenticates and authorizes users and coordinate access to Cloud Storage.
- Implements the user interface for browser clients, and/or exposes APIs using Cloud Endpoints to mobile and desktop clients.
- Plays the role of system controller and is responsible for managing workflows for media ingestion, serving, and processing.
- Scalable media applications are powered by App Engine, with built-in load-balancing and auto-scaling.
- Cloud Datastore
- Stores media content metadata and application data model.
- Cloud SQL
- Stores media content metadata and application data model, as an alternative to Cloud Datastore.
- App Engine Task Queue
- Integrates App Engine application with media processing software running on Compute Engine.
- Image Services
- Provides dynamic image processing services for App Engine applications, such as thumbnail generation, resizing, and cropping.
- Cloud Storage
- Provides scalable and highly available storage for media content. The storage can be accessed by using RESTful APIs and/or signed URLs.
- Leverages Google network for the following advantages: (1) to allow for fast and secure content ingestion into and serving from the storage and (2) for edge caching capability for public content which lowers the serving costs.
- Media Processing Server
- Executes media processing on Compute Engine.
Media Ingestion Workflow and Media Processing Workflow
The media ingestion workflow and the media processing workflow are often tied together. Both workflows are shown in the component communication diagram in Fig 2.
- The client accesses the App Engine application to start an upload. Depending on the type of clients, this request can be: (1) a simple HTTP request from the browser or (2) a call to an endpoint implemented by the App Engine application from a mobile or desktop application, such as a batch uploader. The App Engine application is responsible for authenticating the client/user and coordinate the Cloud Storage access.
- If the client is a web browser, the application can generate a signed upload URL to the Cloud Storage embedded in an HTTP POST form. Otherwise, if the client is a mobile or desktop application, the web application returns Cloud Storage access information as an endpoint call response.
- Regardless of the client, media files are uploaded to Cloud Storage directly by using either the web form or the Cloud Storage RESTful APIs.
- Cloud Storage returns a response back to the client. Depending on the uploading mechanism used in Step 3, the response can either be an HTTP response for form-based upload or a RESTful API response.
- If the upload succeeds, the media metadata needs to be pushed into the App Engine application. There are a few different ways to streamline the process:
- For browser clients using an upload form, a callback URL can be specified inside the upload URL. Based on the response, the browser can be redirected to this URL with limited metadata information embedded in the callback URL.
- Cloud Storage can notify the App Engine application upon upload success using a Cloud Storage feature called Object Change Notification. The notification contains metadata of the media object being uploaded.
- Based on the content upload response from the Cloud Storage, clients can also call the App Engine application Cloud Endpoints directly to upload any metadata.
- App Engine application stores the metadata in a persistent store. There are two options for the data stores depending on the application setup: (1) App Engine NoSQL Datastore, or (2) Cloud SQL.
- If media processing is required, the App Engine application can create a task on the task queue in order to start the media processing workflow. It is also possible for the App Engine application to spin up or bring down virtual machines based on the workload on demand.
- The media processing software, running on Compute Engine, pulls the task from the queue and executes the required procedures.
- The media processing software reads the media content from Cloud Storage, processes it, and stores the output back to Cloud Storage.
Media Serving And Download Workflow
Figure 3 describes the media serving and download workflow and is accompanied by a list of detailed descriptions.
- The clients start the media download by contacting the App Engine application which authenticates and authorizes the clients and also allows for browsing and searching of specific media content. This can either be accomplished by presenting a web user interface for browser client or via a RESTful API provided by the App Engine application using Cloud Endpoints.
- Based on the media metadata and application data in Datastore or Cloud SQL, the App Engine application can check the content sharing rules defined in the application, and look up access information for the content stored in Cloud Storage.
- For content to be securely downloaded from Cloud Storage, the App Engine application can generate a signed URL or provide OAuth access token, along with the Cloud Storage bucket and object names to the client. For browsers, the information is embedded in the web user interface. For mobile and desktop clients, the information is returned in the response of the RESTful API mentioned in Step 1.
- The clients make a request to the Cloud Storage to download the content by sending out HTTP or by calling the RESTful API. Cloud Storage can leverage the caching capability of the Google network for public content. If the content is available in the cache, content is returned from the cache. Otherwise, the following occurs:
- The content is retrieved from the Cloud Storage and the cache is filled.
- App Engine application allows the content retrieved from Cloud Storage to be proxied through the App Engine Image Services for “on the fly” resizing and cropping images.
- The media content is served to the client.
- The metadata for the media content can be stored along with the application data, either in App Engine Datastore or in Cloud SQL. The choice depends on the size of the data, the characteristics of the overall data model, and the developer team's expertise. For example, you may want to choose Cloud SQL if you have highly relational data. Alternatively, you may want to choose Datastore if you are scaling denormalized data to a massive data set. The trade-off between the two options is well discussed in the Google IO 2012 session, SQL vs NoSQL: Battle of the Backends.
- The provided solution uses Compute Engine for media processing. Compute Engine allows running custom software and packages on supported operating systems. The platform is suitable for general purpose media processing. Alternatively, for simple image and photo manipulations, App Engine provides an Image Service that can perform image processing on the fly.
A sample photo sharing application3 has been developed to demonstrate how a media asset management and sharing solution, like the one described earlier in the scenario, can be implemented. The photo sharing application allows a user to upload and make them available for other users to view. A user can also post comments for uploaded photos. The following list details the key elements of this use case scenario:
- A user is required to login with a valid Google account to use the application.
- A user uploads a photo and a description from a local disk.
- All photos uploaded into the photo sharing application are displayed in chronological order.
- The user adds comments, visible to all users, to any photo.
- When a photo is displayed, the image can be resized and cropped to fit into the user interface.
The source code is hosted under GitHub.
Google Cloud Platform enables developers to quickly build a digital media asset management and sharing solution that scales to millions of users and petabytes of data. The solution presented in this paper combines the power of App Engine, Compute Engine, and Cloud Storage to solve the technical challenges presented by digital media systems.
 The Object Change Notification is currently still a feature under the Trusted Tester program.
2 Garnter: Consumers Will Drive Huge Growth for Cloud Storage, by Colleen Miller, July 2012.