Run LabelCat on Google Cloud Platform

Organizing the issues in your GitHub repositories can be a different kind of animal; that's why you need LabelCat.

LabelCat is a Node.js app that uses the Google Prediction API to automatically label GitHub issues as they are created. Who taught LabelCat this trick you ask? You did, or rather, you will. After LabelCat is deployed, you can train it regarding previously labeled issues from a variety of repositories. These instructions show how to run LabelCat in the App Engine flexible environment for ease of use and scalability, but LabelCat can be deployed anywhere.

Many Google Cloud Platform Node.js samples are small and simple, focusing on a single concept or piece of functionality. In contrast, LabelCat is a larger Node.js app that addresses a number of production concerns. As such, LabelCat is an excellent resource as you build your own production applications on Google Cloud Platform.

This tutorial assumes that you are familiar with Node.js programming and that you have installed Node.js 4.0 or newer.

Before you begin

Check off each step as you complete it.

  1. check_box_outline_blank check_box Create a project in the Google Cloud Platform Console.
    If you haven't already created a project, create one now. Projects enable you to manage all Google Cloud Platform resources for your app, including deployment, access control, billing, and services.
    1. Open the Cloud Platform Console.
    2. In the drop-down menu at the top, select Create a project.
    3. Click Show advanced options. Under App Engine location, select a United States location.
    4. Give your project a name.
    5. Make a note of the project ID, which might be different from the project name. The project ID is used in commands and in configurations.
  2. check_box_outline_blank check_box Enable billing for your project, and sign up for a free trial.

    If you haven't already enabled billing for your project, enable billing now, and sign up for a free trial. Enabling billing allows the application to consume billable resources such as running instances and storing data. During your free trial period, you won't be billed for any services.

  3. check_box_outline_blank check_box Install the Google Cloud SDK.

    If you haven't already installed the Google Cloud SDK, install and initialize the Google Cloud SDK now. The SDK contains tools and libraries that enable you to create and manage resources on Google Cloud Platform.

  4. check_box_outline_blank check_box Enable APIs for your project.

    This takes you to the Cloud Platform Console and automatically enables the APIs used by this tutorial. The APIs used are: Google Cloud Datastore API, Google Cloud Pub/Sub API, and Google Prediction API.

  5. check_box_outline_blank check_box Create a service account for your project.
    1. Go to the Create service account key page in the Google Cloud Platform Console.
    2. Click Create service account.
    3. For name, type aspnet.
    4. Check the box to Furnish a new private key.
    5. Leave the key type set to JSON.
    6. Click Create.
    7. A key file is downloaded to your computer. Keep this file in a secure place.

Register a developer application on GitHub

GitHub supports two kinds of developer applications: individual and organization.

If you intend to use LabelCat on repositories owned by an individual GitHub user, register an individual developer application here:

https://developer.github.com/v3/oauth/

If you intend to use LabelCat on repositories owned by an organization that you administer, register an organization developer application here. Replace <your-org-name> with your organization name:

https://github.com/organizations/YOUR_ORG_NAME/settings/applications/new

To register your developer application, follow these steps:

  1. Click Register new application.
  2. Enter an application name, for example, "My LabelCat".
  3. For the homepage URL, enter https://<your-project-id>.appspot.com/.
  4. Optionally enter an application description.
  5. For the authorization callback URL, enter https://<your-project-id>.appspot.com/auth/github/callback.
  6. Click Register application.
  7. Copy the generated client ID and client secret, and save them for later.

Download and run the app

After you've completed the prerequisites, you can download and deploy the LabelCat sample app. The following sections guide you through getting the LabelCat app up and running.

Clone the LabelCat app

The code for the LabelCat sample app is in the GoogleCloudPlatform/LabelCat repository on GitHub. If you haven't already, copy the repository to your local machine:

git clone https://github.com/GoogleCloudPlatform/LabelCat.git

Go to the directory that contains the sample code:

cd LabelCat

Alternatively, you can download the sample as a zip and extract it.

Configure settings

  1. Copy the default configuration file:

    cp config.default.js config.js
    
  2. Open config.js for editing.

  3. Under gcloud, set the value of projectId to your project ID, which is visible in the Cloud Platform Console. Set the value of keyFile to the path of the key file you downloaded when you created your service account.

  4. Under github, set the values of clientId and clientSecret to the client ID and secret that were generated when you registered your GitHub developer application. Edit the value of redirectUrl to include the Authorization callback URL that you specified when you registered your GitHub developer application.

  5. Save and close config.js.

Run the app on your local computer

  1. Install dependencies:

    npm install
    
  2. Run the start script:

    npm start
    
  3. In your web browser, enter this address:

    http://localhost:8080
    

You can see the sample app displayed in the page. This page is delivered by the LabelCat web server running on your computer.

When you're ready to move forward, press Ctrl+C to stop the local web server.

Deploy the app to Google Cloud Platform

Enter this command to deploy the sample:

gcloud app deploy app.yaml worker.yaml

Wait for the message that notifies you that the app update has completed.

See the app run in the cloud

In your web browser, enter this address:

https://<your-project-id>.appspot.com

This time, the page is delivered by a web server running in the App Engine flexible environment.

If you update your app, you can deploy the updated version by entering the same command you used to deploy the app the first time. The new deployment creates a new version of your app and promotes it to the default version. The older versions of your app remain, as do their associated VM instances. Be aware that all of these app versions and VM instances are billable resources. For information about deleting or stopping your VM instances, see Cleaning up.

For convenience, you can use an npm script to run the gcloud command. Add these lines to your package.json file:


"scripts": {
  "start": "node --harmony ${SCRIPT:-app.js}",
  "monitor": "nodemon --harmony ${SCRIPT:-app.js}",
  "unit": "node --harmony node_modules/mocha/bin/mocha test/unit/app/index.test.js",
  "integration": "node --harmony node_modules/mocha/bin/mocha test/integration/app/index.test.js",
  "test": "npm run unit && npm run integration",
  "deploy": "gcloud app deploy app.yaml worker.yaml"
}

Now you can run this command to deploy your application:

npm run deploy

Understanding the code

LabelCat consists of an Angular.js single-page app on the front end and a Node.js RESTful web API on the back end. There is also a worker script, which uses the Prediction API to train models that the back end puts in a queue.

The app.js file contains the JavaScript code to create a Node.js server that responds to requests. The following code starts a web server on port 8080. The App Engine flexible environment uses this port by default.

var server = http.createServer(app).listen(config.port, config.host, function () {
  console.log(`App listening at http://${server.address().address}:${server.address().port}`);
  console.log('Press Ctrl+C to quit.');
});

The server uses the Express.js framework and defines several RESTful endpoints that the front end can use to make AJAX requests. The following code shows some of the RESTful endpoints:

// Search GitHub for a repo with the specified owner and name    
app.get('/api/repos/search/:owner/:repo', ensureAuthenticated, utils.makeSafe(repos.search));

app.route('/api/repos/:key')
  // Return a repo
  .get(ensureAuthenticated, utils.makeSafe(repos.findOne))
  // Update a repo
  .put(ensureAuthenticated, utils.makeSafe(repos.updateOne))
  // Delete a repo
  .delete(ensureAuthenticated, utils.makeSafe(repos.destroyOne));

app.route('/api/repos')
  // Return a repo
  .get(ensureAuthenticated, utils.makeSafe(repos.findAll));

Middleware helps the server handle requests by taking care of request body parsing, secure cookie-based session handling, CSRF protection, authentication, static file serving, logging, error handling and more. The following code shows how the server responds to unknown requests, logs errors, and handles exceptions that might have occurred higher up in the middleware chain:

// Catch any other unknown requests by rendering the home page, where
// Angular will take care of the rest.
app.get('*', function (req, res, next) {
  return res.sendFile('index.html', {
    root: './public/'
  });
});

// Add the error logger after all middleware and routes so that
// it can log errors from the whole application. Any custom error
// handlers should go after this.
app.use(logger.errorLogger);

// Catch all handler, assumes error
app.use(errorHandler);

Configuration

Every app that runs in the App Engine flexible environment requires an app.yaml file to describe its deployment configuration.

The app.yaml file configures LabelCat's default module:

runtime: nodejs
vm: true
api_version: 1

# Force https for all requests
handlers:
  - url: .*
    script: None
    secure: always

resources:
  cpu: .5
  memory_gb: 1.3
  disk_size_gb: 10

This minimal app.yaml file sets the runtime to nodejs with the flexible environment. There are many other configuration values you can specify in app.yaml that allow you to customize resources, scaling, and other settings. For details about the configuration settings for the flexible environment, see App Engine flexible environment.

The worker.yaml file configures LabelCat's worker module. In this case, the file specifies that the worker module can autoscale up to three instances as needed.

runtime: nodejs
vm: true
api_version: 1
module: worker

# Force https for all requests
handlers:
  - url: .*
    script: None
    secure: always

resources:
  cpu: .5
  memory_gb: 1.3
  disk_size_gb: 10

automatic_scaling:
  min_num_instances: 1
  max_num_instances: 3
  cool_down_period_sec: 60
  cpu_utilization:
    target_utilization: 0.75

Bookshelf app logo

LabelCat on GitHub

Fork, rate, and contribute to LabelCat on GitHub.

Cleaning up

If you're done with the tutorials and want to clean up resources that you've allocated, see Cleaning Up.

Send feedback about...