Excluding URLs in Scans

You can specify one or more excluded URL patterns to avoid testing sections of a site during a scan. The scanner does not request resources that match any of the exclusions. The following sections describe the pattern matching used by the Cloud Security Scanner.

Pattern matching for excluded URLS

Excluded URLs matching is based on a set of URLs defined by match patterns. A match pattern is essentially a URL with 3 parts:

  • scheme — for example, http or *
  • host — for example, www.google.com or *.google.com or *
  • path — for example, /*, /foo*, or /foo/bar. *

Here's the basic syntax:

<exclude-pattern> := <scheme>://<host><path>
<scheme> := '*' | 'http' | 'https'
<host> := '*' | '*.' <any char except '/' and '*'>+
<path> := '/' <any chars>

The meaning of * depends on whether it's in the scheme, host, or path part. If the scheme is *, then it matches either HTTP or HTTPS. If the host is just *, then it matches any host. If the host is *.hostname, then it matches the specified host or any of its subdomains. In the path section, each * matches 0 or more characters.

Valid Pattern Matches

The following table shows some valid patterns:

Pattern Behavior Sample matching URLs
http://*/* Matches any URL that uses the HTTP scheme. http://www.google.com/
http://*/foo* Matches any URL that uses the HTTP scheme, on any host, as long as the path starts with /foo. http://example.com/foo/bar.html
https://*.google.com/foo*bar Matches any URL that uses the HTTPS scheme, is on a google.com host (such as www.google.com, docs.google.com, or google.com), as long as the path starts with /foo and ends with bar. http://www.google.com/foo/baz/bar
http://example.org/foo/bar.html Matches the specified URL. http://example.org/foo/bar.html* Matches any URL that uses the HTTP scheme and is on the host
*://mail.google.com/* Matches any URL that starts with http://mail.google.comorhttps://mail.google.com. http://mail.google.com/foo/baz/bar

Invalid pattern matches

The following table shows some invalid patterns:

Pattern Why the pattern is invalid
http://www.google.com No path.
http://*foo/bar * in the host can be followed only by a . or /.
http://foo.*.bar/baz If * is in the host, it must be the first character.
http:/bar Missing scheme separator ("/" should be "//").
foo://* Invalid scheme.
