Structure of a URL

A valid value for the URL data type can be composed of the following parts.

Example URL:

NOTE: IP addresses that include the protocol identifier ( do not contain domain identifiers and need to be processed using a different set of methods. It might be easier to remove the protocol identifiers and change the data type to IP Address.

The hierarchy of domain names extends from right to left.

Element NameExamplesWrangle FunctionNotes
Top-level domain
  • com, net, org
SUFFIX Function

Every valid URL must have at least one top-level domain.

NOTE: When the DOMAIN function parses a multi-tiered top-level domain such as, the output is the first part of the domain value (e.g. co).

Second-level domain



DOMAIN FunctionThis value can be extracted from a valid URL using the DOMAIN function. See DOMAIN Function.
Third-level domainwwwSUBDOMAIN FunctionThis value can be extracted from a valid URL using the SUBDOMAIN function. See SUBDOMAIN Function.
protocol identifier



You can use pattern matching to locate these protocol identifiers. In your Wrangle transforms, use the following Cloud Dataprep pattern:


For an example, see IPTOINT Function.


Send feedback about...

Google Cloud Dataprep Documentation