public final class CharEscapers
Utility functions for encoding and decoding URIs.
Static Methods
decodeUri(String uri)
public static String decodeUri(String uri)
Decodes application/x-www-form-urlencoded strings. The UTF-8 character set determines what characters are represented by any consecutive sequences of the form "%XX".
This replaces each occurrence of '+' with a space, ' '. This method should not be used for non-application/x-www-form-urlencoded strings such as host and path.
Parameter | |
---|---|
Name | Description |
uri | String a percent-encoded US-ASCII string |
Returns | |
---|---|
Type | Description |
String | a string without any percent escapes or plus signs |
decodeUriPath(String path)
public static String decodeUriPath(String path)
Decodes the path component of a URI. This does not convert + into spaces (the behavior of java.net.URLDecoder#decode(String, String)). This method transforms URI encoded values into their decoded symbols.
e.g. decodePath("%3Co%3E")
returns "
Parameter | |
---|---|
Name | Description |
path | String the value to be decoded |
Returns | |
---|---|
Type | Description |
String | decoded version of |
escapeUri(String value)
public static String escapeUri(String value)
Escapes the string value so it can be safely included in application/x-www-form-urlencoded data. This is not appropriate for generic URI escaping. In particular it encodes the space character as a plus sign instead of percent escaping it, in contravention of the URI specification. For details on application/x-www-form-urlencoded encoding see the see HTML 4 specification, section 17.13.4.1.
When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
- The special characters ".", "-", "*", and "_" remain the same.
- The space character " " is converted into a plus sign "+".
- All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.
Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences.
From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all
percent-encodings."
This escaper has identical behavior to (but is potentially much faster than):
- java.net.URLEncoder#encode(String, String) with the encoding name "UTF-8"
Parameter | |
---|---|
Name | Description |
value | String |
Returns | |
---|---|
Type | Description |
String |
escapeUriConformant(String value)
public static String escapeUriConformant(String value)
Escapes the string value so it can be safely included in any part of a URI. For details on escaping URIs, see RFC 3986 - section 2.4.
When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
- The special characters ".", "-", "*", and "_" remain the same.
- The space character " " is converted into "%20".
- All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.
Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences.
From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all
percent-encodings."
Parameter | |
---|---|
Name | Description |
value | String |
Returns | |
---|---|
Type | Description |
String |
escapeUriPath(String value)
public static String escapeUriPath(String value)
Escapes the string value so it can be safely included in URI path segments. For details on escaping URIs, see RFC 3986 - section 2.4.
When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
- The unreserved characters ".", "-", "~", and "_" remain the same.
- The general delimiters "@" and ":" remain the same.
- The subdelimiters "!", "$", "&", "'", "(", ")", "*", ",", ";", and "=" remain the same.
- The space character " " is converted into %20.
- All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.
Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences.
From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all
percent-encodings."
Parameter | |
---|---|
Name | Description |
value | String |
Returns | |
---|---|
Type | Description |
String |
escapeUriPathWithoutReserved(String value)
public static String escapeUriPathWithoutReserved(String value)
Escapes a URI path but retains all reserved characters, including all general delimiters. That is the same as #escapeUriPath(String) except that it does not escape '?', '+', and '/'.
Parameter | |
---|---|
Name | Description |
value | String |
Returns | |
---|---|
Type | Description |
String |
escapeUriQuery(String value)
public static String escapeUriQuery(String value)
Escapes the string value so it can be safely included in URI query string segments. When the query string consists of a sequence of name=value pairs separated by &, the names and values should be individually encoded. If you escape an entire query string in one pass with this escaper, then the "=" and "&" characters used as separators will also be escaped.
This escaper is also suitable for escaping fragment identifiers.
For details on escaping URIs, see RFC 3986 - section 2.4.
When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
- The unreserved characters ".", "-", "~", and "_" remain the same.
- The general delimiters "@" and ":" remain the same.
- The path delimiters "/" and "?" remain the same.
- The subdelimiters "!", "$", "'", "(", ")", "*", ",", and ";", remain the same.
- The space character " " is converted into %20.
- The equals sign "=" is converted into %3D.
- The ampersand "&" is converted into %26.
- All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.
Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences.
From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all
percent-encodings."
Parameter | |
---|---|
Name | Description |
value | String |
Returns | |
---|---|
Type | Description |
String |
escapeUriUserInfo(String value)
public static String escapeUriUserInfo(String value)
Escapes the string value so it can be safely included in URI user info part. For details on escaping URIs, see RFC 3986 - section 2.4.
When encoding a String, the following rules apply:
- The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
- The unreserved characters ".", "-", "~", and "_" remain the same.
- The general delimiter ":" remains the same.
- The subdelimiters "!", "$", "&", "'", "(", ")", "*", ",", ";", and "=" remain the same.
- The space character " " is converted into %20.
- All other characters are converted into one or more bytes using UTF-8 encoding and each byte is then represented by the 3-character string "%XY", where "XY" is the two-digit, uppercase, hexadecimal representation of the byte value.
Note: Unlike other escapers, URI escapers produce uppercase hexadecimal sequences.
From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all
percent-encodings."
Parameter | |
---|---|
Name | Description |
value | String |
Returns | |
---|---|
Type | Description |
String |