What characters are allowed in URIs?#

We present a summary of which character are allowed in URIs and what special purposes (if any) they have.

1 Allowed characters

The specification [URI] (p. 13) allows only characters in the classes unreserved, reserved and the character “%” (which is not part of either class) to appear in URIs.

1.1 Percent sign

  • “%” 37, U+0025. Used exclusively in percent-encoded characters.

1.2 Unreserved

  • “-” 45, U+002D
  • “.” 46, U+002E. Used as a separator in IPv4 addresses. Used in relative URI references, which are not themselves URIs.
  • “0” 48, U+0030 to “9” 57, U+0039
  • “A” 65, U+0041 to “Z” 90, U+005A
  • “a” 97, U+0061 to “z” 122, U+007A
  • “_” 95, U+005F

1.3 Reserved

  • “!” 33, U+0021
  • “#” 35, U+0023. Used to introduce the fragment.
  • “$” 36, U+0024
  • “&” 38, U+0026. De-facto used for “key1=val1&key2=val2” notation in queries. Such an usage is allowed by specificaiton but not mandanted nor given particular status.
  • “'” 39, U+0027 APOSTROPE
  • “(” 40, U+0028
  • “)” 41, U+0029
  • “*” 42, U+002A
  • “+” 43, U+002B
  • “,” 44, U+002C
  • “/” 47, U+002F. Used to delimit the authority (which in practice is the host, but may include user credentials). Used as a delimiter inside the path.
  • “:” 58, U+003A. Used to delimit the URI scheme (e.g.: “http:”), and the port of the host. Used as a separator in IPv6 addresses.
  • “;” 59, U+003B
  • "=" 61, U+003D. De-facto used in “key=val” notation in queries. Such an usage is allowed by specificaiton but not mandanted nor given particular status.
  • “?” 63, U+003F. Used to introduce the query.
  • “@” 64, U+0040. Used to introduce the user
  • “[” 91, U+005B. Used to delimit IPv6 addresses.
  • “]” 93, U+005D. Used to delimit IPv6 addresses.

2 Not allowed

The following is the set of non-control characters in the range of codepoints 0 to 127 (Commonly known as ASCII) that are not allowed to appear in URIs. Note that these characters may still be represented in percent-encoded form. Codepoints outside the range 0 to 127 are not allowed in URIs, but IRIs may contain them (see [IRI]).

  • “ ” 32, U+0020 SPACE
  • “"” 34, U+0022
  • “<” 60, U+003C
  • “>” 62, U+003E
  • “\” 92, U+005C
  • “^” 94, U+005E
  • “`” 96, U+0060
  • “{” 123, U+007B
  • “|” 124, U+007C
  • “}” 125, U+007D

3 References