Policy on the use of Internationalized Resource Identifiers (draft)#

This document describes the policy of 17beta in allocating IRIs (Internationalized Resource Identifier, erroneously referred to as URLs in popular usage) for resources hosted or curated by 17beta.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

In this document the key words “availability” and “variance” are used in the meaning described in [per]. “allocation” is used as described in [webarch].

1 General considerations

The term “allocation” refers to associating a resource with an IRI as described in [webarch].

Allocation of IRIs MUST be documented. The source code of the web site MAY serve as the respective documentation. For identifiers not intended to be persistent in the span of decades, the scheme https SHOULD be used when available instead of http.

Allocated IRIs MUST be in normal form as described in [IRI] §5.3.2. Allocated IRIs MUST have a non-empty path. If the path is other than /, all segments MUST be non-empty. Example: https://17beta.top/en/ and https://17beta.top//abc are not allowed. Allocated IRIs for resources in web format MUST NOT contain filename extensions. It is RECOMMENDED that any human-language phrase that is part of an allocated IRI is written in lowercase and spaces are transcribed as U+005F “_”.

For purposes of hierarchy semantics, the absolute path / is considered the first hierarchical component in all absolute paths under the same scheme and authority. The subsequent hierarchical components (if any) are the segments of the path. Example: The hierarchical components of /en/mtf_pharmacology are /, en and mtf_pharmacology in order of decreasing significance. For each allocated IRI, the IRIs obtained by removing complete hierarchical components assigned by 17beta SHOULD be allocated and identify a more general resource in some suitable sense; for example, if the original IRI https://17beta.top/en/some_book/introduction identifies a chapter of a book in English then https://17beta.top/en/some_book could identify the book as a whole and https://17beta.top/en would identify some resource in English from which the orginal IRI is (directly or indirectly) linked.

It is RECOMMENDED that all IRIs are allocated following the maxim “Do things as simple as possible, but not more simple”. Allocations SHOULD avoid placing spurious information in the IRI and avoid including information specific to the current technological setting. See section “Guideline 1: Choose URIs wisely” in [CHIPS]. Example: https://17beta.top/ is preferred to https://www.17beta.top/ because the www part is spurious information that contributes nothing to the allocation.

Allocated IRIs MAY use a simplified and generalized version of the title of the resource (if it has one) to reduce the probability title of the resource changes. Example: The IRI https://17beta.top/en/mtf_pharmacology has remained meaningful even though the title of the resource identified changed from “Pharmacological notes about transsexualism” to “Pharmacology of transsexualism”. Likewise, categorization beyond what is required to identify the resource SHOULD NOT be included in the IRI because it is liable to change.

2 Persistence

At time of allocation or at any time subsequent to allocation an identifier MAY be designated as being persistent by its documentation. This signals the commitment of 17beta to keep such an allocation effective for as much time as feasible. Persistent IRIs MUST be allocated using the ARK system (see below). Any persistent identifier published by 17beta as being persistent MUST have availability equal to lifetime or subinfinite; if none is explicitly stated in the documentation then lifetime is implied. Persistent IRIs MAY be published using the term “permalink”. Persistent IRIs MAY be allocated without being publicized; in such a case, there is no implicit statement about availability, only of persistence of the allocation as described in the documentation.

3 Archival Resource Key

The NAAN 21206 is allocated to 17beta (see the NAAN allocation registry). The resolver for NAAN 20206 is currently https://17beta.top/. The resolver for NAAN 21206 MAY change in the future.

ARKs allocated under NAAN 21206 MUST obey the following principles:

  • Non-reassignment<ark:/99152/h1215>. ARKs MUST NOT be re-assigned; that is, once an ARK-to-resource association has been made public, that association MUST be considered unique into the indefinite future.
  • Opaque identifiers<ark:/99152/h1218>. To help them age and travel well, the Name part of allocated ARKs MUST NOT contain widely recognizable semantic information (to the extent possible).

As of 2019, ARK is not an URI scheme. However ARKs can be expressed as URIs by combining them with a NMAH as described in [ARK]. For the purposes of allocation, an equivalence class is defined for any bare ARK, all of its equivalent bare ARKs and any corresponding IRI under any of the resolvers listed below are considered equivalent.

  • https://17beta.top/
  • https://n2t.net/

The allocation of any identifier in the equivalence class SHALL imply the allocation of all the other identifiers in that equivalence class to exactly the same resource. However, different identifiers within the same equivalence class MAY have different availability statements.

3.1 Shoulderspaces

As recommended for the ARK system, the Name part is sub-divided in shoulderspaces. For NAAN 21206 the following is the current policy: The shoulder is a sequence of zero or more digits that are 2-9 and exactly 1 digit that is 0-1. Example: In the hypothetical ARK ark:21206/730591 the shoulder is 730. The shoulder “0” is reserved for subdivision using a different scheme. Names that do not begin with a decimal shoulder are reserved.

3.2 Resolution

Resolving any ARKs allocated by NAAN 20206 through any resolver operated by 17beta MUST include a link (as described in [link]) with the “describedby” relation pointing to a resource with metainformation about the resource that includes the name of the resource, the persistence property, the variance property and the date of allocation of the ARK (if known); the link MAY be included inline within the representation and it MAY be in the HTTP headers (if applicable) but only one option is REQUIRED. If the HTTP protocol is used, the status code MUST NOT be a permanent redirect (status code 301 or 308) but it MAY be a temporary redirect (status code 302, 303 or 307), a direct reply (status code 200 or 204) or another appropriate status code.

4 IRIs reserved for resolvers of identifier schemes

resolver-reserved = web-prefix "/" scheme ":" *ipchar *("/" *ipchar)
scheme            = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )

The IRIs that match resolver-reserved are reserved for resolves of Archival Resource Keys (ARK) and similar future schemes; they MUST be allocated only for defined identifier schemes (not necessarily an URI scheme) and MUST match the semantics of that scheme.

5 IRIs for non-information resources

The term “information resource” is used as described in [coolURIs]. “non-information resource” has the correspoinding meaning.

IRIs allocated for information resources that use the http or https URI scheme MUST not include an IRI fragment part and MUST return either a 404 or a 303 stauts code on dereference.

6 Namespaces

The notation used to describe syntax is that described in RFC 5234 augmented with set difference indicated by the operator “-” whose precedence is between Alternative and Concatenation. The syntactical terms ifragment, isegment-nz and iquery are as described by the IRI specification. web-prefix always referes to the domain names controlled by 17beta; the definition below is valid at the time of writing of this document.

web-prefix = http-s "://17beta.top" /
           http-s "://male-to-female.org"
http-s     = "http" / "https"
tail       = ["?" iquery] ["#" ifragment]
ark-prefix = web-prefix /
           http-s "://n2t.net"

6.1 Language namespace

lang-ns  = web-prefix lang-tag *("/" *ipchar) tail
lang-tag = 2*3ALPHA ["-" 1*ipchar] / (("x-" / "X-") 1*ipchar)

The IRIs that match lang-ns constitute the “language” namespace. For IRIs allocated in this lang-ns namespace the lang-tag part MUST be a valid language tag as defined in BCP 47 or its successor; the lang-tag represents the language in which the tail of the IRI is written. The resource identified by the IRI SHOULD either be language-specific or available in the language identified by the lang-tag.

6.2 UUID namespace

uuid-ns   = web-prefix "/" uuid [slug] tail / "urn:uuid:" uuid
uuid      = 8hex-digit "-" 4hex-digit "-" 4hex-digit
          "-" 4hex-digit "-" 12hex-digit
hex-digit = DIGIT / %x41-5A / %x61-7A    ; 0-1A-Fa-f
slug      = isegment-nz

The IRIs that match uuid-ns constitute the “UUID” namespace. IRIs allocated in this namespace MUST have a lowercase uuid part which MUST be a valid UUID as described in RFC 4122 or its successor. Once an UUID is used for the uuid part, it MUST NOT be reused. That is, a different resource MUST NOT be allocated using the same uuid part.

6.3 Opaque namespace

opaque-ns    = (web-prefix "/" / ark-prefix "/ark:" "/"?) opaque tail
opaque       = consonant *opaque-digit *(opaque-misc 1*opaque-digit) - 2*3ALPHA /
opaque-digit = DIGIT / consonant
consonant    = ALPHA - ("a" / "A" / "e" / "E" / "i" /
                      "I" / "o" / "O" / "u" / "U")
opaque-misc  = "-" / "_"

The IRIs that match opaque-ns constitute the “opaque” namespace. IRIs allocated within this namespace MUST have an opaque that has no meaning beyond uniquely identlfiying the object. For convenience, the opaque part MAY be assigned sequentially based upon a positional numbering system (radix not specified here). However, after it is generated, the sequence number carries no meaning and MUST not be interpreted as metadata (e.g.: to infer some resource was created before another or that it forms a logical sequence with another resource).

6.4 Time-based namespace

datetime-ns = prefix "/" timespec ["/" *ucschar] tail
timespec    = 8DIGIT "T" 6DIGIT ["." 1*7DIGIT] "Z"

The IRIs that match datetime-ns constitute the time-based namespace. IRIs allocated within this namespace MUST be valid according to ISO 8601, MUST have a timespec that is the time when they were allocated and SHOULD use a time resolution that corresponds roughly with the frequency with which IRIs are generated by the application.

6.5 Personal namespace

personal-ns = web-prefix "/~" name name-tail tail
name        = 1*namechar
name-tail   = *("/" 1*ipchar)
namechar    = ALPHA / DIGIT / "-" / "_" / ucschar

The IRIs that match personal-ns constitute the “personal” namespace. IRIs allocated within this namespace where name-tail is the empty string are reserved for personal home pages hosted by 17beta or archived versions of personal home pages formerly hosted in 17beta and their technical dependencies (e.g.: stylesheets). This namespace MUST not be used for other purposes; name SHOULD be a name or nickname of the person of which it is a home page.

6.6 Miscellaneous namespace

IRIs that do not match any of the above described namespaces constitute the “Miscellaneous” namespace. There are specific restrictions on the use of this namespace.

7 References