Hauptnavigation:

You are here: Homepage > FAQ > FAQs about IDNs & ß

 Punycode is a rule that describes how Unicode characters are assigned uniquely to ASCII character strings. You will find a technical definition of this rule in RFC3492 (Punycode: A Bootstring Encoding of Unicode for Internationalized Domain Names in Applications).

In an extremely simplified form, the following is what happens in this transposition:

The previously normalized IDN has the prefix "xn--" placed in front of it. All non-ASCII characters are taken out. The punycode algorithm determines what these characters were and where they stood and adds this coded information to the end of the string that is left. To give an example: "zääz.de" is encoded as "xn--zz-viaa.de".

 Computers can only work with numbers. So letters of the alphabet and other characters have to be assigned to numbers before computers can process and store them. Before Unicode was developed there used to be hundreds of different coding systems, and not one of them was complete. Even just concentrating on one language (such as German) there was not a single system that really contained all the letters of the alphabet, punctuation marks and technical symbols in common use. The situation was rendered even more unsatisfactory in that it was not possible to use these various coding system side-by-side at the same time, since the various numbers were assigned to different characters. All this changed with the advent of Unicode, which now ensures unique assignments of characters to numbers, no matter what hardware and software is used. Texts that use Unicode can be exchanged throughout the world without problems or loss of information.

The original definition of Unicode and its further development is in the hands of the Unicode-Consortium, a non-profit body, whose purpose is to normalize and standardize the representation of text data in the computer field. The consortium's members include many companies and institutions from the IT sector.

 No. These entries are loaded directly into the name server and are not transposed first. The situation for host names for name servers and NS entries remains precisely what it has been so far: the only permitted designations are those that are comprised solely of the basic ASCII character set. It is thus possible to enter the punycode value (such as dns.xn—zz-viaa.de) as the host name in the name-server entries but not the corresponding IDN (such as dns.zääz.de). This arrangement has one big advantage in that hosts with Japanese, Chinese, Cyrillic, etc. names can also act as name servers too and there are no limitations to a particular character set. It is true that such host names cannot be registered under .de, but other registries are free to choose what letters and other characters they want to permit and, of course, their decisions are guided by the needs of the internet users they cater for.

A .de domain must consist of at least one character, but its maximum length must not exceed 63 characters. In the case of IDNs, the question immediately arises as to whether these length constraints refer to the IDN itself (such as dänic.de) or its transposition as an ASCII character string (xn--dnic-loa.de). For technical reasons, the maximum length of 63 characters applies to the ASCII character string, whereas the minimum length of three characters applies to the IDN.

Yes. Just call the table DENIC has compiled for you. It contains a list of all the admissible characters, plus their corresponding numbers (in both decimal and hexadecimal) in the Unicode table, along with their official designations in both English and German. This table can also be used by your computer’s copy function. Simply mark a letter in the table and then paste it either to your browser or DENIC's whois search.

Besides the ASCII characters (the 26 Latin letters, the ten numerals and the hyphen) you may use some other characters for IDNs under .de. These include the German umlauts ä, ö and ü as well as the letter eszett ("ß") and letters with accents and other diacritics. We have compiled a list with the new additional characters valid for IDNs under .de in a table for you.

You might wonder why these particular 93 characters have been chosen and not others. There are several reasons:

  • DENIC supports all characters included in the Unicode Latin-1 Supplement and Latin Extended-A blocks which are marked as "PROTOCOL VALID" in the RFC 5892 (The Unicode Code Points and Internationalized Domain Names for Applications).
  • DENIC is an open registry free from any form of discrimination. In Germany, there is no meaningful way of drawing a line between the various character sets, since the written German language now includes characters that originated in the languages of the northern, southern and eastern parts of Europe. The sensible and appropriate solution for us therefore seemed to be to adopt two blocks that cover the necessary European character set for those languages that are based on the Latin alphabet, including some additional characters.
  • The most frequently used new characters of these character sets can be entered via standard German keyboards without requiring any additional equipment or outlay.

 Early in 2003, the IETF (Internet Engineering Task Force) adopted a standard that very considerably broadens the number of characters that can potentially go into making up domains.

Up until the present, only ASCII characters have been permitted in domains (that is to say: a-z, 0-9 and -). It has been countries using alphabets other than the Latin one that have been particularly intensively involved in efforts to broaden the possible character set to include characters that they use nationally. Such a standard now exists and it is called IDNA (Internationalized Domain Names in Applications). There is a ready explanation for the problem that used to stand in the way of using specific national characters in domains. The Internet protocols that are currently in use throughout the world were only ever designed for pure ASCII characters. It would have been prohibitive to replace all the server systems and/or operating-system components, such as resolvers, that are based on these protocols. So the remit that was given for developing the new protocol was that it must be fundamentally possible to convert every domain represented with non-ASCII characters into a pure ASCII string. A procedure known as the punycode algorithm is applied to the IDN to convert it into an ASCII character string, known as an "ACE string" (ACE = ASCII Compatible Encoding). To distinguish ACE strings from others, they all have to start with a particular prefix, namely "xn- -". So the ACE domain is comprised of three blocks: firstly, the prefix, secondly, all the domain’s classical ASCII components in the correct order, and, thirdly, a code which indicates all the non-ASCII characters and their positions.