ISO/IEC FCD 19774 — Humanoid animation (H-Anim)
5 Abstract data types
This clause describes the syntax and general semantics of data types used by H-Anim to define the properties of H-Anim objects.
Table 5.1 provides links to the major topics in this clause.
There are two general classes of data types: data types that contain an elemental value and fixed-length arrays of these and data types that contain an ordered list of elemental value data types.
Elemental value data types can specify real numbers, text strings, or H-Anim objects. When a fixed number of elemental value data types are needed, an array of elemental value data types may be specified in the form:
DataTypeName[array_size]
Each of the allowed elemental data types are specified below.
Ordered list data types are specified in the form:
sequence<DataTypeName>
where DataTypeName is the name of an elemental value data type. Sequences of elemental value data types may be sequences of elemental value array data types. Such an ordered list may be empty indicating zero values in the ordered list.
The float data type specifies one single-precision floating point value.
Implementation of this data type is targeted at the single precision floating point capabilities of processors. However, it is allowable to implement this data type using fixed point numbering provided at least six decimal digits of precision are maintained and that exponents have range of at least [-12, 12] for both positive and negative numbers.
The int data type specifies one integer value supporting at least the range [-2147483647, 2147483647].
The Object data type is a private data type that specifies the representation of one H-Anim object. The exact form of the representation of an instance of Object is specified by the binding of this International Standard to a presentation system.
The string data type represents text strings encoded with the UTF-8 universal character set (see 2.[I16046-1]). Instances of the string data type are specified as a sequence of UTF-8 octets.
Any characters (including linefeeds and '#') may appear within the string.
Characters in 2.[I10646-1] are encoded in multiple octets. Code space is divided into four units, as follows:
+-------------+-------------+-----------+------------+ | Group-octet | Plane-octet | Row-octet | Cell-octet | +-------------+-------------+-----------+------------+
2.[I10646-1] allows two basic forms for characters:
In addition, two transformation formats (UCS Transformation Format or UTF) are accepted: UTF-8 and UTF-16. Each represents the nature of the transformation: 8-bit or 16-bit. UTF-8 and UTF-16 are referenced in 2.[I10646-1].
UTF-8 maintains transparency for all ASCII code values (0...127). It allows ASCII text (0x0..0x7F) to appear without any changes and encodes all characters from 0x80.. 0x7FFFFFFF into a series of six or fewer bytes.
If the most significant bit of the first character is 0, the remaining seven bits are interpreted as an ASCII character. Otherwise, the number of leading 1 bits indicates the number of bytes following. There is always a zero bit between the count bits and any data.
The first byte is one of the following. The X indicates bits available to encode the character:
0XXXXXXX only one byte 0..0x7F (ASCII) 110XXXXX two bytes Maximum character value is 0x7FF 1110XXXX three bytes Maximum character value is 0xFFFF 11110XXX four bytes Maximum character value is 0x1FFFFF 111110XX five bytes Maximum character value is 0x3FFFFFF 1111110X six bytes Maximum character value is 0x7FFFFFFF
All following bytes have the format 10XXXXXX.
As a two byte example, the symbol for a registered trade mark ®, encoded as 0x00AE in UCS-2 of ISO 10646-1, has the following two byte encoding in UTF-8: 0xC2, 0xAE.