Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Symbian OS Explained - Effective C++ Programming For Smartphones (2005) [eng].pdf
Скачиваний:
60
Добавлен:
16.08.2013
Размер:
2.62 Mб
Скачать

5

Descriptors: Symbian OS Strings

Get your facts first, then you can distort them as you please

Mark Twain

The Symbian OS string is known as a ”descriptor”, because it is selfdescribing. A descriptor holds the length of the string of data it represents as well as its ”type”, which identifies the underlying memory layout of the descriptor data. Descriptors have something of a reputation among Symbian OS programmers because they take some time to get used to. The key point to remember is that they were designed to be very efficient on low memory devices, using the minimum amount of memory necessary to store the string, while describing it fully in terms of its length and layout. There is, necessarily, some trade-off between efficiency and simplicity of use, which this chapter illustrates. The chapter is intended to give a good understanding of the design and philosophy of descriptors. The next chapter will show how to use descriptors most effectively by looking at some of the more frequently used descriptor API functions and describing some common descriptor mistakes and misconceptions.

Descriptors have been part of Symbian OS since its initial release and they have a well established base of documentation. Despite this, they can still appear confusing at first sight, perhaps because there are quite a number of descriptor classes, all apparently different although interoperable.1 They’re not like standard C++ strings, Java strings or the MFC CString (to take just three examples) because their underlying memory allocation and cleanup must be managed by the programmer. But they are not like C strings either; they protect against buffer overrun and don’t rely on NULL terminators to determine the length of the string. So let’s discuss what they are and how they work – initially by looking at a few concepts before moving on to the different descriptor classes.

First, I should make the distinction between descriptors and literals; the latter can be built into program binaries in ROM because they

1 To paraphrase Andrew Tanenbaum: The nice thing about descriptors is that there are so many to choose from.

56

DESCRIPTORS: SYMBIAN OS STRINGS

are constant. Literals are treated a bit differently to descriptors and I’ll come back to them later in the chapter. For now, the focus is on descriptors.

Another issue is the ”width” of the string data, that is, whether an individual character is 8 or 16 bits wide. Early releases, up to and including Symbian OS v5, were narrow builds with 8-bit native characters, but since that release Symbian OS has been built with wide 16-bit characters as standard, to support Unicode character sets. The operating system was designed to manage both character widths from the outset by defining duplicate sets of descriptor classes for 8- and 16-bit data. The behavior of the 8- and 16-bit descriptor classes is identical except for Copy() and Size(), both of which are described in the next chapter. In addition, a set of neutral classes are typedef’d to either the narrow or wide descriptor classes, depending on the build width. You can identify the width of a class from its name. If it ends in 8 (e.g. TPtr8) it assumes narrow 8-bit characters, while descriptor class names ending with 16 (e.g. TPtr16) manipulate 16-bit character strings. The neutral classes have no number in their name (e.g. TPtr) and, on releases of Symbian OS since v5u,2 they are implicitly wide 16-bit strings.

The neutral classes were defined for source compatibility purposes to ease the switch between narrow and wide builds. Although today Symbian OS is always built with 16-bit wide characters, you are well advised to continue to use the neutral descriptor classes where you do not need to state the character width explicitly.

Descriptors can also be used for binary data because they don’t rely on a NULL terminating character to determine their length. The unification of binary and string-handling APIs makes it easier for programmers and, of course, the ability to re-use string manipulation code on data helps keep Symbian OS compact. To work with binary data, you need to code specifically with the 8-bit descriptor classes. The next chapter discusses how to manipulate binary data in descriptors in more detail.

So, with that knowledge in hand, we can move on to consider the descriptor classes in general.

5.1 Non-Modifiable Descriptors

All (non-literal) descriptors derive from the base class TDesC which is typedef’d to TDesC16 in e32std.h and defined in e32des16.h

(the narrow version, TDesC8, can be found in e32des8.h). Chapter 1 discusses Symbian OS class naming conventions and explains what the ”T” prefix represents. The ”C” at the end of the class name is more

2 Symbian OS v5u was used in the Ericsson R380 mobile phone. This version is also sometimes known as ”ER5U”, which is an abbreviation of ”EPOC Release 5 Unicode”.

NON-MODIFIABLE DESCRIPTORS

57

relevant to this discussion, however; it reflects that the class defines a non-modifiable type of descriptor, whose contents are constant. The class provides methods for determining the length of the descriptor and accessing the data.

The length of the descriptor is returned, unsurprisingly, by the Length() method. The layout of every descriptor object is the same, with 4 bytes holding the length of the data it currently contains. (Actually, only 28 of the available 32 bits are used to hold the length of the descriptor data; 4 bits are reserved for another purpose, as I’ll describe very shortly. This means that the maximum length of a descriptor is limited to 228 bytes, 256 MB, which should be more than sufficient!)

The Length() method in TDesC is never overridden by its subclasses since it is equally valid for all types of descriptor. However, access to the descriptor data is different depending on the implementation of the derived descriptor classes but Symbian OS does not require each subclass to implement its own data access method using virtual functions. It does not use virtual function overriding because this would place the burden of an extra 4 bytes on each derived descriptor object, added by C++ as a virtual pointer (vptr) to access the virtual function table. As I’ve already described, descriptors were designed to be as efficient as possible and the size overhead to accommodate a vptr was considered undesirable. Instead, to allow for the specialization of derived classes, the top 4 bits of the 4 bytes that store the length of the descriptor object are reserved to indicate the type of descriptor.

There are currently five derived descriptor classes, each of which sets the identifying bits as appropriate upon construction. The use of 4 bits to identify the type limits the number of different types of descriptor to 24 (=16), but since only five types have been necessary in current and previous releases of Symbian OS, it seems unlikely that the range will need to be extended significantly in the future.

Access to the descriptor data for all descriptors goes through the nonvirtual Ptr() method of the base class, TDesC, which uses a switch statement to check the 4 bits, identify the type of descriptor and return the correct address for the beginning of its data. Of course, this requires that the TDesC base class has knowledge of the memory layout of its subclasses hardcoded into Ptr().

With the Length() and Ptr() methods, the TDesC base class can implement all the operations you’d typically expect to perform on a constant string (such as data access, comparison and search). Some of these methods are described in detail in the next chapter, and all will be documented in full in your preferred SDK. The derived classes all inherit these methods and, in consequence, all constant descriptor manipulation is performed by the same base class code, regardless of the type of the descriptor.

58

DESCRIPTORS: SYMBIAN OS STRINGS

The non-modifiable descriptor class TDesC is the base class from which all non-literal descriptors derive. It provides methods to determine the length of the descriptor and to access its data. In addition, it implements all the operations you’d typically expect to perform on a constant string.

5.2 Modifiable Descriptors

Let’s now go on to consider the modifiable descriptor types, which all derive from the base class TDes, itself a subclass of TDesC. TDes has an additional member variable to store the maximum length of data allowed for the current memory allocated to the descriptor. The MaxLength() method of TDes returns this value. Like the Length() method of TDesC, it is not overridden by the derived classes.

TDes defines the range of methods you’d expect for modifiable string data, including those to append, fill and format the descriptor data. Again, all the manipulation code is inherited by the derived classes, and acts on them regardless of their type. Typically, the derived descriptors only implement specific methods for construction and copy assignment.

None of the methods allocates memory, so if they extend the length of the data in the descriptor, as Append() does, for example, you must ensure that there is sufficient memory available for them to succeed before calling them. Of course, the length of the descriptor can be less than the maximum length allowed and the contents of the descriptor can shrink and expand, as long as the length does not exceed the maximum length. When the length of the descriptor contents is shorter than the maximum length, the final portion of the descriptor is simply unused.

The modification methods use assertion statements to check that the maximum length of the descriptor is sufficient for the operation to succeed. These will panic if an overflow would occur if they proceeded, allowing you to detect and fix the programming error (panics are described in detail in Chapter 15 and assertions in Chapter 16).

The very fact that you can’t overflow the descriptor makes the code robust and less prone to hard-to-trace memory scribbles. In general, descriptor classes use __ASSERT_ALWAYS to check that there is sufficient memory allocated for an operation, raising a USER category panic if the assertion fails. In the event of such a panic, it can be assumed that no illegal access of memory has taken place and that no data was moved or corrupted.

The base classes provide and implement the APIs for constant and modifiable descriptor operations for consistency, regardless of the actual type of the derived descriptor. For this reason, the base classes should be

MODIFIABLE DESCRIPTORS

59

used as arguments to functions and return types, allowing descriptors to be passed around in code without forcing a dependency on a particular type. However, if you attempt to create objects of type TDesC and TDes you’ll find that they cannot be instantiated directly because their default constructors are protected.3

So it’s to the derived descriptor types that we now turn, since these are the descriptor classes that you’ll actually instantiate and use. As I mentioned earlier, it can at first sight appear quite confusing because there is a proliferation of descriptor classes. I’ve already explained why there are three versions of each class, e.g. TDes8, TDes16 and TDes, for narrow, wide and neutral (implicitly wide) classes respectively. Let’s now look at the main descriptor types, initially considering their general layout in memory before moving on to look at each class in more detail. I’ll describe the differences between the classes and the methods each defines over and above those provided by the TDesC and TDes base classes. The following chapter will go further into how to use the base class APIs, as well as noting any useful tips or mistakes commonly made when doing so. For comprehensive information about the descriptor APIs, you should refer to the SDK documentation.

As I’ll describe, descriptors come in two basic layouts: pointer descriptors, in which the descriptor holds a pointer to the location of a character string stored elsewhere, and buffer descriptors, where the string of characters forms part of the descriptor.

TDes is the base class for all modifiable descriptors, and itself derives from TDesC. It has a method to return the maximum amount of memory currently allocated to hold data, and a range of methods for modifying string data.

When using descriptors, memory management is your responsibility. Descriptors do not perform allocation, re-allocation or garbage collection, because of the extra overhead that would carry. However, descriptor functions do check against access beyond the end of the data, and will panic if passed out-of-bounds parameters.

3 There is no copy constructor declared, so the compiler generates a default public version which can be used to instantiate a TDes or TDesC by copy, although you are unlikely to have a valid reason for doing this:

_LIT(KExampleLiteral, "The quick brown fox jumps over the lazy dog");

TPtrC original(KExampleLiteral);

TDesC copy(original); // Shallow copy the type, length & data