Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Advanced CORBA Programming wit C++ - M. Henning, S. Vinoski.pdf
Скачиваний:
57
Добавлен:
24.05.2014
Размер:
5 Mб
Скачать

IT-SC book: Advanced CORBA® Programming with C++

Chapter 13. GIOP, IIOP, and IORs

13.1 Chapter Overview

Even though CORBA goes to great lengths to shield applications from the details of networking, it is useful to have at least a basic understanding of what happens under the hood of an ORB. In this chapter, we present an overview of the General Inter-ORB Protocol (GIOP) and the Internet Inter-ORB Protocol (IIOP), and we explain how protocol-specific information is encoded in object references. Our treatment is by no means exhaustive. We show just enough of the protocols to give you a basic understanding of how CORBA achieves interoperability without losing extensibility. Unless you are building your own ORB, the precise protocol details are irrelevant. You can consult the CORBA specification [18] if you want to learn more.

Sections 13.2 to 13.6 provide an overview of GIOP, including the requirements it makes on the underlying transport and its data encoding and message formats. Section

13.7 then describes IIOP, which is a concrete realization of the abstract GIOP specification. Section 13.8 shows how IORs encode information so that the protocols

available for communication can be extended without affecting interoperability. Section 13.9 outlines changes made to the protocols with the CORBA 2.3 revision.

13.2 An Overview of GIOP

The CORBA specification defines the GIOP as its basic interoperability frame-work. GIOP is not a concrete protocol that can be used directly to communicate between ORBs. Instead, it describes how specific protocols can be created to fit within the GIOP framework. IIOP is one concrete realization of GIOP. The GIOP specification consists of the following major elements.

Transport assumptions

GIOP makes a number of assumptions about the underlying transport layer that carries GIOP protocol implementations.

Common Data Representation (CDR)

GIOP defines an on-the-wire format for each IDL data type, so sender and receiver agree on the binary layout of data.

Message formats

GIOP defines eight message types that are used by clients and servers to communicate. Only two of these messages are necessary to achieve the basic remote procedure call semantics of CORBA. The remainder are control messages or messages that support certain optimizations.

13.2.1 Transport Assumptions

528

IT-SC book: Advanced CORBA® Programming with C++

GIOP makes the following assumptions about the underlying transport that is used to carry messages.

The transport is connection-oriented.

A connection-oriented transport allows the originator of a message to open a connection by specifying the address of the receiver. After a connection is established, the transport returns a handle to the originator that identifies the connection. The originator sends messages via the connection without specifying the destination address with each message; instead, the destination address is implicit in the handle that is used to send each message.

Connections are full-duplex.

The receiving end of a connection is notified when an originator requests a connection. The receiver can either accept or reject the connection. If the receiver accepts the connection, the transport returns a handle to the receiver. The receiver not only uses the handle to receive messages but can also use it to reply to the originator. In other words, the receiver can reply to the requests sent by the originator via the same single connection and does not need to know the address of the originator in order to send replies.

Connections are symmetric.

After a connection is established, either end of the connection can close it.

The transport is reliable.

The transport guarantees that messages sent via a connection are delivered no more than once in the order in which they were sent. If a message is not delivered, the transport returns an error indication to the sender.

The transport provides a byte-stream abstraction.

The transport does not impose limits on the size of a message and does not require or preserve message boundaries. In other words, the receiver views a connection as a continuous byte stream. Neither receiver nor sender need be concerned about issues such as message fragmentation, duplication, retransmission, or alignment.

The transport indicates disorderly loss of a connection.

If a network connection breaks down—for example, because one of the connection endpoints has crashed or the network is physically disrupted—both ends of the connection receive an error indication.

This list of assumptions exactly matches the guarantees provided by TCP/IP. However, other transports also meet these requirements. They include Systems Network Architecture (SNA), Xerox Network Systems' Internet Transport Protocol (XNS/ITP), Asynchronous Transfer Mode (ATM), HyperText Transfer Protocol Next Generation (HTTP-NG), and Frame Relay.[1]

[1] The only standardized protocol based on GIOP is IIOP, which uses TCP/IP as its transport. However, the OMG is likely to specify inter-ORB protocols for other transports in the future.

529

IT-SC book: Advanced CORBA® Programming with C++

13.3 Common Data Representation

GIOP defines a Common Data Representation that determines the binary layout of IDL types for transmission. CDR has the following main characteristics.

CDR supports both big-endian and little-endian representation.

CDR-encoded data is tagged to indicate the byte ordering of the data. This means that both big-endian and little-endian machines can send data in their native format. If the sender and receiver use different byte ordering, the receiver is responsible for byteswapping. This model, called receiver makes it right, has the advantage that if both sender and receiver have the same endianness, they can communicate using the native data representation of their respective machines. This is preferable to encodings such as XDR, which require big-endian encoding on the wire and therefore penalize communication if both sender and receiver use little-endian machines.

CDR aligns primitive types on natural boundaries.

CDR aligns primitive data types on byte boundaries that are natural for most machine architectures. For example, short values are aligned on a 2-byte boundary, long values are aligned on a 4-byte boundary, and double values are aligned on an 8-byte boundary. Encoding data according to these alignments wastes some bandwidth because part of a CDR-encoded byte stream consists of padding bytes. However, despite the padding, CDR is more efficient than a more compact encoding because, in many cases, data can be marshaled and unmarshaled simply by pointing at a value that is stored in memory in its natural binary representation. This approach avoids expensive data copying during marshaling.

CDR-encoded data is not self-identifying.

CDR is a binary encoding that is not self-identifying. For example, if an operation requires two in parameters, a long followed by a double, the marshaled data consists of 16 bytes. The first 4 bytes contain the long value, the next 4 bytes are padding with undefined contents to maintain alignment, and the final 8 bytes contain the double value. The receiver simply sees 16 bytes of data and must know in advance that these 16 bytes contain a long followed by a double in order to correctly unmarshal the parameters.

This means that CDR encoding requires an agreement between sender and receiver about the types of data that are to be exchanged. This agreement is established by the IDL definitions that are used to define the interface between sender and receiver. The receiver has no way to prevent misinterpretation of data if the agreement is violated. For example, if the sender sends two double values instead of a long followed by a double, the receiver still gets 16 bytes of data but will silently misinterpret the first 4 bytes of the first double value as a long value.

CDR encoding is a compromise that favors efficiency. Because CDR supports both littleendian and big-endian representations and aligns data on natural boundaries, marshaling

530

IT-SC book: Advanced CORBA® Programming with C++

is both simple and efficient. The downside of CDR is that certain type mismatches cannot be detected at run time. In practice, this is rarely a problem because the stubs and skeletons generated by the C++ mapping make it impossible to send data of the wrong type. However, if you use the DII or DSI, you must take care not to send data of the wrong type as operation parameters because, at least in some cases, the type mismatch will go undetected at run time.

Other encodings do not suffer from this problem. For example, the Basic Encoding Rules (BER) used by ASN.1 use a Tag-Length-Value (TLV) encoding, which tags each primitive data item with both its type and its length. Such encodings provide better type safety at run time but are less efficient in both marshaling overhead and bandwidth. For this reason, most modern RPC mechanisms use encodings similar to CDR, in which data is not tagged with its type during transmission.

13.3.1 CDR Data Alignment

This section presents an overview of the CDR encoding rules. Again, we do not cover all of CDR here. Instead, we show the encoding of a few IDL types to illustrate the basic ideas.

Alignment for Primitive Fixed-Length Types

Each primitive type must start at a particular byte boundary relative to the start of the byte stream it appears in. The same requirements apply to both little-endian and bigendian machines. Table 13.1 shows the alignment requirements for fixed-length primitive types.

 

Table 13.1. CDR alignment of primitive fixed-length types.

Alignment

IDL Types

1char, octet, boolean

2short, unsigned short

4

long, unsigned long, float, enumerated types

8

long long, unsigned long long, double, long double

1, 2, or 4 wchar (alignment depends on codeset)

Encoding of Strings

Strings and wide strings are encoded as an unsigned long (aligned on a 4-byte offset) that indicates the length of the string, including its terminating NUL byte, followed by the bytes of the string, terminated by a NUL byte. For example, the string "Hello" occupies 10 bytes. The first 4 bytes are an unsigned long with value 6, the next 5 bytes contain the characters Hello, and the final byte contains an ASCII NUL byte. This means that an empty string occupies 5 bytes: 4 bytes containing a length of 1, followed by a single NUL byte.

Encoding of Structures

531

IT-SC book: Advanced CORBA® Programming with C++

Structures are encoded as a sequence of structure members in the order in which they are defined in IDL. Each structure member is aligned according to the rules in Table 13.1; padding bytes of undefined value are inserted to maintain alignment. Consider the following structure:

struct CD { char c; double d;

};

This structure contains a character, which can occur anywhere in a byte stream, followed by a double value, which must be aligned on an 8-byte boundary. Figure 13.1 shows how this structure would appear on the wire, assuming it starts at the beginning of a byte stream.

Figure 13.1 Structure of type CD encoded at the beginning of a byte stream.

Figure 13.1 indicates the offsets at which each value is encoded. The first byte of the stream, at offset 0, contains the value of the member c of the structure. This is followed by 7 padding bytes at offset 1 and, beginning at offset 8, the 8 bytes for the member d of the structure.

It is interesting to note that a structure of type CD does not always appear as a 16-byte value. Depending on the other data that precedes the structure on the wire, the length of the structure may vary. For example, consider the following operation, which accepts a string followed by a structure of type CD:

interface foo {

void op(in string s, in CD ds); };

When a client marshals a request to invoke op, it sends all the in parameters end-to-end according to CDR encoding rules. Assume for the moment that the parameters when sent inside the request begin at an 8-byte offset and that the client sends the string "Hello" as the value of the parameter s. Figure 13.2 shows the resulting encoding.

Figure 13.2 CDR encoding of the string "Hello" followed by a structure of type CD.

The encoding for the value "Hello" consumes 10 bytes: 4 bytes for the length and 6 bytes for the actual string. The second parameter is the structure of type CD. Because the

532