Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Springer Science - 2005 - Reverse Engineering of Object Orie.pdf
Скачиваний:
17
Добавлен:
15.08.2013
Размер:
6.11 Mб
Скачать

3.3 Containers

51

Fig. 3.3. OFG for the binary search tree example.

After flow propagation, the following out set is determined for the attribute obj of class BinaryTreeNode:

Thus, an association can be drawn in the class diagram from BinaryTreeNode to Student. On the contrary, the analysis of the declared type would miss completely this interclass relationship, because the declared type of BinaryTreeNo- de. obj is Comparable.

As apparent from the example above, the declared types of variables are a good starting point to infer the relationships that hold among the user-defined classes represented in a class diagram. However, they may lead to imprecise diagrams, where some of the existing relationships are absent. One of the main reasons for the inaccuracy is the declaration of program locations whose type is an interface. In this case, the declared type is not very informative. An OFG based analysis of the actual object types can be used to obtain a more accurate class diagram.

3.3 Containers

Containers are classes that implement a data structure to store, manage, and access other objects. Classical examples of such data structures are: list, tree, graph, vector, hash table, etc. Weakly typed containers are containers that collect objects the type of which is not declared. With the current version of Java, that does not yet support genericity, all containers are weakly typed.

52 3 Class Diagram

Thus, an object x of type List that is used to store objects from class A is declared as: “List x;, without any explicit mention of the contained object type, A. Knowledge about the kind of objects that can be inserted into x and that are retrieved from x is not part of the program’s syntax.

Weakly typed containers expose programmers to errors that are not detected at compile time, and are typically due to a wrong type assumed for contained objects. Moreover, they make reverse engineering a difficult task. In fact, interclass relationships, such as associations and dependencies, are determined from the type declared for attributes, local variables and parameters. When containers are involved, the relationships to recover should connect the given class to the classes of the contained objects. However, information about the contained object classes is not directly available in the program.

eLib example

Let us consider the eLib example. Class Library has an attribute loans (line 6) of declared type Collection, and two attributes, users and documents (lines 4, 5), of type Map. Since both Collection and Map are interfaces, the algorithm described in Section 3.2 can be applied to determine a more accurate type for these class attributes. The result does not help reverse engineer the associations implemented through these attributes. In fact, the classes that implement the Collection and Map interfaces and are actually used for the corresponding attributes of class Library are respectively LinkedList and HashMap, that is, two weakly typed containers. Since HashMap and LinkedList are library classes, no relationship is drawn in the class diagram for them (only user defined classes are considered). However, a closer inspection of the source code reveals that the attribute documents holds the mapping between a document code and the corresponding Document object. Similarly, the attribute users associates a user code to the related User object. The attribute loans stores the list of all active loans of the library, represented as objects of the class Loan. Thus, three association relationships are missed when only declared types are considered, one between Library and Document, another one between Library and User, and a third one between Library and Loan. Correspondingly, the reverse engineered class diagram is very poor and does not show important information such as the way to access the Document objects managed by the Library, the library users (User objects), and the loans (missing association with class Loan).

3.3.1 Flow propagation

It is possible to define a specialization of the flow propagation algorithm presented in Chapter 2, aimed at estimating the type of the contained objects for weakly typed containers. The basic idea is that before insertion into a container each object has to be allocated, and allocation requires the full speci-

3.3 Containers

53

fication of the object type. Symmetrically, after extraction from a container each object has to be constrained to a specific type, in order to be manipulated with type-dependent operations. Flow propagation of the pre-insertion and post-extraction type information results in a static approximation of the contained object types. Such information can be used to refine the class diagrams extracted from the code, by recovering some of the otherwise missing relations between classes.

Container classes offer two basic functionalities to user classes: insertion methods, to store objects into the container, and extraction methods, to retrieve objects out of a container. During OFG construction, these functionalities are abstracted by the two methods insert and extract. Their effects on the object flows are accounted for by replacing their invocations with assignment statements, equivalent to the method calls from the point of view of the data flows (see Chapter 2, Section 2.3).

Given the OFG produced by taking container flows into account, a specialization of the flow propagation algorithm to determine the type of the contained objects is obtained by defining gen and kill sets of each OFG node. Two different kinds of flow information can be used to infer the type of contained objects: the type of inserted objects can be obtained from their allocation, while the type of extracted objects can be obtained from their type coercion. For example, (abstract) statements such as can be exploited to estimate the contained object type as that of the allocation, while the coerced type in a statement such as where ”(A)” is the syntax for type coercion, can be exploited to associate type A to container

Correspondingly, two executions of the flow propagation algorithm have to be conducted, with two different sets of gen and kill sets associated with OFG nodes. Moreover, the direction of flow propagation changes when insertion vs. extraction information is used.

Fig. 3.4. Flow propagation specialization to determine the type of objects stored inside weakly typed containers, accounting for object insertions and based on allocation information. Forward propagation.

54 3 Class Diagram

Fig. 3.4 provides the gen and kill sets to use when the contained object type is estimated from insertion information. Object allocation statements provide the precise type of allocated objects. This information is propagated from object constructors to the containers, according to the fixpoint algorithm described in Chapter 2. The direction of propagation is forward, so that incoming information of each node is obtained from the predecessors. It can be noted that the same flow analysis specialization has been used to refine associations when declared types are superclasses of actual types or interfaces (see Fig. 3.2).

Fig. 3.5. Flow propagation specialization to determine the type of objects stored inside weakly typed containers, accounting for object extractions and based on type coercion. Backward propagation.

Fig. 3.5 gives gen and kill sets for the second execution of the flow propagation algorithm, exploiting extraction information. The abstract syntax given in Chapter 2 has been enriched with a type coercion operator, “()”. Each time a type coercion occurs on a program location or on the value returned by a method, the related type information is generated at the corresponding OFG node. In order to reach the container from which an object has been extracted, this type information has to be propagated backward in the OFG, that is, from the successors of a node to the node itself. In fact, type coercion occurs after an object has flown out of a container up to a given location. Such data flow has to be reversed to propagate the coerced type back to the container.

After the two flow propagations are complete, the two respective out sets of each container location hold the contained object types computed by the two specializations described above. The union of these two out sets gives the final results, i.e., the set of types estimated for the contained objects. If several classes from an inheritance subtree are included in the out set of a container, it may be appropriate to replace them with the LCA, thus reducing the number of connections among entities in the class diagram, and improving its readability.

3.3 Containers

55

eLib example

Let us consider the eLib program in Appendix A, and in particular, let us focus on methods addUser (line 8) and searchDocumentByTitle (line 90) of class Library. Their abstract statements are respectively:

where the assignment has been obtained by transforming the insertion method put invoked on Library.users at line 10, and:

where the first and second assignments are the result of transforming invocations of extraction methods (iterator at line 92 and next at line 94, resp.), while the fourth assignment results from the conversion of an insertion (invocation of add on docsFound at line 96). For completeness, let us consider a code fragment from class Main (Appendix B), that performs a user insertion into the library:

The abstract statements of this code fragment are:

Fig. 3.6 shows (a portion of) the OFG associated with the abstract statements above. Sets gen1 and gen2 have been obtained according to the rules in Fig. 3.4 and 3.5 respectively. Thus, gen1 is used during the first, forward propagation, while gen2 is used in the second, backward flow propagation. The cumulative result is: