Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Springer Science - 2005 - Reverse Engineering of Object Orie.pdf
Скачиваний:
17
Добавлен:
15.08.2013
Размер:
6.11 Mб
Скачать

3.5 Related Work

59

Fig. 3.9. Class diagram for the eLib program including dependency relationships.

3.5 Related Work

Usage of points-to analysis to improve the accuracy of the interclass relationships is described in [56], where the type of pointed-to objects is used to replace the declared type. The results obtained by points-to analysis are comparable to those obtained by the OFG based algorithm to handle inheritance, given in Section 3.2. Both approaches exploit the object type used in allocation points to infer the actual type of referenced objects. As discussed in [56], this represents a substantial improvement over the Class Hierarchy Analysis (CHA) [17], which determines all direct and transitive subclasses of the declared type as possibly referenced by a given program location. CHA becomes particularly imprecise in the presence of interfaces as declared types. In fact, it is quite typical that a large number of classes implement general purpose interfaces (such as the Comparable interface). If all of them are accounted for as possible targets of interclass relationships, a completely unusable class diagram is derived from the code. In [56], the output of two points-to analysis algorithms, described respectively in [68] and [57], is used to determine the possibly pointed-to locations for each variable in the given program. The experimental data show that such information is crucial to refine the inter-class relationships associated with dynamic binding.

In [18], container types are analyzed with the purpose of moving to a hypothetical strongly typed version of the Java containers. A set of constraints is derived on the type parameters that are introduced for each potentially generic class (e.g., containers). A templated instance of the original class which respects such constraints can safely replace the weakly typed one, thus making most of the downcasts unnecessary and allowing for a deeper static check of the code. Although based on a different algorithm, this approach is com-

60 3 Class Diagram

parable to that described in Section 3.3. In fact, more accurate information about the type of objects inserted into containers is inferred from type-related statements in the code under analysis.

An empirical study comparing the results obtained with and without container analysis is described in [87]. The class diagrams for the subsystems in a large C++ code base were reverse engineered. The number of associations missed in the absence of container analysis turned out to be high, and the visual inspection of the related class diagrams revealed that container analysis plays a fundamental role in reverse engineering, when weakly typed container libraries are used.

3.5.1 Object identification in procedural code

In this chapter, reverse engineering of the class diagram has been presented with reference to Object Oriented programs. A lot of work [12, 13, 51, 75, 80, 88, 102] has been conducted within the reverse engineering research community, aimed at identifying abstract data types in procedural code. Thus, classes are tentatively reverse engineered from procedural (instead of Object Oriented) code.

The purpose of the analyses considered in these works is supporting the migration from procedural to Object Oriented programming. It was recognized that this migration process cannot be fully automated and the results available in the literature provide local approaches which help in some cases, but not in others. If a software system was built around data types in the first place, it is possible to identify and extract them as objects. If not, it is hard to retrofit objects into the system and, until now, no one has come up with a general, automated solution for transforming procedural systems into Object Oriented ones. In such a case, the output of reverse engineering may be only the starting point for a highly human-intensive reengineering activity.

In [51] the main methods for class identification are classified as globalbased or type-based, respectively when functions are clustered around globally accessible objects or formal parameter and return types. A new identification method – based on the concept of receiver parameter type – is also proposed. The approach presented in [12], which considers accesses to global variables, uses an internal connectivity index to decide which functions should be clustered around the recognized class. Such a method is extended in [13] to include type-based relations and it is combined with the strong direct dominance tree to obtain a more refined result. The recovery technique described in [102] builds a graph showing the references of the procedures to the internal fields of structures. Accesses to global variables drive the recognition of classes.

In [27] the star diagram is proposed as a support to help programmers restructure programs by improving the encapsulation of abstract data types. Another decomposing and restructuring system is described in [58]. Both of them provide sophisticated interaction means to assist the user in the process of analyzing and restructuring a program.

3.5 Related Work

61

Several works [50, 75, 80, 88] on identification and remodularization of abstract data types are based on the output produced by concept analysis [25]. The relation between procedures and global variables is analyzed by means of concept analysis in [50]. The resulting lattice is used to identify module candidates. Concept analysis is used in [75] to identify modules, by considering both positive and negative information about the types of the function arguments and of the return value. An example of how to identify class candidates from a C implementation of two tangled data structures is provided in [75]. Concept analysis succeeds in separating them into two distinct classes. In [88], encapsulation around dynamically allocated memory locations and module restructuring are considered. Points-to analysis is used to determine dynamic memory accesses, while concept analysis permits grouping functions around the accessed dynamic locations. Concept analysis is exploited in [80] to reengineer class hierarchies. A context describing the usage of a class hierarchy is the starting point for the construction of a concept lattice, from which redesign possibilities are derived.

This page intentionally left blank