Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Springer Science - 2005 - Reverse Engineering of Object Orie.pdf
Скачиваний:
17
Добавлен:
15.08.2013
Размер:
6.11 Mб
Скачать

7

Package Diagram

The complexity involved in the management and description of large software systems can be faced by partitioning the overall collection of the composing entities into smaller, more manageable, units. Packages offer a general grouping mechanism that can be used to decompose a given system into sub-systems and to provide a separate description for each of them.

Packages represented in the package diagram show the decomposition of a given system into cohesive units that are loosely coupled with each other. Each package can in turn be decomposed into sub-packages or it can contain the final, atomic entities, typically consisting of the classes and of their mutual relationships.

The dependency relationships shown in a package diagram represent the usage of resources available from other packages. For example, if a method of a class contained in a package calls a method of a class that belongs to a different package, a dependency relationship exists between the two packages.

Most Object Oriented programming languages provide an explicit construct to define packages. Thus, their recovery from the source code is just a matter of performing a pretty simple syntactic analysis. Dependencies among packages are also quite easy to retrieve, since they correspond to references to resources possessed by other packages (method calls, usage of types, etc.).

A more interesting and challenging situation is one in which no package structure was defined for a given software system, while its evolution over time has made it necessary (for example, because of an increased system’s size). Code analysis techniques can be employed to determine appropriate groupings of entities to be inserted in a same package. In this scenario, packages are recovered from a system that does not possess any package structure at all. Another similar scenario consists of restructuring an existing package organization. If there are reasons to believe that the current decomposition of the system into packages is not satisfactory, code analysis can be used to determine an alternative decomposition, with more cohesive and less coupled packages. Migration to the new package structure can thus be supported by the recovery of an alternative package organization from the code, ignoring

134 7 Package Diagram

the existing one. The exercise of recovering a package structure from the code can be useful also to assess the validity of the current decomposition into packages, by contrasting that recovered with the existing one.

The scenarios in which package diagram recovery applies are clarified in Section 7.1. Among the techniques available for the identification of cohesive groups of classes, clustering is considered in detail in Section 7.2, while concept analysis is presented in Section 7.3. Application of these two methods to the eLib program is described in Section 7.4. A discussion of the related works concludes the chapter.

7.1 Package Diagram Recovery

The complexity of large software systems can be managed by decomposing the overall system into smaller units, called packages, that are internally highly cohesive and that exhibit a low coupling with the other packages in the decomposition. In turn, each package can be decomposed into sub-packages, when its complexity requires a finer grain subdivision. The atomic elements eventually included in the lower level packages are usually the classes used in each subsystem. Although the decomposition into packages is a general mechanism that can be used also with entities different from classes (e.g., states in state diagrams), in the following we will focus on the most frequently occurring case, in which packages contain groups of classes (or other sub-packages).

Since modern Object Oriented programming languages, such as Java, provide an explicit mechanism for package definition, recovery of the organization of the classes into packages and of the decomposition of packages into subpackages is straightforward and requires just the ability to parse the source code. The dependency relationship between packages is also easy to retrieve. In fact, once the kinds of relevant dependencies are defined (e.g., method calls between classes in different packages; declaration of variables whose type is defined in another package), their identification in the source code is typically just a matter of performing some simple syntactic or semantic (construction of symbol table with type information) analysis.

Software systems tend to evolve over time in a manner that is difficult to predict in advance, so that their periodic reorganization is often necessary to preserve the original quality of the design. In this context, recovery of the package diagram from the source code cannot be based on the declared packages, since these may reflect the initial decomposition of the system, which does not correspond any longer its actual structure. Techniques for the reverse engineering of highly cohesive and lowly coupled groups of classes play an important role in this situation.

Three possible scenarios in which package diagram recovery should be based on the actual code organization, instead of the declared package structure, are depicted in Fig. 7.1. When classes are not grouped into packages

7.1 Package Diagram Recovery

135

Fig. 7.1. Scenarios of package diagram recovery from code properties.

(see Fig. 7.1, (a)) or when the existing package structure is considered inappropriate (see Fig. 7.1, (b)), recovery of the package diagram from the code may provide useful indications on how to (re-)organize classes into packages. In these two cases, either no package structure exists, or the available package structure is ignored. A third situation may occur, in which the existing package structure is evaluated to identify opportunities of improvement (see Fig. 7.1, (c)). In such a scenario, the recovered package diagram is expected to have a large overlap with the existing package organization, and interesting information is provided by the differences (if any). Classes that are assigned to different packages in the two package diagrams (the actual and the recovered one) should be carefully inspected to assess the opportunity of reassigning them. The resulting organization of the system, in all three cases sketched above, will be characterized by more cohesive packages with fewer dependencies between each other. This is expected to affect positively the activities of program understanding and code evolution.

Recovery of the package diagram in the three scenarios of Fig. 7.1 is based on proper code properties. Classes that exhibit commonalities in such properties are grouped in a same package. Several algorithms can be employed to identify such commonalities and to group classes together. The code properties to consider in the recovery process vary accordingly, and may be customized based on the available knowledge about the system. Typical examples of such properties are the types of class attributes and of method variables and parameters, and the invocations of methods that belong to other classes. The fact that a group of classes operate on the same types or depend one on the other due to method invocations hint that they should be grouped into a same package. In the next two sections more details are provided on which properties to consider and how to infer packages (i.e., highly cohesive and loosely coupled groupings of classes) from such properties.