Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Springer Science - 2005 - Reverse Engineering of Object Orie.pdf
Скачиваний:
17
Добавлен:
15.08.2013
Размер:
6.11 Mб
Скачать

148 7 Package Diagram

alternative organizations of the packages into cohesive units, that occasionally are allowed to violate encapsulation.

It might be the case that no meaningful concept partition is determined out of the initial context, although each concept, taken in isolation, represents a meaningful grouping of classes into a package. In this situation, the package organization indicated by the concepts can be taken into account by relaxing the constraint on the concept partitions. One way to achieve this result is described in [88], and consists of determining concept sub-partitions, instead of concept partitions, that can be eventually extended to a full partition of the set of classes under analysis.

7.4 The eLib Program

The eLib program is a small application consisting of just 8 classes. Thus, it makes no sense to organize them into packages. However, the exercise of applying the package diagram recovery techniques to the eLib program may be useful to understand how the different techniques work in practice and how their output can be interpreted.

Table 7.2 summarizes the results obtained by the agglomerative clustering method (first two lines, labeled Agglom.), by the modularity optimization method (lines 3 and 4, labeled Mod. opt.), and by concept analysis (last line, labeled Concept). The second column contains the kind of features or relationships that have been taken into account (a detailed explanation follows). The last column gives the resulting package diagram, expressed as a partition of the set of classes in the program.

In the application of the agglomerative clustering algorithm, two kinds of feature vectors have been used. In the first case, each entry in the feature

7.4 The eLib Program

149

vector represents any of the user defined types (i.e., each of the 8 classes in the program). The associated value counts the number of references to such a type in the declarations of class attributes, method parameters, local variables or return values. Table 7.3 shows the feature vectors based on the type information. The types in each position of the vectors read as follows:

It should be noted that the feature vectors for classes Book and Internal– User are empty. This indicates that the chosen features do not characterize these two classes at all, and consequently they do not permit grouping these two classes with any cluster.

Fig. 7.7. Clustering hierarchy for the eLib program (clustering method AgglomTypes).

150 7 Package Diagram

Fig. 7.7 shows the clustering hierarchy produced by the agglomerative algorithm applied to the feature vectors in Table 7.3. The (manually) selected cut point is indicated by a dashed line. The results shown in the first line of Table 7.2 correspond to this cut point. Classes User, Document, Library, Loan are clustered together. So are Journal, TechnicalReport, while Book and InternalUser remain isolated, due to their empty description.

The agglomerative clustering algorithm was re-executed on the eLib program, with different feature vectors. The number of invocations of each method is stored in the respective entry of the new feature vectors. Thus, for example, the first component of the feature vectors, associated with method

User.getCode, holds value 1 for classes Document, Library, Loan, in that they contain one invocation of such a method (resp. at lines 220, 10, 152), while such an entry contains a zero in the feature vectors for all the other classes, which do not call method getCode of class User.

The class partition obtained by cutting the clustering hierarchy associated with these feature vectors is reported in the second line of Table 7.2. Now the two classes Book and InternalUser have a non empty description, so that they can be properly clustered. The resulting package diagram is the same that was produced with the feature vectors based on the declared variable types, except for class Book, which is aggregated with {Journal, TechnicalReport}.

Fig. 7.8. Inter-class relationships considered in the first application of the modularity optimization method.

The clustering method that determines the partition optimizing the Modularity Quality (MQ) measure depends on the inter-class relationships being considered. Two kinds of such relationships have been investigated: (1) those depicted in the class diagram reported in Fig. 3.9 (i.e., inheritance, association and dependency); (2) the method calls.

Fig 7.8 shows the inter-class relationships considered in the first case. Given the low number of classes involved, an exhaustive search was conducted

7.4 The eLib Program

151

to determine the partition which maximizes MQ. The result is the partition in the third line of Table 7.2 (see also the box in Fig 7.8). It corresponds to a value of MQ equal to 0.91 and it was obtained by giving the same weight to all kinds of relationships. Actually, giving different weights to different kinds of relationships does not change the result, as long as the ratios between the weights remains small enough (less than 5). Big ratios between the weights lead to an optimal MQ reached when all classes are in just one cluster.

Fig. 7.9. Call relationships considered in the second application of the modularity optimization method.

In the second case (call relationships), the optimal partition is associated

with MQ = 0.87, and it differs from the previous one only

for the position

of class Library, which is merged with {User, Document,

Loan} (see Ta-

ble 7.2). Call relationships considered in this second clustering based on MQ are weighted by the number of calls issued within each class. Thus, the call relationship between Loan and User is weighted 3 because there are three invocations of methods belonging to class User, issued from methods of class Loan (resp. at lines 148, 152, 153). Fig. 7.9 shows the weighted call relationships considered in this second application of the modularity optimization method (the only non-singleton cluster is surrounded by a box).

Finally, concept analysis was applied to the context that relates the classes to the declared type of attributes, method parameters and local variables (see Table 7.4). Classes Book and InternalUser have been excluded, since they do not declare any variable of a user-defined type (see discussion of the feature vectors in Table 7.3 given above). Two concepts are determined from such a context: