Generic Modularization Case Study
This material accompanies the publication to the Modularity in Modelling Workshop (MOMO). We show the application of our generic modularization approach on the example of Ecore.
Meta-Model
In our generic modularization meta-model, a language is the root of the object graph.
A Language
consists of a set of Modules
which in turn consist of entities. In the unmodularized case we only have one module containing all entities.
Entities
are identified by their name and may have relationships among them (e.g., associations or inheritance).
A Relationship
is given a weight that is considered during the search process.
The higher the weight the more important the relationship is, i.e., the closer the respective elements should be grouped together.
A common, abstract superclass NamedElement
ensures that all elements in the system have a name.
Ecore 2 Generic Modularization
For example, when we aim to modularize Ecore models, we could provide the following mapping.
Ecore | Generic Modularization Language |
---|---|
EPackage | Module |
EClass | Entity |
EDataType | Entity |
EEnum | Entity |
eSuperType | Relationship (weight=2) |
EAttribute | Relationship (weight=1) |
EReference (not containment) | Relationship (weight=1) |
EReference (containment) | Relationship (weight=3) |
This mapping is realized using an ATL transformation. A whole project with this transformation can be downloaded in our repository. An excerpt of the transformation is shown below:
rule EPackage_Language {
from
p : Ecore!EPackage,
c : Ecore!EClass (p.refImmediateComposite().oclIsUndefined() and
Ecore!EPackage.allInstancesFrom('IN').first() = p and
Ecore!EClass.allInstancesFrom('IN').first() = c)
to
l : Module!Language (
name <- p.name,
modules <- Ecore!EPackage.allInstancesFrom('IN')
)
}
rule Relationship(target : Ecore!EClassifier, weight : Integer) {
to
r : Module!Relationship (
relationshipEnd <- target,
weight <- weight
)
do{ r; }
}
rule Entity_Enum {
from
enum : Ecore!EEnum
to
e : Module!Entity (
name <- enum.name
)
}
...
Rules
Move Entity:
Since at the beginning there is only one module with all in the input model, we create modules to which the entities can be moved.
This rule moves an entity with the name entity
from a module with the name source
to another entity with the name target
.
Objectives and Constraints
Since modularization is such a common and well-studied problem, many metrics have been proposed which indicate the quality of a module. Common metrics include coupling and cohesion. For our example, we follow the Equal-Size Cluster Approach, as described by Praditwong et al in Software Module Clustering as a Multi-Objective Search Problem. The goal of this approach is to produce equally-sized modules, i.e., modules that have a similar number of entities. Therefore, besides the above mentioned two objectives we also aim to maximize the number of modules and minimize the difference between the minimum and maximum number of entities in a module. In order to improve efficiency, we have outsourced evaluation of the objectives and constraints into a separate class (MetricsCalculator), which calculates the values in one iteration through the model. In the configuration example below, you can find how this external calculation can be integrated into the fitness evaluation of MOMoT.
Coupling: Coupling refers to the number of external relationships a specific module has, i.e., the sum of inter-relationships with other modules. Typically, low coupling is preferred as this indicates that a group covers separate functionality aspects of a system, improving the maintainability, readability and testability of the overall system. In our case study, not all relationships are considered equal, therefore the coupling is the sum of all inter-relationship weights instead of just the number of all inter-relationships.
Cohesion: Cohesion refers to the relationships within a module, i.e., the sum of intra-relationships in the module. As opposed to coupling, the cohesion within one module should be maximized to ensure that it does not contain parts that are not part of its functionality. In our case study, not all relationships are considered equal, therefore the cohesion is the sum of all intra-relationship weights instead of just the number of all intra-relationships.
Number of Modules: We aim to maximize the number of modules to avoid having all entities in a single large module.
Min-Max Difference: The difference between the module with the lowest number of entities and the module with the highest number of entities should be minimized. By doing so, we aim to produce equally-sized modules as the optimal difference would be 0.
fitness = {
preprocess = { // use attribute storage for external calculation
val root = MomotUtil.getRoot(solution.execute, typeof(Language))
solution.setAttribute("metrics", MetricsCalculator.calculate(root))
}
objectives = {
Coupling : minimize { // java-like syntax
val metrics = solution.getAttribute("metrics", typeof(LanguageMetrics))
metrics.coupling
}
Cohesion : maximize {
val metrics = solution.getAttribute("metrics", typeof(LanguageMetrics))
metrics.cohesion
}
NrModules : maximize {
(root as Language).^modules.filter[m | !m.entities.empty].size
}
MinMaxDiff : minimize {
val sizes = (root as Language).^modules.filter[m | !m.entities.empty].map[m | m.entities.size]
sizes.max - sizes.min
}
}
}
References
- Praditwong K, Harman M, Yao X. Software Module Clustering as a Multi-Objective Search Problem. IEEE Transactions on Software Engineering 2011; 37(2):264–282, doi:10.1109/TSE.2010.26.
- Example project on GitHub
Input Examples
As an example input model, we show the modularization of four Ecore-based languages: HTML, JAVA, OCL, and QVT. The initial values for the case studies are given in the following table. The second module for HTML and OCL is retrieved from the PrimitiveTypes package which contain the data types Integer, Boolean, and String for both case studies, and additionaly Double for OCL. The eight modules for the QVT case studies are retrieved from the following packages: QVT Template, Imperative OCL, EMOF, QVT Operational, QVT Core, QVT Base, QVT Relation, and Essential OCL.
HTML | JAVA | OCL | QVT | |
---|---|---|---|---|
Entities | 62 | 132 | 77 | 151 |
Coupling | 0 | 0 | 0 | 216 |
Cohesion | 119 | 856 | 304 | 587 |
Modules | 2 | 1 | 2 | 8 |
MinMaxDiff | 56 | 0 | 69 | 38 |
Output
As for each case study we get a lot of solutions, we apply a knee point strategy to select one solution. The values for the modularization results are as follow:
HTML | JAVA | OCL | QVT | |
---|---|---|---|---|
Coupling | 18 | 339 | 42 | 355 |
Cohesion | 101 | 517 | 262 | 448 |
Modules | 5 | 7 | 4 | 8 |
MinMaxDiff | 31 | 2 | 45 | 2 |
Please note that for the QVT case study where already a lot of modules where available, the solution we have selected seems to focus on the MinMaxDiff objective and therefore the coupling and cohesion values for this solution are a bit worse than in the initial version.