P ROPERTIES OF A F EATURE IN C ODE -A SSETS : AN E XPLORATORY S TUDY

Software product line engineering is a paradigm for developing a family of software products from a repository of reusable assets rather than developing each individual product from scratch. In feature-oriented software product line engineering, the common and the variable characteristics of the products are expressed in terms of features. Using software product line engineering approach, software products are produced en masse by means of two engineering phases: (i) Domain Engineering and, (ii) Application Engineering. At the domain engineering phase, reusable assets are developed with variation points where variant features may be bound for each of the diverse products. At the application engineering phase, individual and customized products are developed from the reusable assets . Ideally, the reusable assets should be adaptable with less effort to support additional variations (features) that were not planned beforehand in order to increase the usage context of SPL as a result of expanding markets or when a new usage context of software product line emerges. This paper presents an exploration research to investigate the properties of features, in the code-asset implemented using Object-Oriented Programming Style. In the exploration, we observed that program elements of disparate features formed unions as well as intersections that may affect modifiability of the code-assets. The implication of this research to practice is that an unstable product line and with the tendency of emerging variations should aim for techniques that limit the number of intersections between program elements of different features. Similarly, the implication of the observation to research is that there should be subsequent investigations using multiple case studies in different software domains and programming styles to improve the understanding of the findings.


INTRODUCTION
Software product line engineering (SPLE) is a paradigm for developing a family of software products from the repository of reusable assets rather than developing each individual product from scratch. The driver of SPLE is pre-planned software reuse and within a specific problem area known as a domain. In the feature-oriented SPLE [1]- [3], the common and the variable characteristics of the products are expressed in terms of features. Thus, a feature is used as the key abstraction to distinguish between the members of the family. Consequently, the sets of products in the product line are said to have 'common' features and differ in 'variable' features.

Feature Model
A feature model [2] is a graphical tree structure in which product features of a product line are identified and organized with their types and their relationships. Feature type is one of the following broad categories:


A mandatory type: a common feature that is manifested in all the products of a product line.  A variable type: a feature of this type is either optional, is part of an alternative group (only one feature in the group can be selected), or is part of an inclusive OR group feature (more than one feature can be selected from the group). Unlike the alternative group, the OR group features are not mutually exclusive and can either contain all optional features or at least one of the features must be selected.  [3]. In the feature model, the Powertrain and Transmission features are mandatory because they must be available in every car. Comfort and Advanced Driver Assistance (ADAS) are optional features because they do not have to be in every vehicle produced. The Automatic and Manual are alternative features since the two cannot co-exist in the same car. The Air Conditioning and Sunroof are OR group features because a car can have zero or more of the features.
A selection of a valid combination of features is known as product configuration. For example, the following is a valid configuration of features from Figure 1: Horsepower 150

, Powertrain, Automatic, Air conditioning
Using software product line engineering approach, software products are produced en masse by means of two engineering phases: (i) Domain Engineering (DE) and, (ii) Application Engineering (AE) At the domain engineering phase, reusable assets are developed with variation points where variant features may be bound for each of the diverse products. At the application engineering phase, individual and customized products are developed from the reusable assets [4]. Figure 2 summarizes the relationship between feature space and the outputs of both domain engineering and application engineering activities. As shown in Figure 2, features in the feature space are mapped to the reusable assets (deliverable of the domain engineering activities). Similarly, each product or partial product, as the case may be, is derived from the reusable asset based on feature configurations from the feature space (i.e. valid selection of features). Lastly, a product-specific artefact may be fed back into the reusable assets in the form of feedback.

Justification for the study
Modifiable assets are needed to support additional variations that were not planned beforehand to increase the usage context of the software product line (SPL) as a result of expanding markets. For example, Kastner [5] observed that, in Berkeley Database Engine (BDE) product line [6], features such as Statistics and Transactions were implemented as mandatory. However, such features will have to be made optional to make BDE configurable to other usage-contexts such as smartcard products because the product cannot afford the footprint of the extraneous feature. Therefore, to increase the usage context of BDE, its code-assets must be modified to make the Statistics and the Transactions optional. Failure to inject additional variations may lead to the delivery of product with extraneous code-assets -which is not desirable for lean memory applications and may also cause a problem especially if the product is to be integrated within other software products [7].
The objective of this paper is to explore the properties of features, in the code-asset, that may affect modifiability of injecting additional variations. Thus, the paper attempted to answer the following research question: RQ. What are the characteristics of a feature in the code-asset that may potentially affect the modifiability of injecting additional variations?
The contribution of this paper is the exploration of properties of features at the implementation level and contributed with the description of the properties that may affect the flexibility of reusable assets.

STUDY DESIGN
This section provides explanations of the key design decisions for this study. The section begins with clarifications of the key terms that appear repeatedly in the paper.

Definition 2 (Feature module)
A feature module f1 consists of a set of program elements, Ef1, such that Ef1⸦ E.

Case Study: Oracle Berkeley Database Engine
Oracle Berkeley Database Engine (BDE), Java Edition [6], is an embedded storage engine designed to support integrating application logic and storage requirements of a software product in a single binary installation. It is specific for applications targeting Java Virtual Machine (JVM) and where no separate installation of a database server is required. Hence, the BDE runs in the same memory address space with the integrated application logic and thereby eliminating the overhead of process switching. Thus, a BDE can be embedded in a wide range of applications.
BDE has many features that are extraneous to the requirements of some applications in the domain. For example, features for concurrent access and atomic transaction are not required in an application with single access and simple data requirements. Similarly, by default, BDE gathers statistics of almost every operation of the database such as tree traversal and memory usage. The implementation of the statistics collection adds a significant footprint that may become a burden to some applications that do not require the collection of statistics. Therefore, those features should be optional and the code-asset should reflect their variability.
We selected the BDE as a case study because its legacy code-asset requires additional variations to derive customized products for the different applications and has been used in previous related researchers [5], [8].   3 depicts the partial feature model of BDE in which 11 features hitherto not optional but will have to be made so, to increase its usage-context. All the features were identified from previous studies [5], [8]and vary in size of lines of codes. We limited our selection to 11 features because the retrofitting of variations into features is tediously repetitive. Table 1 presents a brief explanation of the selected features.  As program elements of a specific feature are often scattered across the code-assets [9], to explore the properties of features in the code-asset, we followed an iterative process depicted in Fig.4. We started with consulting BDE documentation [6] to have reasonable domain knowledge. We checked the features already identified from the previous studies [5], [8]. We searched through the code-asset to trace the implementation of the identified features.

Case study exploration
Statistics feature is one of the features that spread all over the code-assets.  With the support of a tool, we marked program elements of the traced features with annotations. Annotation is the most popular means of enforcing variations in code-assets [10].

Observation
In the process of marking the implementation of the eleven (11) features, we observed that the code-asset is a union of feature modules and the program elements in the feature modules intersect with each other. Fig.7 depicts this observation in a Venn diagram. In Fig.7, each of the feature modules, represented by F1...Fn internally contains program elements (attributes, declarations, and operations) that are only for the containing feature as well as program elements that are shared with other features. We refer to the program elements that are not shared with any other feature as exclusive elements to the given feature and the shared part as an intersection between features. Thus, we defined the exclusivity and intersection (semiformally) in definition 3 and definition 4 respectively:

Definition 3 (Exclusivity)
Program element ef1 is said to be exclusive to a feature f1 if it satisfies a disjoint union relationship with other features in the code-asset CA. i.e.ef1 ꓵef2 ꓵef3 ꓵef3ꓵ….. efn =ø

Definition 4 (Intersection)
An intersection between programs elements of different features ef1 and ef2 is defined as ef1#ef2, which is a modification and or integration of ef1 and ef2 so that they work correctly together [12]. 1  For the 11 features explored in the study, Table 2 shows the summary of intersecting features. Note that, some of the intersecting features in Table 2 are not among the 11 selected features but were picked while annotating the 11 features. Also, note that the intersection is reflexive, i.e. if feature A intersects feature B, implies feature B also intersects feature A. Fig.8 depicts an illustration of 7 features intersecting the Statistics feature.

Discussion
The consequence of observation is that injection of additional variations requires decoupling of features in the code-asset: The program elements of the feature to be made optional have to be traced in the code-asset; Both the exclusive and the intersection program elements have to be separated; The separated program elements should be added to a product configuration only if the variant feature is selected.
We noticed this observation to be true within the context of Object-Oriented Development. For example, using various design patterns in OOP [13], a designer should be able to encapsulate features to some extent. For example, Keypad, Fingerprint, Remote control as OR group subfeatures of Access Control feature in a smart home can be implemented using Strategy pattern. In that case, a programmer implements each feature as a separate strategy for accessing the smart home, and each of the strategies only contains program elements exclusive to one of the features. At the point of invocation, however, a request to access the smart home has to be resolved to one of the concrete strategies, and that is the point of intersection.
Similarly, when using class inheritance, as a form of polymorphism, to implement an optional feature as a subclass of one of the mandatory classes, the sub-class is the exclusive program element to the optional feature. In that case, intersection points are places where references are made to the subclass.
Thus, thinking in terms of exclusiveness and intersection can be beneficial. Intuitively, limiting the number of intersections may be better for a product line that is not stable and with the tendency of the emergence of new usage-contexts. Of particular note, the Factory pattern encapsulates a specific form of intersection-a point to create one of the variant objects.

RELATED WORK
In this section, we present similar studies from the literature to highlight the novelty of our approach. There is an approach of taking existing software assets and transforming them into reusable assets of a software product line. It is called extractive and there have been positive arguments and studies for it [14]- [16]. For example, [16] proposed a semi-automated and stepwise process to identify, map, and visualize features in the code-assets of a legacy system that is being transformed into software product line following the extractive approach. However, the focus was mostly on the benefits of the approach in easing the transition from single software product development to the software product line.
There are studies aimed at visualizing features in the code-assets [16], [17]. For example, [18] developed a tool for interactive visualization of features. It provided very easy support for visualising and locating features in the code-assets. The tool can also be used to answer questions such as which feature has the highest line of codes or which feature has the highest spread in the code-assets. However, the aim was to aid feature comprehension and not properties of features that may affect modifiability. Our description of properties of feature in the code-assets as well as the visualization is meant to give an insight of modifiability of the product line code-assets to introduce additional variations.
Tracing features in the code-assets is one of the common activities in software product line engineering [19] and tools [20]- [23] were developed to facilitate the activity. Nonetheless, less attention was paid to the actual properties of features that have the potential to affect the modifiability of the code-asset. Rather, studies on the exploration of features in the code-assets focused on getting insights that will improve feature traceability [24], the influence of varying project scope on the feature traceability, as well as effect of feature traceability on program comprehension [25].
May et al [26] conducted an exploration of features in the code-assets of micro-service-based implementation of a webshop. They even took a step further and re-engineered the code-assets and produced product line architecture; introduced variations using the principles of deltaoriented software product line [27]. In another exploratory study, Murphy et al [28] investigated the flexibility of three language-based techniques when used to untangle features from a codeasset. All the three techniques, Hyper/J, AspectJ [29], and the authors' own technique, were designed for advanced separation of concern. The authors qualitatively characterized the effect the different techniques had on the structure of the code-asset and also characterized how to restructure the code-asset to untangle features with each of the techniques. Nevertheless, these studies did not describe the properties of features, in the code-assets, that may affect the modifiability of code-assets.
In summary, this study attempted to provide insights into properties of features that may affect modifiability of code-assets to pave way for researches on comparable software product line implementation techniques that can be used to decouple features in the code-assets to introduce new variations.

FUTURE WORK AND CONCLUSION
In future researches, properties of feature in other domains and programming styles shall be explored and then compare newer software product line implementation techniques with respect to modifiability of existing code-assets to introduce variations that were not planned beforehand.
In this paper, we explored properties features in the code-asset using Berkeley Database Engine (Java edition), as a case study. In the exploration, we observed that program elements of disparate features formed unions as well as intersections that may affect the modifiability of the codeassets. The implication of this research to practice is that an unstable product line and with the tendency of emerging variations, should aim for programming style that limits the number of intersections between program elements of different features. Similarly, the implication of the observation to research is that, there should be subsequent investigations using multiple case studies in different software domains and programming styles in order to improve the understanding of the findings.