.

Driven by the continuous expansions of software applications and the increases in component varieties and sizes, the so-called component mismatch problem has become a more severe hurdle for component selection and integration. Although many component repositories and search tools have been proposed, so far there is no satisfactory solution which simultaneously achieves the following goals: automated, semantic-based, and precise. This paper presents a novel component repository and associated search tool which implements holistic semantic-based and adaptation-aware component specification and retrieval. The repository and tool is based on a Multiple-View and Interrelated Component Specification ontology model (MVICS), which has a smooth integration with domain related software system ontologies. The MVICS provides a formally defined and ontology-based architecture to specify components automatically in a spectrum of perspectives. The integration enhances the function and application scope of the MVICS model by bringing domain semantics into component specification and retrieval. The repository and search tool contributes to the current state of the art with four unique features: ontology-based component specification mechanism, semantic-based component retrieval method, adaptive component matching, and a comprehensive result component profile. The repository and tool has been widely tested and evaluated via its online version and follow-on survey reports, which concluded that they are effective for avoiding the component mismatch problem and is promising for industrial use


INTRODUCTION
As a popular software development methodology, the advantages of Component-Based Development (CBD) have been well stated by many publications [5] [16] [19], such as shortened development life cycle, reduced time-to-market and reduced development costs. However, CBD is still not widely accepted as it should be due to the increasingly severe difficulty in finding perfectly matching components automatically, in particular for large complex software applications and from a huge collection of diverse and often very similar components. Severe mismatches often exist between the user query and result components due to the lack of consideration of full semantics, lack of precise understanding of the semantics and lack of automation support. These problems not only prevent CBD from reaching its full potential, but also hinder the acceptance of many existing component repositories.
To overcome the above problems, several component search tools had engaged a variety of technologies to support better component specification and retrieval, among which several recent research projects attempted to use domain models and ontology in component retrieval [1][9] [17][21] [22]. Based on an analysis of the existing semantic-based component specification and retrieval approaches (Section 2), it is clear that the ontology in these approaches has a monolithic structure and few relationships to deal with the specification and retrieval of modern components, which narrowed its application scope. In this paper a novel ontology-based approach is developed and then fully realized for holistic and semantic-based component specification and follow-on automatic and precise component retrieval. As the foundation of the proposed approach, a Multiple-View and Interrelated Component Specification ontology model (MVICS) is developed for component specification and repository building. A formal definition of the MVICS model is first presented in the paper, which ensures the rigorousness of the model and the high level of automation of the tool. The MVICS model provides an ontology-based architecture to specify components in a spectrum of perspectives; it integrates the knowledge of Component Based Software Engineering (CBSE) and the domain knowledge of application domains, and supports w w w . i j c t o n l i n e . c o m ontology evolution to reflect the continuous developments in CBD and components. Moreover, the integration with a domain related software system ontology model enhances the function and application scope of the MVICS model by bringing more domain semantics into the component specification and retrieval. Based on the MVICS model and the integration, a MVICS based component repository and search tool have been developed and then published as an online system via the project web site. The repository and search tool supports semantic-based component matching and adaptive component matching; they are fully automated, and presents a comprehensive profile of the result components instead of a mere number of relevance. The result of retrieval includes not only the matching components but also accurate relevance rating and unsatisfied discrepancy, which are presented to CBD engineers in the component matching profile.
The rest of the paper is organized as follows: Section 2 discusses related work. Section 3 and Section 4 describe the formal definition of MVICS and its realization in a component repository. Section 5 describes the realization of the MVICS based component search tool. Section 6 presents the case study. Section 7 evaluates the repository and tool via real-life use. Finally, section 8 presents the conclusion and future work.

RELATED WORK
In Component Based Development (CBD), the mismatch between the requirements and the selected components has existed as a rather persistent problem, which is getting increasingly severe with the emerging of modern software systems and the evolution of CBD itself. At an early stage, software developers modify the code of accessible components to satisfy the requirements. This approach is known as white-box reuse, which is applicable to local repositories and incurs much cost in making the changes. Thereafter, most local repositories extend to external or even global markets, and components are usually reused "as is", i.e. without changes to its code. This type of reuse is known as black-box reuse. As the investigation in [ [18] which affect CBD. Three of them play a crucial role in connection with overcoming the mismatch problem, including user query formulation, standard specification of components and component search engine with search relevant rating. This paper focuses on the last two issues, which are referred to as component specification and retrieval respectively.
Component search tools were developed on the basis of component description and retrieval approaches. The existing approaches can be classified into two types: traditional and ontology-based. The traditional approaches [11] [12][13] [23] include keyword searching, faceted classification, signature matching and behavioral matching. Two typical examples of component search tools from traditional approaches are Agora [15] and Zaremski [23].
Traditional approaches are not effective for component selection, suffering from lower recall and precision, i.e., poor completeness and accuracy of components matching [17]. Traditional approaches are rather limited in accommodating semantics of user queries and domain knowledge. To solve this problem, ontology is thus introduced to help understand the semantics of the components [17] [20][21] [22]. Sugumaran's tool [17] is developed on the basis of his proposed semantic-based component retrieval approach, which enables a user to execute more intelligent queries by using domain ontology and natural language parsing techniques. With the help of domain ontology, the tool generates three layers of optional user query refinement panels, which are selected by the user to identify the accurate requirement. From this process, the initial query is augmented or revised by exploiting the additional knowledge from domain ontology. The specification of result components and other related components are shown on the search result panel at the end. This is a good start point to use domain ontology models for refining the user query and specifying the component. However, the tool is not mature enough to apply in large-scale domain ontology, because it is not taken into account how to acquire and evaluate the domain ontology. Moreover, the display method of result components is too simple.
From the above literature analysis, we conclude that existing component repositories and search tools failed to have a sound semantic model as their foundation to reach adequately sufficient level of automation and comprehension of the semantics of components. Consequently, these repositories and tools are doomed to have fatal drawbacks in the delivery of the desired aims. In detail, these limitations include: i) the ontology models in existing repositories are all domain specific, therefore a generic computing-oriented overview is missing, often leaving the component search in a unsystematic style and within too narrow scopes; ii) the domain ontology in existing component repositories is too simple for a holistic specification of components, in particular large and complex ones; iii) the architecture of the domain ontology is monolithic and has few relationships, which limit their semantic expressiveness; iv) the evolution of the domain ontology is not considered; v) existing ontology-based repositories and search tools presumed that the domain ontology in use already exists; the method to access such domain ontology is not mentioned.

THE FOUNDATION FOR SEMANTIC COMPREHENSION AND TOOL AUTOMATION
To overcome the above limitations, a formal and machine recognizable semantic model is needed as the foundation for component specification and follow-on automated component search. The semantic model needs a domain-independent structure and meanwhile a domain-friendly interface for integration with a set of common application domains. This section defines the foundation for achieving the targeted holistic semantic-based automatic component retrieval, which includes the Multiple View Interrelated Component Specification ontology model (MVICS), and its linkage with domain models, and the formal definition of both. w w w . i j c t o n l i n e . c o m

Multiple View Interrelated Component Specification Ontology Model
The MVICS ontology model has a pyramid architecture, which contains four facets: a function model, an intrinsic model, a context model and a meta-relationship mode. Each of the four models specifies one perspective of a component and as a whole they construct a complete spectrum of semantic-based component specification. All the four models are ontologybased, and are extracted from the analysis of a CBSE knowledge and have extension slots for specific application domains. The first three models (function model, intrinsic model and context model) can be viewed as sub-ontology models, each of which describes one facet of the component specification. The forth (meta-relationship model) is used to store four types of inter-relationships among the classes of the first three models. These relationships represent more of the semantics among each facet in the model, while the architecture of the model is not thereby changed. A detailed description of the MVICS model can be found in [7], as a supplement to the brief introduction that follows (3.1.1-3.1.4).

Intrinsic Model
The intrinsic model specifies the essential information of a component, which does not have to be relevant to the functionality and applicable context of the component, e.g. its name, type, and applicable software engineering phases. In

Function Model
The

Context Model
The context model is used to represent the reuse context information of the components, including but not limited to the application environment, hardware and software platform, required resources and possible dependency with other components. The top-level classes consist of operating system, component container, hardware requirement and software requirement.

Meta-Relationship Model
The meta-relationship model provides a semantic description of the relationships among the classes in different facets (sub-models) of MVICS. Four types of relationships are identified, namely: Matching Propagation Relationship, Conditional Matching Propagation Relationship, Matching Negation Relationship and Supersedure Relationship.

Linkage between domain related software system ontology and MVICS
The MVICS model is a component specification ontology model based on the IT specific functions, rather than the application domain related functions and other features. Therefore this MVICS model does not support domain oriented component specification or selection. To extend the semantic-based component search into a specific domain two mechanisms, namely Association Link (AssL) and Aggregation Link (AggL), were developed to integrate the domain related software system ontology into MVICS. With such integration, the domain ontology is linked to MVICS effectively and thus extends the application scope of MVICS without changing the architecture of the model. Because the precision of MVICS component search is calculated based on the weights of the search paths, and the impacts of AssL and AggL in the search path identification are equivalent to the original relationships in the MVICS or the domain ontology, the calculation of search precision would not be affected by the integration. In the function model of MVICS, the class "component domain" is set to interface domain ontology with MVICS. The AssL and AggL generated from the integration will be stored under the class component domain.
The fundamental of the AssL and AggL mechanisms was first introduced in [8], which are used to link two different kinds of classes of domain ontology to the MVICS. In this paper, these two mechanisms are defined formally and further utilise to automate the repository building and component search.

Component adaptation model in MVICS
Component adaptation is a common method to change the functionality and quality features of pre-qualified components [2][3] [10]. The MVICS model provides a new adaptation model, which records the impact of adaptation in the specification and selection of matching components. This unique feature gives more choices for system developers to opt for the suitable and low cost result components. We name those components whose function and QoS may vary via the application of adaptation assets as "adaptive components". In MVICS, the adaptive components are linked to a class via adaptation method/assets when the component is relevant to that class, after adaptation with that method or asset. Such adaptation methods/assets are defined as classes or instances in the adaptation model of MVICS.

Formal definition of MVICS model
While the architecture of MVICS is set up and the relevant classes are in place, OWL-DL is used to define the classes, individuals and relationships of the four sub-models of MVICS. With these formal definitions in OWL-DL, automatic semantic extension, automatic ontology validation, and semi-automated ontology evolution can be achieved with the support of an ontology reasoner.

Original MVICS Model Definition
To define the function model, the intrinsic model and the context model in OWL-DL, let represent the top class. For the isA relationship, we assert that the range of the relationship is the respective class n i C : 2 DLL C , defines that the relationship isA links the class DLL to the class component type.
For the isAttributeof relationship, we assert that the range of the relationship is the respective attribute class

Domain Ontology Definition
The method to define the domain ontology in OWL-DL is the same as the Original MVICS ontology, except that the classes located in the same level are not disjoint in the domain ontology. Among these classes, the hasA relationship is used to describe super-and sub-class links between classes in the adjacent levels. The relationship is defined as follows: For the hasA relationship, we assert that the range of the relationship is the respective class

Linkage Definition
The linkage between the domain ontology and the MVICS are established by Association Class, Aggregation, AssL and AggL. Because the Association Class is a kind of domain ontology class, the definition of the Association Class is the same as the domain ontology classes.
To define AssL in OWL-DL, let   It provides not only the matched adaptation assets/methods, but also their suggested effort. The adaptive search results will give user more options during the system development.

Refined User Keywords Parser
Prior to the component search, the refined user keywords will be further processed by the Refined User Keywords Parser (RUK Parser). The parser first parses the keywords base on the sub-models of MVICS. The parser classifies the keywords into three groups, including function keywords (domain related keywords belongs to function keywords), intrinsic keywords and context keywords. In the second step, the parser generates several scratch storages for the component search, according to the numbers of the keywords. The scratch storages deposit the temporary search data generated during the searching process.

MVICS Component Search Path
The The last type is the adaptive search path, which is achieved during the adaptive component search. In MVICS, the adaptive components are linked to a class via adaptation method/assets if the component becomes relevant to that class after adaptation with that method or asset. The retrieval path is then recorded as an adaptive component search path, in contrast to the first three types which are obtained during original MVICS component search.

Adaptive Component Search Scratch Storage and Adaptive Suggestion Processing
After the adaptive component search, a scratch storage of each adaptive search path will be generated to keep the path for further precision calculation and adaptation suggestion. The searched keywords with their matched adaptation methods/assets are saved in the corresponding storages. In addition, an adaptation suggestion processing will give every matched methods/assets an effort suggestion. The effort suggestions are classified into three levels, which indicate as Strong, Medium and Weak. For each available adaptation methods/assets, the effort suggestion information is defined as attributes of the relevant classes, which are stored in the MVICS model. An adaptation suggestion processing will invoke these attributes and save them in the adaptive search result storage for displaying to the user.

Result Component Oriented Data Conversion
After the component search, the search results will be processed by the Result Component Oriented Data Converter. The data converter will implement three tasks.

Precision Calculation
Based on the converted data, the match precision of a result component (Pc) is calculated with the following unified formula as mentioned in [7].
The numerators in the formula represent the path weight of the result components that partially match with the keywords in each facet, and the denominator represents the path weight of those perfectly matched. X is fiducial weights, X = 0.5 for a class in the function model, X = 0.3 for one in intrinsic model, X = 0.2 for class in the context model. The yield value of the X for each sub model is given based on our experience, and it will be updated dynamically by the dynamic fiducial weight assignment.

Dynamic Fiducial Weight Assignment
The MVICS component search tool is based on the tree structure of the MVICS model, and calculates the precision of the result component by substituting the values of the search path weights and the fiducial weights (X) into the formula. The fiducial weights (X) of the classes in each model are given as tentative values, which are adjustable via the dynamic fiducial weight assignment mechanism by analysing the user keywords. Each group of the keywords will be recorded by the Requirements Recorder after clicking the search button on the UI. The keywords and their respective sub-model of the MVICS are stored in an XML document. According to the collected data, the fiducial weights will be updated dynamically w w w . i j c t o n l i n e . c o m after every 100 groups of user keywords are obtained. The rules of dynamic fiducial class weight assignment are: the more frequently the keywords are used in a facet, the heavier fiducial weight of this facet is [8].

CASE STUDY
To exemplify the use of the MVICS component search tool, several search scenarios with corresponding search results are offered on the project website. Users can opt to test some of these given scenarios, or construct their own for the testing. Here we take a financial domain scenario of developing an encrypted Cash Management Systems with Useful Interface to illustrate the function and process of the MVICS repository and its linkage with domain related software system ontology. A financial domain related software system ontology was built, by immigrating existing financial operation ontology. Following the proposed domain ontology immigration method, a financial operation ontology was retrieved from protégé ontology library with help of google filetype search [4]. And then, the selected financial operation ontology was updated by recording and adding the financial operations with the characteristics that maximize the use of CBD approach and the types of the components on the basis of the MVICS format component specification in the repository. Each class in this ontology represents one software system or module that carries out a financial operation. Superior-subordinate relationships have been used to describe the affiliations of the functions of these systems or modules. To date, 125 users have tested the tool in practice, among whom 43% from Europe, 33% from Asia, 17% from North America and 7% from the rest of the world. In the self-appraisal to scale their own software engineering experiences from level 1 to 5 (the level 1 indicates the user has more than 10 years software engineering experience, and then 8 years, 5years, 2 year until the level 5 indicates the user has no experience), 15% opt for scale 1, 7% for 2, 44% for 3, 23% for 4 and 11% for 5. The information of test participants' professional background is shown in Table 1 Years of experience (Average) 7 5  faster than MVICS tool when the size of a repository is less than 500 components; however, the MVICS tool is faster when the repository is in large-scale, when the number of components beyond 500 in this case study. This is because the MVICS tool searches classes in the ontology along its multi-faceted and hierarchical structure. The other reason for the speed loss in the MVICS component search tool is due to more semantic processing, i.e., the semantic-based precision calculation, the adaptive component search and the data collection for a whole profile of result components.
Regarding the effort of maintenance, it is observed that the ontology-based search tools (MVICS and existing domain ontology-based approach) are easier to manage in medium size repository (Figure 4f). Again this advantage comes from that the MVICS component search tool uses ontology formally defined in OWL DL, which makes it possible for the automatic validation through ontology reasoners.

CONCLUSION
The presented work provides a novel and effective solution to avoid the component mismatch problem with precise semantic-based and adaptation-aware component selection. Its key contributions lie at: i) the formalized MVICS model as the basis for semantic comprehension and process automation; ii) the integration of domain knowledge into MVICS model via the ontological interface of the developed relationships; ii) the automation of the retrieval process and repository building with the MVICS component repository and the component search tool. Our literature investigation has shown that similar work has not been done prior to the MVICS project. Our user testing and evaluation have also shown that the repository and tool is suitable for industrial use with their delivery of much better functionalities and performance.
The MVICS based repository and search tool provide an integral mechanism for component selection, including component specification, component retrieval, user query refinement, component registration, and online repository management. As the foundation of the search tool, the MVICS ontology model and its linkage with domain ontology not only solves the shortage of description capability of the ontology used in the existing semantic-based component search tools, but also with its formal definition in OWL-DL guarantees the adequately good reasoning capability for component search and ontology evaluation. Moreover, the MVICS gets rid of the over-complication problem in traditional monolithic ontology by a set of coupled sub-models of high coherency as well as relative flexibility. The inter-relationships among the classes in different sub-models also ensure a holistic view in component specification and selection, and thus improve the search precision. On the retrieval side, the MVICS tool supports dynamic and user group oriented retrieval by adjusting the fiducial facet weights, which further decreases the mismatch between the result components and the user query. The use of AssL and AggL improves the functionality and application scope of component retrieval and provides a practicable way to integrate in domain related system ontology. The adaptive component matching and the search result profile are novel concepts in component search tools; they make the MVICS approach "holistic" for the component retrieval and specification.
For future work, we will further refine the MVICS model, extend the ontology linkage method to more relationships, and refine the formulae for weight assignment and component precision calculation on the basis of a wider and more industrial extent of test results and user feedbacks.