KNN AND STEERABLE PYRAMID BASED ENHANCED CONTENT BASED IMAGE RETRIEVAL MECHANISM

Recently, digital content has become a significant and inevitable asset of or any enterprise and the need for visual content management is on the rise as well. There has been an increase in attention towards the automated management and retrieval of digital images owing to the drastic development in the number and size of image databases. A significant and increasingly popular approach that aids in the retrieval of image data from a huge collection is called Content-based image retrieval (CBIR). Content-based image retrieval has attracted voluminous research in the last decade paving way for development of numerous techniques and systems besides creating interest on fields that support these systems. CBIR indexes the images based on the features obtained from visual content so as to facilitate speedy retrieval. Content based image retrieval from large resources has become an area of wide interest nowadays in many applications. In this thesis work, we present a steerable pyramid based image retrieval system that uses color, contours and texture as visual features to describe the content of an image region. To speed up retrieval and similarity computation, the database images are classified and the extracted regions are clustered according to their feature vectors using KNN algorithm We have used steerable pyramid to extract texture features from query image and classified database images and store them in feature features. Therefore to answer a query our system does not need to search the entire database images; instead just a number of candidate images are required to be searched for image similarity. Our proposed system has the advantage of increasing the retrieval accuracy and decreasing the retrieval time.


INTRODUCTION
Content based image retrieval is based on (automated) matching of the features of the query image with that of image database through some image-image similarity evaluation. Therefore, the images will be indexed according to their own visual content in the light of the underlying (chosen) features like color (distribution of color intensity across image, texture (presence of visual patterns that have properties of homogeneity and do not result from the presence of single color, or intensity), shape (boundaries, or the interiors of objects depicted in the image), or any other visual feature or a combination of a set of elementary visual features. Needless to say, the advantages and end users of such systems range from simple users searching a particular image on the web as well various type of professional bodies, like police force for picture recognition, journalists requesting pictures that match some query etc. From historical perspective, probably the first use of CBIR goes back to Kato in early nineties where he implemented what sounds to be the first automated image retrieval system using color and shape features. Since Kato's pioneer work, many prototypes of CBIR systems have been developed, and some of them did go to commercial market, e.g., IBM's QBIC system, which supports color, shape and texture feature, Virage developed by VirageInc and supports color, texture, color layout and shapes. However, it has been acknowledged the lack of maturity of current technology, which limited its large-scale deployment. This motivated the intensive research carried out in many aspects of CBIR including image indexing, feature selection and extraction, image similarity calculus and user's feedback/interactions. A typical task solved by CBIR systems is that a user submits a query image or series of images and the system is required to retrieve images from the database as similar as possible. Another task is a support for browsing through large image databases, where the images are supposed to be grouped or organized in accordance with similar properties. Although the image retrieval has been an active research area for many years this difficult problem is still far from being solved. Any CBIR system involves at least four main steps: -•Feature extraction and indexing of image database according to the chosen visual features, which form the perceptual feature space, e.g., color, shape, texture or any combination of the above.
•Feature extraction of query image (s).
•Matching the query image to the most similar images in the database according to some image-image similarity measure. This forms the search part of the CBIR •User interface and feedback which governs the display of the outcomes, their ranking, the type of the userinteraction with possibility of refining the search through some automatic or manual preference (weighting) scheme, etc.
In a typical CBIR system (Figure 1), image low level features like color, texture, shape and spatial locations are represented in the form of a multidimensional feature vector. The feature vectors of images in the database form a feature database. The retrieval process is initiated when a user query the system using an example image or sketch of the object. The query image is converted into the internal representation of feature vector using the same feature extraction routine that was used for building the feature database. The similarity measure is employed to calculate the distance between the feature vectors of query image and those of the target images in the feature database. Finally, the retrieval is performed using an indexing scheme which facilitates the efficient searching of the image database.

Figure 1. Architecture of CBIR
Recently, user's relevance feedback is also, incorporated to further improve the retrieval process in order to produce perceptually and semantically more meaningful retrieval results. In this chapter, we discuss these fundamental techniques for content-based image retrieval. CBIR is the application of computer vision to aid the image retrieval process of searching for digital images in large database based on the comparison of low level features of images. The search is carried out by using contents of the image themselves rather than relying on human-inputted metadata such as caption or keyword describing the image. Compared to textbased retrieval systems, CBIR is more feasible in large-scale databases and is usually used in environments which require fast retrieval and real-time operations. Software's which implements CBIR are known as content-based image retrieval systems (CBIRS). CBIR came to the interest of researchers as it offers the ability to index images based on content of the image itself [4]. In a standard CBIR based machine (Figure 1), image based features like color, texture, shape of an image and spatial locations are shown and represented in the form of a multidimensional feature vector. The characteristic vectors of images inside the database form a feature database. The retrieval process in CBIR system is started whenever a consumer question the system using query image or provides the sketch of the image [3]. The question photograph or the query image is converted into the internal illustration of feature vector using the equal characteristic extraction process that was used for constructing the feature database. The similarity measure or the degree is hired to calculate the distance among the feature vectors of query image and those of the target images inside the characteristic database of images. Finally, the retrieval is achieved by using an indexing scheme which facilitates the efficient searching of the image database. Recently, consumer's relevance remarks or we can say the feedback is likewise included to further enhance the retrieval method so one can produce perceptually and semantically greater meaningful retrieval effects using CBIR system [4].CBIR retrieves images based on visual features such as colour, texture and shape [3]. In this method, colour, shape and texture of an image are classified automatically or semi-automatically with the aid of human classifier. Retrieval results are obtained by calculating the similarity between the query and images stored in the database using predefined distance measure. The results are than ranked according to the highest similarity score.  2016) proposes the content based image retrieval as one of most technique of data and multimedia technology. As image collections are growing at a rapid rate, and demand for efficient and effective tools for retrieval of query images from database is increased significantly. Zhijie Zhao et al. (2016) proposes a scheme which is based on three noticeable algorithms: color distribution entropy(CDE), color level co-occurrence(CLCM) and invariant moments. CDE takes the correlation of the color spatial distribution in an image into consideration. CLCM matrix is the texture feature of the image, which is a new proposed descriptor that is grounded on co-occurrence matrix to seize the alteration of the texture.

RESEARCH MOTIVATION
Image retrieval is an extension to traditional information retrieval. Approaches to image retrieval are somehow derived from conventional information retrieval and are designed to manage the more versatile and enormous amount of visual data which exist. Low-level visual features such as color, texture, shape and spatial relationships are directly related to perceptual aspects of image content. Since it is usually easy to extract and represent these features and fairly convenient to design similarity measures by using the statistical properties of these features, a variety of content-based image retrieval techniques have been proposed in the past few years. High-level concepts, however, are not extracted directly from visual contents, but they represent the relatively more important meanings of objects and scenes in the images that are perceived by human beings. These conceptual aspects are more closely related to users' preferences and subjectivity. Concepts may vary significantly in different circumstances. Subtle changes in the semantics may lead to dramatic conceptual differences. Needless to say, it is a very challenging task to extract and manage meaningful semantics and to make use of them to achieve more intelligent and user friendly retrieval.

KNN
An instance based learning method called the K-Nearest Neighbor or K-NN algorithm has been used in many applications in areas such as data mining, statistical pattern recognition, image processing. Successful applications include recognition of handwriting, satellite image and EKG pattern. In data mining, we often need to compare samples to see how similar they are to each other. For samples whose features have continuous values, it is customary to consider samples to be similar to each other if the distances between them are small. Other than the most popular choice of Euclidean distance, there are of course many other ways to define distance. The k-means clustering algorithm attempts to split a given anonymous data set (a set containing no information as to class identity) into a fixed number (k) of clusters. Initially k number of so called centroids are chosen. A centroid is a data point (imaginary or real) at the center of a cluster. In context each centroid is an existing data point in the given input data set, picked at random, such that all centroids are unique (that is, for all centroids ci and cj, ci ≠ cj). These centroids are used to train a KNN classifier. The resulting classifier is used to classify (using k = 1) the data and thereby produce an initial randomized set of clusters. Each centroid is thereafter set to the arithmetic mean of the cluster it defines. The process of classification and centroid adjustment is repeated until the values of the centroids stabilize. The final centroids will be used to produce the final classification/clustering of the input data, effectively turning the set of initially anonymous data points into a set of data points, each with a class identity. CI is a categorical variable, and there is a scalar function, , which assigns a class, = ( ) to every such vectors. We do not know anything about (otherwise there is no need for data mining) except that we assume that it is smooth in some sense. We suppose that a set of T such vectors are given together with their corresponding classes.

STEERABLE PYRAMID
The Steerable Pyramid is a linear multi-scale, multi-orientation image decomposition that provides a useful front-end for image-processing and computer vision applications. We developed this representation in 1990, in order to overcome the limitations of orthogonal separable wavelet decompositions that were then becoming popular for image processing (specifically, those representations are heavily aliased, and do not represent oblique orientations well). Once the orthogonality constraint is dropped, it makes sense to completely reconsider the filter design problem (as opposed to just re-using orthogonal wavelet filters in a redundant representation, as is done in cycle-spinning or undecimated wavelet transforms!).
The basic functions of the steerable pyramid are Kth-order directional derivative operators (for any choice of K), that come in different sizes and K+1 orientations. As directional derivatives, they span a rotation-invariant subspace, and they are designed and sampled such that the whole transform forms a tight frame. An example decomposition of an image of a white disk on a black background is shown to the right. This particular steerable pyramid contains 4 orientation sub bands, at 2 scales. The smallest sub band is the residual low pass information. The residual high pass sub band is not shown. The block diagram for the decomposition (both analysis and synthesis) is shown to the right. Initially, the image is separated into low and high pass sub bands, using filters L0 and H0.

Figure 2. Decomposition using Steerable Pyramids
The low pass sub band is then divided into a set of oriented band pass sub bands and a lower-pass sub band. This lower pass sub band is subsampled by a factor of 2 in the X and Y directions. The recursive (pyramid) construction of a pyramid is achieved by inserting a copy of the shaded portion of the diagram at the location of the solid circle (i.e., the low pass branch). The right side of the diagram is the synthesis part. The synthesized image is reconstructed by up sampling the lower low-pass sub band by the factor of 2 and adding up with the set of band-pass sub bands and the high-pass sub band.

RESEARCH METHODOLOGY
•D represents the number of images in the database and Q represents the set of query images.
•Extract the low level features like color, shape and texture of all database as well as query images.
•Match all the database images with the query images by using KNN. Then the result will comes as classified images and unclassified images. Classified images are stored in labeled folder and unclassified images are stored in unlabeled folder.
•Steerable pyramid mechanism is used to extract the various features like color, texture and contour from the query image and classified database images •Multiple scales and rotation invariance is used by steerable pyramid to extract the features.
•Then Match the classified images with query images and display output.

EXPERIMENTAL SETUP
The user query is used to express the user's information needed to retrieve images in collection of database that conform to human perception. Querying by visual example is a paradigm, particularly suited to express perceptual aspects of low/intermediate features of visual content. The initial process is preprocessing the input image, for better performance. The input image has different contents and different format. Sometimes the same image is given twice with some small change in that image, but the comparison of these two images using pixel by pixel may not give the positive result. So the proposed methods with few enhancement techniques that enhance the image and then compare the image using only the modified images not the original one. WANG Database contains 1000 images which can be classified into different domains namely Buses, Dinosaurs, Flowers, Building, Elephants, Mountains, Food, African people, Beaches and Horses with JPEG format which used in a general purpose for experimentation. These images are stored with size 256×256 and each image is represented with RGB color space.

Figure 4. WANG dataset images
Thee open source computer vision library, OpenCV, began as a research project at Intel in 1998. It has C++, C, Python, Java and MATLAB interfaces and supports Windows, Linux, Android and Mac OS. OpenCV leans mostly towards real-time vision applications and takes advantage of MMX and SSE instructions when available. A full-featured CUDA and OpenCL interfaces are being actively developed right now. There are over 500 algorithms and about 10 times as many functions that compose or support those algorithms. OpenCV is written natively in C++ and has a templated interface that works seamlessly with STL containers. Officially launched in 1999, the OpenCV project was initially an Intel Research initiative to advance CPUintensive applications, part of a series of projects including real-time ray tracing and 3D display walls. The main contributors to the project included a number of optimization experts in Intel Russia, as well as Intel's Performance Library Team A retrieved image is considered to be correct if and only if it is in the same category as the query. The experiments are carried out in a personal computer with Intel Core i5 processor with 8GB RAM. The program is developed using OpenCV libraries and Visual Studio IDE. The features of the query image and database images are extracted using OpenCV libraries. Figure 5 shows the actual image along with some another features like grayscale version, extraction of color key points and contours of the sample image.

PERFORMANCE EVALUATION
The performance of a retrieval system is evaluated based on several criteria. Some of the commonly used performance measures are average precision, average recall, average retrieval rate. All these parameters are computed using precision and recall values computed for each query image.  Table 1 shows the processing time of the existing work and the proposed work for the same set of images. The processing time is calculated in milliseconds. Processing time has been reduced in the proposed work that shows the effectiveness of the proposed work. The precision of the retrieval is defined as the fraction of the retrieved images that are indeed relevant for the query. The recall is the fraction of relevant images that is returned by the query.

Figure 8. Precision Comparison
In the figure 8 and figure 9, we have illustrated the precision and recall graphs of the proposed work. Both the graphs are depicting a significant improvement that shows the effectiveness of our proposed work. Recall is showing an upward trend in the proposed work that shows the improvement over th existing work.

CONCLUSION
CBIR is a vast research area and has many open questions and challenges. Designing a CBIR system involves choosing particular feature representation techniques, optimal dimensionality and reliable similarity functions in order to achieve best results. The ultimate aim is to reduce the gap between semantic information in the image and the extracted low-level features. The developed CBIR system can be extended to include stronger features and additional learning capabilities. This will provide higher accuracy values thus facilitating the investigation of results. A larger database can also be used to increase confidence in the results obtained. Furthermore, experiments can be run on a different data set for more rigorous proof of concept. We have reviewed the main components of a content based image retrieval system by applying KNN with steerable pyramid and using multiple query images, including image feature representation, indexing, query processing, and query-image matching and user's interaction, while highlighting the current state of the art and the keychallenges. It has been acknowledged that it remains much room for potential improvement in the development of content based image retrieval system due to semantic gap between image similarity outcome and user's perception. After evaluating the results, we have reached up to the solution that we have been able to improve the CBIR mechanism using the proposed mechanism in this work. We will further test and benchmark this integrated image retrieval framework over various large image databases, along with tuning the relevance feedback to achieve optimal performance with highly reduced dimensionality.