Content Based Image Retrieval using Texture, Color and Shape for Image Analysis

Content-Based Image Retrieval(CBIR) or QBIR is the important field of research..Content Based Image retrieval has gained much popularity in the past Content-based image retrieval (CBIR)[1] system has also helped users to retrieve relevant images based on their contents. It represents low level features like texture ,color and shape .In this paper, we compare the several feature extraction techniques [5]i.e..GLCM ,Histogram and shape properties over color, texture and shape The experiments show the similarity between these features and also that the output obtained using this combination of color, texture and shape is better as obtaining output with a single feature


INTRODUCTION
From times the images have been the mode of communication for human being. Now a days, we are able to generate, store, send and share large amount of data because of the growth of Information and Communication Technology. After a decade of intensive research. CBIR technology is now beginning to move out of the laboratory and into the marketplace, in the form of commercial products like QBIC and Virage [3].

Fig1:Block Diagram of CBIR
For the given a query image its feature vectors are computed. If the distance between features of the query image and images in the database is small, the corresponding image in the database is to be considered as a match to the query. The search is usually based on similarity rather than on exact match and the retrieval results are then ranked accordingly to [6] a similarity index. The CBIR is used to operate on the query image and then obtain the output relevant to that w.r.t the feature discussed in the paper. [11] A CBIR consists of main component is feature extraction.In this paper the low level feature extraction is being done.In this paper we have discussed the texture features and the color and shape features are discussed.The [13]texture feature extraction is done using GLCM, histogram is used for color feature extrsction and different shape features are extracted from the query image.From the output obtained , it is found that the combination of low level features provides the better results in image retrieval.

MODEL
Under this we have considered the Gray Level [20] Cooccurrence matrix(GLCM) for texture feature extraction.

Gray Level Co-occurence Matrix
Gray level co-occurrence matrix (GLCM, one of the most known texture analysis methods.It estimates image properties related to secondorder statistics. GLCM is created by calculating how often a pixel with gray-level (grayscale intensity) value i occurs horizontally adjacent to a pixel with the value Each element (i,j) in glcm specifies the number of times that the pixel with value i occurred horizontally adjacent to a pixel with value j.The features [10] obtained are Homogenity, Contrast, Energy and Correlation as shown below: CONTRAST returns a measure of the intensity contrast between a pixel and its neighbor over the whole image.Contrast is 0 for constant image.
CORRELATION returns a measure of how correlated a pixel is to its neighbor over the whole image.Correlation is 1 or -1 for a perfectly positively or negatively correlated image.
ENERGY returns the sum of the squared elements [16] It is 1 for constant image.
HOMOGENITY returns a value which measures the closeness of the distribution of elementsin GLCM to GLCM diagonal.

HISTOGRAM
Color is one of the most reliable visual features that are also easier to apply in image retrieval systems. Color is independent of image size and orientation, because, it is robust to background complication. Color histogram is the most common method for extracting the color features of colored images. Color histograms are widely used for CBIR systems in the image retrieval area. It is one of the most common methods for predicting the features of an image. [18] The image histogram shows the variations of gray levels from 0 to 255, these all values cannot be used as a feature vector as the dimension is too big to be stored or compared. The image histogram must be sampled into the number of bins to reduce the dimensions of the feature vector. The sampling of the pixels into the optimal number of bins is necessary because very small bin width will represent the histogram in the form of spikes and will not contain much information which can be used and the large bin width will increase the frequencies in each bin and will not be able to distinguish between different types of objects in the image and thus the retrieval accuracy will decrease.

IV SHAPE FEATURE EXTRACTION
There are many techniques of shape description and recognition. [18] An overview of shape description techniques is given here .These techniques can be broadly categorized into two types, boundary based and region based. Boundary based methods use only the contour or the border of the shape of object and ignore its interior. Hence, these methods are also called external methods of shape extraction.. Recognition of a shape by its boundary is the process of comparing and identifying shapes by analyzing the shapes" boundaries but the local structural organization is always hard to describe. The features that are proposed in the paper are area , eccentricity, euler number and filled area.
Area: It is a scalar quantity.It is defined as the the actual number of pixels in the region Eccentricity: It is also a scalar quantity. The eccentricity is the ratio of the distance between the foci of the ellipse and its major axis length. The value is between 0 and 1.This property is supported only for 2-D input label matrices.
EulerNumber: This is also a scalar quantity. It is equal to the number of objects in the region minus the number of holes in those objects. This property is supported only for 2-D input label matrices. [26] FilledArea: This is also a scalar quantity It is defined as the number of on pixels in Filled Image.

V.SIMILARITY FEATURE EXTRACTION
The different feature extraction methods are explained above separately. The similarity feature which is used for comparing the various features is the Euclidean Distance. To retrieve the similarity images from the large image dataset, three types of Distance Metric Measures like Euclidean Distance, Chi-Square Distance and Weighted Euclidean but in the proposed method Euclidean distance is used.

Euclidean Distance: The formula of Euclidean distance is
The minimum distance value signifies an exact match with the query. Euclidean distance is not always the best metric. The fact that the distances in each dimension are squared before summation, places great emphasis on those features for which the dissimilarity is large. Hence it is necessary to normalize the individual feature components before finding the distance between two images.

VI. RESULTS AND DISCUSSIONS
The simulations were taken place in MATLAB7.5.0 .The For testing the proposed approach, the image database [19] containing 90 images (a fairly good amount for testing), has been used. It has multiple copies of an image and also it has same images in arbitrary position and rotation. The query image is also one of the images in the databases. Our framework for CBIR is built in matlab. Test results for some objects are shown in figures given below where only .retrieved images are shown for the query image on the right.  The GLCM for texture is the better method as compared to that with gabor and wavelet. The methods used for calculating the output is precision, recall and accuracy. The output obtained is as shown:

VI. CONCLUSIONS AND FUTURE WORK
In this paper we have worked with the three features i.e.texture, color and shape and its different combinations. The GLCM is used for texture feature extraction , histogram for Color feature extraction and for shape different factors are found like area eccentricity, Euler No. and Filled Area. Good Experimental results show that output obtained using these three features is better.