Wavelet and its applications
Comparing images is impractical and inefficient
for large scale image retrieval. Wavelet transformation is a toll for processing images at multiple resolutions. In this project we use the discrete wavelet transform (DWT). The wavelet toll is an efficient, highly intuitive frame work for representation and storage of images. This tool provides insights into the image’s spatial and frequency characteristics.
The wavelet transform is a tool is used for analyzing functions at different levels of detail. The DWT has a property of analyzing images at multiple resolutions. It is similar to the Fourier transform, but encodes both frequency and spatial information.
Wavelet tools can be used in a wide range of research on images such as in identifying pure frequency, de-noising, and image compression naming a few. In the project, wavelet transformation is used for decomposing images for retrieval of signature content (CSGV).
By saving the few largest wavelet coefficients for an image, it is possible to recover a fairly accurate representation of the image.
The Wavelet Transform
|
|
|
|
| 20 coeffs | 100 coeffs | 400 coeffs | Original (16,000 coeffs) |
The example shown above is used for image compression the third image from the left would require 3% of the of disk space as compared to the original image. In our project, we decompose upto 3 levels which results in a significant “signature content” and because the content is small enough, it allows a higher higher on searching for images in a large scale image database.
The Wavelet Toolbox (a collection of Math Works function of wavelet analysis) which is used in the implementation phase to decompose images to the user desired level.
Image Retrieval
An image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images.
A common saying goes “A picture is better than a thousand words”. Images represented using tags, labels or captioned tend to lose what the actual information the image represents. Images have a large amount of information through human vision and computer vision. Using multiple tags to represent the content of an image simply does not describe an image for efficient retrieval. Content-based image retrieval (CBIR) uses the actual content of the image proving to be more efficient but yet challenging. The most important factor of image retrieval is its accuracy. One problem with using image search results as a training set for a classifier is the high percentage of unrelated images within the results. Estimation has shown a high number of inaccuracies of the result of image in Google image search. Problems with traditional methods of image indexing have led to the rise of interest in techniques for retrieving images on the basis of automatically-derived features such as color, texture and shape.
Most traditional and common methods of image retrieval use methods of adding meta data such as captions, keywords, tags, or descriptions to the images so that retrieval can be performed over the annotation words. Manual image annotation is time-consuming, laborious and expensive; to address this, there has been a large amount of research done on automatic image annotation.
Content-based image retrieval (CBIR), also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR) is the application of computer vision to the image retrieval problem. In other words, an image produces image data in a form of rows and columns. This image data derived from computations can be used to produce vector or quantifiers which would then be a primary key (index) for an image to be retrieved in a large database.
“Content-based” means that the search will analyze the actual contents which in this context; it will be the contents of the image. The term ‘content’ in this context might refer to colors, shapes, textures, or any other visual information that can be derived from the image itself. Without the ability to examine image content, searching for images must rely on meta data or the traditional methods.
Meta data are very hard to generate which proves to be more expensive. A security camera capturing picture could be caption by the time, date, location rather than by the actual contents of the image it represents. Here, CBIR comes into play by deriving image data for analysis and use of different image problem based areas.
Read More
