Combined Search Method For Content-Based Image Retrieval
Background of Invention This invention relates generally to computer technology in the field of image management systems for database classification, search and retrieval methods and systems and relates in particular to an image management system which characterizes sorts and organizes images on the basis of shape, texture, colour, of image content and stores these characteristics in a hierarchical structure enabling images having similar characteristics to be retrieved from an image database based upon combinations of the characteristics of shape, colour and texture. By comparing feature similarities or differences between an image and an input image for efficient storage and retrieval of images similar to an input image Background And Description of Prior Art In today's information technology market, search engine and relevant information retrieval technology has been widely applied to allow global knowledge to be easily accessible throughout the world. Applications of advanced web technology greatly benefit people by providing relevant information through simple graphical interfaces.
Finding an efficient method of data storage and data retrieval, is top priority in an information retrieval society from services for personal users to larger enterprises. Successful methodologies of content organisation and retrieval has empowered Yahoo,
Google, AltaVista and other well known Internet portals to provide a powerful and profitable global service. At the heart of their system are hierarchical structures which allow fast, easy and interactive information retrieval. These constructions allow large knowledge collection management and fast information retrieval mechanisms publicly available.
In advanced digital library environments, in addition to larger collection of text information, useful information also comes in the form of drawings, pictures, images, GIS maps, medical images, fingerprints, faces, industrial photographs, DNA maps, videos and other non-text media contents. In Government organizations, enormous amounts of resources are required to process business and intellectual property registrations such as business names, trademarks, patents and designs many of which also come in visual representations. For a well-developed environment, there already exists an overwhelming number of visual objects the search of which is an extremely time consuming practice especially if the subject cannot be adequately expressed by text contents. Fast, accurate and robust methods for searching non-text information will play a key role in advancing search engine technology. Higher performance in visual content retrieval will result in faster and better quality services in wider data warehouse environments. From a technical viewpoint, all search engines are the combination of sorting and searching methodologies. Sorting mechanisms provide a fast way to organize relevant information into data structures. Search mechanisms apply the reversed operation to retrieve stored information by using keywords to query text or geometric features for graphical contents. Today, a myriad of data organisation methods are available including Artificial Neuron-Network (ANN), K-Nearest Neighbourhood (KNN), Relational Data Bases (RDB), Object-Oriented Data Bases (OODB) and Balanced Trees (BT). However, when a system involves more than three key variables, it is difficult to use only a single principle parameter as primary factor to satisfy complicated requirements. To make general query results link with multiple variables, many statistical and probabilistic approaches have been developed. Since this type of optimization is applied to discrete objects, most practical methods need to apply exhaustive search schemes or use single attributes to construct specialised databases on the given property. To reduce system cost, multiple joined properties are handled by a series of merging operations among refined single attribute from selected key elements. In relational database practices, a significantly larger number of computations are
required to make a complex query on multiple joined attributes. This type of joined query is very difficult to implement in practice and poses a challenge for database system search methods. In addition to applying optimal methodologies to arrange individual key features into searchable constructions, it is also important to have practical methodologies to use a combination of multiple features to obtain more reliable results in complicated query environments. For this purpose, we have implemented an efficient methodology that allows content-based image retrieval applications to generate robust results.
Principles of Search Methods To clearly describe the major principle of our the method,of our invention it is necessary to firstly introduce the basic concepts and techniques which we use/.
Image Indexing and Feature Vector For any content-based image indexing /with n measures, it can be represented as a numeric vector with n positions as a feature vector form. I = (lQ,...,I1,...,In_1), i [0,ή)
For any given image x, there is a numeric projection on each given component as a numeric measure. This creates a numeric vector denoted as I(x). I(x) = (l0 (x),..., I, (x)..., J„_ι (x)), i e [0, n)
Under this definition, I(x) is an image index of the image x representing the same object as a measurable vector in a feature vector. Different applications may require variable lengths of feature vectors. The length n of the vector is an important number which indicates the complexity of possible vector spaces. In reality, longer vectors always have more complicated properties than shorter ones. Short vectors can simply be extensions to fill zero values on their non-match parts to make two vectors in the same length. Using vector computation, a well-known equation can be used to compare two vectors.
Similarity Measures between Two Feature Vectors Let I(x) and I(y) be two indexes, the similarity measure S is defined by
S(x,y) = e [-l, l]
n-1 Where I(x) • I(y) = V 7, (x) x I
t (y) denotes the inner product between I(x) 1=0
and I(y), and is the normal form of the vector I(x).
According to vector computation, the similarity measure has a real value range between [-1, 1]. It can be proved that when S (x, y) = 1, then if and only if I(x) = I(y) and \I(x)\ > 0. In such condition, two vectors are exactly the same. For all comparisons in this invention, only S (x, y) in [0, 1] will be used. Under general conditions, an indexing vector may contain many attributes. This makes the similarity equation very powerful in theory. However in practice the equation may not be so useful since longer vectors indicates extremely complicated combinations in its combinatorial space. Optimal conditions on multiple variables, even with the most advanced modern computational tools, are extremely difficult problems to solve .
Segments of Feature Vectors When we do not focus attention on general considerations but instead restrict our target to a specific problem such as content-based image search, it is feasible to assign useful measurements for certain categories which are allocated to set positions on the vector. Under this arrangement, a feature vector composed of segments can be assigned. Each non-overlapping segment can correspond to one property of the image. Using conventional terminology, there can be m distinct partitions on the vector. Each partition is a segment with a fixed number of attributes to represent values of special measures. For example, common image indexing uses colour, shape and texture information for content-based image searching. These three special features could be represented by three segments on the vector shown in Figure 1.
Length n Indexing I
Three Features F: (m = 3)
Three Special Features: Fo =C = Colour E; =S= Shape F2 =-R=Texture
Figure 1 Indexing Vector & Feature Segments
Phase Space of Segmented Feature Vector Each feature is treated as an individual segment, defined combinations need to be distinguished. Each of the m features could be either selected or not selected. A vector form F m' m components is shown as follows
F = (F0,...,FJ ,...Fm_1) = ∑FJ , j e [0,m) The vector form can make a total of 2m combinations composed of its feature space (a phase space or a combinatorial state space) simply by selecting or unselecting its specific components. Using conventional terminology, all possible combinations are composed of a feature space denoted by Ω. That is,
Where t) is the y-th bit of integer /. R
2™"1 is an empty operator, it does not matter what m values are selected.
When m = 3, eight vectors can be constructed in the feature space Ω(3).
In general, there are 2m - 1 non-trivial combinations composed from the phase space using m features. Combinatorial numbers increase at an exponential rate. For example, m = 10, the feature space will contain 1024 combinations.
Measure Space of Feature Combination Let x be an image in an image database Ψ, xe Ψ , with a total numbers of N images in the database. In convention, an index can be represented as an integer 1, 0 < I < N, and the measure space of the entire image database can be denoted by Θ. so that, {3x € Ψ I / = F(x),I € [0,N),N = |{x|,Ψ = {x}}, ^/ | / = {R},R = (/ ,...N;,.N _1)N; = E;(x), € [o,2ffl), ; e [o,m)}
The above equations indicate that each index /has (2m-ϊ) non-trivial combination with m features as measure vectors. In addition, a total number of N*(2m-1) distinct indexes plus one empty vector are in a measure space to form the bottom levels of combined Meta KB. Θ(N) = {V/ | / e [0,N)} |Θ(N)| < N* (2m -l) + l Considering m > 1, 2m » m, the number of combinations in the feature space increases exponentially with the number of features used and thus the range of m should be chosen very carefully. As is well-known in advanced mathematics, the solutions of
combinatorial superposition vectors cannot be easily calculated from multiple measure vectors and their solutions. It is thus necessary in practical applications to process a total of N*(2m-1) vectors in the measure space of the Meta KB. Computer software containing methods and systems for processing and retrieving images on a computer database are disclosed in patent documents such as
U.S.patent 5,325,445 granted to Eastman Kodak Company which describes a feature classification system for images which involves classifying features of defect in an image rather than the overall image and uses a hierarchical data structure for efficient access in order to claissify features of an input image in real-time. Specifically, the system assigns unique classifications to selected features of the same type to produce an n-element feature vector for each feature, thereby defining an associated n-dimensional feature space. The assigned classifications are used to cluster the corresponding feature vectors in feature space and define a tree-like hierarchical decomposition of n-dimensional feature space. The hierarchical structure disclosed is used to classify individual features, such as photographic defects of an image rather characteristics of the overall image. U.S.Patent Application 2002/0021841 filed by Hiroto Yoshii discloses an information processing method for recognizing patterns within an image. The method includes generating a classification tree on the basis of learning patterns by generating a hierarchy of new feature amounts based on a linear combination of feature amounts derived from learning patterns U.S Patent Application 2002/0122587 filed by Samsung Electronics Co.Ltd describes an image retrieval method in which an image similar to an input image is retrieved based on combinations of colour and texture features of the input image .for determining colour and texture distances between the input image and each data reference image in an image data base and weighting the colour an texture distances with predetermined weighting factors by considering human visual perception attributes However, there does not appear to be any disclosure or suggestion of using hierarchical data structure to store the colour and texture information.
Summary of The Invention. The invention provides a method and system for performing combined image classification, storage and rapid visual content retrieval on a computer database by using random combinatory information and involving sorting relevant information into categories and organizing it into data structures according to allocated indices In more detail the method includes the step of vector calculation by comparison of multiple feature vectors of each image, each image is equivalent to one vector system of five components for image classification involving a number of combinative category steps selected from image space, base space, base measure space, feature space and measure space using a combination of measure vectors and three structure levels represented by five spaces. The three structure levels are measured with features of the vector segments of colour, shape and texture derived from an initial enquiry from a selected image as input. A combined search is automatically performed using multiple levels of visual content image structure recognition represented by multiple spaces comprising a first level containing an image database space (designated Image DB) and a feature space (Base Space, a second level containing a mete-knowledge space (Meta KB Base or Measure Space) and a third level containing a phase space (Feature Space); an additional level is a combined meta knowledge base (Measure Space)
Method of Combined Search Scheme
Three Levels of Structure The whole structure is composed of three levels represented by five spaces.
The first level contains N images and m features composed of an image space (Image DB) and a base space (Feature Base). The second level contains N indexes and 2m combinatorial members in a base measure space (Meta KB) and a feature space (Phase Space). To extend the feature space and the base measure space, a third level is created as
a measure space (Combined Meta KB) containing N*(2m-1) non-trivial indexes. The concept relationships are represented in Figure 2.
Figure 2 Five Essential Spaces in Combined Searc Scheme There are five main parts in Figure 2 denoted by (l)-(5) respectively. (1) Image database (Image Space) containing N images; (2) Feature base (Base Space) with m distinct features; (3) Meta -Knowledge Base (Meta KB, Base Measure Space) containing N indexes;
(4) Phase space (Feature Space) containing 2m-l non-trivial forms; (5) Combined Meta Knowledge Base (Measure Space) composed of N*(2W -1) non- trivial indexes.
Conditions For Using Measure Space The most difficult issue in automatic visual objects recognition is that all visual data firstly needs to be translated into numeric data for the artificial recognition
process. However, due to the approximate nature of this translation process, the subsequent numeric analysis can only partially emulate human recognition. Different people each influenced by varying personal experiences have alternative opinions that disagree with machine selected categories. Interpretations of any given image which depend upon using numerical measures to distinguish visual objects have similar limitations . Simply using colour, shape and texture as matching criteria is insufficient for generating perfect results. Since multiple viewpoints naturally exist in both imagination and reality, it is important to have an efficient mechanism to conveniently use all available information in multiple cases to support selected queries. The contribution of each feature state may invaluably enhance the accuracy of final results. A proper method and system that optimally utilises combinatorial information is needed to increase the reliability of the retrieval process.
Essential Organization of Combined Meta KB A system provided by the preent invention is shown in Figure 3. The main components in the system can be divided as nine diagrams in (6)-(14).
12
(13) N*2m indexes can be created from the N indexes (diagram 10) under 2m feature combinations;
(14) The combined Meta KB provides a data structure into which N*2 indexes are hierarchical organised. There are combination links between diagrams (9)-(12) and (10)-(13).
Each individual index has 2m combinations from its feature space. There is one-to-one hierarchical mapping between the linear list of images (7) and and the hierarchical image storage database (8). In practice, this type of linking can be implemented using a LUT (Look Up Table). Hierarchical relationships between the diagrams (10) and (11) strongly depend on the similarity measures. If Meta KB in the system relies on non-structured organization, it must apply exhaustive method in which the search time will be proportion to O(N) to find the best candidate for each query. Using ANN, KNN, BT and other hierarchical technologies, fewer operations are needed to retrieve values within the index set and Meta KB, reducing the search time complexity to 0(log N) or O( N) depending on how well the data structure is implemented. From a structural viewpoint, the diagrams (13) and (14) may use the same methods as diagrams (10) and (11). A clear difference, however, is that the number of indexes contained in the combined Meta KB is significantly larger than in original Meta KB. To find the best candidate from all non- trivial indexes using exhaustive search will require 0(N) computations between diagrams (10) and (11), and at least O(N*(2m-l)) computations between diagrams (13) and (14).
Combined Search Method Under typical conditions, the similarity operation provides orthogonal projections to partition the original 0(N*(2m-l)) vectors in the measure space into (2m-l) portions each of which may contain a projection of the original N measure vectors on selected features. Each measure vector needs to match a set of best candidates (vector with the greatest similarly to the vector of the original query) from a query. Orthogonal properties of measures provide an efficient framework in matching the queried best
ll
11
-Arrows: indexing T! Collection-i II Combination r | Hierarchy rLi-nking Figure 3 Image, indexing, and combination among Image DB and Meta KB Flowchart describing the interaction between the image, Image DB and Meta KB
(6) -An individual image;
(7) Image collection has N elements containing the image (diagram 6);
(8) Image Database provides the image collection (7) with physical storage;
(9) An index corresponds to image (6);
(10) Index base contains N indexes corresponding to the image collection (7);
(11) N indexes are hierarchically organised into the Meta KB structure.
(12) 2'" indexes can be generated from the index in diagram (9) under 2m feature combinations;
vector candidates collected from Combined Meta KB with a vector from the same feature space. For example, when the joint properties Colour + Texture are used to find the best candidates from the same measure space, the most relevant candidates will also come from the Colour + Texture group due to the different feature segments and their orthogonal properties. In general, (2m-l) different query sequences as a whole may contain a large amount of redundant information. In practice, only a small number of the best candidates need to be collected from the original N members. To wisely use multiple feature advantages, the Meta KB need to be searched with combined queries. After collecting the information from all possible projections, appropriate merging operations are required to eliminate redundant results thus increasing relevancy. A final sorting operation will be applied to the remaining candidates arranging them linear descending order with the best candidate first. A number of the best candidates will be selected to obtain more robust results than with a single feature selection. The entire system of operations is shown in Figure 4.
Figure 4 Combined Search Method and System There are ten diagrams in Figure 4 denoted by (15)-(24) respectively.
(15) Initial query from a selected image as input;
(16) (2m- 1 ) measure vectors prepared for combinatorial measures;
(17) Each measure matched with the Meta KB (diagram 19) to obtain/? best candidates
(18) Combined Meta KB containing at least N*(2m-1) measures in the measure space;
(19) All matched results < p *(2"'-l) collected from diagram (18);
(20) Results merged under certain conditions to reduce redundancy;
(21) Merged results sorted by similarity measures;
(22) Best q candidates selected from list;
(23) Final q results outputted.
(24) Content-based indexing of the image based on m features; In proposed Combined Search Method and System, three diagrams 18, 19 and 21 illustrate the key novelty of the invention.
Key Parts of Combined Search Method Conditions to get Best Candidates from Meta KB Two key components determine whether the proposed system is a practical one. Figure 5 focuses on diagrams 18 and 19 from the system for further discussion.
(18) For each /, 0 < i < (2m-l) F(x) is matched with all (19) measures in Combined Meta Combined Meta KB to obtain <p best members KB
Figure 5 Obtaining best candidates from Combined Meta KB In general, it is necessary to perform (2m-l) queries to Combined Meta KB 0 (Diagram 19) to find the best/? candidates from each query. Considering the Meta KB 19) contained at least N*(2m-1) items, all of which are in random order without structural organization, then it is necessary to perform 0(p*N*(2m-ϊ)) operations using an exhaustive search to find the ? best candidates from the Meta KB (19). Each of the (2m -1) queries would require 0(p*N*(2m -l)2) operations (for any m > 1, (2m-l) » 1) making 5 this type of search operation extremely expensive. To feasibly implement such a method, it is essential to use the fast search mechanism for searching the Combined Meta KB (19) which therefore must be structurally organised. All N*(2m-1) items have intrinsic connections to the original N elements. This condition allows the entire structure to be linked to a full combinatorial o lattice. Proper construction involves organizing all relevant elements into a lattice with all N*(2m -1) indexes as the end nodes. In addition, complicated hierarchical construction may be required to pre-organize nodes into a structure composed of clustered
intermediate nodes (implicit nodes). From artificial intelligence and advanced information retrieval applications, a number of structural organisation techiques can be applied on the lattice such as ANN, KNN. Even though this is an extremely important issue for applications, it is not a subject for our monopoly claim and others are free to use any method to complete this task.
Determination of Usability and Suitable Configurations Two typical search models will be used for comparison. The first one takes (log N) computations to search N ordered items providing the fastest possible theoretical computational time from a clustering approach while second model fakes O(VN) computations on the same collection. This may provide realistic boundaries to the most commonly used clustering approaches such as ANN, KNN and even Quantum Database Search methods. In the method according to the present invention, it is necessary to apply the state-of-art clustering technologies to the data structure to reduce the search time. The time required to find the best candidate is between 0(log N*(2m-1)) ~ 0(log N) + O(m) and O(V(N*(2m-l))) a O(VN*2m/2).
To collect all ?*(2m-l) best candidates would require O((2m-l) * (logN+ m)) to O(( (2m-l)*V(N*(2m-l)) computations. In terms of the computational complexity, there are always O(2ra) factors involved in this type of problem. To use this method, it is necessary to restrict m within the range [2, 4] to avoid the significant increments in computational complexity when handling larger data structures. To readily compare results, speedup ratios between an exhaustive search and two combined hierarchical clustering search algorithms will be evaluated using above complexity equations to determine the range for Ν and m in which the proposed method is practicable. In Table 1, the following equation is applied to the range N= 100
- 1,000,000: N Speedup Ratio I : (2m -l)x (logN + m)
N Table 1 Speedup Ratio I = (2m - l)x (logN + m)
(All values rounded to nearest integer) Under the same conditions, the second type of speedup ratio is shown in Table 2 using following equation: Speedup Ratio II =
From the speedup ratios listed in Tables 1 and 2, it is interesting to see that the proposed methods will be more valuable for larger values of N. This speedup ratio property provides a measure of economic viability thus allowing judgement of whether the proposed method is better than exhaustive search in real life applications. Values in Table 1 and 2, provide boundaries against which real life search time efficiency can be compared. It is feasible to determine the optimal properties of the applied method directly without involving any complex analysis. In short, only optimal clustering constructions can support the proposed scheme in reality. Without an efficient mechanism to find the best /? candidates significantly faster than exhaustive search, the proposed method cannot be efficiently applied to real life environments.
Merging Three Methods to Increase Visibility and Reliability When the best p*(2m -1) candidates are collected, there may exist many candidates linked to the same indexes. It is important for the system to merge duplicate samples by removing redundant information. (21) Merging methods to the relevant members in BCC • Νon-merging • Add-merging • Max-merging Renormalization on measures Figure 6 Merging Methods and Their Renormalization
There are (2m -1) queries processed in the search process and thus highly probable to obtain the same sample many times with various similarity measures. The three merging methods used in the system are • Non-merging method • Add merging method • Max merging method
Non-merging Method The non-merging method provides a restricted merging action to merge all relevant original samples from the same index into a representative. The representative will always be the largest similarity measure in the group as shown in Diagram 25. (25) For all J, J in BCC on x, do If I , J are the same index then I removed from BCC, i-f S [ I- x) ≤ S.J, x) or J removed from BCC, if S ( I, x) ≥ S.- , x) Where S(T, x) and S(-J, x) are I and J' s similarity measures Repeat until non-redundancy in BCC
Add-merging Method
The add merging method is an enhanced merging mechanism that merges all combined samples into the base index and uses a summarized measure as a representative as shown in Diagram 26.
(26) For all I. J in BCC on x, do If I , J are the combined indexes relevant to the same group then Renew S(I, x)= S(I, x) + S(- , x) Removed -J from BCC Where S(J, x) and S ( J, x) are I and J's similarity measures Repeat until non-redundancy in BCC
Max-merging Method
The max-merging method is an alternative enhanced merging mechanism to merge a group of all relevant combined samples to one base index and uses the maximal measure of the group as a representative as shown in Diagram 27. (27) For each I, J in BCC to x, do If J, J are two combined indexes reslevant to a group then Removed I from BCC, if S(J, x) ≤ S ( J- x) or Removed J from BCC, if S(J, x) > S ( J- x) Where S(I, x) and S ( J- x) are I and J' s similarity measures Recursion until non-redundancy in BCC
Renormalization Method Both non-merging and max-merging methods keeps the range of measure values within [0, 1]. However the Add-merging operation alters alternative measures of representatives if certain redundant indexes are merged. Renormalisation converts combined measures into the original measure range. For consistent results and better illustration, it is necessary to use a transforming equation to renormalize measures and restrict similarity measure values between [0, 1] shown in Diagram 28.
(28) For each I in BCC on x, do Renew S ( I, x) -- = 1 - exp(- -S(J, x) /2 )
After renormalization, all measures are contained within the range [0, 1], with one measure for each representative. A sorting scheme can be used to order the elements in the best candidate collection in linear order allowing the first q best candidates to be selected as the final output (Diagrams 23 - 24).
Possible Applications of Our Search Method Our method uses combination and transforming redundant samples to enhance reliability in information retrieval. To demonstrate the method's usability in practical applications, a variety of test cases are discussed.
Determinating the Quality of Content-based Retrieval Systems Multiple features and combinations provide a natural way of representing complicated objects with multiple viewpoints. Under the proposed construction, each index is matched with its maximal similarity. This property can be easily applied to determine search accuracy on any implemented system.
Lack of Accuracy in Feature Indexes This problem within information retrieval system can be observed using the following combined approach. We may observe many images with the same similarity measures. The best matched candidates might be in the list but might not necessarily be in the first position if the hierarchies of the Meta KB wasn't constructed by optimal methods. Such problem is usually caused by a lack of separable information to make significantly distinct projections among used features. Under such conditions, significant enhancements on feature indexes are required. This effect can be also observed under non-merging and max-merging strategies shown in Figures 8-10.
Sufficient Information on Feature Indexes In contrast to the previous condition, if each index already contains suffipient information in a well organized hierarchical structure then the queried object will always
appear as the first match within the results. When every selected index has this property, it indicates that the whole Meta KB is constructed well with accurate links from each feature and combination to their real objects. This effect can be observed under non- mergihg, max-merging and add-merging strategies as shown in Figures 16-19.
A Simple Way to Check Retrieval Quality For well indexed and hierarchical constructed databases, the non-merging method can be applied to determine the retrieval system's quality. For well indexed features and a well constructed Meta KB, a proper query result needs to contain (2m -1) non-trivial combinations occupying the first (2m -1) positions from the best q candidates, where q > (2 -1). This effect can be shown in Figure 17.
Checking for Duplicates Within Image DB Under a proper constructed retrieval system using add-merging or max- merging method, it is efficient to check for duplicate images from the queried list simply checking neighbouring entries within the Combined Meta KB. In real world environments, it is common to have a large number of duplicated images storied in different directories with different filenames and varying sizes. The proposed method can thus serve as a convenient method of removing this type of duplication. This will assist greatly in creating refined image databases and in removing redundancy from base image collections. This process is shown in Figures 12-15.
Appendix: All example figures are manipulated under the following parameters. There are 1005 images in its image database and a total number of 7035 combined indexes in the combined Meta KB. The exhaustive search on the samples needs to exhaustively match all 7035 indexes. Base Image Number = 1005 Combined Meta KB indexes = 7035 Feature Number = 3 Non-trivial combination number = 7 Index vector length = 86 Table 3 Time Measures & S eedu Ratios in Test Sam les
Compared with Table 1 Speedup Ratio I (m = 3, N = 1000) = 11, a well-constructed test system in optimal conditions may belong to Speedup Ratio I category.