US20120250984A1 - Image segmentation for distributed target tracking and scene analysis - Google Patents
Image segmentation for distributed target tracking and scene analysis Download PDFInfo
- Publication number
- US20120250984A1 US20120250984A1 US13/309,551 US201113309551A US2012250984A1 US 20120250984 A1 US20120250984 A1 US 20120250984A1 US 201113309551 A US201113309551 A US 201113309551A US 2012250984 A1 US2012250984 A1 US 2012250984A1
- Authority
- US
- United States
- Prior art keywords
- segmentation
- image
- planes
- feature
- pixels
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/292—Multi-camera tracking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
- H04N7/181—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
Definitions
- the present invention relates generally to machine vision systems and methods and specifically to image segmentation to determine salient features in an image.
- the present invention relates generally to machine vision systems and methods and specifically to object tracking and scene analysis for distributed or mobile applications.
- Segmentation is a method of breaking an image into coherent regions and is a common problem in Computer Vision. Many known methods of segmentation are computationally intensive, making them unsuitable for fast, low power, or low cost applications, such as for use with distributed or mobile devices. However, there is a need for an algorithm that is more amenable to real-time implementation.
- the first group consists of algorithms that view the image as a graph and use various metrics to measure the difference in appearance between neighboring pixels or regions. Once the problem has been formulated in this way, the algorithms center on the problem of dividing this graph into pieces so as to maximize coherence.
- the Normalized Cut algorithm developed by Shi and Malik proceeds by recasting the graph segmentation problem in terms of a spectral analysis.
- Felzenszswalb and Huttenlocher proposed an efficient approach to grouping pixels in an image by making use of a spanning tree and showed that locally greedy grouping decisions can yield plausible results.
- Pedro F. Felzenszwalb and Daniel P. Huttenlocher “Efficient graph-based image segmentation,” Int. J. Comput. Vision, 59(2): 167-181, 2004. ISSN 0920-5691.”
- This approach also revolves around the computation of multiple pairwise distance values. There remains a need for avoiding the computational costs associated with distance computation.
- K-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean.
- K-means clustering is typically considered an NP-hard problem. While some approaches have been proposed to mitigate this problem including the method developed by Elkan, which seeks to accelerate the process by invoking the triangle inequality, and Locality Sensitive Hashing schemes that search for near neighbors in the feature space, these have been unable to fully mitigate the distance computations required. (See Charles Elkan, “Using the triangle inequality to accelerate k-means,” In International Conference on Machine Learning, 2003; Piotr Indyk and Rajeev Motwani.
- Embodiments of the present invention address and overcome one or more of the above shortcomings and drawbacks, by providing devices, systems, and methods for segmenting images.
- This technology is particularly well-suited for, but by no means limited to, real-time segmentation of images for identifying salient features of an image.
- Embodiments of the present invention are directed to a method for identifying salient segments in an image comprising the steps of choosing a plurality of planes that divide a feature space into a plurality of cells, generating a first set of hash codes for a first set of pixels in an image based on a location in the feature space of a feature vector associated with each pixel in the first set whereby the location of each feature vector relative to each of the plurality of planes contributes a binary value to each hash code, determining a second set of hash codes selected from the first set of hash codes, wherein the second set of hash codes indicates local maxima of clusters of pixels, assigning each of the first set of pixels to one of the second set of hash codes based on the hamming distance between a first hash code from the first set of hash codes assigned to each pixel and each of the second set of hash codes, and identifying segments in an image based on groups of adjacent pixels sharing common hash codes.
- the step of choosing a plurality of planes may further comprise selecting a predetermined number of random planes in the feature space.
- the feature space may be a three-dimensional color space.
- the step of choosing a plurality of planes may further comprise training the plurality of planes based on human interpretations of images.
- FIG. 1A is a two dimensional view of a simplified 2D version of the randomized hashing scheme in a feature space fractured into regions by a set of randomly chosen splitting planes Each region is associated with a hash code indicating where it falls with respect to the splitting planes;
- FIG. 1B is a four-dimensional view of the relationship between hash codes and associated with the vertices of a hypercube in an exemplary embodiment
- FIG. 2 is a flow chart showing operation of an exemplary algorithm for use with segmenting images in accordance with some embodiments of the random hashing scheme
- FIG. 3 shows two histograms of the distributions of the GCE and Rand Index metrics over a sample data set as a result of the application of certain embodiments of the random hash scheme.
- FIG. 3 includes sample images along with the human segmentation and the machine segmentation.
- FIG. 4 shows a plurality sample images, along with the human segmentation and the machine segmentation, including both prior art and an embodiment of the random hashing scheme.
- the segmentation scheme described in this application employs a feature based approach.
- Each pixel in the image is described by a feature vector which encodes a set of properties used to describe that pixel.
- Embodiments of the present invention can employ a simple color descriptor vector, which is an example of a feature vector, but some embodiments also use more sophisticated feature vectors such as a histogram of color values or a vector of texture coefficients.
- Some embodiments can employ an approach to segmenting natural images which leverages the idea of randomized hashing. The procedure aims to replace the problem of finding clusters in the feature space with the problem of finding local maxima in a graph whose topology approximates the geometry of the underlying feature space. In so doing the method can bypass the computational effort associated with computing distances between feature vectors which can comprise a significant fraction of the effort in other techniques such as k-means clustering and mean shift segmentation.
- the method can be controlled by a few parameters namely, the number of random splitting planes, n, The Hamming Distance threshold, k, and the window size that is used to average the color vectors, w.
- the algorithm can be made to produce over segmentations or under segmentations of the input imagery.
- the number of segments that are produced is implicitly controlled by these parameters rather than explicitly provided as an input to the algorithm.
- a feature space for use with some embodiments includes a 3D space with orthogonal axes for each color: red, green, and blue (RGB).
- RGB red, green, and blue
- the feature space uses axes for Y, Pb, and Pr.
- more than three dimensions can be used.
- other dimensions can include motion vectors, depth, or other information that can be obtained related to each pixel, so that each pixel (or group of pixels) can be mapped to the feature space.
- Entries in this feature vector characterize salient properties of the region surrounding that pixel such as color (e.g. RGB, YPbPr, or multispectral properties, such as IR, UV, or information from a FLIR camera), texture, frequency content, and/or motion properties.
- color e.g. RGB, YPbPr, or multispectral properties, such as IR, UV, or information from a FLIR camera
- texture e.g. RGB, YPbPr, or multispectral properties, such as IR, UV, or information from a FLIR camera
- FIG. 1A depicts a simplified 2D version of the randomized hashing scheme in a feature space fractured into regions by a set of randomly chosen splitting planes Each region is associated with a hash code indicating where it falls with respect to the splitting planes.
- the set of all hash codes can be associated with the vertices of a hypercube as shown in Figure (b) here the shading of the nodes indicates how many feature vectors are hashed to that code.
- the segmentation scheme proceeds by identifying local maxima in this hash code space.
- FIGS. 1A and 1B show a simplified view of this procedure in two dimensions.
- the random splitting planes 0 , 1 , 2 , and 3 are used to hash the feature vectors into a set of disjoint cells based on their location.
- On can hash a set of vectors into a set of discrete bins in order to accelerate the search for nearest neighbors.
- This randomized hashing procedure tends to preserves locality so points that are near to each other in the feature space are hashed to the same bin with high probability.
- the proposed segmentation scheme leverages these phenomena to cluster the feature vectors into groups.
- each splitting plane can also be considered a normal vector and a decision point (such as a mean) in the feature space. By projecting each pixel on the that normal vector, and determining its position relative to the decision point, a bit associated with each normal vector can be assigned.
- Neighboring cells in the feature space differ by a single bit so the Hamming distance between the codes provides some indication of the distance between vectors in the feature space. More generally we can construct a correspondence between the set of all possible hash codes and the vertices of an n-dimensional hypercube.
- the topology of the hypercube 100 reflects the structure of the feature space since neighboring cells in feature space will correspond to neighboring vertices in the hypercube.
- the shaded nodes, 1001, 0000, and 0111 are those bins that have local maxima clusters in FIG. 1A .
- the clustering procedure can record how many feature vectors are mapped to that code.
- clusters in feature space will induce population maxima in the code space. That is, if we consider the hypercube as a graph we would expect to observe that some of the hash codes have a greater population than their neighbors. This allows us to replace the original problem of clustering vectors in the feature space in favor of the simpler problem of looking for population maxima in the code space graph.
- the algorithm interrogates all of the codes that differ from the current code by k bits or fewer.
- This parameter, k is referred to as the Hamming Distance Threshold. If the code under consideration has a population greater than all of its neighbors it is declared a local maxima and a cluster center. In this way the number of clusters recovered by the procedure is determined automatically based on the data as opposed to being imposed a priori as in k-means. Note that this scheme can be used to distinguish up to 2 (n-k) local maxima.
- the normals associated with the splitting planes, u i can be chosen randomly or based on a priori knowledge.
- the splitting values, s i can be chosen by considering the distribution of the projected values, (v i ⁇ u i ).
- the mean of the distribution which corresponds to casting all of the splitting planes through the centroid of the distribution, is used.
- the median value, and the value midway between the maximum and minimum projected values is used as the splitting value (i.e. the decision point for assigning bit values to pixels projected on the normal vector.
- each of the feature vectors is labeled with the hash code of the closest local maxima based on the Hamming Distance.
- the Euclidean distance between the feature vector and the mean cluster vector is used to break the tie and decide the label.
- FIG. 2 shows method 300 for segmenting an image.
- a feature space is divided by a predetermined number, n, of splitting planes.
- This step can include randomly assigning and feature planes and using a calibration image or a first image to the splitting plane for a given orientation. For example, the position of each plane can be chosen such that feature vectors (e.g. those feature vectors associated with each pixel in a test image) within the feature space are evenly divided on either side of the splitting plane.
- each splitting plane can be created by choosing a random normal vector and assigning a decision point along that vector such that the decision point is at the mean or median of the distribution of feature vectors (e.g. pixels projected on that normal vector).
- step 302 takes into account predetermined splitting planes that have been created via prior training or have been manually assigned.
- the splitting planes can include a combination of random splitting planes and preassigned splitting planes.
- each feature vector image (or a subset of the feature vectors in image) is hashed using the splitting planes.
- This process can be computationally simple as each bit in the hash simply determines which side of the splitting plane the feature vector resides. Because only a bit is used for this hash, additional computational overhead needed for deriving the Euclidean distance from the splitting claim is not necessary.
- Step 304 can be an iterative loop whereby each feature vector is taken and compared in succession to each of the n splitting planes. It will be appreciated that massive parallelism and may be possible using the right processor, such as a DSP or graphics processor, to perform this step.
- the algorithm proceeds to step 306 .
- the number of feature vectors resident in each cell in the feature space is counted.
- the population counts for each cell in the feature space are compared to choose a number of local maxima.
- the number of local maxima, M is predetermined prior to image processing.
- the number of local maxima, M is derived dynamically from the image based on the results of the residency counts of each cell in the feature space.
- the maximum number of local maxima, M can be determined based on the number of splitting planes used and the Hamming distance requirements for clusters used to image segmentation. Once the local maxima are identified, these can be used as the center of each cluster used for image segmentation.
- each feature vector under test can be assigned to each of the cluster centers determined in step 308 .
- this assignment has low computational is and where.
- In overhead because the hash of each feature vector is compared to each of the cluster centers and the cluster having the nearest Hamming distance to the hash of the vector is selected as the cluster to which the feature vector will be assigned.
- ties between competing clusters i.e. clusters that are the same Hamming distance away from the hashed feature vector
- image segments can be identified within the image at step 312 .
- adjacent (i.e. connected) pixels in the image plane that have had their feature vectors assigned to the same cluster can be considered part of the same image segment.
- Any known technique can be used, such as minimum threshold sizes for segments of adjacent pixels. In this way, image segments can be rapidly deduced from the image by scanning pixels in the image plane and determining whether they share the same cluster in the feature space.
- step 314 may optionally adjust the splitting planes based on the results in step 312 .
- the results of step 312 can be compared via a consistency score with a desired segmentation result. Based on this result, the splitting planes can be adjusted to improve this consistency score.
- incremental improvement of splitting planes is done in real time based on other criteria, such as reducing the number of identified segments in step 312 or increasing the number of identified segments in step 312 to reach a desired range of segments.
- the proposed scheme can be similar in spirit to the Mean Shift segmentation algorithm which also seeks to identify modes in the distribution of feature vectors.
- the mean shift scheme uses a Parzen Window based scheme to estimate density in feature space
- the proposed scheme uses hashing, which can be randomized or tailored to apriori information, to identify salient groupings of feature vectors.
- the segmentation scheme can make implicit use of the Johnsson-Lindenstrauss theorem which justifies the use of random projection by bounding the distortion of the relative distances between the feature vectors induced by the projection process.
- the hashing scheme may use planes that are predetermined, selected from a bounded group of planes, estimated or calculated based on some selected criteria. For example, if a particular color is known to be important, the hashing scheme can take this into account, by using at least some predetermined feature planes to create the hash or introduce a bias into the selection process, which might otherwise be random.
- the hashing scheme may also include an adaptive selection algorithm to learn how to select more appropriate feature planes in future video frames. This learning process can be by any scheme known to a person of ordinary skill in the art, including using human feedback to help train parameters, genetic algorithms, decision trees, pruning, beam searching, or the like.
- n denotes the number of projection directions
- m denotes the dimension of the feature space
- N denotes the total number of pixels or feature vectors. Note that the scheme can avoid the explicit distance computations between the feature vectors that one uses in most agglomerative and k-means segmentation schemes in favor of randomized hashing.
- the proposed segmentation scheme can be carried out using training.
- training For example, the Berkeley Segmentation Database which contains 1633 manual segmentations of 300 color images that can be used for this purpose.
- the selected splitting planes can be improved.
- training can be used to assist in selecting the appropriate number of splitting planes or in selecting specific splitting planes to include.
- the manual segmentations provided by the users can be compared with the segmentations produced by the algorithm using two different measures, the Global Consistency Error and the Rand Index.
- GCE Global Consistency Error
- Martin, Fowlkes, Tal and Malik can capture the difference between two segmentations in a single number between 0 and 1 where lower numbers indicate lower error.
- the measure can be designed such that if one segmentation is a refinement of the other the score will be zero. This is a useful feature since it accounts for the fact that human subjects often choose to segment scenes to various levels, however, it also implies that machine segmentations that are strongly over or under segmented can also yield very low GCE scores which can be misleading.
- the Rand Index represents the fraction of the pixel pairs that are labeled consistently in the two segmentations, values that are closer to 1 indicate better segmentations. Unlike the GCE the Rand Index will suffer if the machine segmentation is over or under segmented with respect to the human segmentation.
- a series of segmentation experiments can be carried out using feature spaces based on color information.
- the color values can be averaged over a square window of width w centered around each pixel. Effectively, this averaging is a preprocessing step that can aid in smoothing noise out of the image. Increasing the size of this window can increase the level of smoothing and leads to a coarser segmentation. It will be appreciated that other filtering schemes can be used to smooth or preprocess the image before performing image segmentation.
- FIG. 3 shows histograms of the distributions of the GCE and Rand Index metrics over the entire data set.
- the first graph on the left of FIG. 2 indicates the distribution of the GCE values over all of the segmentations in the database while the graph on the right denotes the distribution of the Rand Index values.
- FIG. 4 shows a few of the images from the data set along with the human segmentation and the machine segmentation. This figure compares the output of the automated segmentation procedure to human labeled segmentations.
- the first and fourth rows contain the input imagery, the second and fifth rows contain human segmentations while the third and sixth rows contain machine segmentations.
- FIG. 5 provides a direct comparison of segmentations produced by the methods of the present invention with those produced by the Mean Shift procedure for a few randomly chosen images in the data set.
- This figure compares the output of the proposed segmentation scheme with the results obtained using the Edison segmentation tool.
- the first row corresponds to the input image, the second to a human segmentation, the third to the mean shift result and the fourth to the randomized hash result.
- One advantage of the proposed segmentation scheme is that the computational effort required scales linearly in the number of pixels and the operations required are simple and regular.
- a real time version of the scheme was implemented on a Macbook Pro laptop computer.
- This implementation was used to segment 640 by 480 video frames at a rate of 10 frames per second using a single core of an Intel Core 2 Duo processor running at 2.33 GHz.
- This rate includes the time taken for all phases of the algorithm, image acquisition, randomized hashing, local maxima detection and connected components processing. Since almost all of the steps in the procedure are embarrassingly parallel, the algorithm is a well suited to implementation on modern multi-core processors and GPUs and should be amenable to further acceleration.
- the proposed algorithm can be highly parallelizable and can be implemented in real time on modest hardware. This is an advantage since it means that the method could be used as a cheap preprocessing step in a variety of image interpretation applications much as edge detection is used today.
- the segmentation algorithm is used instead of edge detection in image processing in a real time environment. This call allow objects to be identified when coupled with pattern matching or relative motion to a background.
- the method can be used on a mobile robot to produce a fast, rough segmentation of the scene into sky ground, road and tree regions.
- This mobile robot can use image segmentation as described herein as well as a ranging mechanism to model its surrounding environment as described in concurrently filed application titled “Scene Analysis Using Image And Range Data,” by C. J. Taylor, which is incorporated herein by reference.
- the segmentation scheme could be used as part of the loop in real time tracking applications where it would allow the system to automatically delineate targets.
- the lower processing requirements of hashing could make object detection faster and cheaper for real-time tracking.
- the segmentation scheme could be performed by CPUs on the cameras to detect an object and track it.
- a suitable camera network with tracking ability is described in concurrently filed application titled “Distributed Target Tracking Using Self Localizing Smart Camera Networks,” by C. J. Taylor, which is incorporated herein by reference.
- the real time segmentation scheme being used as a pre-processing step which would suggest possible groupings in the image to higher level interpretation algorithms.
- this segmentation scheme could use this segmentation scheme as a preprocessing stage to any preferred image processing technique suitable for the application.
- the system could, for instance, focus its attention on regions based on their, size, shape, texture or position in the image.
- Suitable processing environments can include Intel, PowerPC, ARM, or other CPU-based systems having memory and a processor, but can also include any suitable embedded systems, DSP, GPU, APU, or other multi-core processing environment including related hardware and memory.
- the algorithms taught herein can be implemented by dedicated logic.
- execution of these algorithm and techniques is not limited to a single processor environment, and can, in some contemplated embodiments, be performed in a client server environment, a cloud computing environment, multicore environment, multithreaded environment, mobile device or devices, etc.
Abstract
Pixels in a feature space can be divided using a hashing method. Random planes are selected within a feature space and a pixel's relationship to each plane determines a bit of a hash code. Clusters of pixels can be identified by local maxima in the hash cells in the feature space. Nearby pixels in the feature space can be further assigned to these local maxima based on hamming distance. An image can be segmented by observing adjacent pixels sharing a common hash code.
Description
- The present application claims priority to provisional patent applications 61/418,789, 61/418,805, and 61/418,799 which are incorporated by reference in their entirety.
- The present application relates to co-pending patent applications entitled “Scene Analysis Using Image and Range Data” and “Distributed Target Tracking Using Self Localizing Smart Camera Networks Technology Field” both of which are incorporated by reference in their entirety and filed on the same day as the present application entitled “Image Segmentation for Distributed Target Tracking and Scene Analysis.”
- The present invention relates generally to machine vision systems and methods and specifically to image segmentation to determine salient features in an image. The present invention relates generally to machine vision systems and methods and specifically to object tracking and scene analysis for distributed or mobile applications.
- Segmentation is a method of breaking an image into coherent regions and is a common problem in Computer Vision. Many known methods of segmentation are computationally intensive, making them unsuitable for fast, low power, or low cost applications, such as for use with distributed or mobile devices. However, there is a need for an algorithm that is more amenable to real-time implementation.
- To date, most of the approaches that have been developed to tackle the segmentation problem can be broadly divided into two groups. The first group consists of algorithms that view the image as a graph and use various metrics to measure the difference in appearance between neighboring pixels or regions. Once the problem has been formulated in this way, the algorithms center on the problem of dividing this graph into pieces so as to maximize coherence. The Normalized Cut algorithm developed by Shi and Malik proceeds by recasting the graph segmentation problem in terms of a spectral analysis. (See Jianbo Shi and Jitendra Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:888-905, 1997.) This approach involves computing the distance between the pixels in the image and then solving a series of large but sparse eigenvector problems.
- Felzenszswalb and Huttenlocher proposed an efficient approach to grouping pixels in an image by making use of a spanning tree and showed that locally greedy grouping decisions can yield plausible results. (Pedro F. Felzenszwalb and Daniel P. Huttenlocher, “Efficient graph-based image segmentation,” Int. J. Comput. Vision, 59(2): 167-181, 2004. ISSN 0920-5691.) This approach also revolves around the computation of multiple pairwise distance values. There remains a need for avoiding the computational costs associated with distance computation.
- Another broad class of segmentation schemes are termed feature based methods because these proceed by associating a feature vector with each pixel in the image. (See Dorin Comaniciu, Peter Meer, and Senior Member, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:603-619, 2002; W Y Ma and B. S. Manjunath, “Texture features and learning similarity,” Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 0:425, 1996, ISSN 1063-6919. doi: http://doi.ieeecomputersociety.org/10.1109/CVPR.1996.517107; and Eduard Vazquez, Joost Weijer and Ramon Baldrich, “Image segmentation in the presence of shadows and highlights,” In ECCV '08: “Proceedings of the 10th European Conference on Computer Vision,” pages 1-14, Berlin, Heidelberg, 2008, Springer-Verlag. ISBN 978-3-540-88692-1. doi: http://dx.doi.org/10.1007/978-3-540-88693-8—1.)
- Another clustering method is the k-means algorithm, which seeks to divide the population into k-clusters using an Expectation Maximization approach. (See Morten Rufus Blas, Motilal Agrawal, Aravind Sundaresan, and Kurt Konolige, “Fast color/texture segmentation for outdoor robots.” In IROS, pages 4078-1085, 2008. K-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. One issue that one needs to be addressed in applying this algorithm to segmentation problems is the question of choosing an appropriate value for k, which is typically not known beforehand. A second issue is the fact that the k-means scheme involves repeated rounds of distance computations. This means that the computational complexity grows with the number of pixels, the dimension of the feature space and the number of clusters. K-means clustering is typically considered an NP-hard problem. While some approaches have been proposed to mitigate this problem including the method developed by Elkan, which seeks to accelerate the process by invoking the triangle inequality, and Locality Sensitive Hashing schemes that search for near neighbors in the feature space, these have been unable to fully mitigate the distance computations required. (See Charles Elkan, “Using the triangle inequality to accelerate k-means,” In International Conference on Machine Learning, 2003; Piotr Indyk and Rajeev Motwani. “Approximate nearest neighbors: towards removing the curse of dimensionality,” In STOC '98: Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604-613, New York, N.Y., USA, 1998. ACM. ISBN 0-89791-962-9. doi: http://doi.acm.org/10.1145/276698.276876.)
- There have also been attempts to apply Mean Shift segmentation algorithm to subdivide color images into regions. (Dorin Comaniciu, Peter Meer and Senior Member, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:603-619, 2002.) This feature based approach proceeds by searching for modes of the distribution in the feature space using a Parzen Window based approach (sometimes referred to as the Parzen-Rosenblatt window, which provides a well known non-parametric way of estimating the probability density function of a random variable). The method involves tracing the paths of various feature vectors as they evolve under the mean shift rule. This non-parameteric estimation scheme can be very time consuming which makes it less useful in situations where real time response is desired. The Parzen Window density estimation scheme employed in this approach also limits the dimension of the feature spaces to which it can be applied effectively. In contrast the method proposed in this application can be applied to arbitrary feature spaces and has been implemented in real time on modest hardware.
- When labeled image data is available, algorithms that learn how to classify pixels and segment images have also been proposed. (See J. Shotton, M. Johnson, and R. Cipolla, “Semantic text on forests for image categorization and segmentation,” In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8, 2008; Michael Maire, Pablo Arbelaez, Charles Fowlkes, and Jitendra Malik, “Using contours to detect and localize junctions in natural images,” In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8, June 2008.) These approaches rely on leveraging the training data to associate semantic labels with pixels and segments.
- With the emergence of low-cost, ubiquitous cameras, such as cameras, there exists a need for developing image segmentation methods that can be reliably used with low-cost computation, which can handle large volumes of image data. Similarly, it is desirable for a segmentation method to be operated without the need for structured training data, which may not be readily available for use with low-cost or large volume cameras and processors.
- When labeled image data is available, algorithms that learn how to classify pixels and segment images have also been proposed. (See J. Shotton, M. Johnson, and R. Cipolla, “Semantic text on forests for image categorization and segmentation,” In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8, 2008; Michael Maire, Pablo Arbelaez, Charles Fowlkes, and Jitendra Malik, “Using contours to detect and localize junctions in natural images,” In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8, June 2008.) These approaches rely on leveraging the training data to associate semantic labels with pixels and segments.
- Embodiments of the present invention address and overcome one or more of the above shortcomings and drawbacks, by providing devices, systems, and methods for segmenting images. This technology is particularly well-suited for, but by no means limited to, real-time segmentation of images for identifying salient features of an image.
- Embodiments of the present invention are directed to a method for identifying salient segments in an image comprising the steps of choosing a plurality of planes that divide a feature space into a plurality of cells, generating a first set of hash codes for a first set of pixels in an image based on a location in the feature space of a feature vector associated with each pixel in the first set whereby the location of each feature vector relative to each of the plurality of planes contributes a binary value to each hash code, determining a second set of hash codes selected from the first set of hash codes, wherein the second set of hash codes indicates local maxima of clusters of pixels, assigning each of the first set of pixels to one of the second set of hash codes based on the hamming distance between a first hash code from the first set of hash codes assigned to each pixel and each of the second set of hash codes, and identifying segments in an image based on groups of adjacent pixels sharing common hash codes. The step of choosing a plurality of planes may further comprise selecting a predetermined number of random planes in the feature space. The feature space may be a three-dimensional color space. The step of choosing a plurality of planes may further comprise training the plurality of planes based on human interpretations of images.
- Additional features and advantages of the invention will be made apparent from the following detailed description of illustrative embodiments that proceeds with reference to the accompanying drawings.
- The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
- The foregoing and other aspects of the present invention are best understood from the following detailed description when read in connection with the accompanying drawings. For the purpose of illustrating the invention, there is shown in the drawings embodiments that are presently preferred, it being understood, however, that the invention is not limited to the specific instrumentalities disclosed. Included in the drawings are the following Figures:
-
FIG. 1A is a two dimensional view of a simplified 2D version of the randomized hashing scheme in a feature space fractured into regions by a set of randomly chosen splitting planes Each region is associated with a hash code indicating where it falls with respect to the splitting planes; -
FIG. 1B is a four-dimensional view of the relationship between hash codes and associated with the vertices of a hypercube in an exemplary embodiment; -
FIG. 2 is a flow chart showing operation of an exemplary algorithm for use with segmenting images in accordance with some embodiments of the random hashing scheme; -
FIG. 3 shows two histograms of the distributions of the GCE and Rand Index metrics over a sample data set as a result of the application of certain embodiments of the random hash scheme. -
FIG. 3 includes sample images along with the human segmentation and the machine segmentation. -
FIG. 4 shows a plurality sample images, along with the human segmentation and the machine segmentation, including both prior art and an embodiment of the random hashing scheme. - The segmentation scheme described in this application employs a feature based approach. Each pixel in the image is described by a feature vector which encodes a set of properties used to describe that pixel. Embodiments of the present invention can employ a simple color descriptor vector, which is an example of a feature vector, but some embodiments also use more sophisticated feature vectors such as a histogram of color values or a vector of texture coefficients. Some embodiments can employ an approach to segmenting natural images which leverages the idea of randomized hashing. The procedure aims to replace the problem of finding clusters in the feature space with the problem of finding local maxima in a graph whose topology approximates the geometry of the underlying feature space. In so doing the method can bypass the computational effort associated with computing distances between feature vectors which can comprise a significant fraction of the effort in other techniques such as k-means clustering and mean shift segmentation.
- The method can be controlled by a few parameters namely, the number of random splitting planes, n, The Hamming Distance threshold, k, and the window size that is used to average the color vectors, w. By adjusting these parameters the algorithm can be made to produce over segmentations or under segmentations of the input imagery. Importantly the number of segments that are produced is implicitly controlled by these parameters rather than explicitly provided as an input to the algorithm.
- These techniques can be employed in a computing environment available to a person of ordinary skill in the art, including performing the prescribed calculations on a PC, embedded processor, mobile device, cloud computing environment, client-server environment, DSP, or dedicated hardware circuit capable of performing the methods disclosed herein.
- Given a set of feature vectors (e.g. pixels having predetermined properties, where each selected property is a dimension in a feature space), the goal of the segmentation procedure is to divide them into a set of clusters which capture the most salient groupings in the distribution. An example of a feature space for use with some embodiments includes a 3D space with orthogonal axes for each color: red, green, and blue (RGB). In some embodiments, the feature space uses axes for Y, Pb, and Pr. In some embodiments, more than three dimensions can be used. For example, other dimensions can include motion vectors, depth, or other information that can be obtained related to each pixel, so that each pixel (or group of pixels) can be mapped to the feature space.
- Entries in this feature vector characterize salient properties of the region surrounding that pixel such as color (e.g. RGB, YPbPr, or multispectral properties, such as IR, UV, or information from a FLIR camera), texture, frequency content, and/or motion properties. Once all of the pixels have been mapped to the feature space, the segmentation process is treated as a clustering problem where the goal can be to identify salient clusters in the population of feature vectors.
-
FIG. 1A depicts a simplified 2D version of the randomized hashing scheme in a feature space fractured into regions by a set of randomly chosen splitting planes Each region is associated with a hash code indicating where it falls with respect to the splitting planes. The set of all hash codes can be associated with the vertices of a hypercube as shown in Figure (b) here the shading of the nodes indicates how many feature vectors are hashed to that code. The segmentation scheme proceeds by identifying local maxima in this hash code space. - This scheme employs a series of randomly chosen splitting planes.
FIGS. 1A and 1B show a simplified view of this procedure in two dimensions. Here therandom splitting planes 0, 1, 2, and 3 are used to hash the feature vectors into a set of disjoint cells based on their location. - On can hash a set of vectors into a set of discrete bins in order to accelerate the search for nearest neighbors. One can further leverage the fact that this randomized hashing procedure tends to preserves locality so points that are near to each other in the feature space are hashed to the same bin with high probability. The proposed segmentation scheme leverages these phenomena to cluster the feature vectors into groups. In
FIG. 1A , the n=4 spitting planes fracture the feature space into a set of 2n disjoint convex cells each of which corresponds to an n-bit hash code. It should be understood that each splitting plane can also be considered a normal vector and a decision point (such as a mean) in the feature space. By projecting each pixel on the that normal vector, and determining its position relative to the decision point, a bit associated with each normal vector can be assigned. - More specifically, each sample vector (e.g. pixel or object) in the feature space vj is assigned an n-bit hash code where the ith bit in the code, bij is derived from the ith splitting plane as follows bij=(vj·ui)>Si where ui denotes the normal associated with the ith splitting plane and si denotes the corresponding splitting value. Neighboring cells in the feature space differ by a single bit so the Hamming distance between the codes provides some indication of the distance between vectors in the feature space. More generally we can construct a correspondence between the set of all possible hash codes and the vertices of an n-dimensional hypercube. The topology of the
hypercube 100 reflects the structure of the feature space since neighboring cells in feature space will correspond to neighboring vertices in the hypercube. In this example, the shaded nodes, 1001, 0000, and 0111 are those bins that have local maxima clusters inFIG. 1A . - For each of the hash codes the clustering procedure can record how many feature vectors are mapped to that code. In some embodiments, clusters in feature space will induce population maxima in the code space. That is, if we consider the hypercube as a graph we would expect to observe that some of the hash codes have a greater population than their neighbors. This allows us to replace the original problem of clustering vectors in the feature space in favor of the simpler problem of looking for population maxima in the code space graph.
- For every populated code in the hypercube the algorithm interrogates all of the codes that differ from the current code by k bits or fewer. This parameter, k, is referred to as the Hamming Distance Threshold. If the code under consideration has a population greater than all of its neighbors it is declared a local maxima and a cluster center. In this way the number of clusters recovered by the procedure is determined automatically based on the data as opposed to being imposed a priori as in k-means. Note that this scheme can be used to distinguish up to 2(n-k) local maxima.
- The normals associated with the splitting planes, ui, can be chosen randomly or based on a priori knowledge. The splitting values, si, can be chosen by considering the distribution of the projected values, (vi·ui). In some embodiments, the mean of the distribution, which corresponds to casting all of the splitting planes through the centroid of the distribution, is used. In some embodiments, the median value, and the value midway between the maximum and minimum projected values, is used as the splitting value (i.e. the decision point for assigning bit values to pixels projected on the normal vector. These schemes tend to produce similar results in practice.
- After the local maxima have been identified, each of the feature vectors is labeled with the hash code of the closest local maxima based on the Hamming Distance. In the case where a feature vector is equidistant from two or more local maxima based on Hamming Distance the Euclidean distance between the feature vector and the mean cluster vector is used to break the tie and decide the label. Once each of the pixels has been labeled with the index of its local maxima, a connected components procedure is run to divide the image into coherent connected regions.
- The entire scheme is outlined below in pseudo-code. This algorithm is elaborated in
FIG. 2 . - Algorithm 1 Segmentation via Randomized Hashing
- 1: Hash each feature vector to an n-bit code using the n randomly chosen splitting planes
- 2: Maintain a count of the number of feature vectors mapped to each hash code
- 3: Identify local maxima in the code space—these are the cluster centers
- 4: Assign each feature vector to the closest local maxima
- 5: Run connected components on the labeled pixels to identify coherent connected components.
-
FIG. 2 showsmethod 300 for segmenting an image. At step 302 a feature space is divided by a predetermined number, n, of splitting planes. This step can include randomly assigning and feature planes and using a calibration image or a first image to the splitting plane for a given orientation. For example, the position of each plane can be chosen such that feature vectors (e.g. those feature vectors associated with each pixel in a test image) within the feature space are evenly divided on either side of the splitting plane. As discussed above, each splitting plane can be created by choosing a random normal vector and assigning a decision point along that vector such that the decision point is at the mean or median of the distribution of feature vectors (e.g. pixels projected on that normal vector). This can be done using a calibration image, the image under test, or the previous image under test. In some embodiments,step 302 takes into account predetermined splitting planes that have been created via prior training or have been manually assigned. Atstep 302, the splitting planes can include a combination of random splitting planes and preassigned splitting planes. - At
step 304, each feature vector image (or a subset of the feature vectors in image) is hashed using the splitting planes. This process can be computationally simple as each bit in the hash simply determines which side of the splitting plane the feature vector resides. Because only a bit is used for this hash, additional computational overhead needed for deriving the Euclidean distance from the splitting claim is not necessary. Step 304 can be an iterative loop whereby each feature vector is taken and compared in succession to each of the n splitting planes. It will be appreciated that massive parallelism and may be possible using the right processor, such as a DSP or graphics processor, to perform this step. - Once each feature vector is hashed into the cells created by the splitting planes, the algorithm proceeds to step 306. At
step 306, the number of feature vectors resident in each cell in the feature space is counted. At step 308, the population counts for each cell in the feature space are compared to choose a number of local maxima. In some embodiments the number of local maxima, M, is predetermined prior to image processing. In other embodiments the number of local maxima, M, is derived dynamically from the image based on the results of the residency counts of each cell in the feature space. As discussed, the maximum number of local maxima, M, can be determined based on the number of splitting planes used and the Hamming distance requirements for clusters used to image segmentation. Once the local maxima are identified, these can be used as the center of each cluster used for image segmentation. - At step 310 each feature vector under test can be assigned to each of the cluster centers determined in step 308. In some embodiments, this assignment has low computational is and where. In overhead because the hash of each feature vector is compared to each of the cluster centers and the cluster having the nearest Hamming distance to the hash of the vector is selected as the cluster to which the feature vector will be assigned. In some embodiments, ties between competing clusters (i.e. clusters that are the same Hamming distance away from the hashed feature vector) can be resolved by estimating the Euclidean distance between the center of each cluster and the current feature vector. It will be appreciated that other techniques for resolving conflicts between equidistant clusters can be used, including assigning each feature vector to the equidistant cluster that has the least number of feature vectors currently assigned, or the most number.
- Once each of the feature vectors is assigned to a cluster, image segments can be identified within the image at step 312. For example, adjacent (i.e. connected) pixels in the image plane that have had their feature vectors assigned to the same cluster can be considered part of the same image segment. Any known technique can be used, such as minimum threshold sizes for segments of adjacent pixels. In this way, image segments can be rapidly deduced from the image by scanning pixels in the image plane and determining whether they share the same cluster in the feature space.
- In some embodiments,
algorithm 300 repeats after step 312, with the next captured image. In some embodiments, step 314 may optionally adjust the splitting planes based on the results in step 312. For example, in a training scheme the results of step 312 can be compared via a consistency score with a desired segmentation result. Based on this result, the splitting planes can be adjusted to improve this consistency score. In some embodiments, incremental improvement of splitting planes is done in real time based on other criteria, such as reducing the number of identified segments in step 312 or increasing the number of identified segments in step 312 to reach a desired range of segments. Once the splitting planes are adjusted the algorithm can return to step 302. - The proposed scheme can be similar in spirit to the Mean Shift segmentation algorithm which also seeks to identify modes in the distribution of feature vectors. Where the mean shift scheme uses a Parzen Window based scheme to estimate density in feature space, the proposed scheme uses hashing, which can be randomized or tailored to apriori information, to identify salient groupings of feature vectors.
- Like Locality Sensitive Hashing, the segmentation scheme can make implicit use of the Johnsson-Lindenstrauss theorem which justifies the use of random projection by bounding the distortion of the relative distances between the feature vectors induced by the projection process. Additionally and alternatively, the hashing scheme may use planes that are predetermined, selected from a bounded group of planes, estimated or calculated based on some selected criteria. For example, if a particular color is known to be important, the hashing scheme can take this into account, by using at least some predetermined feature planes to create the hash or introduce a bias into the selection process, which might otherwise be random. The hashing scheme may also include an adaptive selection algorithm to learn how to select more appropriate feature planes in future video frames. This learning process can be by any scheme known to a person of ordinary skill in the art, including using human feedback to help train parameters, genetic algorithms, decision trees, pruning, beam searching, or the like.
- From a computational perspective the principal effort revolves around computing the hash codes which involves O(nmN) operations where n denotes the number of projection directions, m denotes the dimension of the feature space and N denotes the total number of pixels or feature vectors. Note that the scheme can avoid the explicit distance computations between the feature vectors that one uses in most agglomerative and k-means segmentation schemes in favor of randomized hashing.
- In searching for the local maxima in the code space one can simply store the hash code populations in an array with 2n entries. For each populated hash code the procedure involves interrogating on the order of (n/k) neighboring codes. For example to run the local maxima detection algorithm on n=12 dimensions with a Hamming distance threshold, k=2, one can construct a table with 212=4096 entries and each hash code would have (12/1)+(12/2)=12+66=78 neighbors. Typically many of the hash bins are empty which further simplifies processing. For larger value of n one could employ a binary tree data structure to store and query the contents of the hash table efficiently.
- In some embodiments, the proposed segmentation scheme can be carried out using training. For example, the Berkeley Segmentation Database which contains 1633 manual segmentations of 300 color images that can be used for this purpose. By comparing the results of random hash segmentation one or more of the images to one or more of the manual segmentation results, the selected splitting planes can be improved. For example, training can be used to assist in selecting the appropriate number of splitting planes or in selecting specific splitting planes to include. The manual segmentations provided by the users can be compared with the segmentations produced by the algorithm using two different measures, the Global Consistency Error and the Rand Index. The Global Consistency Error (GCE) developed by Martin, Fowlkes, Tal and Malik can capture the difference between two segmentations in a single number between 0 and 1 where lower numbers indicate lower error. (See David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” In In Proc. 8th Int'l Conf. Computer Vision, pages 416-423, 2001.) The measure can be designed such that if one segmentation is a refinement of the other the score will be zero. This is a useful feature since it accounts for the fact that human subjects often choose to segment scenes to various levels, however, it also implies that machine segmentations that are strongly over or under segmented can also yield very low GCE scores which can be misleading.
- To provide a different but related perspective on the algorithm one can record and report the Rand Index for each segmentation. This measure is commonly used in statistics to measure the quality of clustering algorithms. (See Ranjith Unnikrishnan, Caroline Pantofaru and Martial Hebert, “Toward objective evaluation of image segmentation algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):929-944, 2007. ISSN 0162-8828. doi: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2007.1046.) In order to compute the Rand Index one can consider every pair of pixels in the image and determine whether they are labeled consistently in the human and machine segmentations. That is, if the two pixels have the same label in the human segmentation they should have the same label in the machine segmentation and vice versa. The Rand Index represents the fraction of the pixel pairs that are labeled consistently in the two segmentations, values that are closer to 1 indicate better segmentations. Unlike the GCE the Rand Index will suffer if the machine segmentation is over or under segmented with respect to the human segmentation.
- A series of segmentation experiments can be carried out using feature spaces based on color information. In some embodiments, the color values can be averaged over a square window of width w centered around each pixel. Effectively, this averaging is a preprocessing step that can aid in smoothing noise out of the image. Increasing the size of this window can increase the level of smoothing and leads to a coarser segmentation. It will be appreciated that other filtering schemes can be used to smooth or preprocess the image before performing image segmentation.
- A first set of experiments that was carried out was designed to determine how the performance of the segmentation scheme varied as we varied the color space. Experiments were carried out using the standard RGB values, the HSV color space, the LAB color space and a color vector that concatenated the RGV and HSV values into a six dimensional color vector. These experiments were carried out using a randomly chosen subset of 150 of the segmentations in the database. The average GCE and Rand index values are reported. In all of these experiments the value of n was fixed at 12 the value of k was fixed at 1 and the value of w was fixed at 3. Table 1 summarizes the results of these experiments and indicates that the RGBHSV color space offers the best performance with respect to the two metrics.
-
TABLE 1 Results of running the segmentation procedure using various color spaces Color GCE Rand RGB 0.2805 0.7327 HSV 0.2421 0.7527 LAB 0.2578 0.7351 RGBHSV 0.2014 0.7614 - A second set of experiments explored how the performance of the scheme varied as we varied the number of splitting planes, n, and the Hamming Distance Threshold used to find local minima, k. The experiments were carried out using the HSV color space on the same subset of 150 segmentations from the database. The mean GCE and Rand Index values were recorded for every combination of parameters and the results are summarized in Table 2. In practice increasing values of the n parameter provide more ways to distinguish between feature vectors can lead to over segmentation while increasing the k parameter decreases the number of local maxima detected in the code space and leads to under segmentation.
-
TABLE 2 Results of running the segmentation procedure using various values for the n and k parameters n k GCE Rand Index 8 1 0.1952 0.7632 8 2 0.2511 0.7443 8 3 0.2450 0.7104 12 1 0.1652 0.7535 12 2 0.2250 0.7640 12 3 0.2482 0.7417 16 1 0.1005 0.7438 16 2 0.1670 0.7583 16 3 0.2236 0.7527 - A third set of experiments was carried out to investigate how the performance of the scheme varied as the window size parameter, w, was varied. The experiments were carried out using the HSV color space with the n and k parameters fixed at 12 and 1 respectively. Table 3 contains the results of these trials. Increasing the value of w leads to increases the level of smoothing which typically leads to under segmentation.
-
TABLE 3 Results of running the segmentation procedure using various values for the size of the smoothing window, w in pixels w GCE Rand Index 3 0.1219 0.7479 5 0.1350 0.7494 7 0.1477 0.7513 11 0.1660 0.7520 21 0.1992 0.7537 - A fourth experiment was run to compare the results of the automated segmentation procedure to each of the 1633 human segmentations in the database. For this experiment the HSV color space was employed, the number of splitting planes, n was 12, the Hamming Distance threshold, k was 2 and the window size, w was 3. These parameter values were chosen to produce a visually pleasing over segmentation of the images rather than to optimize the GCE or Rand Index values. Over the entire database the mean GCE value was 0.2235 and the median GCE value was 0.2157 the mean Rand Index value was 0.7370 and the median Rand Index was 0.7833.
-
FIG. 3 shows histograms of the distributions of the GCE and Rand Index metrics over the entire data set. The first graph on the left ofFIG. 2 indicates the distribution of the GCE values over all of the segmentations in the database while the graph on the right denotes the distribution of the Rand Index values. -
FIG. 4 shows a few of the images from the data set along with the human segmentation and the machine segmentation. This figure compares the output of the automated segmentation procedure to human labeled segmentations. The first and fourth rows contain the input imagery, the second and fifth rows contain human segmentations while the third and sixth rows contain machine segmentations. -
FIG. 5 provides a direct comparison of segmentations produced by the methods of the present invention with those produced by the Mean Shift procedure for a few randomly chosen images in the data set. This figure compares the output of the proposed segmentation scheme with the results obtained using the Edison segmentation tool. The first row corresponds to the input image, the second to a human segmentation, the third to the mean shift result and the fourth to the randomized hash result. The parameters used for the Edison tool were (hs,hr,M)=(7,6.5,20) and the parameters used for the randomized method were (n,k,w)=(12,2,3). - One advantage of the proposed segmentation scheme is that the computational effort required scales linearly in the number of pixels and the operations required are simple and regular. In order to demonstrate this a real time version of the scheme was implemented on a Macbook Pro laptop computer. This implementation was used to segment 640 by 480 video frames at a rate of 10 frames per second using a single core of an
Intel Core 2 Duo processor running at 2.33 GHz. This rate includes the time taken for all phases of the algorithm, image acquisition, randomized hashing, local maxima detection and connected components processing. Since almost all of the steps in the procedure are embarrassingly parallel, the algorithm is a well suited to implementation on modern multi-core processors and GPUs and should be amenable to further acceleration. - The proposed algorithm can be highly parallelizable and can be implemented in real time on modest hardware. This is an advantage since it means that the method could be used as a cheap preprocessing step in a variety of image interpretation applications much as edge detection is used today. In some embodiments, the segmentation algorithm is used instead of edge detection in image processing in a real time environment. This call allow objects to be identified when coupled with pattern matching or relative motion to a background.
- The method can be used on a mobile robot to produce a fast, rough segmentation of the scene into sky ground, road and tree regions. This mobile robot can use image segmentation as described herein as well as a ranging mechanism to model its surrounding environment as described in concurrently filed application titled “Scene Analysis Using Image And Range Data,” by C. J. Taylor, which is incorporated herein by reference.
- Similarly, the segmentation scheme could be used as part of the loop in real time tracking applications where it would allow the system to automatically delineate targets. The lower processing requirements of hashing could make object detection faster and cheaper for real-time tracking. For example, if used in a distributed camera network, the segmentation scheme could be performed by CPUs on the cameras to detect an object and track it. A suitable camera network with tracking ability is described in concurrently filed application titled “Distributed Target Tracking Using Self Localizing Smart Camera Networks,” by C. J. Taylor, which is incorporated herein by reference.
- In any of these applications, the real time segmentation scheme being used as a pre-processing step which would suggest possible groupings in the image to higher level interpretation algorithms. In this way a person of ordinary skill in the art could use this segmentation scheme as a preprocessing stage to any preferred image processing technique suitable for the application. The system could, for instance, focus its attention on regions based on their, size, shape, texture or position in the image.
- While the segmentation algorithm has been discussed and implemented in the context of color descriptors, it could equally easily be applied to feature spaces with higher dimension and can involve any combination of features, including both color and texture values and/or motion vectors.
- It should be readily apparent that the image processing techniques taught herein are suitable for execution in a computing environment that includes at least one processor. Suitable processing environments can include Intel, PowerPC, ARM, or other CPU-based systems having memory and a processor, but can also include any suitable embedded systems, DSP, GPU, APU, or other multi-core processing environment including related hardware and memory. Similarly, the algorithms taught herein can be implemented by dedicated logic. Similarly, execution of these algorithm and techniques is not limited to a single processor environment, and can, in some contemplated embodiments, be performed in a client server environment, a cloud computing environment, multicore environment, multithreaded environment, mobile device or devices, etc.
- Although the invention has been described with reference to exemplary embodiments, it is not limited thereto. Those skilled in the art will appreciate that numerous changes and modifications may be made to the preferred embodiments of the invention and that such changes and modifications may be made without departing from the true spirit of the invention. It is therefore intended that the appended claims be construed to cover all such equivalent variations as fall within the true spirit and scope of the invention.
Claims (4)
1. A method for identifying salient segments in an image comprising the steps of:
choosing a plurality of planes that divide a feature space into a plurality of cells;
generating a first set of hash codes for a first set of pixels in an image based on a location in the feature space of a feature vector associated with each pixel in the first set, whereby the location of each feature vector relative to each of the plurality of planes contributes a binary value to each hash code;
determining a second set of hash codes selected from the first set of hash codes, wherein the second set of hash codes indicates local maxima of clusters of pixels;
assigning each of the first set of pixels to one of the second set of hash codes based on the hamming distance between a first hash code from the first set of hash codes assigned to each pixel and each of the second set of hash codes; and
identifying segments in an image based on groups of adjacent pixels sharing common hash codes.
2. The method of claim 1 , wherein the step of choosing a plurality of planes further comprises selecting a predetermined number of random planes in the feature space.
3. The method of claim 1 , wherein the feature space is a three-dimensional color space.
4. The method of claim 1 , wherein the step of choosing a plurality of planes further comprises training the plurality of planes based on human interpretations of images.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/309,551 US20120250984A1 (en) | 2010-12-01 | 2011-12-01 | Image segmentation for distributed target tracking and scene analysis |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US41880510P | 2010-12-01 | 2010-12-01 | |
US41879910P | 2010-12-01 | 2010-12-01 | |
US41878910P | 2010-12-01 | 2010-12-01 | |
US13/309,551 US20120250984A1 (en) | 2010-12-01 | 2011-12-01 | Image segmentation for distributed target tracking and scene analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
US20120250984A1 true US20120250984A1 (en) | 2012-10-04 |
Family
ID=46926741
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/309,543 Active 2033-09-09 US9449233B2 (en) | 2010-12-01 | 2011-12-01 | Distributed target tracking using self localizing smart camera networks |
US13/309,558 Active US8867793B2 (en) | 2010-12-01 | 2011-12-01 | Scene analysis using image and range data |
US13/309,551 Abandoned US20120250984A1 (en) | 2010-12-01 | 2011-12-01 | Image segmentation for distributed target tracking and scene analysis |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/309,543 Active 2033-09-09 US9449233B2 (en) | 2010-12-01 | 2011-12-01 | Distributed target tracking using self localizing smart camera networks |
US13/309,558 Active US8867793B2 (en) | 2010-12-01 | 2011-12-01 | Scene analysis using image and range data |
Country Status (1)
Country | Link |
---|---|
US (3) | US9449233B2 (en) |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120170801A1 (en) * | 2010-12-30 | 2012-07-05 | De Oliveira Luciano Reboucas | System for Food Recognition Method Using Portable Devices Having Digital Cameras |
US8495425B2 (en) * | 2011-03-01 | 2013-07-23 | International Business Machines Corporation | System and method to efficiently identify bad components in a multi-node system utilizing multiple node topologies |
US20130197859A1 (en) * | 2012-01-30 | 2013-08-01 | International Business Machines Corporation | Tracking Entities by Means of Hash Values |
US20140040262A1 (en) * | 2012-08-03 | 2014-02-06 | Adobe Systems Incorporated | Techniques for cloud-based similarity searches |
US8867793B2 (en) | 2010-12-01 | 2014-10-21 | The Trustees Of The University Of Pennsylvania | Scene analysis using image and range data |
CN104574440A (en) * | 2014-12-30 | 2015-04-29 | 安科智慧城市技术(中国)有限公司 | Video movement target tracking method and device |
CN104637052A (en) * | 2015-01-22 | 2015-05-20 | 西南交通大学 | Object tracking method based on target guide significance detection |
CN105989611A (en) * | 2015-02-05 | 2016-10-05 | 南京理工大学 | Blocking perception Hash tracking method with shadow removing |
US20180293742A1 (en) * | 2014-09-19 | 2018-10-11 | Brain Corporation | Apparatus and methods for saliency detection based on color occurrence analysis |
US20190068940A1 (en) * | 2017-08-31 | 2019-02-28 | Disney Enterprises Inc. | Large-Scale Environmental Mapping In Real-Time By A Robotic System |
CN109598726A (en) * | 2018-10-26 | 2019-04-09 | 哈尔滨理工大学 | A kind of adapting to image target area dividing method based on SLIC |
US10298970B2 (en) * | 2014-12-12 | 2019-05-21 | Huawei Technologies Co., Ltd. | Image transmission method and apparatus |
CN109844807A (en) * | 2016-08-19 | 2019-06-04 | 讯宝科技有限责任公司 | For the mthods, systems and devices of size to be split and determined to object |
US20190180086A1 (en) * | 2017-06-30 | 2019-06-13 | Beijing Didi Infinity Technology And Development Co. Ltd. | Systems and methods for verifying authenticity of id photo |
CN111680176A (en) * | 2020-04-20 | 2020-09-18 | 武汉大学 | Remote sensing image retrieval method and system based on attention and bidirectional feature fusion |
WO2021086721A1 (en) * | 2019-10-31 | 2021-05-06 | Siemens Healthcare Diagnostics Inc. | Methods and apparatus for hashing and retrieval of training images used in hiln determinations of specimens in automated diagnostic analysis systems |
US11184604B2 (en) * | 2016-04-04 | 2021-11-23 | Compound Eye, Inc. | Passive stereo depth sensing |
US11270467B2 (en) | 2020-01-21 | 2022-03-08 | Compound Eye, Inc. | System and method for camera calibration |
CN115578694A (en) * | 2022-11-18 | 2023-01-06 | 合肥英特灵达信息技术有限公司 | Video analysis computing power scheduling method, system, electronic equipment and storage medium |
US11651581B2 (en) | 2019-11-27 | 2023-05-16 | Compound Eye, Inc. | System and method for correspondence map determination |
US11935249B2 (en) | 2020-01-21 | 2024-03-19 | Compound Eye, Inc. | System and method for egomotion estimation |
Families Citing this family (140)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9526156B2 (en) * | 2010-05-18 | 2016-12-20 | Disney Enterprises, Inc. | System and method for theatrical followspot control interface |
EP4290856A3 (en) | 2010-09-13 | 2024-03-06 | Contour IP Holding, LLC | Portable digital video camera configured for remote image acquisition control and viewing |
US8451344B1 (en) * | 2011-03-24 | 2013-05-28 | Amazon Technologies, Inc. | Electronic devices with side viewing capability |
US9117281B2 (en) * | 2011-11-02 | 2015-08-25 | Microsoft Corporation | Surface segmentation from RGB and depth images |
US9072929B1 (en) * | 2011-12-01 | 2015-07-07 | Nebraska Global Investment Company, LLC | Image capture system |
US9183631B2 (en) * | 2012-06-29 | 2015-11-10 | Mitsubishi Electric Research Laboratories, Inc. | Method for registering points and planes of 3D data in multiple coordinate systems |
US8995903B2 (en) | 2012-07-25 | 2015-03-31 | Gopro, Inc. | Credential transfer management camera network |
US9036016B2 (en) | 2012-07-25 | 2015-05-19 | Gopro, Inc. | Initial camera mode management system |
US9025014B2 (en) | 2012-07-25 | 2015-05-05 | Gopro, Inc. | Device detection camera system |
US8994800B2 (en) | 2012-07-25 | 2015-03-31 | Gopro, Inc. | Credential transfer management camera system |
US9189888B1 (en) * | 2013-01-14 | 2015-11-17 | Bentley Systems, Incorporated | Point cloud modeling based on user-provided seed |
DE102013002554A1 (en) * | 2013-02-15 | 2014-08-21 | Jungheinrich Aktiengesellschaft | Method for detecting objects in a warehouse and / or for spatial orientation in a warehouse |
TW201442511A (en) * | 2013-04-17 | 2014-11-01 | Aver Information Inc | Tracking shooting system and method |
CN103235825B (en) * | 2013-05-08 | 2016-05-25 | 重庆大学 | A kind of magnanimity face recognition search engine design method based on Hadoop cloud computing framework |
US9614898B1 (en) * | 2013-05-27 | 2017-04-04 | Surround.IO | Distributed event engine |
DE102013209940A1 (en) * | 2013-05-28 | 2014-12-04 | Conti Temic Microelectronic Gmbh | Camera system for vehicles |
WO2014208337A1 (en) * | 2013-06-28 | 2014-12-31 | シャープ株式会社 | Location detection device |
JP5438861B1 (en) * | 2013-07-11 | 2014-03-12 | パナソニック株式会社 | Tracking support device, tracking support system, and tracking support method |
US10038740B2 (en) | 2013-08-19 | 2018-07-31 | Nant Holdings Ip, Llc | Camera-to-camera interactions, systems and methods |
JP2015126474A (en) * | 2013-12-27 | 2015-07-06 | ソニー株式会社 | Information processing apparatus, imaging apparatus, information processing method, information processing program, and imaging system |
US10009099B2 (en) * | 2014-03-29 | 2018-06-26 | Intel Corporation | Techniques for communication with body-carried devices |
US9607207B1 (en) * | 2014-03-31 | 2017-03-28 | Amazon Technologies, Inc. | Plane-fitting edge detection |
CN104038729A (en) * | 2014-05-05 | 2014-09-10 | 重庆大学 | Cascade-type multi-camera relay tracing method and system |
US9844360B2 (en) | 2014-10-27 | 2017-12-19 | Clear Guide Medical, Inc. | System and devices for image targeting |
US9423669B2 (en) | 2014-11-04 | 2016-08-23 | Qualcomm Incorporated | Method and apparatus for camera autofocus based on Wi-Fi ranging technique |
US9600892B2 (en) * | 2014-11-06 | 2017-03-21 | Symbol Technologies, Llc | Non-parametric method of and system for estimating dimensions of objects of arbitrary shape |
US9396554B2 (en) | 2014-12-05 | 2016-07-19 | Symbol Technologies, Llc | Apparatus for and method of estimating dimensions of an object associated with a code in automatic response to reading the code |
CN104463899B (en) * | 2014-12-31 | 2017-09-22 | 北京格灵深瞳信息技术有限公司 | A kind of destination object detection, monitoring method and its device |
WO2016140680A1 (en) * | 2015-03-05 | 2016-09-09 | Hewlett Packard Enterprise Development Lp | Multi-level object re-identification |
CN106709899B (en) | 2015-07-15 | 2020-06-02 | 华为终端有限公司 | Method, device and equipment for calculating relative positions of two cameras |
US9928605B2 (en) * | 2015-09-25 | 2018-03-27 | Intel Corporation | Real-time cascaded object recognition |
US9953430B1 (en) * | 2015-10-29 | 2018-04-24 | Indoor Reality Inc. | Methods for detecting luminary fixtures |
US10352689B2 (en) | 2016-01-28 | 2019-07-16 | Symbol Technologies, Llc | Methods and systems for high precision locationing with depth values |
US10145955B2 (en) | 2016-02-04 | 2018-12-04 | Symbol Technologies, Llc | Methods and systems for processing point-cloud data with a line scanner |
US20170230637A1 (en) * | 2016-02-07 | 2017-08-10 | Google Inc. | Multiple camera computing system having camera-to-camera communications link |
US10721451B2 (en) | 2016-03-23 | 2020-07-21 | Symbol Technologies, Llc | Arrangement for, and method of, loading freight into a shipping container |
CN105844669B (en) * | 2016-03-28 | 2018-11-13 | 华中科技大学 | A kind of video object method for real time tracking based on local Hash feature |
US11356334B2 (en) * | 2016-04-15 | 2022-06-07 | Nec Corporation | Communication efficient sparse-reduce in distributed machine learning |
US9805240B1 (en) | 2016-04-18 | 2017-10-31 | Symbol Technologies, Llc | Barcode scanning and dimensioning |
US10497014B2 (en) * | 2016-04-22 | 2019-12-03 | Inreality Limited | Retail store digital shelf for recommending products utilizing facial recognition in a peer to peer network |
US9946256B1 (en) | 2016-06-10 | 2018-04-17 | Gopro, Inc. | Wireless communication device for communicating with an unmanned aerial vehicle |
US9998907B2 (en) * | 2016-07-25 | 2018-06-12 | Kiana Analytics Inc. | Method and apparatus for uniquely identifying wireless devices |
CN106295563B (en) * | 2016-08-09 | 2019-06-07 | 武汉中观自动化科技有限公司 | A kind of system and method that airbound target flying quality is assessed based on multi-vision visual |
CN106295594B (en) * | 2016-08-17 | 2019-10-15 | 北京大学 | A kind of across camera method for tracking target and device based on dynamic route tree |
GB2553108B (en) * | 2016-08-22 | 2020-07-15 | Canon Kk | Method, processing device and system for managing copies of media samples in a system comprising a plurality of interconnected network cameras |
US10044972B1 (en) | 2016-09-30 | 2018-08-07 | Gopro, Inc. | Systems and methods for automatically transferring audiovisual content |
US10397415B1 (en) | 2016-09-30 | 2019-08-27 | Gopro, Inc. | Systems and methods for automatically transferring audiovisual content |
DE102016120386A1 (en) * | 2016-10-26 | 2018-04-26 | Jungheinrich Aktiengesellschaft | Method for detecting objects in a warehouse and industrial truck with a device for detecting objects in a warehouse |
JP7256746B2 (en) * | 2016-10-31 | 2023-04-12 | ヴィザル・テクノロジー・ソシエテ・ア・レスポンサビリテ・リミテ | Apparatus and method for detecting optically modulated signals in a video stream |
US11042161B2 (en) | 2016-11-16 | 2021-06-22 | Symbol Technologies, Llc | Navigation control method and apparatus in a mobile automation system |
US10451405B2 (en) | 2016-11-22 | 2019-10-22 | Symbol Technologies, Llc | Dimensioning system for, and method of, dimensioning freight in motion along an unconstrained path in a venue |
CN108089152B (en) * | 2016-11-23 | 2020-07-03 | 杭州海康威视数字技术股份有限公司 | Equipment control method, device and system |
US10354411B2 (en) | 2016-12-20 | 2019-07-16 | Symbol Technologies, Llc | Methods, systems and apparatus for segmenting objects |
KR102629934B1 (en) * | 2016-12-22 | 2024-01-26 | 에스케이플래닛 주식회사 | Imaging apparatus, and control method thereof |
US10839203B1 (en) * | 2016-12-27 | 2020-11-17 | Amazon Technologies, Inc. | Recognizing and tracking poses using digital imagery captured from multiple fields of view |
US11665308B2 (en) | 2017-01-31 | 2023-05-30 | Tetavi, Ltd. | System and method for rendering free viewpoint video for sport applications |
GB2560177A (en) | 2017-03-01 | 2018-09-05 | Thirdeye Labs Ltd | Training a computational neural network |
GB2560387B (en) | 2017-03-10 | 2022-03-09 | Standard Cognition Corp | Action identification using neural networks |
US10699421B1 (en) | 2017-03-29 | 2020-06-30 | Amazon Technologies, Inc. | Tracking objects in three-dimensional space using calibrated visual cameras and depth cameras |
US20180285438A1 (en) * | 2017-03-31 | 2018-10-04 | Change Healthcase Holdings, Llc | Database system and method for identifying a subset of related reports |
CN107038753B (en) * | 2017-04-14 | 2020-06-05 | 中国科学院深圳先进技术研究院 | Stereoscopic vision three-dimensional reconstruction system and method |
US10726273B2 (en) | 2017-05-01 | 2020-07-28 | Symbol Technologies, Llc | Method and apparatus for shelf feature and object placement detection from shelf images |
US10591918B2 (en) | 2017-05-01 | 2020-03-17 | Symbol Technologies, Llc | Fixed segmented lattice planning for a mobile automation apparatus |
US10663590B2 (en) | 2017-05-01 | 2020-05-26 | Symbol Technologies, Llc | Device and method for merging lidar data |
US11093896B2 (en) | 2017-05-01 | 2021-08-17 | Symbol Technologies, Llc | Product status detection system |
US11367092B2 (en) | 2017-05-01 | 2022-06-21 | Symbol Technologies, Llc | Method and apparatus for extracting and processing price text from an image set |
US10949798B2 (en) | 2017-05-01 | 2021-03-16 | Symbol Technologies, Llc | Multimodal localization and mapping for a mobile automation apparatus |
US11449059B2 (en) | 2017-05-01 | 2022-09-20 | Symbol Technologies, Llc | Obstacle detection for a mobile automation apparatus |
WO2018201423A1 (en) | 2017-05-05 | 2018-11-08 | Symbol Technologies, Llc | Method and apparatus for detecting and interpreting price label text |
CN107358200B (en) * | 2017-07-13 | 2020-09-18 | 常州大学 | Multi-camera non-overlapping vision field pedestrian matching method based on sparse learning |
JP6928499B2 (en) * | 2017-07-21 | 2021-09-01 | 株式会社タダノ | Guide information display device and work equipment |
US10055853B1 (en) | 2017-08-07 | 2018-08-21 | Standard Cognition, Corp | Subject identification and tracking using image recognition |
US11200692B2 (en) | 2017-08-07 | 2021-12-14 | Standard Cognition, Corp | Systems and methods to check-in shoppers in a cashier-less store |
US11250376B2 (en) | 2017-08-07 | 2022-02-15 | Standard Cognition, Corp | Product correlation analysis using deep learning |
US10474991B2 (en) | 2017-08-07 | 2019-11-12 | Standard Cognition, Corp. | Deep learning-based store realograms |
US10650545B2 (en) | 2017-08-07 | 2020-05-12 | Standard Cognition, Corp. | Systems and methods to check-in shoppers in a cashier-less store |
US10853965B2 (en) | 2017-08-07 | 2020-12-01 | Standard Cognition, Corp | Directional impression analysis using deep learning |
US11232687B2 (en) | 2017-08-07 | 2022-01-25 | Standard Cognition, Corp | Deep learning-based shopper statuses in a cashier-less store |
US10133933B1 (en) | 2017-08-07 | 2018-11-20 | Standard Cognition, Corp | Item put and take detection using image recognition |
US11023850B2 (en) | 2017-08-07 | 2021-06-01 | Standard Cognition, Corp. | Realtime inventory location management using deep learning |
US10445694B2 (en) | 2017-08-07 | 2019-10-15 | Standard Cognition, Corp. | Realtime inventory tracking using deep learning |
US10474988B2 (en) | 2017-08-07 | 2019-11-12 | Standard Cognition, Corp. | Predicting inventory events using foreground/background processing |
US10127438B1 (en) | 2017-08-07 | 2018-11-13 | Standard Cognition, Corp | Predicting inventory events using semantic diffing |
US10572763B2 (en) | 2017-09-07 | 2020-02-25 | Symbol Technologies, Llc | Method and apparatus for support surface edge detection |
US10521914B2 (en) | 2017-09-07 | 2019-12-31 | Symbol Technologies, Llc | Multi-sensor object recognition system and method |
US11232294B1 (en) | 2017-09-27 | 2022-01-25 | Amazon Technologies, Inc. | Generating tracklets from digital imagery |
US10110994B1 (en) | 2017-11-21 | 2018-10-23 | Nokia Technologies Oy | Method and apparatus for providing voice communication with spatial audio |
US11030442B1 (en) | 2017-12-13 | 2021-06-08 | Amazon Technologies, Inc. | Associating events with actors based on digital imagery |
US11284041B1 (en) | 2017-12-13 | 2022-03-22 | Amazon Technologies, Inc. | Associating items with actors based on digital imagery |
JP2019121069A (en) * | 2017-12-28 | 2019-07-22 | キヤノン株式会社 | Image processing device, image processing method, and program |
US10706505B2 (en) * | 2018-01-24 | 2020-07-07 | GM Global Technology Operations LLC | Method and system for generating a range image using sparse depth data |
US10809078B2 (en) | 2018-04-05 | 2020-10-20 | Symbol Technologies, Llc | Method, system and apparatus for dynamic path generation |
US10823572B2 (en) | 2018-04-05 | 2020-11-03 | Symbol Technologies, Llc | Method, system and apparatus for generating navigational data |
US11327504B2 (en) | 2018-04-05 | 2022-05-10 | Symbol Technologies, Llc | Method, system and apparatus for mobile automation apparatus localization |
US10832436B2 (en) | 2018-04-05 | 2020-11-10 | Symbol Technologies, Llc | Method, system and apparatus for recovering label positions |
US10740911B2 (en) | 2018-04-05 | 2020-08-11 | Symbol Technologies, Llc | Method, system and apparatus for correcting translucency artifacts in data representing a support structure |
US11126863B2 (en) | 2018-06-08 | 2021-09-21 | Southwest Airlines Co. | Detection system |
CN109035295B (en) * | 2018-06-25 | 2021-01-12 | 广州杰赛科技股份有限公司 | Multi-target tracking method, device, computer equipment and storage medium |
US11482045B1 (en) | 2018-06-28 | 2022-10-25 | Amazon Technologies, Inc. | Associating events with actors using digital imagery and machine learning |
US11468698B1 (en) | 2018-06-28 | 2022-10-11 | Amazon Technologies, Inc. | Associating events with actors using digital imagery and machine learning |
US11468681B1 (en) | 2018-06-28 | 2022-10-11 | Amazon Technologies, Inc. | Associating events with actors using digital imagery and machine learning |
US11366865B1 (en) * | 2018-09-05 | 2022-06-21 | Amazon Technologies, Inc. | Distributed querying of computing hubs |
US11010920B2 (en) | 2018-10-05 | 2021-05-18 | Zebra Technologies Corporation | Method, system and apparatus for object detection in point clouds |
US11506483B2 (en) | 2018-10-05 | 2022-11-22 | Zebra Technologies Corporation | Method, system and apparatus for support structure depth determination |
US11090811B2 (en) | 2018-11-13 | 2021-08-17 | Zebra Technologies Corporation | Method and apparatus for labeling of support structures |
US11003188B2 (en) | 2018-11-13 | 2021-05-11 | Zebra Technologies Corporation | Method, system and apparatus for obstacle handling in navigational path generation |
US11416000B2 (en) | 2018-12-07 | 2022-08-16 | Zebra Technologies Corporation | Method and apparatus for navigational ray tracing |
US11079240B2 (en) | 2018-12-07 | 2021-08-03 | Zebra Technologies Corporation | Method, system and apparatus for adaptive particle filter localization |
US11100303B2 (en) | 2018-12-10 | 2021-08-24 | Zebra Technologies Corporation | Method, system and apparatus for auxiliary label detection and association |
US11015938B2 (en) | 2018-12-12 | 2021-05-25 | Zebra Technologies Corporation | Method, system and apparatus for navigational assistance |
US10731970B2 (en) | 2018-12-13 | 2020-08-04 | Zebra Technologies Corporation | Method, system and apparatus for support structure detection |
CA3028708A1 (en) | 2018-12-28 | 2020-06-28 | Zih Corp. | Method, system and apparatus for dynamic loop closure in mapping trajectories |
WO2020161646A2 (en) * | 2019-02-05 | 2020-08-13 | Rey Focusing Ltd. | Focus tracking system |
US11232575B2 (en) | 2019-04-18 | 2022-01-25 | Standard Cognition, Corp | Systems and methods for deep learning-based subject persistence |
US11402846B2 (en) | 2019-06-03 | 2022-08-02 | Zebra Technologies Corporation | Method, system and apparatus for mitigating data capture light leakage |
US11662739B2 (en) | 2019-06-03 | 2023-05-30 | Zebra Technologies Corporation | Method, system and apparatus for adaptive ceiling-based localization |
US11960286B2 (en) | 2019-06-03 | 2024-04-16 | Zebra Technologies Corporation | Method, system and apparatus for dynamic task sequencing |
US11200677B2 (en) | 2019-06-03 | 2021-12-14 | Zebra Technologies Corporation | Method, system and apparatus for shelf edge detection |
US11341663B2 (en) | 2019-06-03 | 2022-05-24 | Zebra Technologies Corporation | Method, system and apparatus for detecting support structure obstructions |
US11080566B2 (en) | 2019-06-03 | 2021-08-03 | Zebra Technologies Corporation | Method, system and apparatus for gap detection in support structures with peg regions |
US11151743B2 (en) | 2019-06-03 | 2021-10-19 | Zebra Technologies Corporation | Method, system and apparatus for end of aisle detection |
US11222460B2 (en) * | 2019-07-22 | 2022-01-11 | Scale AI, Inc. | Visualization techniques for data labeling |
CN112788227B (en) * | 2019-11-07 | 2022-06-14 | 富泰华工业(深圳)有限公司 | Target tracking shooting method, target tracking shooting device, computer device and storage medium |
US11507103B2 (en) | 2019-12-04 | 2022-11-22 | Zebra Technologies Corporation | Method, system and apparatus for localization-based historical obstacle handling |
US11107238B2 (en) | 2019-12-13 | 2021-08-31 | Zebra Technologies Corporation | Method, system and apparatus for detecting item facings |
US11822333B2 (en) | 2020-03-30 | 2023-11-21 | Zebra Technologies Corporation | Method, system and apparatus for data capture illumination control |
US11398094B1 (en) | 2020-04-06 | 2022-07-26 | Amazon Technologies, Inc. | Locally and globally locating actors by digital cameras and machine learning |
US11443516B1 (en) | 2020-04-06 | 2022-09-13 | Amazon Technologies, Inc. | Locally and globally locating actors by digital cameras and machine learning |
US11303853B2 (en) | 2020-06-26 | 2022-04-12 | Standard Cognition, Corp. | Systems and methods for automated design of camera placement and cameras arrangements for autonomous checkout |
US11620900B2 (en) * | 2020-06-26 | 2023-04-04 | Intel Corporation | Object tracking technology based on cognitive representation of a location in space |
US11361468B2 (en) | 2020-06-26 | 2022-06-14 | Standard Cognition, Corp. | Systems and methods for automated recalibration of sensors for autonomous checkout |
US11450024B2 (en) | 2020-07-17 | 2022-09-20 | Zebra Technologies Corporation | Mixed depth object detection |
US20220101532A1 (en) * | 2020-09-29 | 2022-03-31 | Samsung Electronics Co., Ltd. | Method and device for performing plane detection |
US11593915B2 (en) | 2020-10-21 | 2023-02-28 | Zebra Technologies Corporation | Parallax-tolerant panoramic image generation |
US11392891B2 (en) | 2020-11-03 | 2022-07-19 | Zebra Technologies Corporation | Item placement detection and optimization in material handling systems |
US11847832B2 (en) | 2020-11-11 | 2023-12-19 | Zebra Technologies Corporation | Object classification for autonomous navigation systems |
CN112907528B (en) | 2021-02-09 | 2021-11-09 | 南京航空航天大学 | Point cloud-to-image-based composite material laying wire surface defect detection and identification method |
KR20240005790A (en) * | 2021-04-30 | 2024-01-12 | 나이앤틱, 인크. | Repeatability prediction of points of interest |
US11954882B2 (en) | 2021-06-17 | 2024-04-09 | Zebra Technologies Corporation | Feature-based georegistration for mobile computing devices |
CN116027269B (en) * | 2023-03-29 | 2023-06-06 | 成都量芯集成科技有限公司 | Plane scene positioning method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020196976A1 (en) * | 2001-04-24 | 2002-12-26 | Mihcak M. Kivanc | Robust recognizer of perceptually similar content |
US20110235908A1 (en) * | 2010-03-23 | 2011-09-29 | Microsoft Corporation | Partition min-hash for partial-duplicate image determination |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5436672A (en) * | 1994-05-27 | 1995-07-25 | Symah Vision | Video processing system for modifying a zone in successive images |
EP0971242A1 (en) * | 1998-07-10 | 2000-01-12 | Cambridge Consultants Limited | Sensor signal processing |
US6567116B1 (en) * | 1998-11-20 | 2003-05-20 | James A. Aman | Multiple object tracking system |
US6441734B1 (en) * | 2000-12-12 | 2002-08-27 | Koninklijke Philips Electronics N.V. | Intruder detection through trajectory analysis in monitoring and surveillance systems |
US6847728B2 (en) * | 2002-12-09 | 2005-01-25 | Sarnoff Corporation | Dynamic depth recovery from multiple synchronized video streams |
JP2004198211A (en) * | 2002-12-18 | 2004-07-15 | Aisin Seiki Co Ltd | Apparatus for monitoring vicinity of mobile object |
US7007888B2 (en) * | 2003-11-25 | 2006-03-07 | The Boeing Company | Inertial position target measuring systems and methods |
US7421113B2 (en) | 2005-03-30 | 2008-09-02 | The Trustees Of The University Of Pennsylvania | System and method for localizing imaging devices |
US7489804B2 (en) * | 2005-09-26 | 2009-02-10 | Cognisign Llc | Apparatus and method for trajectory-based identification of digital data content |
US8325979B2 (en) * | 2006-10-30 | 2012-12-04 | Tomtom Global Content B.V. | Method and apparatus for detecting objects from terrestrial based mobile mapping data |
WO2008149925A1 (en) * | 2007-06-08 | 2008-12-11 | Nikon Corporation | Imaging device, image display device, and program |
US8116527B2 (en) * | 2009-10-07 | 2012-02-14 | The United States Of America As Represented By The Secretary Of The Army | Using video-based imagery for automated detection, tracking, and counting of moving objects, in particular those objects having image characteristics similar to background |
US9449233B2 (en) | 2010-12-01 | 2016-09-20 | The Trustees Of The University Of Pennsylvania | Distributed target tracking using self localizing smart camera networks |
-
2011
- 2011-12-01 US US13/309,543 patent/US9449233B2/en active Active
- 2011-12-01 US US13/309,558 patent/US8867793B2/en active Active
- 2011-12-01 US US13/309,551 patent/US20120250984A1/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020196976A1 (en) * | 2001-04-24 | 2002-12-26 | Mihcak M. Kivanc | Robust recognizer of perceptually similar content |
US20110235908A1 (en) * | 2010-03-23 | 2011-09-29 | Microsoft Corporation | Partition min-hash for partial-duplicate image determination |
Non-Patent Citations (1)
Title |
---|
Camillo J. Taylor and Anthony Cowley. Fast Segmentation via Randomized Hashing. In A. Cavallaro, S. Prince and D. Alexander, editors, Proceedings of the British Machine Conference, pages 60.1-60.11. BMVA Press, September 2009. doi:10.5244/C.23.60 * |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8867793B2 (en) | 2010-12-01 | 2014-10-21 | The Trustees Of The University Of Pennsylvania | Scene analysis using image and range data |
US9449233B2 (en) | 2010-12-01 | 2016-09-20 | The Trustees Of The University Of Pennsylvania | Distributed target tracking using self localizing smart camera networks |
US8625889B2 (en) * | 2010-12-30 | 2014-01-07 | Samsung Electronics Co., Ltd. | System for food recognition method using portable devices having digital cameras |
US20120170801A1 (en) * | 2010-12-30 | 2012-07-05 | De Oliveira Luciano Reboucas | System for Food Recognition Method Using Portable Devices Having Digital Cameras |
US8495425B2 (en) * | 2011-03-01 | 2013-07-23 | International Business Machines Corporation | System and method to efficiently identify bad components in a multi-node system utilizing multiple node topologies |
US20130197859A1 (en) * | 2012-01-30 | 2013-08-01 | International Business Machines Corporation | Tracking Entities by Means of Hash Values |
US9600443B2 (en) * | 2012-01-30 | 2017-03-21 | International Business Machines Corporation | Tracking entities by means of hash values |
US10042818B2 (en) | 2012-01-30 | 2018-08-07 | International Business Machines Corporation | Tracking entities by means of hash values |
US20140040262A1 (en) * | 2012-08-03 | 2014-02-06 | Adobe Systems Incorporated | Techniques for cloud-based similarity searches |
US9165068B2 (en) * | 2012-08-03 | 2015-10-20 | Adobe Systems Incorporated | Techniques for cloud-based similarity searches |
US10810456B2 (en) * | 2014-09-19 | 2020-10-20 | Brain Corporation | Apparatus and methods for saliency detection based on color occurrence analysis |
US20180293742A1 (en) * | 2014-09-19 | 2018-10-11 | Brain Corporation | Apparatus and methods for saliency detection based on color occurrence analysis |
US10298970B2 (en) * | 2014-12-12 | 2019-05-21 | Huawei Technologies Co., Ltd. | Image transmission method and apparatus |
CN104574440A (en) * | 2014-12-30 | 2015-04-29 | 安科智慧城市技术(中国)有限公司 | Video movement target tracking method and device |
CN104637052A (en) * | 2015-01-22 | 2015-05-20 | 西南交通大学 | Object tracking method based on target guide significance detection |
CN105989611B (en) * | 2015-02-05 | 2019-01-18 | 南京理工大学 | The piecemeal perceptual hash tracking of hatched removal |
CN105989611A (en) * | 2015-02-05 | 2016-10-05 | 南京理工大学 | Blocking perception Hash tracking method with shadow removing |
US11184604B2 (en) * | 2016-04-04 | 2021-11-23 | Compound Eye, Inc. | Passive stereo depth sensing |
CN109844807A (en) * | 2016-08-19 | 2019-06-04 | 讯宝科技有限责任公司 | For the mthods, systems and devices of size to be split and determined to object |
US20190180086A1 (en) * | 2017-06-30 | 2019-06-13 | Beijing Didi Infinity Technology And Development Co. Ltd. | Systems and methods for verifying authenticity of id photo |
US11003895B2 (en) * | 2017-06-30 | 2021-05-11 | Beijing Didi Infinity Technology And Development Co., Ltd. | Systems and methods for verifying authenticity of ID photo |
US10484659B2 (en) * | 2017-08-31 | 2019-11-19 | Disney Enterprises, Inc. | Large-scale environmental mapping in real-time by a robotic system |
US20190068940A1 (en) * | 2017-08-31 | 2019-02-28 | Disney Enterprises Inc. | Large-Scale Environmental Mapping In Real-Time By A Robotic System |
CN109598726A (en) * | 2018-10-26 | 2019-04-09 | 哈尔滨理工大学 | A kind of adapting to image target area dividing method based on SLIC |
WO2021086721A1 (en) * | 2019-10-31 | 2021-05-06 | Siemens Healthcare Diagnostics Inc. | Methods and apparatus for hashing and retrieval of training images used in hiln determinations of specimens in automated diagnostic analysis systems |
US11651581B2 (en) | 2019-11-27 | 2023-05-16 | Compound Eye, Inc. | System and method for correspondence map determination |
US11869218B2 (en) | 2020-01-21 | 2024-01-09 | Compound Eye, Inc. | System and method for camera calibration |
US11270467B2 (en) | 2020-01-21 | 2022-03-08 | Compound Eye, Inc. | System and method for camera calibration |
US11935249B2 (en) | 2020-01-21 | 2024-03-19 | Compound Eye, Inc. | System and method for egomotion estimation |
CN111680176A (en) * | 2020-04-20 | 2020-09-18 | 武汉大学 | Remote sensing image retrieval method and system based on attention and bidirectional feature fusion |
CN115578694A (en) * | 2022-11-18 | 2023-01-06 | 合肥英特灵达信息技术有限公司 | Video analysis computing power scheduling method, system, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
US9449233B2 (en) | 2016-09-20 |
US8867793B2 (en) | 2014-10-21 |
US20120249802A1 (en) | 2012-10-04 |
US20120250978A1 (en) | 2012-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20120250984A1 (en) | Image segmentation for distributed target tracking and scene analysis | |
Lim et al. | Real-time image-based 6-dof localization in large-scale environments | |
Hartmann et al. | Recent developments in large-scale tie-point matching | |
Lai et al. | RGB-D object recognition: Features, algorithms, and a large scale benchmark | |
WO2016119117A1 (en) | Localization and mapping method | |
JP5261501B2 (en) | Permanent visual scene and object recognition | |
KR20130122662A (en) | Method and system for comparing images | |
Shahbazi et al. | Application of locality sensitive hashing to realtime loop closure detection | |
US10943098B2 (en) | Automated and unsupervised curation of image datasets | |
US9014486B2 (en) | Systems and methods for tracking with discrete texture traces | |
An et al. | Optimal colour‐based mean shift algorithm for tracking objects | |
Qin et al. | Loop closure detection in SLAM by combining visual CNN features and submaps | |
Li et al. | Salient object detection based on meanshift filtering and fusion of colour information | |
Carvalho et al. | Analysis of object description methods in a video object tracking environment | |
Gad et al. | Crowd density estimation using multiple features categories and multiple regression models | |
Gu et al. | Automatic searching of fish from underwater images via shape matching | |
Tal et al. | An accurate method for line detection and manhattan frame estimation | |
Lowry et al. | Logos: Local geometric support for high-outlier spatial verification | |
Essmaeel et al. | A new 3D descriptor for human classification: Application for human detection in a multi-kinect system | |
Arnfred et al. | Mirror match: Reliable feature point matching without geometric constraints | |
Sliti et al. | Efficient visual tracking via sparse representation and back-projection histogram | |
Taylor et al. | Fast Segmentation via Randomized Hashing. | |
Zhang et al. | A New Inlier Identification Scheme for Robust Estimation Problems. | |
Thinh et al. | Depth-aware salient object segmentation | |
Xu et al. | Label transfer for joint recognition and segmentation of 3D object |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: THE TRUSTEES OF THE UNIVERSITY OF PENNSYLVANIA, PE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TAYLOR, CAMILLO JOSE;REEL/FRAME:029887/0863 Effective date: 20120525 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |