US20120250984A1 - Image segmentation for distributed target tracking and scene analysis - Google Patents

Image segmentation for distributed target tracking and scene analysis Download PDF

Info

Publication number
US20120250984A1
US20120250984A1 US13/309,551 US201113309551A US2012250984A1 US 20120250984 A1 US20120250984 A1 US 20120250984A1 US 201113309551 A US201113309551 A US 201113309551A US 2012250984 A1 US2012250984 A1 US 2012250984A1
Authority
US
United States
Prior art keywords
segmentation
image
planes
feature
pixels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/309,551
Inventor
Camillo Jose Taylor
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Pennsylvania Penn
Original Assignee
University of Pennsylvania Penn
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Pennsylvania Penn filed Critical University of Pennsylvania Penn
Priority to US13/309,551 priority Critical patent/US20120250984A1/en
Publication of US20120250984A1 publication Critical patent/US20120250984A1/en
Assigned to THE TRUSTEES OF THE UNIVERSITY OF PENNSYLVANIA reassignment THE TRUSTEES OF THE UNIVERSITY OF PENNSYLVANIA ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TAYLOR, CAMILLO JOSE
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/292Multi-camera tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources

Definitions

  • the present invention relates generally to machine vision systems and methods and specifically to image segmentation to determine salient features in an image.
  • the present invention relates generally to machine vision systems and methods and specifically to object tracking and scene analysis for distributed or mobile applications.
  • Segmentation is a method of breaking an image into coherent regions and is a common problem in Computer Vision. Many known methods of segmentation are computationally intensive, making them unsuitable for fast, low power, or low cost applications, such as for use with distributed or mobile devices. However, there is a need for an algorithm that is more amenable to real-time implementation.
  • the first group consists of algorithms that view the image as a graph and use various metrics to measure the difference in appearance between neighboring pixels or regions. Once the problem has been formulated in this way, the algorithms center on the problem of dividing this graph into pieces so as to maximize coherence.
  • the Normalized Cut algorithm developed by Shi and Malik proceeds by recasting the graph segmentation problem in terms of a spectral analysis.
  • Felzenszswalb and Huttenlocher proposed an efficient approach to grouping pixels in an image by making use of a spanning tree and showed that locally greedy grouping decisions can yield plausible results.
  • Pedro F. Felzenszwalb and Daniel P. Huttenlocher “Efficient graph-based image segmentation,” Int. J. Comput. Vision, 59(2): 167-181, 2004. ISSN 0920-5691.”
  • This approach also revolves around the computation of multiple pairwise distance values. There remains a need for avoiding the computational costs associated with distance computation.
  • K-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean.
  • K-means clustering is typically considered an NP-hard problem. While some approaches have been proposed to mitigate this problem including the method developed by Elkan, which seeks to accelerate the process by invoking the triangle inequality, and Locality Sensitive Hashing schemes that search for near neighbors in the feature space, these have been unable to fully mitigate the distance computations required. (See Charles Elkan, “Using the triangle inequality to accelerate k-means,” In International Conference on Machine Learning, 2003; Piotr Indyk and Rajeev Motwani.
  • Embodiments of the present invention address and overcome one or more of the above shortcomings and drawbacks, by providing devices, systems, and methods for segmenting images.
  • This technology is particularly well-suited for, but by no means limited to, real-time segmentation of images for identifying salient features of an image.
  • Embodiments of the present invention are directed to a method for identifying salient segments in an image comprising the steps of choosing a plurality of planes that divide a feature space into a plurality of cells, generating a first set of hash codes for a first set of pixels in an image based on a location in the feature space of a feature vector associated with each pixel in the first set whereby the location of each feature vector relative to each of the plurality of planes contributes a binary value to each hash code, determining a second set of hash codes selected from the first set of hash codes, wherein the second set of hash codes indicates local maxima of clusters of pixels, assigning each of the first set of pixels to one of the second set of hash codes based on the hamming distance between a first hash code from the first set of hash codes assigned to each pixel and each of the second set of hash codes, and identifying segments in an image based on groups of adjacent pixels sharing common hash codes.
  • the step of choosing a plurality of planes may further comprise selecting a predetermined number of random planes in the feature space.
  • the feature space may be a three-dimensional color space.
  • the step of choosing a plurality of planes may further comprise training the plurality of planes based on human interpretations of images.
  • FIG. 1A is a two dimensional view of a simplified 2D version of the randomized hashing scheme in a feature space fractured into regions by a set of randomly chosen splitting planes Each region is associated with a hash code indicating where it falls with respect to the splitting planes;
  • FIG. 1B is a four-dimensional view of the relationship between hash codes and associated with the vertices of a hypercube in an exemplary embodiment
  • FIG. 2 is a flow chart showing operation of an exemplary algorithm for use with segmenting images in accordance with some embodiments of the random hashing scheme
  • FIG. 3 shows two histograms of the distributions of the GCE and Rand Index metrics over a sample data set as a result of the application of certain embodiments of the random hash scheme.
  • FIG. 3 includes sample images along with the human segmentation and the machine segmentation.
  • FIG. 4 shows a plurality sample images, along with the human segmentation and the machine segmentation, including both prior art and an embodiment of the random hashing scheme.
  • the segmentation scheme described in this application employs a feature based approach.
  • Each pixel in the image is described by a feature vector which encodes a set of properties used to describe that pixel.
  • Embodiments of the present invention can employ a simple color descriptor vector, which is an example of a feature vector, but some embodiments also use more sophisticated feature vectors such as a histogram of color values or a vector of texture coefficients.
  • Some embodiments can employ an approach to segmenting natural images which leverages the idea of randomized hashing. The procedure aims to replace the problem of finding clusters in the feature space with the problem of finding local maxima in a graph whose topology approximates the geometry of the underlying feature space. In so doing the method can bypass the computational effort associated with computing distances between feature vectors which can comprise a significant fraction of the effort in other techniques such as k-means clustering and mean shift segmentation.
  • the method can be controlled by a few parameters namely, the number of random splitting planes, n, The Hamming Distance threshold, k, and the window size that is used to average the color vectors, w.
  • the algorithm can be made to produce over segmentations or under segmentations of the input imagery.
  • the number of segments that are produced is implicitly controlled by these parameters rather than explicitly provided as an input to the algorithm.
  • a feature space for use with some embodiments includes a 3D space with orthogonal axes for each color: red, green, and blue (RGB).
  • RGB red, green, and blue
  • the feature space uses axes for Y, Pb, and Pr.
  • more than three dimensions can be used.
  • other dimensions can include motion vectors, depth, or other information that can be obtained related to each pixel, so that each pixel (or group of pixels) can be mapped to the feature space.
  • Entries in this feature vector characterize salient properties of the region surrounding that pixel such as color (e.g. RGB, YPbPr, or multispectral properties, such as IR, UV, or information from a FLIR camera), texture, frequency content, and/or motion properties.
  • color e.g. RGB, YPbPr, or multispectral properties, such as IR, UV, or information from a FLIR camera
  • texture e.g. RGB, YPbPr, or multispectral properties, such as IR, UV, or information from a FLIR camera
  • FIG. 1A depicts a simplified 2D version of the randomized hashing scheme in a feature space fractured into regions by a set of randomly chosen splitting planes Each region is associated with a hash code indicating where it falls with respect to the splitting planes.
  • the set of all hash codes can be associated with the vertices of a hypercube as shown in Figure (b) here the shading of the nodes indicates how many feature vectors are hashed to that code.
  • the segmentation scheme proceeds by identifying local maxima in this hash code space.
  • FIGS. 1A and 1B show a simplified view of this procedure in two dimensions.
  • the random splitting planes 0 , 1 , 2 , and 3 are used to hash the feature vectors into a set of disjoint cells based on their location.
  • On can hash a set of vectors into a set of discrete bins in order to accelerate the search for nearest neighbors.
  • This randomized hashing procedure tends to preserves locality so points that are near to each other in the feature space are hashed to the same bin with high probability.
  • the proposed segmentation scheme leverages these phenomena to cluster the feature vectors into groups.
  • each splitting plane can also be considered a normal vector and a decision point (such as a mean) in the feature space. By projecting each pixel on the that normal vector, and determining its position relative to the decision point, a bit associated with each normal vector can be assigned.
  • Neighboring cells in the feature space differ by a single bit so the Hamming distance between the codes provides some indication of the distance between vectors in the feature space. More generally we can construct a correspondence between the set of all possible hash codes and the vertices of an n-dimensional hypercube.
  • the topology of the hypercube 100 reflects the structure of the feature space since neighboring cells in feature space will correspond to neighboring vertices in the hypercube.
  • the shaded nodes, 1001, 0000, and 0111 are those bins that have local maxima clusters in FIG. 1A .
  • the clustering procedure can record how many feature vectors are mapped to that code.
  • clusters in feature space will induce population maxima in the code space. That is, if we consider the hypercube as a graph we would expect to observe that some of the hash codes have a greater population than their neighbors. This allows us to replace the original problem of clustering vectors in the feature space in favor of the simpler problem of looking for population maxima in the code space graph.
  • the algorithm interrogates all of the codes that differ from the current code by k bits or fewer.
  • This parameter, k is referred to as the Hamming Distance Threshold. If the code under consideration has a population greater than all of its neighbors it is declared a local maxima and a cluster center. In this way the number of clusters recovered by the procedure is determined automatically based on the data as opposed to being imposed a priori as in k-means. Note that this scheme can be used to distinguish up to 2 (n-k) local maxima.
  • the normals associated with the splitting planes, u i can be chosen randomly or based on a priori knowledge.
  • the splitting values, s i can be chosen by considering the distribution of the projected values, (v i ⁇ u i ).
  • the mean of the distribution which corresponds to casting all of the splitting planes through the centroid of the distribution, is used.
  • the median value, and the value midway between the maximum and minimum projected values is used as the splitting value (i.e. the decision point for assigning bit values to pixels projected on the normal vector.
  • each of the feature vectors is labeled with the hash code of the closest local maxima based on the Hamming Distance.
  • the Euclidean distance between the feature vector and the mean cluster vector is used to break the tie and decide the label.
  • FIG. 2 shows method 300 for segmenting an image.
  • a feature space is divided by a predetermined number, n, of splitting planes.
  • This step can include randomly assigning and feature planes and using a calibration image or a first image to the splitting plane for a given orientation. For example, the position of each plane can be chosen such that feature vectors (e.g. those feature vectors associated with each pixel in a test image) within the feature space are evenly divided on either side of the splitting plane.
  • each splitting plane can be created by choosing a random normal vector and assigning a decision point along that vector such that the decision point is at the mean or median of the distribution of feature vectors (e.g. pixels projected on that normal vector).
  • step 302 takes into account predetermined splitting planes that have been created via prior training or have been manually assigned.
  • the splitting planes can include a combination of random splitting planes and preassigned splitting planes.
  • each feature vector image (or a subset of the feature vectors in image) is hashed using the splitting planes.
  • This process can be computationally simple as each bit in the hash simply determines which side of the splitting plane the feature vector resides. Because only a bit is used for this hash, additional computational overhead needed for deriving the Euclidean distance from the splitting claim is not necessary.
  • Step 304 can be an iterative loop whereby each feature vector is taken and compared in succession to each of the n splitting planes. It will be appreciated that massive parallelism and may be possible using the right processor, such as a DSP or graphics processor, to perform this step.
  • the algorithm proceeds to step 306 .
  • the number of feature vectors resident in each cell in the feature space is counted.
  • the population counts for each cell in the feature space are compared to choose a number of local maxima.
  • the number of local maxima, M is predetermined prior to image processing.
  • the number of local maxima, M is derived dynamically from the image based on the results of the residency counts of each cell in the feature space.
  • the maximum number of local maxima, M can be determined based on the number of splitting planes used and the Hamming distance requirements for clusters used to image segmentation. Once the local maxima are identified, these can be used as the center of each cluster used for image segmentation.
  • each feature vector under test can be assigned to each of the cluster centers determined in step 308 .
  • this assignment has low computational is and where.
  • In overhead because the hash of each feature vector is compared to each of the cluster centers and the cluster having the nearest Hamming distance to the hash of the vector is selected as the cluster to which the feature vector will be assigned.
  • ties between competing clusters i.e. clusters that are the same Hamming distance away from the hashed feature vector
  • image segments can be identified within the image at step 312 .
  • adjacent (i.e. connected) pixels in the image plane that have had their feature vectors assigned to the same cluster can be considered part of the same image segment.
  • Any known technique can be used, such as minimum threshold sizes for segments of adjacent pixels. In this way, image segments can be rapidly deduced from the image by scanning pixels in the image plane and determining whether they share the same cluster in the feature space.
  • step 314 may optionally adjust the splitting planes based on the results in step 312 .
  • the results of step 312 can be compared via a consistency score with a desired segmentation result. Based on this result, the splitting planes can be adjusted to improve this consistency score.
  • incremental improvement of splitting planes is done in real time based on other criteria, such as reducing the number of identified segments in step 312 or increasing the number of identified segments in step 312 to reach a desired range of segments.
  • the proposed scheme can be similar in spirit to the Mean Shift segmentation algorithm which also seeks to identify modes in the distribution of feature vectors.
  • the mean shift scheme uses a Parzen Window based scheme to estimate density in feature space
  • the proposed scheme uses hashing, which can be randomized or tailored to apriori information, to identify salient groupings of feature vectors.
  • the segmentation scheme can make implicit use of the Johnsson-Lindenstrauss theorem which justifies the use of random projection by bounding the distortion of the relative distances between the feature vectors induced by the projection process.
  • the hashing scheme may use planes that are predetermined, selected from a bounded group of planes, estimated or calculated based on some selected criteria. For example, if a particular color is known to be important, the hashing scheme can take this into account, by using at least some predetermined feature planes to create the hash or introduce a bias into the selection process, which might otherwise be random.
  • the hashing scheme may also include an adaptive selection algorithm to learn how to select more appropriate feature planes in future video frames. This learning process can be by any scheme known to a person of ordinary skill in the art, including using human feedback to help train parameters, genetic algorithms, decision trees, pruning, beam searching, or the like.
  • n denotes the number of projection directions
  • m denotes the dimension of the feature space
  • N denotes the total number of pixels or feature vectors. Note that the scheme can avoid the explicit distance computations between the feature vectors that one uses in most agglomerative and k-means segmentation schemes in favor of randomized hashing.
  • the proposed segmentation scheme can be carried out using training.
  • training For example, the Berkeley Segmentation Database which contains 1633 manual segmentations of 300 color images that can be used for this purpose.
  • the selected splitting planes can be improved.
  • training can be used to assist in selecting the appropriate number of splitting planes or in selecting specific splitting planes to include.
  • the manual segmentations provided by the users can be compared with the segmentations produced by the algorithm using two different measures, the Global Consistency Error and the Rand Index.
  • GCE Global Consistency Error
  • Martin, Fowlkes, Tal and Malik can capture the difference between two segmentations in a single number between 0 and 1 where lower numbers indicate lower error.
  • the measure can be designed such that if one segmentation is a refinement of the other the score will be zero. This is a useful feature since it accounts for the fact that human subjects often choose to segment scenes to various levels, however, it also implies that machine segmentations that are strongly over or under segmented can also yield very low GCE scores which can be misleading.
  • the Rand Index represents the fraction of the pixel pairs that are labeled consistently in the two segmentations, values that are closer to 1 indicate better segmentations. Unlike the GCE the Rand Index will suffer if the machine segmentation is over or under segmented with respect to the human segmentation.
  • a series of segmentation experiments can be carried out using feature spaces based on color information.
  • the color values can be averaged over a square window of width w centered around each pixel. Effectively, this averaging is a preprocessing step that can aid in smoothing noise out of the image. Increasing the size of this window can increase the level of smoothing and leads to a coarser segmentation. It will be appreciated that other filtering schemes can be used to smooth or preprocess the image before performing image segmentation.
  • FIG. 3 shows histograms of the distributions of the GCE and Rand Index metrics over the entire data set.
  • the first graph on the left of FIG. 2 indicates the distribution of the GCE values over all of the segmentations in the database while the graph on the right denotes the distribution of the Rand Index values.
  • FIG. 4 shows a few of the images from the data set along with the human segmentation and the machine segmentation. This figure compares the output of the automated segmentation procedure to human labeled segmentations.
  • the first and fourth rows contain the input imagery, the second and fifth rows contain human segmentations while the third and sixth rows contain machine segmentations.
  • FIG. 5 provides a direct comparison of segmentations produced by the methods of the present invention with those produced by the Mean Shift procedure for a few randomly chosen images in the data set.
  • This figure compares the output of the proposed segmentation scheme with the results obtained using the Edison segmentation tool.
  • the first row corresponds to the input image, the second to a human segmentation, the third to the mean shift result and the fourth to the randomized hash result.
  • One advantage of the proposed segmentation scheme is that the computational effort required scales linearly in the number of pixels and the operations required are simple and regular.
  • a real time version of the scheme was implemented on a Macbook Pro laptop computer.
  • This implementation was used to segment 640 by 480 video frames at a rate of 10 frames per second using a single core of an Intel Core 2 Duo processor running at 2.33 GHz.
  • This rate includes the time taken for all phases of the algorithm, image acquisition, randomized hashing, local maxima detection and connected components processing. Since almost all of the steps in the procedure are embarrassingly parallel, the algorithm is a well suited to implementation on modern multi-core processors and GPUs and should be amenable to further acceleration.
  • the proposed algorithm can be highly parallelizable and can be implemented in real time on modest hardware. This is an advantage since it means that the method could be used as a cheap preprocessing step in a variety of image interpretation applications much as edge detection is used today.
  • the segmentation algorithm is used instead of edge detection in image processing in a real time environment. This call allow objects to be identified when coupled with pattern matching or relative motion to a background.
  • the method can be used on a mobile robot to produce a fast, rough segmentation of the scene into sky ground, road and tree regions.
  • This mobile robot can use image segmentation as described herein as well as a ranging mechanism to model its surrounding environment as described in concurrently filed application titled “Scene Analysis Using Image And Range Data,” by C. J. Taylor, which is incorporated herein by reference.
  • the segmentation scheme could be used as part of the loop in real time tracking applications where it would allow the system to automatically delineate targets.
  • the lower processing requirements of hashing could make object detection faster and cheaper for real-time tracking.
  • the segmentation scheme could be performed by CPUs on the cameras to detect an object and track it.
  • a suitable camera network with tracking ability is described in concurrently filed application titled “Distributed Target Tracking Using Self Localizing Smart Camera Networks,” by C. J. Taylor, which is incorporated herein by reference.
  • the real time segmentation scheme being used as a pre-processing step which would suggest possible groupings in the image to higher level interpretation algorithms.
  • this segmentation scheme could use this segmentation scheme as a preprocessing stage to any preferred image processing technique suitable for the application.
  • the system could, for instance, focus its attention on regions based on their, size, shape, texture or position in the image.
  • Suitable processing environments can include Intel, PowerPC, ARM, or other CPU-based systems having memory and a processor, but can also include any suitable embedded systems, DSP, GPU, APU, or other multi-core processing environment including related hardware and memory.
  • the algorithms taught herein can be implemented by dedicated logic.
  • execution of these algorithm and techniques is not limited to a single processor environment, and can, in some contemplated embodiments, be performed in a client server environment, a cloud computing environment, multicore environment, multithreaded environment, mobile device or devices, etc.

Abstract

Pixels in a feature space can be divided using a hashing method. Random planes are selected within a feature space and a pixel's relationship to each plane determines a bit of a hash code. Clusters of pixels can be identified by local maxima in the hash cells in the feature space. Nearby pixels in the feature space can be further assigned to these local maxima based on hamming distance. An image can be segmented by observing adjacent pixels sharing a common hash code.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • The present application claims priority to provisional patent applications 61/418,789, 61/418,805, and 61/418,799 which are incorporated by reference in their entirety.
  • The present application relates to co-pending patent applications entitled “Scene Analysis Using Image and Range Data” and “Distributed Target Tracking Using Self Localizing Smart Camera Networks Technology Field” both of which are incorporated by reference in their entirety and filed on the same day as the present application entitled “Image Segmentation for Distributed Target Tracking and Scene Analysis.”
  • TECHNOLOGY FIELD
  • The present invention relates generally to machine vision systems and methods and specifically to image segmentation to determine salient features in an image. The present invention relates generally to machine vision systems and methods and specifically to object tracking and scene analysis for distributed or mobile applications.
  • BACKGROUND
  • Segmentation is a method of breaking an image into coherent regions and is a common problem in Computer Vision. Many known methods of segmentation are computationally intensive, making them unsuitable for fast, low power, or low cost applications, such as for use with distributed or mobile devices. However, there is a need for an algorithm that is more amenable to real-time implementation.
  • To date, most of the approaches that have been developed to tackle the segmentation problem can be broadly divided into two groups. The first group consists of algorithms that view the image as a graph and use various metrics to measure the difference in appearance between neighboring pixels or regions. Once the problem has been formulated in this way, the algorithms center on the problem of dividing this graph into pieces so as to maximize coherence. The Normalized Cut algorithm developed by Shi and Malik proceeds by recasting the graph segmentation problem in terms of a spectral analysis. (See Jianbo Shi and Jitendra Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:888-905, 1997.) This approach involves computing the distance between the pixels in the image and then solving a series of large but sparse eigenvector problems.
  • Felzenszswalb and Huttenlocher proposed an efficient approach to grouping pixels in an image by making use of a spanning tree and showed that locally greedy grouping decisions can yield plausible results. (Pedro F. Felzenszwalb and Daniel P. Huttenlocher, “Efficient graph-based image segmentation,” Int. J. Comput. Vision, 59(2): 167-181, 2004. ISSN 0920-5691.) This approach also revolves around the computation of multiple pairwise distance values. There remains a need for avoiding the computational costs associated with distance computation.
  • Another broad class of segmentation schemes are termed feature based methods because these proceed by associating a feature vector with each pixel in the image. (See Dorin Comaniciu, Peter Meer, and Senior Member, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:603-619, 2002; W Y Ma and B. S. Manjunath, “Texture features and learning similarity,” Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 0:425, 1996, ISSN 1063-6919. doi: http://doi.ieeecomputersociety.org/10.1109/CVPR.1996.517107; and Eduard Vazquez, Joost Weijer and Ramon Baldrich, “Image segmentation in the presence of shadows and highlights,” In ECCV '08: “Proceedings of the 10th European Conference on Computer Vision,” pages 1-14, Berlin, Heidelberg, 2008, Springer-Verlag. ISBN 978-3-540-88692-1. doi: http://dx.doi.org/10.1007/978-3-540-88693-81.)
  • Another clustering method is the k-means algorithm, which seeks to divide the population into k-clusters using an Expectation Maximization approach. (See Morten Rufus Blas, Motilal Agrawal, Aravind Sundaresan, and Kurt Konolige, “Fast color/texture segmentation for outdoor robots.” In IROS, pages 4078-1085, 2008. K-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. One issue that one needs to be addressed in applying this algorithm to segmentation problems is the question of choosing an appropriate value for k, which is typically not known beforehand. A second issue is the fact that the k-means scheme involves repeated rounds of distance computations. This means that the computational complexity grows with the number of pixels, the dimension of the feature space and the number of clusters. K-means clustering is typically considered an NP-hard problem. While some approaches have been proposed to mitigate this problem including the method developed by Elkan, which seeks to accelerate the process by invoking the triangle inequality, and Locality Sensitive Hashing schemes that search for near neighbors in the feature space, these have been unable to fully mitigate the distance computations required. (See Charles Elkan, “Using the triangle inequality to accelerate k-means,” In International Conference on Machine Learning, 2003; Piotr Indyk and Rajeev Motwani. “Approximate nearest neighbors: towards removing the curse of dimensionality,” In STOC '98: Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604-613, New York, N.Y., USA, 1998. ACM. ISBN 0-89791-962-9. doi: http://doi.acm.org/10.1145/276698.276876.)
  • There have also been attempts to apply Mean Shift segmentation algorithm to subdivide color images into regions. (Dorin Comaniciu, Peter Meer and Senior Member, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 24:603-619, 2002.) This feature based approach proceeds by searching for modes of the distribution in the feature space using a Parzen Window based approach (sometimes referred to as the Parzen-Rosenblatt window, which provides a well known non-parametric way of estimating the probability density function of a random variable). The method involves tracing the paths of various feature vectors as they evolve under the mean shift rule. This non-parameteric estimation scheme can be very time consuming which makes it less useful in situations where real time response is desired. The Parzen Window density estimation scheme employed in this approach also limits the dimension of the feature spaces to which it can be applied effectively. In contrast the method proposed in this application can be applied to arbitrary feature spaces and has been implemented in real time on modest hardware.
  • When labeled image data is available, algorithms that learn how to classify pixels and segment images have also been proposed. (See J. Shotton, M. Johnson, and R. Cipolla, “Semantic text on forests for image categorization and segmentation,” In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8, 2008; Michael Maire, Pablo Arbelaez, Charles Fowlkes, and Jitendra Malik, “Using contours to detect and localize junctions in natural images,” In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8, June 2008.) These approaches rely on leveraging the training data to associate semantic labels with pixels and segments.
  • With the emergence of low-cost, ubiquitous cameras, such as cameras, there exists a need for developing image segmentation methods that can be reliably used with low-cost computation, which can handle large volumes of image data. Similarly, it is desirable for a segmentation method to be operated without the need for structured training data, which may not be readily available for use with low-cost or large volume cameras and processors.
  • When labeled image data is available, algorithms that learn how to classify pixels and segment images have also been proposed. (See J. Shotton, M. Johnson, and R. Cipolla, “Semantic text on forests for image categorization and segmentation,” In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8, 2008; Michael Maire, Pablo Arbelaez, Charles Fowlkes, and Jitendra Malik, “Using contours to detect and localize junctions in natural images,” In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1-8, June 2008.) These approaches rely on leveraging the training data to associate semantic labels with pixels and segments.
  • SUMMARY OF THE INVENTION
  • Embodiments of the present invention address and overcome one or more of the above shortcomings and drawbacks, by providing devices, systems, and methods for segmenting images. This technology is particularly well-suited for, but by no means limited to, real-time segmentation of images for identifying salient features of an image.
  • Embodiments of the present invention are directed to a method for identifying salient segments in an image comprising the steps of choosing a plurality of planes that divide a feature space into a plurality of cells, generating a first set of hash codes for a first set of pixels in an image based on a location in the feature space of a feature vector associated with each pixel in the first set whereby the location of each feature vector relative to each of the plurality of planes contributes a binary value to each hash code, determining a second set of hash codes selected from the first set of hash codes, wherein the second set of hash codes indicates local maxima of clusters of pixels, assigning each of the first set of pixels to one of the second set of hash codes based on the hamming distance between a first hash code from the first set of hash codes assigned to each pixel and each of the second set of hash codes, and identifying segments in an image based on groups of adjacent pixels sharing common hash codes. The step of choosing a plurality of planes may further comprise selecting a predetermined number of random planes in the feature space. The feature space may be a three-dimensional color space. The step of choosing a plurality of planes may further comprise training the plurality of planes based on human interpretations of images.
  • Additional features and advantages of the invention will be made apparent from the following detailed description of illustrative embodiments that proceeds with reference to the accompanying drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
  • The foregoing and other aspects of the present invention are best understood from the following detailed description when read in connection with the accompanying drawings. For the purpose of illustrating the invention, there is shown in the drawings embodiments that are presently preferred, it being understood, however, that the invention is not limited to the specific instrumentalities disclosed. Included in the drawings are the following Figures:
  • FIG. 1A is a two dimensional view of a simplified 2D version of the randomized hashing scheme in a feature space fractured into regions by a set of randomly chosen splitting planes Each region is associated with a hash code indicating where it falls with respect to the splitting planes;
  • FIG. 1B is a four-dimensional view of the relationship between hash codes and associated with the vertices of a hypercube in an exemplary embodiment;
  • FIG. 2 is a flow chart showing operation of an exemplary algorithm for use with segmenting images in accordance with some embodiments of the random hashing scheme;
  • FIG. 3 shows two histograms of the distributions of the GCE and Rand Index metrics over a sample data set as a result of the application of certain embodiments of the random hash scheme.
  • FIG. 3 includes sample images along with the human segmentation and the machine segmentation.
  • FIG. 4 shows a plurality sample images, along with the human segmentation and the machine segmentation, including both prior art and an embodiment of the random hashing scheme.
  • DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
  • The segmentation scheme described in this application employs a feature based approach. Each pixel in the image is described by a feature vector which encodes a set of properties used to describe that pixel. Embodiments of the present invention can employ a simple color descriptor vector, which is an example of a feature vector, but some embodiments also use more sophisticated feature vectors such as a histogram of color values or a vector of texture coefficients. Some embodiments can employ an approach to segmenting natural images which leverages the idea of randomized hashing. The procedure aims to replace the problem of finding clusters in the feature space with the problem of finding local maxima in a graph whose topology approximates the geometry of the underlying feature space. In so doing the method can bypass the computational effort associated with computing distances between feature vectors which can comprise a significant fraction of the effort in other techniques such as k-means clustering and mean shift segmentation.
  • The method can be controlled by a few parameters namely, the number of random splitting planes, n, The Hamming Distance threshold, k, and the window size that is used to average the color vectors, w. By adjusting these parameters the algorithm can be made to produce over segmentations or under segmentations of the input imagery. Importantly the number of segments that are produced is implicitly controlled by these parameters rather than explicitly provided as an input to the algorithm.
  • These techniques can be employed in a computing environment available to a person of ordinary skill in the art, including performing the prescribed calculations on a PC, embedded processor, mobile device, cloud computing environment, client-server environment, DSP, or dedicated hardware circuit capable of performing the methods disclosed herein.
  • Given a set of feature vectors (e.g. pixels having predetermined properties, where each selected property is a dimension in a feature space), the goal of the segmentation procedure is to divide them into a set of clusters which capture the most salient groupings in the distribution. An example of a feature space for use with some embodiments includes a 3D space with orthogonal axes for each color: red, green, and blue (RGB). In some embodiments, the feature space uses axes for Y, Pb, and Pr. In some embodiments, more than three dimensions can be used. For example, other dimensions can include motion vectors, depth, or other information that can be obtained related to each pixel, so that each pixel (or group of pixels) can be mapped to the feature space.
  • Entries in this feature vector characterize salient properties of the region surrounding that pixel such as color (e.g. RGB, YPbPr, or multispectral properties, such as IR, UV, or information from a FLIR camera), texture, frequency content, and/or motion properties. Once all of the pixels have been mapped to the feature space, the segmentation process is treated as a clustering problem where the goal can be to identify salient clusters in the population of feature vectors.
  • FIG. 1A depicts a simplified 2D version of the randomized hashing scheme in a feature space fractured into regions by a set of randomly chosen splitting planes Each region is associated with a hash code indicating where it falls with respect to the splitting planes. The set of all hash codes can be associated with the vertices of a hypercube as shown in Figure (b) here the shading of the nodes indicates how many feature vectors are hashed to that code. The segmentation scheme proceeds by identifying local maxima in this hash code space.
  • This scheme employs a series of randomly chosen splitting planes. FIGS. 1A and 1B show a simplified view of this procedure in two dimensions. Here the random splitting planes 0, 1, 2, and 3 are used to hash the feature vectors into a set of disjoint cells based on their location.
  • On can hash a set of vectors into a set of discrete bins in order to accelerate the search for nearest neighbors. One can further leverage the fact that this randomized hashing procedure tends to preserves locality so points that are near to each other in the feature space are hashed to the same bin with high probability. The proposed segmentation scheme leverages these phenomena to cluster the feature vectors into groups. In FIG. 1A, the n=4 spitting planes fracture the feature space into a set of 2n disjoint convex cells each of which corresponds to an n-bit hash code. It should be understood that each splitting plane can also be considered a normal vector and a decision point (such as a mean) in the feature space. By projecting each pixel on the that normal vector, and determining its position relative to the decision point, a bit associated with each normal vector can be assigned.
  • More specifically, each sample vector (e.g. pixel or object) in the feature space vj is assigned an n-bit hash code where the ith bit in the code, bij is derived from the ith splitting plane as follows bij=(vj·ui)>Si where ui denotes the normal associated with the ith splitting plane and si denotes the corresponding splitting value. Neighboring cells in the feature space differ by a single bit so the Hamming distance between the codes provides some indication of the distance between vectors in the feature space. More generally we can construct a correspondence between the set of all possible hash codes and the vertices of an n-dimensional hypercube. The topology of the hypercube 100 reflects the structure of the feature space since neighboring cells in feature space will correspond to neighboring vertices in the hypercube. In this example, the shaded nodes, 1001, 0000, and 0111 are those bins that have local maxima clusters in FIG. 1A.
  • For each of the hash codes the clustering procedure can record how many feature vectors are mapped to that code. In some embodiments, clusters in feature space will induce population maxima in the code space. That is, if we consider the hypercube as a graph we would expect to observe that some of the hash codes have a greater population than their neighbors. This allows us to replace the original problem of clustering vectors in the feature space in favor of the simpler problem of looking for population maxima in the code space graph.
  • For every populated code in the hypercube the algorithm interrogates all of the codes that differ from the current code by k bits or fewer. This parameter, k, is referred to as the Hamming Distance Threshold. If the code under consideration has a population greater than all of its neighbors it is declared a local maxima and a cluster center. In this way the number of clusters recovered by the procedure is determined automatically based on the data as opposed to being imposed a priori as in k-means. Note that this scheme can be used to distinguish up to 2(n-k) local maxima.
  • The normals associated with the splitting planes, ui, can be chosen randomly or based on a priori knowledge. The splitting values, si, can be chosen by considering the distribution of the projected values, (vi·ui). In some embodiments, the mean of the distribution, which corresponds to casting all of the splitting planes through the centroid of the distribution, is used. In some embodiments, the median value, and the value midway between the maximum and minimum projected values, is used as the splitting value (i.e. the decision point for assigning bit values to pixels projected on the normal vector. These schemes tend to produce similar results in practice.
  • After the local maxima have been identified, each of the feature vectors is labeled with the hash code of the closest local maxima based on the Hamming Distance. In the case where a feature vector is equidistant from two or more local maxima based on Hamming Distance the Euclidean distance between the feature vector and the mean cluster vector is used to break the tie and decide the label. Once each of the pixels has been labeled with the index of its local maxima, a connected components procedure is run to divide the image into coherent connected regions.
  • The entire scheme is outlined below in pseudo-code. This algorithm is elaborated in FIG. 2.
  • Algorithm 1 Segmentation via Randomized Hashing
  • 1: Hash each feature vector to an n-bit code using the n randomly chosen splitting planes
  • 2: Maintain a count of the number of feature vectors mapped to each hash code
  • 3: Identify local maxima in the code space—these are the cluster centers
  • 4: Assign each feature vector to the closest local maxima
  • 5: Run connected components on the labeled pixels to identify coherent connected components.
  • FIG. 2 shows method 300 for segmenting an image. At step 302 a feature space is divided by a predetermined number, n, of splitting planes. This step can include randomly assigning and feature planes and using a calibration image or a first image to the splitting plane for a given orientation. For example, the position of each plane can be chosen such that feature vectors (e.g. those feature vectors associated with each pixel in a test image) within the feature space are evenly divided on either side of the splitting plane. As discussed above, each splitting plane can be created by choosing a random normal vector and assigning a decision point along that vector such that the decision point is at the mean or median of the distribution of feature vectors (e.g. pixels projected on that normal vector). This can be done using a calibration image, the image under test, or the previous image under test. In some embodiments, step 302 takes into account predetermined splitting planes that have been created via prior training or have been manually assigned. At step 302, the splitting planes can include a combination of random splitting planes and preassigned splitting planes.
  • At step 304, each feature vector image (or a subset of the feature vectors in image) is hashed using the splitting planes. This process can be computationally simple as each bit in the hash simply determines which side of the splitting plane the feature vector resides. Because only a bit is used for this hash, additional computational overhead needed for deriving the Euclidean distance from the splitting claim is not necessary. Step 304 can be an iterative loop whereby each feature vector is taken and compared in succession to each of the n splitting planes. It will be appreciated that massive parallelism and may be possible using the right processor, such as a DSP or graphics processor, to perform this step.
  • Once each feature vector is hashed into the cells created by the splitting planes, the algorithm proceeds to step 306. At step 306, the number of feature vectors resident in each cell in the feature space is counted. At step 308, the population counts for each cell in the feature space are compared to choose a number of local maxima. In some embodiments the number of local maxima, M, is predetermined prior to image processing. In other embodiments the number of local maxima, M, is derived dynamically from the image based on the results of the residency counts of each cell in the feature space. As discussed, the maximum number of local maxima, M, can be determined based on the number of splitting planes used and the Hamming distance requirements for clusters used to image segmentation. Once the local maxima are identified, these can be used as the center of each cluster used for image segmentation.
  • At step 310 each feature vector under test can be assigned to each of the cluster centers determined in step 308. In some embodiments, this assignment has low computational is and where. In overhead because the hash of each feature vector is compared to each of the cluster centers and the cluster having the nearest Hamming distance to the hash of the vector is selected as the cluster to which the feature vector will be assigned. In some embodiments, ties between competing clusters (i.e. clusters that are the same Hamming distance away from the hashed feature vector) can be resolved by estimating the Euclidean distance between the center of each cluster and the current feature vector. It will be appreciated that other techniques for resolving conflicts between equidistant clusters can be used, including assigning each feature vector to the equidistant cluster that has the least number of feature vectors currently assigned, or the most number.
  • Once each of the feature vectors is assigned to a cluster, image segments can be identified within the image at step 312. For example, adjacent (i.e. connected) pixels in the image plane that have had their feature vectors assigned to the same cluster can be considered part of the same image segment. Any known technique can be used, such as minimum threshold sizes for segments of adjacent pixels. In this way, image segments can be rapidly deduced from the image by scanning pixels in the image plane and determining whether they share the same cluster in the feature space.
  • In some embodiments, algorithm 300 repeats after step 312, with the next captured image. In some embodiments, step 314 may optionally adjust the splitting planes based on the results in step 312. For example, in a training scheme the results of step 312 can be compared via a consistency score with a desired segmentation result. Based on this result, the splitting planes can be adjusted to improve this consistency score. In some embodiments, incremental improvement of splitting planes is done in real time based on other criteria, such as reducing the number of identified segments in step 312 or increasing the number of identified segments in step 312 to reach a desired range of segments. Once the splitting planes are adjusted the algorithm can return to step 302.
  • The proposed scheme can be similar in spirit to the Mean Shift segmentation algorithm which also seeks to identify modes in the distribution of feature vectors. Where the mean shift scheme uses a Parzen Window based scheme to estimate density in feature space, the proposed scheme uses hashing, which can be randomized or tailored to apriori information, to identify salient groupings of feature vectors.
  • Like Locality Sensitive Hashing, the segmentation scheme can make implicit use of the Johnsson-Lindenstrauss theorem which justifies the use of random projection by bounding the distortion of the relative distances between the feature vectors induced by the projection process. Additionally and alternatively, the hashing scheme may use planes that are predetermined, selected from a bounded group of planes, estimated or calculated based on some selected criteria. For example, if a particular color is known to be important, the hashing scheme can take this into account, by using at least some predetermined feature planes to create the hash or introduce a bias into the selection process, which might otherwise be random. The hashing scheme may also include an adaptive selection algorithm to learn how to select more appropriate feature planes in future video frames. This learning process can be by any scheme known to a person of ordinary skill in the art, including using human feedback to help train parameters, genetic algorithms, decision trees, pruning, beam searching, or the like.
  • From a computational perspective the principal effort revolves around computing the hash codes which involves O(nmN) operations where n denotes the number of projection directions, m denotes the dimension of the feature space and N denotes the total number of pixels or feature vectors. Note that the scheme can avoid the explicit distance computations between the feature vectors that one uses in most agglomerative and k-means segmentation schemes in favor of randomized hashing.
  • In searching for the local maxima in the code space one can simply store the hash code populations in an array with 2n entries. For each populated hash code the procedure involves interrogating on the order of (n/k) neighboring codes. For example to run the local maxima detection algorithm on n=12 dimensions with a Hamming distance threshold, k=2, one can construct a table with 212=4096 entries and each hash code would have (12/1)+(12/2)=12+66=78 neighbors. Typically many of the hash bins are empty which further simplifies processing. For larger value of n one could employ a binary tree data structure to store and query the contents of the hash table efficiently.
  • In some embodiments, the proposed segmentation scheme can be carried out using training. For example, the Berkeley Segmentation Database which contains 1633 manual segmentations of 300 color images that can be used for this purpose. By comparing the results of random hash segmentation one or more of the images to one or more of the manual segmentation results, the selected splitting planes can be improved. For example, training can be used to assist in selecting the appropriate number of splitting planes or in selecting specific splitting planes to include. The manual segmentations provided by the users can be compared with the segmentations produced by the algorithm using two different measures, the Global Consistency Error and the Rand Index. The Global Consistency Error (GCE) developed by Martin, Fowlkes, Tal and Malik can capture the difference between two segmentations in a single number between 0 and 1 where lower numbers indicate lower error. (See David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” In In Proc. 8th Int'l Conf. Computer Vision, pages 416-423, 2001.) The measure can be designed such that if one segmentation is a refinement of the other the score will be zero. This is a useful feature since it accounts for the fact that human subjects often choose to segment scenes to various levels, however, it also implies that machine segmentations that are strongly over or under segmented can also yield very low GCE scores which can be misleading.
  • To provide a different but related perspective on the algorithm one can record and report the Rand Index for each segmentation. This measure is commonly used in statistics to measure the quality of clustering algorithms. (See Ranjith Unnikrishnan, Caroline Pantofaru and Martial Hebert, “Toward objective evaluation of image segmentation algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):929-944, 2007. ISSN 0162-8828. doi: http://doi.ieeecomputersociety.org/10.1109/TPAMI.2007.1046.) In order to compute the Rand Index one can consider every pair of pixels in the image and determine whether they are labeled consistently in the human and machine segmentations. That is, if the two pixels have the same label in the human segmentation they should have the same label in the machine segmentation and vice versa. The Rand Index represents the fraction of the pixel pairs that are labeled consistently in the two segmentations, values that are closer to 1 indicate better segmentations. Unlike the GCE the Rand Index will suffer if the machine segmentation is over or under segmented with respect to the human segmentation.
  • A series of segmentation experiments can be carried out using feature spaces based on color information. In some embodiments, the color values can be averaged over a square window of width w centered around each pixel. Effectively, this averaging is a preprocessing step that can aid in smoothing noise out of the image. Increasing the size of this window can increase the level of smoothing and leads to a coarser segmentation. It will be appreciated that other filtering schemes can be used to smooth or preprocess the image before performing image segmentation.
  • A first set of experiments that was carried out was designed to determine how the performance of the segmentation scheme varied as we varied the color space. Experiments were carried out using the standard RGB values, the HSV color space, the LAB color space and a color vector that concatenated the RGV and HSV values into a six dimensional color vector. These experiments were carried out using a randomly chosen subset of 150 of the segmentations in the database. The average GCE and Rand index values are reported. In all of these experiments the value of n was fixed at 12 the value of k was fixed at 1 and the value of w was fixed at 3. Table 1 summarizes the results of these experiments and indicates that the RGBHSV color space offers the best performance with respect to the two metrics.
  • TABLE 1
    Results of running the segmentation
    procedure using various color spaces
    Color GCE Rand
    RGB 0.2805 0.7327
    HSV 0.2421 0.7527
    LAB 0.2578 0.7351
    RGBHSV 0.2014 0.7614
  • A second set of experiments explored how the performance of the scheme varied as we varied the number of splitting planes, n, and the Hamming Distance Threshold used to find local minima, k. The experiments were carried out using the HSV color space on the same subset of 150 segmentations from the database. The mean GCE and Rand Index values were recorded for every combination of parameters and the results are summarized in Table 2. In practice increasing values of the n parameter provide more ways to distinguish between feature vectors can lead to over segmentation while increasing the k parameter decreases the number of local maxima detected in the code space and leads to under segmentation.
  • TABLE 2
    Results of running the segmentation procedure
    using various values for the n and k parameters
    n k GCE Rand Index
    8 1 0.1952 0.7632
    8 2 0.2511 0.7443
    8 3 0.2450 0.7104
    12 1 0.1652 0.7535
    12 2 0.2250 0.7640
    12 3 0.2482 0.7417
    16 1 0.1005 0.7438
    16 2 0.1670 0.7583
    16 3 0.2236 0.7527
  • A third set of experiments was carried out to investigate how the performance of the scheme varied as the window size parameter, w, was varied. The experiments were carried out using the HSV color space with the n and k parameters fixed at 12 and 1 respectively. Table 3 contains the results of these trials. Increasing the value of w leads to increases the level of smoothing which typically leads to under segmentation.
  • TABLE 3
    Results of running the segmentation procedure using various
    values for the size of the smoothing window, w in pixels
    w GCE Rand Index
    3 0.1219 0.7479
    5 0.1350 0.7494
    7 0.1477 0.7513
    11 0.1660 0.7520
    21 0.1992 0.7537
  • A fourth experiment was run to compare the results of the automated segmentation procedure to each of the 1633 human segmentations in the database. For this experiment the HSV color space was employed, the number of splitting planes, n was 12, the Hamming Distance threshold, k was 2 and the window size, w was 3. These parameter values were chosen to produce a visually pleasing over segmentation of the images rather than to optimize the GCE or Rand Index values. Over the entire database the mean GCE value was 0.2235 and the median GCE value was 0.2157 the mean Rand Index value was 0.7370 and the median Rand Index was 0.7833.
  • FIG. 3 shows histograms of the distributions of the GCE and Rand Index metrics over the entire data set. The first graph on the left of FIG. 2 indicates the distribution of the GCE values over all of the segmentations in the database while the graph on the right denotes the distribution of the Rand Index values.
  • FIG. 4 shows a few of the images from the data set along with the human segmentation and the machine segmentation. This figure compares the output of the automated segmentation procedure to human labeled segmentations. The first and fourth rows contain the input imagery, the second and fifth rows contain human segmentations while the third and sixth rows contain machine segmentations.
  • FIG. 5 provides a direct comparison of segmentations produced by the methods of the present invention with those produced by the Mean Shift procedure for a few randomly chosen images in the data set. This figure compares the output of the proposed segmentation scheme with the results obtained using the Edison segmentation tool. The first row corresponds to the input image, the second to a human segmentation, the third to the mean shift result and the fourth to the randomized hash result. The parameters used for the Edison tool were (hs,hr,M)=(7,6.5,20) and the parameters used for the randomized method were (n,k,w)=(12,2,3).
  • One advantage of the proposed segmentation scheme is that the computational effort required scales linearly in the number of pixels and the operations required are simple and regular. In order to demonstrate this a real time version of the scheme was implemented on a Macbook Pro laptop computer. This implementation was used to segment 640 by 480 video frames at a rate of 10 frames per second using a single core of an Intel Core 2 Duo processor running at 2.33 GHz. This rate includes the time taken for all phases of the algorithm, image acquisition, randomized hashing, local maxima detection and connected components processing. Since almost all of the steps in the procedure are embarrassingly parallel, the algorithm is a well suited to implementation on modern multi-core processors and GPUs and should be amenable to further acceleration.
  • Applications
  • The proposed algorithm can be highly parallelizable and can be implemented in real time on modest hardware. This is an advantage since it means that the method could be used as a cheap preprocessing step in a variety of image interpretation applications much as edge detection is used today. In some embodiments, the segmentation algorithm is used instead of edge detection in image processing in a real time environment. This call allow objects to be identified when coupled with pattern matching or relative motion to a background.
  • The method can be used on a mobile robot to produce a fast, rough segmentation of the scene into sky ground, road and tree regions. This mobile robot can use image segmentation as described herein as well as a ranging mechanism to model its surrounding environment as described in concurrently filed application titled “Scene Analysis Using Image And Range Data,” by C. J. Taylor, which is incorporated herein by reference.
  • Similarly, the segmentation scheme could be used as part of the loop in real time tracking applications where it would allow the system to automatically delineate targets. The lower processing requirements of hashing could make object detection faster and cheaper for real-time tracking. For example, if used in a distributed camera network, the segmentation scheme could be performed by CPUs on the cameras to detect an object and track it. A suitable camera network with tracking ability is described in concurrently filed application titled “Distributed Target Tracking Using Self Localizing Smart Camera Networks,” by C. J. Taylor, which is incorporated herein by reference.
  • In any of these applications, the real time segmentation scheme being used as a pre-processing step which would suggest possible groupings in the image to higher level interpretation algorithms. In this way a person of ordinary skill in the art could use this segmentation scheme as a preprocessing stage to any preferred image processing technique suitable for the application. The system could, for instance, focus its attention on regions based on their, size, shape, texture or position in the image.
  • While the segmentation algorithm has been discussed and implemented in the context of color descriptors, it could equally easily be applied to feature spaces with higher dimension and can involve any combination of features, including both color and texture values and/or motion vectors.
  • It should be readily apparent that the image processing techniques taught herein are suitable for execution in a computing environment that includes at least one processor. Suitable processing environments can include Intel, PowerPC, ARM, or other CPU-based systems having memory and a processor, but can also include any suitable embedded systems, DSP, GPU, APU, or other multi-core processing environment including related hardware and memory. Similarly, the algorithms taught herein can be implemented by dedicated logic. Similarly, execution of these algorithm and techniques is not limited to a single processor environment, and can, in some contemplated embodiments, be performed in a client server environment, a cloud computing environment, multicore environment, multithreaded environment, mobile device or devices, etc.
  • Although the invention has been described with reference to exemplary embodiments, it is not limited thereto. Those skilled in the art will appreciate that numerous changes and modifications may be made to the preferred embodiments of the invention and that such changes and modifications may be made without departing from the true spirit of the invention. It is therefore intended that the appended claims be construed to cover all such equivalent variations as fall within the true spirit and scope of the invention.

Claims (4)

1. A method for identifying salient segments in an image comprising the steps of:
choosing a plurality of planes that divide a feature space into a plurality of cells;
generating a first set of hash codes for a first set of pixels in an image based on a location in the feature space of a feature vector associated with each pixel in the first set, whereby the location of each feature vector relative to each of the plurality of planes contributes a binary value to each hash code;
determining a second set of hash codes selected from the first set of hash codes, wherein the second set of hash codes indicates local maxima of clusters of pixels;
assigning each of the first set of pixels to one of the second set of hash codes based on the hamming distance between a first hash code from the first set of hash codes assigned to each pixel and each of the second set of hash codes; and
identifying segments in an image based on groups of adjacent pixels sharing common hash codes.
2. The method of claim 1, wherein the step of choosing a plurality of planes further comprises selecting a predetermined number of random planes in the feature space.
3. The method of claim 1, wherein the feature space is a three-dimensional color space.
4. The method of claim 1, wherein the step of choosing a plurality of planes further comprises training the plurality of planes based on human interpretations of images.
US13/309,551 2010-12-01 2011-12-01 Image segmentation for distributed target tracking and scene analysis Abandoned US20120250984A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/309,551 US20120250984A1 (en) 2010-12-01 2011-12-01 Image segmentation for distributed target tracking and scene analysis

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US41880510P 2010-12-01 2010-12-01
US41879910P 2010-12-01 2010-12-01
US41878910P 2010-12-01 2010-12-01
US13/309,551 US20120250984A1 (en) 2010-12-01 2011-12-01 Image segmentation for distributed target tracking and scene analysis

Publications (1)

Publication Number Publication Date
US20120250984A1 true US20120250984A1 (en) 2012-10-04

Family

ID=46926741

Family Applications (3)

Application Number Title Priority Date Filing Date
US13/309,543 Active 2033-09-09 US9449233B2 (en) 2010-12-01 2011-12-01 Distributed target tracking using self localizing smart camera networks
US13/309,558 Active US8867793B2 (en) 2010-12-01 2011-12-01 Scene analysis using image and range data
US13/309,551 Abandoned US20120250984A1 (en) 2010-12-01 2011-12-01 Image segmentation for distributed target tracking and scene analysis

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US13/309,543 Active 2033-09-09 US9449233B2 (en) 2010-12-01 2011-12-01 Distributed target tracking using self localizing smart camera networks
US13/309,558 Active US8867793B2 (en) 2010-12-01 2011-12-01 Scene analysis using image and range data

Country Status (1)

Country Link
US (3) US9449233B2 (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120170801A1 (en) * 2010-12-30 2012-07-05 De Oliveira Luciano Reboucas System for Food Recognition Method Using Portable Devices Having Digital Cameras
US8495425B2 (en) * 2011-03-01 2013-07-23 International Business Machines Corporation System and method to efficiently identify bad components in a multi-node system utilizing multiple node topologies
US20130197859A1 (en) * 2012-01-30 2013-08-01 International Business Machines Corporation Tracking Entities by Means of Hash Values
US20140040262A1 (en) * 2012-08-03 2014-02-06 Adobe Systems Incorporated Techniques for cloud-based similarity searches
US8867793B2 (en) 2010-12-01 2014-10-21 The Trustees Of The University Of Pennsylvania Scene analysis using image and range data
CN104574440A (en) * 2014-12-30 2015-04-29 安科智慧城市技术(中国)有限公司 Video movement target tracking method and device
CN104637052A (en) * 2015-01-22 2015-05-20 西南交通大学 Object tracking method based on target guide significance detection
CN105989611A (en) * 2015-02-05 2016-10-05 南京理工大学 Blocking perception Hash tracking method with shadow removing
US20180293742A1 (en) * 2014-09-19 2018-10-11 Brain Corporation Apparatus and methods for saliency detection based on color occurrence analysis
US20190068940A1 (en) * 2017-08-31 2019-02-28 Disney Enterprises Inc. Large-Scale Environmental Mapping In Real-Time By A Robotic System
CN109598726A (en) * 2018-10-26 2019-04-09 哈尔滨理工大学 A kind of adapting to image target area dividing method based on SLIC
US10298970B2 (en) * 2014-12-12 2019-05-21 Huawei Technologies Co., Ltd. Image transmission method and apparatus
CN109844807A (en) * 2016-08-19 2019-06-04 讯宝科技有限责任公司 For the mthods, systems and devices of size to be split and determined to object
US20190180086A1 (en) * 2017-06-30 2019-06-13 Beijing Didi Infinity Technology And Development Co. Ltd. Systems and methods for verifying authenticity of id photo
CN111680176A (en) * 2020-04-20 2020-09-18 武汉大学 Remote sensing image retrieval method and system based on attention and bidirectional feature fusion
WO2021086721A1 (en) * 2019-10-31 2021-05-06 Siemens Healthcare Diagnostics Inc. Methods and apparatus for hashing and retrieval of training images used in hiln determinations of specimens in automated diagnostic analysis systems
US11184604B2 (en) * 2016-04-04 2021-11-23 Compound Eye, Inc. Passive stereo depth sensing
US11270467B2 (en) 2020-01-21 2022-03-08 Compound Eye, Inc. System and method for camera calibration
CN115578694A (en) * 2022-11-18 2023-01-06 合肥英特灵达信息技术有限公司 Video analysis computing power scheduling method, system, electronic equipment and storage medium
US11651581B2 (en) 2019-11-27 2023-05-16 Compound Eye, Inc. System and method for correspondence map determination
US11935249B2 (en) 2020-01-21 2024-03-19 Compound Eye, Inc. System and method for egomotion estimation

Families Citing this family (140)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9526156B2 (en) * 2010-05-18 2016-12-20 Disney Enterprises, Inc. System and method for theatrical followspot control interface
EP4290856A3 (en) 2010-09-13 2024-03-06 Contour IP Holding, LLC Portable digital video camera configured for remote image acquisition control and viewing
US8451344B1 (en) * 2011-03-24 2013-05-28 Amazon Technologies, Inc. Electronic devices with side viewing capability
US9117281B2 (en) * 2011-11-02 2015-08-25 Microsoft Corporation Surface segmentation from RGB and depth images
US9072929B1 (en) * 2011-12-01 2015-07-07 Nebraska Global Investment Company, LLC Image capture system
US9183631B2 (en) * 2012-06-29 2015-11-10 Mitsubishi Electric Research Laboratories, Inc. Method for registering points and planes of 3D data in multiple coordinate systems
US8995903B2 (en) 2012-07-25 2015-03-31 Gopro, Inc. Credential transfer management camera network
US9036016B2 (en) 2012-07-25 2015-05-19 Gopro, Inc. Initial camera mode management system
US9025014B2 (en) 2012-07-25 2015-05-05 Gopro, Inc. Device detection camera system
US8994800B2 (en) 2012-07-25 2015-03-31 Gopro, Inc. Credential transfer management camera system
US9189888B1 (en) * 2013-01-14 2015-11-17 Bentley Systems, Incorporated Point cloud modeling based on user-provided seed
DE102013002554A1 (en) * 2013-02-15 2014-08-21 Jungheinrich Aktiengesellschaft Method for detecting objects in a warehouse and / or for spatial orientation in a warehouse
TW201442511A (en) * 2013-04-17 2014-11-01 Aver Information Inc Tracking shooting system and method
CN103235825B (en) * 2013-05-08 2016-05-25 重庆大学 A kind of magnanimity face recognition search engine design method based on Hadoop cloud computing framework
US9614898B1 (en) * 2013-05-27 2017-04-04 Surround.IO Distributed event engine
DE102013209940A1 (en) * 2013-05-28 2014-12-04 Conti Temic Microelectronic Gmbh Camera system for vehicles
WO2014208337A1 (en) * 2013-06-28 2014-12-31 シャープ株式会社 Location detection device
JP5438861B1 (en) * 2013-07-11 2014-03-12 パナソニック株式会社 Tracking support device, tracking support system, and tracking support method
US10038740B2 (en) 2013-08-19 2018-07-31 Nant Holdings Ip, Llc Camera-to-camera interactions, systems and methods
JP2015126474A (en) * 2013-12-27 2015-07-06 ソニー株式会社 Information processing apparatus, imaging apparatus, information processing method, information processing program, and imaging system
US10009099B2 (en) * 2014-03-29 2018-06-26 Intel Corporation Techniques for communication with body-carried devices
US9607207B1 (en) * 2014-03-31 2017-03-28 Amazon Technologies, Inc. Plane-fitting edge detection
CN104038729A (en) * 2014-05-05 2014-09-10 重庆大学 Cascade-type multi-camera relay tracing method and system
US9844360B2 (en) 2014-10-27 2017-12-19 Clear Guide Medical, Inc. System and devices for image targeting
US9423669B2 (en) 2014-11-04 2016-08-23 Qualcomm Incorporated Method and apparatus for camera autofocus based on Wi-Fi ranging technique
US9600892B2 (en) * 2014-11-06 2017-03-21 Symbol Technologies, Llc Non-parametric method of and system for estimating dimensions of objects of arbitrary shape
US9396554B2 (en) 2014-12-05 2016-07-19 Symbol Technologies, Llc Apparatus for and method of estimating dimensions of an object associated with a code in automatic response to reading the code
CN104463899B (en) * 2014-12-31 2017-09-22 北京格灵深瞳信息技术有限公司 A kind of destination object detection, monitoring method and its device
WO2016140680A1 (en) * 2015-03-05 2016-09-09 Hewlett Packard Enterprise Development Lp Multi-level object re-identification
CN106709899B (en) 2015-07-15 2020-06-02 华为终端有限公司 Method, device and equipment for calculating relative positions of two cameras
US9928605B2 (en) * 2015-09-25 2018-03-27 Intel Corporation Real-time cascaded object recognition
US9953430B1 (en) * 2015-10-29 2018-04-24 Indoor Reality Inc. Methods for detecting luminary fixtures
US10352689B2 (en) 2016-01-28 2019-07-16 Symbol Technologies, Llc Methods and systems for high precision locationing with depth values
US10145955B2 (en) 2016-02-04 2018-12-04 Symbol Technologies, Llc Methods and systems for processing point-cloud data with a line scanner
US20170230637A1 (en) * 2016-02-07 2017-08-10 Google Inc. Multiple camera computing system having camera-to-camera communications link
US10721451B2 (en) 2016-03-23 2020-07-21 Symbol Technologies, Llc Arrangement for, and method of, loading freight into a shipping container
CN105844669B (en) * 2016-03-28 2018-11-13 华中科技大学 A kind of video object method for real time tracking based on local Hash feature
US11356334B2 (en) * 2016-04-15 2022-06-07 Nec Corporation Communication efficient sparse-reduce in distributed machine learning
US9805240B1 (en) 2016-04-18 2017-10-31 Symbol Technologies, Llc Barcode scanning and dimensioning
US10497014B2 (en) * 2016-04-22 2019-12-03 Inreality Limited Retail store digital shelf for recommending products utilizing facial recognition in a peer to peer network
US9946256B1 (en) 2016-06-10 2018-04-17 Gopro, Inc. Wireless communication device for communicating with an unmanned aerial vehicle
US9998907B2 (en) * 2016-07-25 2018-06-12 Kiana Analytics Inc. Method and apparatus for uniquely identifying wireless devices
CN106295563B (en) * 2016-08-09 2019-06-07 武汉中观自动化科技有限公司 A kind of system and method that airbound target flying quality is assessed based on multi-vision visual
CN106295594B (en) * 2016-08-17 2019-10-15 北京大学 A kind of across camera method for tracking target and device based on dynamic route tree
GB2553108B (en) * 2016-08-22 2020-07-15 Canon Kk Method, processing device and system for managing copies of media samples in a system comprising a plurality of interconnected network cameras
US10044972B1 (en) 2016-09-30 2018-08-07 Gopro, Inc. Systems and methods for automatically transferring audiovisual content
US10397415B1 (en) 2016-09-30 2019-08-27 Gopro, Inc. Systems and methods for automatically transferring audiovisual content
DE102016120386A1 (en) * 2016-10-26 2018-04-26 Jungheinrich Aktiengesellschaft Method for detecting objects in a warehouse and industrial truck with a device for detecting objects in a warehouse
JP7256746B2 (en) * 2016-10-31 2023-04-12 ヴィザル・テクノロジー・ソシエテ・ア・レスポンサビリテ・リミテ Apparatus and method for detecting optically modulated signals in a video stream
US11042161B2 (en) 2016-11-16 2021-06-22 Symbol Technologies, Llc Navigation control method and apparatus in a mobile automation system
US10451405B2 (en) 2016-11-22 2019-10-22 Symbol Technologies, Llc Dimensioning system for, and method of, dimensioning freight in motion along an unconstrained path in a venue
CN108089152B (en) * 2016-11-23 2020-07-03 杭州海康威视数字技术股份有限公司 Equipment control method, device and system
US10354411B2 (en) 2016-12-20 2019-07-16 Symbol Technologies, Llc Methods, systems and apparatus for segmenting objects
KR102629934B1 (en) * 2016-12-22 2024-01-26 에스케이플래닛 주식회사 Imaging apparatus, and control method thereof
US10839203B1 (en) * 2016-12-27 2020-11-17 Amazon Technologies, Inc. Recognizing and tracking poses using digital imagery captured from multiple fields of view
US11665308B2 (en) 2017-01-31 2023-05-30 Tetavi, Ltd. System and method for rendering free viewpoint video for sport applications
GB2560177A (en) 2017-03-01 2018-09-05 Thirdeye Labs Ltd Training a computational neural network
GB2560387B (en) 2017-03-10 2022-03-09 Standard Cognition Corp Action identification using neural networks
US10699421B1 (en) 2017-03-29 2020-06-30 Amazon Technologies, Inc. Tracking objects in three-dimensional space using calibrated visual cameras and depth cameras
US20180285438A1 (en) * 2017-03-31 2018-10-04 Change Healthcase Holdings, Llc Database system and method for identifying a subset of related reports
CN107038753B (en) * 2017-04-14 2020-06-05 中国科学院深圳先进技术研究院 Stereoscopic vision three-dimensional reconstruction system and method
US10726273B2 (en) 2017-05-01 2020-07-28 Symbol Technologies, Llc Method and apparatus for shelf feature and object placement detection from shelf images
US10591918B2 (en) 2017-05-01 2020-03-17 Symbol Technologies, Llc Fixed segmented lattice planning for a mobile automation apparatus
US10663590B2 (en) 2017-05-01 2020-05-26 Symbol Technologies, Llc Device and method for merging lidar data
US11093896B2 (en) 2017-05-01 2021-08-17 Symbol Technologies, Llc Product status detection system
US11367092B2 (en) 2017-05-01 2022-06-21 Symbol Technologies, Llc Method and apparatus for extracting and processing price text from an image set
US10949798B2 (en) 2017-05-01 2021-03-16 Symbol Technologies, Llc Multimodal localization and mapping for a mobile automation apparatus
US11449059B2 (en) 2017-05-01 2022-09-20 Symbol Technologies, Llc Obstacle detection for a mobile automation apparatus
WO2018201423A1 (en) 2017-05-05 2018-11-08 Symbol Technologies, Llc Method and apparatus for detecting and interpreting price label text
CN107358200B (en) * 2017-07-13 2020-09-18 常州大学 Multi-camera non-overlapping vision field pedestrian matching method based on sparse learning
JP6928499B2 (en) * 2017-07-21 2021-09-01 株式会社タダノ Guide information display device and work equipment
US10055853B1 (en) 2017-08-07 2018-08-21 Standard Cognition, Corp Subject identification and tracking using image recognition
US11200692B2 (en) 2017-08-07 2021-12-14 Standard Cognition, Corp Systems and methods to check-in shoppers in a cashier-less store
US11250376B2 (en) 2017-08-07 2022-02-15 Standard Cognition, Corp Product correlation analysis using deep learning
US10474991B2 (en) 2017-08-07 2019-11-12 Standard Cognition, Corp. Deep learning-based store realograms
US10650545B2 (en) 2017-08-07 2020-05-12 Standard Cognition, Corp. Systems and methods to check-in shoppers in a cashier-less store
US10853965B2 (en) 2017-08-07 2020-12-01 Standard Cognition, Corp Directional impression analysis using deep learning
US11232687B2 (en) 2017-08-07 2022-01-25 Standard Cognition, Corp Deep learning-based shopper statuses in a cashier-less store
US10133933B1 (en) 2017-08-07 2018-11-20 Standard Cognition, Corp Item put and take detection using image recognition
US11023850B2 (en) 2017-08-07 2021-06-01 Standard Cognition, Corp. Realtime inventory location management using deep learning
US10445694B2 (en) 2017-08-07 2019-10-15 Standard Cognition, Corp. Realtime inventory tracking using deep learning
US10474988B2 (en) 2017-08-07 2019-11-12 Standard Cognition, Corp. Predicting inventory events using foreground/background processing
US10127438B1 (en) 2017-08-07 2018-11-13 Standard Cognition, Corp Predicting inventory events using semantic diffing
US10572763B2 (en) 2017-09-07 2020-02-25 Symbol Technologies, Llc Method and apparatus for support surface edge detection
US10521914B2 (en) 2017-09-07 2019-12-31 Symbol Technologies, Llc Multi-sensor object recognition system and method
US11232294B1 (en) 2017-09-27 2022-01-25 Amazon Technologies, Inc. Generating tracklets from digital imagery
US10110994B1 (en) 2017-11-21 2018-10-23 Nokia Technologies Oy Method and apparatus for providing voice communication with spatial audio
US11030442B1 (en) 2017-12-13 2021-06-08 Amazon Technologies, Inc. Associating events with actors based on digital imagery
US11284041B1 (en) 2017-12-13 2022-03-22 Amazon Technologies, Inc. Associating items with actors based on digital imagery
JP2019121069A (en) * 2017-12-28 2019-07-22 キヤノン株式会社 Image processing device, image processing method, and program
US10706505B2 (en) * 2018-01-24 2020-07-07 GM Global Technology Operations LLC Method and system for generating a range image using sparse depth data
US10809078B2 (en) 2018-04-05 2020-10-20 Symbol Technologies, Llc Method, system and apparatus for dynamic path generation
US10823572B2 (en) 2018-04-05 2020-11-03 Symbol Technologies, Llc Method, system and apparatus for generating navigational data
US11327504B2 (en) 2018-04-05 2022-05-10 Symbol Technologies, Llc Method, system and apparatus for mobile automation apparatus localization
US10832436B2 (en) 2018-04-05 2020-11-10 Symbol Technologies, Llc Method, system and apparatus for recovering label positions
US10740911B2 (en) 2018-04-05 2020-08-11 Symbol Technologies, Llc Method, system and apparatus for correcting translucency artifacts in data representing a support structure
US11126863B2 (en) 2018-06-08 2021-09-21 Southwest Airlines Co. Detection system
CN109035295B (en) * 2018-06-25 2021-01-12 广州杰赛科技股份有限公司 Multi-target tracking method, device, computer equipment and storage medium
US11482045B1 (en) 2018-06-28 2022-10-25 Amazon Technologies, Inc. Associating events with actors using digital imagery and machine learning
US11468698B1 (en) 2018-06-28 2022-10-11 Amazon Technologies, Inc. Associating events with actors using digital imagery and machine learning
US11468681B1 (en) 2018-06-28 2022-10-11 Amazon Technologies, Inc. Associating events with actors using digital imagery and machine learning
US11366865B1 (en) * 2018-09-05 2022-06-21 Amazon Technologies, Inc. Distributed querying of computing hubs
US11010920B2 (en) 2018-10-05 2021-05-18 Zebra Technologies Corporation Method, system and apparatus for object detection in point clouds
US11506483B2 (en) 2018-10-05 2022-11-22 Zebra Technologies Corporation Method, system and apparatus for support structure depth determination
US11090811B2 (en) 2018-11-13 2021-08-17 Zebra Technologies Corporation Method and apparatus for labeling of support structures
US11003188B2 (en) 2018-11-13 2021-05-11 Zebra Technologies Corporation Method, system and apparatus for obstacle handling in navigational path generation
US11416000B2 (en) 2018-12-07 2022-08-16 Zebra Technologies Corporation Method and apparatus for navigational ray tracing
US11079240B2 (en) 2018-12-07 2021-08-03 Zebra Technologies Corporation Method, system and apparatus for adaptive particle filter localization
US11100303B2 (en) 2018-12-10 2021-08-24 Zebra Technologies Corporation Method, system and apparatus for auxiliary label detection and association
US11015938B2 (en) 2018-12-12 2021-05-25 Zebra Technologies Corporation Method, system and apparatus for navigational assistance
US10731970B2 (en) 2018-12-13 2020-08-04 Zebra Technologies Corporation Method, system and apparatus for support structure detection
CA3028708A1 (en) 2018-12-28 2020-06-28 Zih Corp. Method, system and apparatus for dynamic loop closure in mapping trajectories
WO2020161646A2 (en) * 2019-02-05 2020-08-13 Rey Focusing Ltd. Focus tracking system
US11232575B2 (en) 2019-04-18 2022-01-25 Standard Cognition, Corp Systems and methods for deep learning-based subject persistence
US11402846B2 (en) 2019-06-03 2022-08-02 Zebra Technologies Corporation Method, system and apparatus for mitigating data capture light leakage
US11662739B2 (en) 2019-06-03 2023-05-30 Zebra Technologies Corporation Method, system and apparatus for adaptive ceiling-based localization
US11960286B2 (en) 2019-06-03 2024-04-16 Zebra Technologies Corporation Method, system and apparatus for dynamic task sequencing
US11200677B2 (en) 2019-06-03 2021-12-14 Zebra Technologies Corporation Method, system and apparatus for shelf edge detection
US11341663B2 (en) 2019-06-03 2022-05-24 Zebra Technologies Corporation Method, system and apparatus for detecting support structure obstructions
US11080566B2 (en) 2019-06-03 2021-08-03 Zebra Technologies Corporation Method, system and apparatus for gap detection in support structures with peg regions
US11151743B2 (en) 2019-06-03 2021-10-19 Zebra Technologies Corporation Method, system and apparatus for end of aisle detection
US11222460B2 (en) * 2019-07-22 2022-01-11 Scale AI, Inc. Visualization techniques for data labeling
CN112788227B (en) * 2019-11-07 2022-06-14 富泰华工业(深圳)有限公司 Target tracking shooting method, target tracking shooting device, computer device and storage medium
US11507103B2 (en) 2019-12-04 2022-11-22 Zebra Technologies Corporation Method, system and apparatus for localization-based historical obstacle handling
US11107238B2 (en) 2019-12-13 2021-08-31 Zebra Technologies Corporation Method, system and apparatus for detecting item facings
US11822333B2 (en) 2020-03-30 2023-11-21 Zebra Technologies Corporation Method, system and apparatus for data capture illumination control
US11398094B1 (en) 2020-04-06 2022-07-26 Amazon Technologies, Inc. Locally and globally locating actors by digital cameras and machine learning
US11443516B1 (en) 2020-04-06 2022-09-13 Amazon Technologies, Inc. Locally and globally locating actors by digital cameras and machine learning
US11303853B2 (en) 2020-06-26 2022-04-12 Standard Cognition, Corp. Systems and methods for automated design of camera placement and cameras arrangements for autonomous checkout
US11620900B2 (en) * 2020-06-26 2023-04-04 Intel Corporation Object tracking technology based on cognitive representation of a location in space
US11361468B2 (en) 2020-06-26 2022-06-14 Standard Cognition, Corp. Systems and methods for automated recalibration of sensors for autonomous checkout
US11450024B2 (en) 2020-07-17 2022-09-20 Zebra Technologies Corporation Mixed depth object detection
US20220101532A1 (en) * 2020-09-29 2022-03-31 Samsung Electronics Co., Ltd. Method and device for performing plane detection
US11593915B2 (en) 2020-10-21 2023-02-28 Zebra Technologies Corporation Parallax-tolerant panoramic image generation
US11392891B2 (en) 2020-11-03 2022-07-19 Zebra Technologies Corporation Item placement detection and optimization in material handling systems
US11847832B2 (en) 2020-11-11 2023-12-19 Zebra Technologies Corporation Object classification for autonomous navigation systems
CN112907528B (en) 2021-02-09 2021-11-09 南京航空航天大学 Point cloud-to-image-based composite material laying wire surface defect detection and identification method
KR20240005790A (en) * 2021-04-30 2024-01-12 나이앤틱, 인크. Repeatability prediction of points of interest
US11954882B2 (en) 2021-06-17 2024-04-09 Zebra Technologies Corporation Feature-based georegistration for mobile computing devices
CN116027269B (en) * 2023-03-29 2023-06-06 成都量芯集成科技有限公司 Plane scene positioning method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020196976A1 (en) * 2001-04-24 2002-12-26 Mihcak M. Kivanc Robust recognizer of perceptually similar content
US20110235908A1 (en) * 2010-03-23 2011-09-29 Microsoft Corporation Partition min-hash for partial-duplicate image determination

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5436672A (en) * 1994-05-27 1995-07-25 Symah Vision Video processing system for modifying a zone in successive images
EP0971242A1 (en) * 1998-07-10 2000-01-12 Cambridge Consultants Limited Sensor signal processing
US6567116B1 (en) * 1998-11-20 2003-05-20 James A. Aman Multiple object tracking system
US6441734B1 (en) * 2000-12-12 2002-08-27 Koninklijke Philips Electronics N.V. Intruder detection through trajectory analysis in monitoring and surveillance systems
US6847728B2 (en) * 2002-12-09 2005-01-25 Sarnoff Corporation Dynamic depth recovery from multiple synchronized video streams
JP2004198211A (en) * 2002-12-18 2004-07-15 Aisin Seiki Co Ltd Apparatus for monitoring vicinity of mobile object
US7007888B2 (en) * 2003-11-25 2006-03-07 The Boeing Company Inertial position target measuring systems and methods
US7421113B2 (en) 2005-03-30 2008-09-02 The Trustees Of The University Of Pennsylvania System and method for localizing imaging devices
US7489804B2 (en) * 2005-09-26 2009-02-10 Cognisign Llc Apparatus and method for trajectory-based identification of digital data content
US8325979B2 (en) * 2006-10-30 2012-12-04 Tomtom Global Content B.V. Method and apparatus for detecting objects from terrestrial based mobile mapping data
WO2008149925A1 (en) * 2007-06-08 2008-12-11 Nikon Corporation Imaging device, image display device, and program
US8116527B2 (en) * 2009-10-07 2012-02-14 The United States Of America As Represented By The Secretary Of The Army Using video-based imagery for automated detection, tracking, and counting of moving objects, in particular those objects having image characteristics similar to background
US9449233B2 (en) 2010-12-01 2016-09-20 The Trustees Of The University Of Pennsylvania Distributed target tracking using self localizing smart camera networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020196976A1 (en) * 2001-04-24 2002-12-26 Mihcak M. Kivanc Robust recognizer of perceptually similar content
US20110235908A1 (en) * 2010-03-23 2011-09-29 Microsoft Corporation Partition min-hash for partial-duplicate image determination

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Camillo J. Taylor and Anthony Cowley. Fast Segmentation via Randomized Hashing. In A. Cavallaro, S. Prince and D. Alexander, editors, Proceedings of the British Machine Conference, pages 60.1-60.11. BMVA Press, September 2009. doi:10.5244/C.23.60 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8867793B2 (en) 2010-12-01 2014-10-21 The Trustees Of The University Of Pennsylvania Scene analysis using image and range data
US9449233B2 (en) 2010-12-01 2016-09-20 The Trustees Of The University Of Pennsylvania Distributed target tracking using self localizing smart camera networks
US8625889B2 (en) * 2010-12-30 2014-01-07 Samsung Electronics Co., Ltd. System for food recognition method using portable devices having digital cameras
US20120170801A1 (en) * 2010-12-30 2012-07-05 De Oliveira Luciano Reboucas System for Food Recognition Method Using Portable Devices Having Digital Cameras
US8495425B2 (en) * 2011-03-01 2013-07-23 International Business Machines Corporation System and method to efficiently identify bad components in a multi-node system utilizing multiple node topologies
US20130197859A1 (en) * 2012-01-30 2013-08-01 International Business Machines Corporation Tracking Entities by Means of Hash Values
US9600443B2 (en) * 2012-01-30 2017-03-21 International Business Machines Corporation Tracking entities by means of hash values
US10042818B2 (en) 2012-01-30 2018-08-07 International Business Machines Corporation Tracking entities by means of hash values
US20140040262A1 (en) * 2012-08-03 2014-02-06 Adobe Systems Incorporated Techniques for cloud-based similarity searches
US9165068B2 (en) * 2012-08-03 2015-10-20 Adobe Systems Incorporated Techniques for cloud-based similarity searches
US10810456B2 (en) * 2014-09-19 2020-10-20 Brain Corporation Apparatus and methods for saliency detection based on color occurrence analysis
US20180293742A1 (en) * 2014-09-19 2018-10-11 Brain Corporation Apparatus and methods for saliency detection based on color occurrence analysis
US10298970B2 (en) * 2014-12-12 2019-05-21 Huawei Technologies Co., Ltd. Image transmission method and apparatus
CN104574440A (en) * 2014-12-30 2015-04-29 安科智慧城市技术(中国)有限公司 Video movement target tracking method and device
CN104637052A (en) * 2015-01-22 2015-05-20 西南交通大学 Object tracking method based on target guide significance detection
CN105989611B (en) * 2015-02-05 2019-01-18 南京理工大学 The piecemeal perceptual hash tracking of hatched removal
CN105989611A (en) * 2015-02-05 2016-10-05 南京理工大学 Blocking perception Hash tracking method with shadow removing
US11184604B2 (en) * 2016-04-04 2021-11-23 Compound Eye, Inc. Passive stereo depth sensing
CN109844807A (en) * 2016-08-19 2019-06-04 讯宝科技有限责任公司 For the mthods, systems and devices of size to be split and determined to object
US20190180086A1 (en) * 2017-06-30 2019-06-13 Beijing Didi Infinity Technology And Development Co. Ltd. Systems and methods for verifying authenticity of id photo
US11003895B2 (en) * 2017-06-30 2021-05-11 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for verifying authenticity of ID photo
US10484659B2 (en) * 2017-08-31 2019-11-19 Disney Enterprises, Inc. Large-scale environmental mapping in real-time by a robotic system
US20190068940A1 (en) * 2017-08-31 2019-02-28 Disney Enterprises Inc. Large-Scale Environmental Mapping In Real-Time By A Robotic System
CN109598726A (en) * 2018-10-26 2019-04-09 哈尔滨理工大学 A kind of adapting to image target area dividing method based on SLIC
WO2021086721A1 (en) * 2019-10-31 2021-05-06 Siemens Healthcare Diagnostics Inc. Methods and apparatus for hashing and retrieval of training images used in hiln determinations of specimens in automated diagnostic analysis systems
US11651581B2 (en) 2019-11-27 2023-05-16 Compound Eye, Inc. System and method for correspondence map determination
US11869218B2 (en) 2020-01-21 2024-01-09 Compound Eye, Inc. System and method for camera calibration
US11270467B2 (en) 2020-01-21 2022-03-08 Compound Eye, Inc. System and method for camera calibration
US11935249B2 (en) 2020-01-21 2024-03-19 Compound Eye, Inc. System and method for egomotion estimation
CN111680176A (en) * 2020-04-20 2020-09-18 武汉大学 Remote sensing image retrieval method and system based on attention and bidirectional feature fusion
CN115578694A (en) * 2022-11-18 2023-01-06 合肥英特灵达信息技术有限公司 Video analysis computing power scheduling method, system, electronic equipment and storage medium

Also Published As

Publication number Publication date
US9449233B2 (en) 2016-09-20
US8867793B2 (en) 2014-10-21
US20120249802A1 (en) 2012-10-04
US20120250978A1 (en) 2012-10-04

Similar Documents

Publication Publication Date Title
US20120250984A1 (en) Image segmentation for distributed target tracking and scene analysis
Lim et al. Real-time image-based 6-dof localization in large-scale environments
Hartmann et al. Recent developments in large-scale tie-point matching
Lai et al. RGB-D object recognition: Features, algorithms, and a large scale benchmark
WO2016119117A1 (en) Localization and mapping method
JP5261501B2 (en) Permanent visual scene and object recognition
KR20130122662A (en) Method and system for comparing images
Shahbazi et al. Application of locality sensitive hashing to realtime loop closure detection
US10943098B2 (en) Automated and unsupervised curation of image datasets
US9014486B2 (en) Systems and methods for tracking with discrete texture traces
An et al. Optimal colour‐based mean shift algorithm for tracking objects
Qin et al. Loop closure detection in SLAM by combining visual CNN features and submaps
Li et al. Salient object detection based on meanshift filtering and fusion of colour information
Carvalho et al. Analysis of object description methods in a video object tracking environment
Gad et al. Crowd density estimation using multiple features categories and multiple regression models
Gu et al. Automatic searching of fish from underwater images via shape matching
Tal et al. An accurate method for line detection and manhattan frame estimation
Lowry et al. Logos: Local geometric support for high-outlier spatial verification
Essmaeel et al. A new 3D descriptor for human classification: Application for human detection in a multi-kinect system
Arnfred et al. Mirror match: Reliable feature point matching without geometric constraints
Sliti et al. Efficient visual tracking via sparse representation and back-projection histogram
Taylor et al. Fast Segmentation via Randomized Hashing.
Zhang et al. A New Inlier Identification Scheme for Robust Estimation Problems.
Thinh et al. Depth-aware salient object segmentation
Xu et al. Label transfer for joint recognition and segmentation of 3D object

Legal Events

Date Code Title Description
AS Assignment

Owner name: THE TRUSTEES OF THE UNIVERSITY OF PENNSYLVANIA, PE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TAYLOR, CAMILLO JOSE;REEL/FRAME:029887/0863

Effective date: 20120525

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION