US20020154833A1 - Computation of intrinsic perceptual saliency in visual environments, and applications - Google Patents

Computation of intrinsic perceptual saliency in visual environments, and applications Download PDF

Info

Publication number
US20020154833A1
US20020154833A1 US09/912,225 US91222501A US2002154833A1 US 20020154833 A1 US20020154833 A1 US 20020154833A1 US 91222501 A US91222501 A US 91222501A US 2002154833 A1 US2002154833 A1 US 2002154833A1
Authority
US
United States
Prior art keywords
image
analyzing
different
information
comparing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/912,225
Inventor
Christof Koch
Laurent Itti
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
California Institute of Technology CalTech
Original Assignee
California Institute of Technology CalTech
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by California Institute of Technology CalTech filed Critical California Institute of Technology CalTech
Priority to US09/912,225 priority Critical patent/US20020154833A1/en
Assigned to CALIFORNIA INSTITUTE OF TECHNOLOGY reassignment CALIFORNIA INSTITUTE OF TECHNOLOGY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ITTI, LAURENT, KOCH, CHRISTOF
Publication of US20020154833A1 publication Critical patent/US20020154833A1/en
Priority to US11/430,684 priority patent/US8098886B2/en
Priority to US13/324,352 priority patent/US8515131B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • the mammalian visual system is believed to use a computational strategy of identifying interesting parts of the image without extensively analyzing the content of the image.
  • the entire image may be analyzed in parallel for simple features. Portions of the image are then selected, based either on their behavioral relevance or based on local image cues.
  • the local image cues may include brightness, motion, and/or color and others.
  • the mammalian brain evolved in this manner to handle the enormous amount of information that is received from a scene. This information has been estimated as being on the order of up to 10 8 bits per second along the optic nerve, the axonal fibers that constitute the output of the retina. This may exceed what the brain is capable of fully processing and assimilating into its conscious experience.
  • the present invention describes a computer-based implementation that allows automatic detection of salient parts of image information. This may use a model which is based on the way the primate's visual system is believed to process the retinal image stream.
  • the application discloses the basic model, and applications of the model to various practical uses.
  • One such use includes detection of the effectiveness of an image or temporal sequence of images in displaying their content, e.g., in advertising context.
  • Some specific model attributes are also disclosed.
  • a first model attribute describes higher order statistical analysis of image information to compute saliency.
  • Another model attribute discloses detection of extended but interrupted contours within the image information that can contribute to image saliency.
  • the computation of saliency specific to moving objects in a video sequence or constantly changing image sequences is described.
  • Another aspect relates to the improvement of computing saliency for video sequence detection, by detecting portions of the video sequence which flicker.
  • Another relates to the usage of multiple spectral images acquired of the same scene.
  • FIG. 1 shows a flow diagram of a model of saliency-based attention
  • FIG. 2 shows a block diagram of the nonlinear filtering using an iterated difference of Gaussian filter
  • FIG. 3 shows a diagram of waveforms obtained at different spatial resolutions or scales
  • FIG. 4A- 4 H shows results of different numbers of iterations of the iterative equation to converge to salient elements
  • FIG. 5 shows an exemplary field with a background and an internal elliptical area
  • FIG. 6 shows a block diagram of a statistical measure of pixel distribution using higher order statistics
  • FIG. 7 shows a flowchart of operation of obtaining the different image pyramids
  • FIG. 8 shows a diagram of the different pyramids obtained
  • FIG. 9 shows a flowchart of finding extended image contours
  • FIGS. 10 A- 10 C show additional information in finding the extended contours
  • FIG. 11 shows some notion of the different image contour operations
  • FIG. 12 shows a flowchart of motion in an extended image sequence.
  • FIG. 13 shows a flowchart of thresholding.
  • FIG. 1 shows a system for determining a saliency map, which may be a two-dimensional map that encodes salient objects in a visual environment.
  • the map of the scene expresses the saliency of all locations in this image. This map is the result of competitive interactions among feature maps for image features including color, orientation, texture, motion, depth and so on, that interact within and across each map.
  • the currently strongest location in the saliency map corresponds to the most salient object.
  • the value in the map represents the local saliency of any one location with respect to its neighborhood. By default, the system directs attention towards the most salient location.
  • a second most salient location may be found by inhibiting the most salient location, causing the system to automatically shift to the next most salient location.
  • the techniques described herein are based on the bottom-up control of attention, i.e., control that is based on the properties of the visual stimulus. This compares with a top-down component, which may be based not only on the content of the image but also on additional high-level features that may depend on a specific visual task at hand.
  • a top-down component would include, for example, storing an image of a face of a person one is searching for, followed by correlating that image across the entire scene.
  • a task of the saliency map is to compute a scalar quantity representing the salience at every location in the visual field, and to guide the subsequent selection of attended locations.
  • the “feature maps” provide the input to the saliency map, which is modeled as a neural network receiving its input at a particular spatial scale (here scale 4 ).
  • the input image 100 may be a digitized image from a variety of sources.
  • the digitized image may be from an NTSC video camera.
  • linear filtering is carried out at different spatial scales, here nine spatial scales.
  • the spatial scales may be created using Gaussian pyramid filters of the Burt and Adelson type. These pyramid filters may include progressively low pass filtering and sub-sampling of the input image.
  • the spatial processing pyramids can have an arbitrary number of spatial scales. In the example provided, nine spatial scales provide horizontal and vertical image reduction factors ranging from 1:1 (level 0 , representing the original input image) to 1:256 (level 8 ) in powers of 2. This may be used to detect differences in the image between fine and coarse scales.
  • Each portion of the image is analyzed by comparing the “center” portion of the image with the surround part of the image.
  • This example would yield 6 feature maps for each feature at the scales 2 - 5 , 2 - 6 , 3 - 6 , 3 - 7 , 4 - 7 and 4 - 8 (for instance, in the last case, the image at spatial scale 8 is subtracted, after suitable normalization, from the image at spatial scale 4 ).
  • One feature type encodes for intensity contrast, e.g., “on” and “off” intensity contrast shown as 115 .
  • This may encode for the modulus of image luminance contrast, which shows the absolute value of the difference between center intensity and surround intensity.
  • the differences between two images at different scales may be obtained by oversampling the image at the coarser scale to the resolution of the image at the finer scale.
  • any number of scales in the pyramids, of center scales, and of surround scales may be used.
  • Another feature 110 encodes for colors.
  • an intensity image I is obtained as I ⁇ (r+g+b)/3.
  • a Gaussian pyramid I(s) is created from I, where s is the scale.
  • the r, g and b channels are normalized by I at 131 , at the locations where the intensity is at least 10% of its maximum, in order to decorrelate hue from intensity.
  • RGB r ⁇ (g+b)/2 for red
  • G g ⁇ (r+b)/2 for green
  • B b ⁇ (r+g)/2 for blue
  • Y (r+g)/2 ⁇
  • 130 computes center-surround differences across scales.
  • Two different feature maps may be used for color, a first encoding red-green feature maps, and a second encoding blue-yellow feature maps.
  • Four Gaussian pyramids R(s), G(s), B(s) and Y(s) are created from these color channels. Depending on the input image, many more color channels could be evaluated in this manner.
  • the image sensor 99 that obtains the image of a particular scene is a multi-spectral image sensor.
  • This image sensor may obtain different spectra of the same scene.
  • the image sensor may sample a scene in the infra-red as well as in the visible part of the spectrum. These two images may then be evaluated in a similar manner to that described above.
  • Another feature type may encode for local orientation contrast 120 .
  • This may use the creation of oriented Gabor pyramids as known in the art.
  • Four orientation-selective pyramids may thus be created from I using Gabor filtering at 0, 45, 90 and 135 degrees, operating as the four features.
  • the maps encode, as a group, the difference between the average local orientation and the center and surround scales. In a more general implementation, many more than four orientation channels could be used.
  • each feature map is first normalized to a fixed dynamic range such as between 0 and 1. This may eliminate feature-dependent amplitude differences that may be due to different feature extraction mechanisms.
  • the map is convolved by a large difference-of-Gaussians kernel at 215 and the results are added to the center contents of the map at 210 .
  • the additional input implements the short-range excitation processes and the long-range inhibitory processes between the neighboring visual locations.
  • the map is then half-wave rectified at 220 , which may remove negative results. This makes the iterative process nonlinear, which may improve the results.
  • c ex and c in are positive numbers that denote the strength of the excitatory center response and the strength of the inhibitory surround response, respectively.
  • ⁇ ex and ⁇ inh denote the width, spatial extent or size of the associated excitatory central Gaussian or the inhibitory surround Gaussian.
  • the central Gaussian is subtracted from the surround Gaussian to obtain a so-called “Mexican-Hat” operator or “Difference-of-Gaussian”, hence leading to the name ‘DoG’. This can also be seen in the central box ‘ 215 ’ of FIG. 2.
  • Eq. 2 shows getting the new value of the image ‘M’ by taking the current input image in map ‘M’, filtering it through this ‘DOG’ filter, adding it to the existing Map ‘M’, and subtracting an inhibitory constant C inh . Positive results are kept; negative results are set to zero.
  • Each feature map is iterated 10 times using this equation. Different numbers of iterations may be carried out, based on experience and the application domain. The local excitation is counteracted by broad inhibition from neighboring locations. This spatial interaction across the entire map may be crucial for resolving competition among salient items.
  • FIG. 3 shows two different examples of the six center-surround receptive field types.
  • the left part of the figure shows Gaussian pixel widths, numbered 0 - 8 , for the 9 spatial scales used in the model example of FIG. 1.
  • Scale 0 corresponds to the original image, and each subsequent scale is coarser by a factor 2 .
  • 300 and 302 show two examples of the six center-surround receptive field types.
  • 302 shows the scale pair 4 - 8 .
  • the spatial competition for salience may be implemented within each of the feature maps. Each map receives input from the filtering and center surround stages.
  • FIGS. 4 A- 4 H An example of results is shown in FIGS. 4 A- 4 H.
  • FIG. 4A shows the actual image, with iteration 0 (FIG. 4B) showing the items that are present in FIG. 4A.
  • FIG. 4C shows two iterations of the type illustrated in FIG. 2 to show that the salient features begin to emerge. This is shown in further detail in FIG. 4D (iteration 4 ), 4 E (iterations 6 ), 4 F (iteration 8 ), 4 G (iteration 10 ) and 4 H (iteration 12 ).
  • FIG. 4G representing iteration 10 clearly shows which features are most salient, and this only becomes more evident in FIG. 4H showing the result of iteration 12 .
  • the feature maps for intensity, color, and orientation are summed across scales into three separate “conspicuity maps,” 133 for intensity, 134 for color and 136 for orientation. Conspicuity maps for other features, such as motion or flicker, can easily be added here.
  • Each conspicuity map is then subjected to another 10 iterations of the iterative normalization process shown in FIG. 2.
  • the motivation for the creation of three separate channels and their individual normalization is the hypothesis that similar features compete strongly for salience, while different modalities contribute independently to the saliency map.
  • the individual normalization may provide additional information since similar features may compete strongly for salience, while different modalities may contribute independently to the saliency maps.
  • the maximum of the saliency map may correspond to the most salient stimulus, and represents the item to which the focus of attention should next be directed.
  • the most salient location may be determined from the maximum of the saliency map. This may be effected at 160 using a “winner take all” technique.
  • the system as described might direct its focus of attention constantly to one location since the same winner would always be selected. Accordingly, the feedback shown as 165 is indicated to provide feedback from the “winner take all” array 160 to the saliency map 155 . That is, after some period of variable delay, the saliency of the winning location may be transiently inhibited. This assures that the “winner take all” circuit automatically selects the next most salient location. As a consequence, attention then switches to the next most conspicuous location. This inhibition prevents a previously attended location from being attended to again within a short interval and endows the entire algorithm with a dynamic element.
  • Another embodiment determines higher order, e.g., second order, statistics in the image. This may be done for any of previously described purposes. For example, consider the case as shown in FIG. 6, where the center and surround are two different textures with similar means but different higher-order statistics (for instance, different variances). A simple comparison of the mean pixel values between the center and surround regions would show a low saliency, while both textures may appear quite dissimilar to human observers.
  • An alternative embodiment described herein takes into account not only mean value differences between center and surround, but also the statistical distribution of the information.
  • An embodiment describes the use of second-order statistics, here the variance of pixel distribution. This technique may be used when a simple comparison of mean pixel values between center and surround regions shows a low saliency. Alternatively, this may be used for all applications of the invention.
  • This system may provide a statistical measure of a difference of distributions of pixel values between the center and surrounding regions.
  • This embodiment may assume that the pixels should be distributed in a Gaussian format. While this assumption holds for only certain kinds of images, it may still represent a better approximation than the first embodiment. However, more general statistical assumptions could also be used.
  • FIG. 5 An example is shown in FIG. 5. An image is shown having a background area with a texture, and an elliptical area within the other background area. An observer can easily see the elliptical area within the background in FIG. 5, but the average values are more or less the same.
  • FIG. 6 shows a block diagram of a center-surround neuronal “unit” of this embodiment.
  • This unit is comparing two different parts 600 , 605 with different textures.
  • the unit compares the distribution of pixel values between the center 605 and surround regions 600 .
  • the mean pixel values are substantially identical over the center and concentric surround regions. Therefore, an operator that only considered the mean intensity in the center and subtracts that from the average intensity in the surround would obtain a value close to zero and would not find the center portion to be salient. Note that the mean of the two Gaussian distributions in the middle plot are identical.
  • This embodiment takes the variance as shown.
  • the variance of the center region 610 is higher than the variance 615 of the surround.
  • the distributions of pixel values in center and surround are approximated two Gaussian functions.
  • a statistical measure of similarity between those distributions (such as the Kullback divergence) may then be used to compute the response of the neuron at 620 , such that identical distributions yield no neuronal response while very different distributions yield a strong response.
  • the mean and standard deviation may be calculated as follows, and as shown in the flowchart of FIG. 7.
  • the pixel distribution is taken in a region represented by a pixel at a given level in a multiscale image pyramid.
  • the sum-of-squares pyramid is similar except that an image of the sum of the squares of the pixel values in the original image is used as the base of the pyramid.
  • saliency is then derived from a comparison between this mean and standard deviation for the center and surrounding regions.
  • the saliency may use other similar measures including Euclidean distance between the mean-standard deviation pair, ideal-observer discrimination, and the Kullback J-divergence.
  • This higher order comparison may not only be applied to the intensity channel, but also to color opponencies and orientation-selective channels or to any other channel.
  • FIG. 8 graphically illustrates the computation of mean and variance of pixel distribution within increasingly larger square regions, using an image pyramid architecture. From the original input image 800 , two dyadic image pyramids are created. In the sum pyramid on the left, each pixel at a given level “n” contains the sum of all corresponding pixels at level 0 (the original image). In the second one (right), each pixel at level “n” contains the sum of squares of all corresponding pixels at level 0 .
  • Another improvement may include improved detection of center-surround differences by contour identification.
  • Detection of salient contours involves the elaboration of a subset of elongated contours in the image.
  • Image contour detection can be done with standard image processing techniques, such as by using Canny edge-detection filtering.
  • Canny edge-detection filtering Several techniques have been proposed for the detection of salient contours. The present technique uses a multiscale approach which is flowcharted in FIG. 9, and shown graphically in FIG. 10.
  • contours and edges of the image are detected at multiple spatial scales using oriented Gabor filters which may be set to take account of contours in both local neighborhoods as well as contours across the entire image. This takes note that a longer contour or edge, even if interrupted, may represent a more salient image feature than shorter image segments, even if they are continuous and non-interrupted.
  • the raw map M is multiplied by E at 925 .
  • this non-linear process 910 - 950 is iterated a few times (on the order of 10 iterations), hence implementing a recurrent non-linear scheme with early termination.
  • FIG. 10A shows parameters defining the field of influence between two nearby visual neurons, as found in typical single-spatial scale models of contour integration.
  • the actual image 1000 is filtered by banks of orientation-selective filters 1010 . These filters may approximate neuronal responses for several orientations and at several spatial scales 1020 , not taking into account any interaction.
  • FIG. 10B shows characterizing interactions between distant filters according to separating distance and angles. In typical models, this may yield a “field of influence” which defines the location, preferred orientation and connection strength between a central neuron of interest and its neighbors.
  • FIG. 10C shows this field of influence Results obtained with this technique for each map M are then combined at 960 , first across spatial scales for one orientation, and then across orientations as shown in FIG. 11.
  • the result is a single saliency map which contains not only small, localized salient objects as detected with the basic technique described with reference to FIG. 1, but also extended contours if those are salient.
  • this motion extraction module is applied to the luminance (Y) and chrominance (C) channels of the image at several spatial scales, yielding one “motion map” for each orientation, velocity and scale.
  • [0085] 1210 carries out non-linear spatial competition for salience, as described previously, with respect to each resulting motion map. That is, the motion saliency of multiple objects, moving roughly in the same direction and speed, is evaluated by the competitive and iterative process described above. Again, this step is crucial for evaluating the saliency of more than one object that moves in a similar direction and speed.
  • This system is used for detecting saliency in the motion channel.
  • a nonlinear within-feature competition scheme is used to detect motion in luminance and also in chrominance in a multiscale manner. This provides one motion map for each of orientation, velocity and scale for each of luminance and chrominance.
  • Another embodiment recognizes that the Adelson-Bergen or spatio-temporal image filters are specialized to pick up motion.
  • Classic motion detectors do not respond to flicker in the image since nothing is moving in any direction.
  • an additional filter may be added which provides a temporal derivative channel to pick up the flicker.
  • this embodiment looks at flicker in animated sequences. This may be of particular relevance for evaluating the saliency of web-pages or marquette advertising or electronic displays with flashing LEDs.
  • this absolute difference value is thresholded, and compared against the threshold. That is, if the change in image intensity is too small, it is not considered, since it might be produced by noise.
  • Other temporal information may be calculated at 1310 , such as taking the derivative of colors, e.g. the red-green or blue-yellow color channels, with respect to time. Again, the absolute value of the temporal derivative in the red-green and in the blue-yellow color channels can be considered.
  • a test is made to determine if the change is over the whole image. If so, then the process stops. This is based on the recognition that flickering of the entire image may not be very salient.
  • the image portion that flickers is identified as salient, or increased in salience according to results of the iterative competition process applied to the flicker map.
  • a preferred embodiment for a flicker saliency channel hence may include:
  • a basic rectified flicker extraction module based on taking the absolute value of the difference between two successive frames.
  • the above system evaluates saliency based on images obtained using a classical red-green-blue representation. This gives rise to two opponency channels (red-green and blue-yellow), an intensity channels, and four orientation channels. These seven channels are processed in separate computational streams. This can be extended to many more channels.
  • Such multi-spectral or hyper-spectral image sensors may include near and far infra-red cameras, visible light cameras, synthetic aperture radar and so on. With images comprising large numbers of spectral bands, e.g., up to hundreds of channels in some futuristic military scenarios, significant redundancies will exist across different spectral bands. The saliency system can therefore be used to model more sophisticated interactions between spectral channels.
  • connections across channels whereby each feature map at a given scale can receive multiplicative excitatory or inhibitory input from another feature map at the same or different spatial scale.
  • These connections extend the interactive spatial competition for salience already implemented in the saliency model: at each time step, spatial interactions within each map may be iterated, followed by one iteration of interactions across maps.
  • Supervised training algorithms can be applied to include training of the weights by which the different channels interact. The resulting system may be able to exploit multi-spectral imagery in a much more sophisticated manner than is currently possible.
  • This may be used to calculate saliency/conspicuity of items which are being displayed, for example, in an advertising context.
  • This may include advertisements, visual art and text in print (magazines, news-papers, journals, books); posters, bill-boards and other outside, environmental displays; advertisements, visual art and text in electronic format on the world-wide-web or on computers; as well as the saliency/conscipicuity of dynamic advertisements, visual art and clips in movies, TV film, videos, dynamic display boards or graphical user interfaces.
  • It may also be used for the saliency/conspicuity of displays of products placed in shop windows, department stores, aisles and shelves, printed ads and so on for product placement. That is, given a particular product (e.g. a soda brand, wine bottle, candy bar), the software evaluates its saliency within the entire display by taking account of the entire view as would be seen by a casual observer or shopper.
  • a particular product e.g. a soda brand, wine bottle, candy bar
  • the software can also determine how to change the visual appearance of the product, including its shape and its label, in order to increase its saliency. It can do so by providing specific information to the user on which features, at which spatial scales, are more or less salient than the object or location that the user wishes to draw the attention of the viewer to. For instance, say the user wishes to draw the eye of the viewer to a specific brand of candy bars in an array of candy bars, chocolates and other sweets. By inspecting the conspicuity maps for color, orientation and intensity (see FIG. 1), the user can get a first impression of which objects in the scene are salient because of an intensity difference, because of a color difference or because of their spatial orientation relative to the background.
  • each parameter may be varied in each direction to determine if that part of the image becomes more salient or less salient.
  • a part of the image for example, could be made a little redder.
  • an evaluation of whether the saliency increases is made. If the saliency does increase from that change, then the image can be made redder still. This can be continued until the maximum saliency from that parameter is obtained.
  • the search process can be carried out through feature channels including any of the feature channels noted above, and through different scales.
  • the parameter is changed systematically throughout each of these values to determine the effect on saliency, allowing the saliency of different parts of the image to be manipulated.
  • An additional aspect learns from the way in which images are made more salient. From this “experience”, the system may use a conventional learning system to write rules which say, in a certain kind of the image/background/space, do a certain operation in order to increase the salience of the image portion. This automated system hence provides rules or preferences which can increase the salience.
  • the software can alert (semi)-autonomously moving robotic device of salient locations in the environment that need to be further inspected by a high-resolution sensory system or by a human observer.
  • This model may predict were casual observers will place their attention. For example, this could either be done as a service, where ad people send their ad to the service, and the service analyzes it and sends it back with an analysis of its saliency.
  • Another paradigm is a web-based service where people submit images and the software automatically determines the first, second, third etc. most salient locations.
  • the paradigm can also be carried out on a computer such as a PDA with attached camera.
  • the software runs on this hand-held device as a sort of “saliency meter” for determining the saliency of, for example, a product display.

Abstract

Detection of image salience in a visual display of an image. The image is analyzed at multiple spatial scales and over multiple feature channels to determine the likely salience of different portions of the image. One application for the system is in an advertising context. The detection may be improved by second order statistics, e.g. mean and the standard deviations of different image portions relative to other portions. Different edges may be considered as being extended edges by looking at the edges over multiple spatial scales. One set of feature channels can be optimized for use in moving images, and can detect motion or flicker. The images can be obtained over multiple spectral ranges the user can be instructed about how to maximize the saliency. This can be applied to automatically evaluate and optimize sales or advertisement displays.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application claims priority from provisional application Nos. 60/274,674 filed Mar. 8, 2001, and 60/288,724 filed May 4, 2001.[0001]
  • STATEMENT AS TO FEDERALLY-SPONSORED RESEARCH
  • [0002] This work was supported by the Engineering Research Centers Program of the National Science Foundation under Award Number EEC-9402726 and by the Office of Naval Research under Award Number N00014-95-1-0600. The US Government may have certain rights in this invention.
  • BACKGROUND
  • Different systems for analysis of vision components are known. Many of these systems, broadly categorized under machine vision, attempt to use the machine, usually a digital computer running dedicated software, to actually identify parts of the image. [0003]
  • However, vision algorithms frequently fail when confronted with real life images. These real life images may be of extremely high resolution, e.g., on the order of 6000 by 4000 pixels, and may be very cluttered with information that might not necessarily be relevant to the visual task at hand. For instance, many images may have partially occluding objects such as foliage, vehicles, people and so on. [0004]
  • It is believed that biological vision systems use a different approach. The mammalian visual system is believed to use a computational strategy of identifying interesting parts of the image without extensively analyzing the content of the image. The entire image may be analyzed in parallel for simple features. Portions of the image are then selected, based either on their behavioral relevance or based on local image cues. The local image cues may include brightness, motion, and/or color and others. The mammalian brain evolved in this manner to handle the enormous amount of information that is received from a scene. This information has been estimated as being on the order of up to 10[0005] 8 bits per second along the optic nerve, the axonal fibers that constitute the output of the retina. This may exceed what the brain is capable of fully processing and assimilating into its conscious experience.
  • Because of this processing strategy, only a small fraction of the information that is actually registered by the human visual system actually influences behavior. Different studies have demonstrated this in different ways. In some studies (“change blindness”) (Rensink, R. A., O'Regan, J. K., and Clark, J. J. “To see or not to see: The need for attention to perceive changes in scenes,” Psychological Sci. 8:368-373, 1997) significant image changes are not actually perceived under natural viewing conditions. However, once the attention of the person is directed to these changes, they can be easily perceived. This implies that even though a part of an image might be registered by the brain, the conscious mind might not be visually aware of that part or any other in the image. [0006]
  • Those parts of an image which elicit a strong, rapid and automatic response from viewers, independent of the task they are trying to solve, can be referred to as being “visually salient”. Two examples of such salient locations are a green object among red ones, or a vertical line among horizontal ones. The mind can direct its attention to other parts of the image, although that may require voluntary effort. [0007]
  • SUMMARY
  • The present invention describes a computer-based implementation that allows automatic detection of salient parts of image information. This may use a model which is based on the way the primate's visual system is believed to process the retinal image stream. [0008]
  • The application discloses the basic model, and applications of the model to various practical uses. One such use includes detection of the effectiveness of an image or temporal sequence of images in displaying their content, e.g., in advertising context. Some specific model attributes are also disclosed. A first model attribute describes higher order statistical analysis of image information to compute saliency. Another model attribute discloses detection of extended but interrupted contours within the image information that can contribute to image saliency. In another model attribute, the computation of saliency specific to moving objects in a video sequence or constantly changing image sequences is described. Another aspect relates to the improvement of computing saliency for video sequence detection, by detecting portions of the video sequence which flicker. Another relates to the usage of multiple spectral images acquired of the same scene. Another relates to the ability of the model to provide specific feedback on how to improve the saliency of specific objects or locations in the scene. [0009]
  • At the basis of the invention was the original concept of a “saliency map” proposed by Koch and Ullman (Koch, C. and Ullman, S. Shifts in selective visual attention: towards the underlying neural circuitry. [0010] Human Neurobiology, 4:219-227, 1985), and two detailed computer implementations: Itti, L., Koch, C. and Niebur, E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Analysis & Machine Intell. (PAMI) 20:1254-1259, 1998 and Itti, L. and Koch, C. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research 40:1489-1506, 2000.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • These and other aspects of the invention will be described in detail with reference to the accompanying drawings, wherein: [0011]
  • FIG. 1 shows a flow diagram of a model of saliency-based attention; [0012]
  • FIG. 2 shows a block diagram of the nonlinear filtering using an iterated difference of Gaussian filter; [0013]
  • FIG. 3 shows a diagram of waveforms obtained at different spatial resolutions or scales; [0014]
  • FIG. 4A-[0015] 4H shows results of different numbers of iterations of the iterative equation to converge to salient elements;
  • FIG. 5 shows an exemplary field with a background and an internal elliptical area; [0016]
  • FIG. 6 shows a block diagram of a statistical measure of pixel distribution using higher order statistics; [0017]
  • FIG. 7 shows a flowchart of operation of obtaining the different image pyramids; [0018]
  • FIG. 8 shows a diagram of the different pyramids obtained; [0019]
  • FIG. 9 shows a flowchart of finding extended image contours; [0020]
  • FIGS. [0021] 10A-10C show additional information in finding the extended contours;
  • FIG. 11 shows some notion of the different image contour operations; [0022]
  • FIG. 12 shows a flowchart of motion in an extended image sequence. [0023]
  • FIG. 13 shows a flowchart of thresholding. [0024]
  • DETAILED DESCRIPTION
  • FIG. 1 shows a system for determining a saliency map, which may be a two-dimensional map that encodes salient objects in a visual environment. The map of the scene expresses the saliency of all locations in this image. This map is the result of competitive interactions among feature maps for image features including color, orientation, texture, motion, depth and so on, that interact within and across each map. At any time, the currently strongest location in the saliency map corresponds to the most salient object. The value in the map represents the local saliency of any one location with respect to its neighborhood. By default, the system directs attention towards the most salient location. [0025]
  • A second most salient location may be found by inhibiting the most salient location, causing the system to automatically shift to the next most salient location. [0026]
  • The techniques described herein are based on the bottom-up control of attention, i.e., control that is based on the properties of the visual stimulus. This compares with a top-down component, which may be based not only on the content of the image but also on additional high-level features that may depend on a specific visual task at hand. An example of a top-down component would include, for example, storing an image of a face of a person one is searching for, followed by correlating that image across the entire scene. [0027]
  • A task of the saliency map is to compute a scalar quantity representing the salience at every location in the visual field, and to guide the subsequent selection of attended locations. The “feature maps” provide the input to the saliency map, which is modeled as a neural network receiving its input at a particular spatial scale (here scale [0028] 4 ).
  • The [0029] input image 100 may be a digitized image from a variety of sources. In one embodiment, the digitized image may be from an NTSC video camera.
  • At [0030] 105, linear filtering is carried out at different spatial scales, here nine spatial scales. The spatial scales may be created using Gaussian pyramid filters of the Burt and Adelson type. These pyramid filters may include progressively low pass filtering and sub-sampling of the input image. The spatial processing pyramids can have an arbitrary number of spatial scales. In the example provided, nine spatial scales provide horizontal and vertical image reduction factors ranging from 1:1 (level 0, representing the original input image) to 1:256 (level 8 ) in powers of 2. This may be used to detect differences in the image between fine and coarse scales.
  • Each portion of the image is analyzed by comparing the “center” portion of the image with the surround part of the image. Each comparison, called “center-surround” difference, may be carried out at multiple spatial scales indexed by the scale of the center, c, where, for example, c=2, 3 or 4 in the pyramid schemes. Each one of those is compared to the scale of the surround s=c+d, where, for example, d is 3 or 4. This example would yield [0031] 6 feature maps for each feature at the scales 2-5, 2-6, 3-6, 3-7, 4-7 and 4-8 (for instance, in the last case, the image at spatial scale 8 is subtracted, after suitable normalization, from the image at spatial scale 4 ). One feature type encodes for intensity contrast, e.g., “on” and “off” intensity contrast shown as 115. This may encode for the modulus of image luminance contrast, which shows the absolute value of the difference between center intensity and surround intensity. The differences between two images at different scales may be obtained by oversampling the image at the coarser scale to the resolution of the image at the finer scale. In principle, any number of scales in the pyramids, of center scales, and of surround scales, may be used.
  • Another [0032] feature 110 encodes for colors. With r, g and b respectively representing the red, green and blue channels of the input image, an intensity image I is obtained as I−(r+g+b)/3. A Gaussian pyramid I(s) is created from I, where s is the scale. The r, g and b channels are normalized by I at 131, at the locations where the intensity is at least 10% of its maximum, in order to decorrelate hue from intensity.
  • Four broadly tuned color channels may be created, for example as: R=r−(g+b)/2 for red, G=g−(r+b)/2 for green, B=b−(r+g)/2 for blue, and Y=(r+g)/2−|r−g|/2−b for yellow, where negative values are set to zero). [0033] 130 computes center-surround differences across scales. Two different feature maps may be used for color, a first encoding red-green feature maps, and a second encoding blue-yellow feature maps. Four Gaussian pyramids R(s), G(s), B(s) and Y(s) are created from these color channels. Depending on the input image, many more color channels could be evaluated in this manner.
  • In one embodiment, the [0034] image sensor 99 that obtains the image of a particular scene is a multi-spectral image sensor. This image sensor may obtain different spectra of the same scene. For example, the image sensor may sample a scene in the infra-red as well as in the visible part of the spectrum. These two images may then be evaluated in a similar manner to that described above.
  • Another feature type may encode for [0035] local orientation contrast 120. This may use the creation of oriented Gabor pyramids as known in the art. Four orientation-selective pyramids may thus be created from I using Gabor filtering at 0, 45, 90 and 135 degrees, operating as the four features. The maps encode, as a group, the difference between the average local orientation and the center and surround scales. In a more general implementation, many more than four orientation channels could be used.
  • In summary, differences between a “center” fine scale c and a “surround” coarser scales yield six feature maps for each of intensity contrast, red-green double opponency, blue-yellow double opponency, and the four orientations. A total of 42 feature maps is thus created, using six pairs of center-surround scales in seven types of features, following the example above. A different number of feature maps may be obtained using a different number of pyramid scales, center scales, surround scales, or features. [0036]
  • [0037] 130 shows normalizing the features to extract salient image location from the raw center-surround maps, and to discard inconspicuous locations. This process may be critical to the operation of the system. This operation follows the flowchart of FIG. 2. At 200, each feature map is first normalized to a fixed dynamic range such as between 0 and 1. This may eliminate feature-dependent amplitude differences that may be due to different feature extraction mechanisms.
  • At each step of the iteration, the map is convolved by a large difference-of-Gaussians kernel at [0038] 215 and the results are added to the center contents of the map at 210. The additional input implements the short-range excitation processes and the long-range inhibitory processes between the neighboring visual locations. The map is then half-wave rectified at 220, which may remove negative results. This makes the iterative process nonlinear, which may improve the results.
  • Specifically, the filter carries out [0039] DOG ( x , y ) = c ex 2 2 π σ ex 2 - ( x 2 + y 2 ) / ( 2 δ ex 2 ) - c inh 2 2 πσ inh 2 - ( x 2 + y 2 ) / ( 2 δ inh 2 ) ( 1 )
    Figure US20020154833A1-20021024-M00001
  • where c[0040] ex and cin are positive numbers that denote the strength of the excitatory center response and the strength of the inhibitory surround response, respectively. δex and δinh denote the width, spatial extent or size of the associated excitatory central Gaussian or the inhibitory surround Gaussian. In eq. 1, the central Gaussian is subtracted from the surround Gaussian to obtain a so-called “Mexican-Hat” operator or “Difference-of-Gaussian”, hence leading to the name ‘DoG’. This can also be seen in the central box ‘215’ of FIG. 2.
  • At each iteration, the feature map M goes through the following transformation:[0041]
  • M←|M+M*DOG− inh|≧  (2)
  • Eq. 2 shows getting the new value of the image ‘M’ by taking the current input image in map ‘M’, filtering it through this ‘DOG’ filter, adding it to the existing Map ‘M’, and subtracting an inhibitory constant C[0042] inh. Positive results are kept; negative results are set to zero.
  • Each feature map is iterated 10 times using this equation. Different numbers of iterations may be carried out, based on experience and the application domain. The local excitation is counteracted by broad inhibition from neighboring locations. This spatial interaction across the entire map may be crucial for resolving competition among salient items. [0043]
  • FIG. 3 shows two different examples of the six center-surround receptive field types. The left part of the figure shows Gaussian pixel widths, numbered [0044] 0-8, for the 9 spatial scales used in the model example of FIG. 1. Scale 0 corresponds to the original image, and each subsequent scale is coarser by a factor 2. At the coarsest scale, sigma=8, almost the entire image is blurred and only very coarse objects are visible as blobs. 300 and 302 show two examples of the six center-surround receptive field types. 300 shows the scale pair 2-5 representing the image filtered with the filter with sigma=2 being subtracted from the image filtered sigma=5. 302 shows the scale pair 4-8. The spatial competition for salience may be implemented within each of the feature maps. Each map receives input from the filtering and center surround stages.
  • An example of results is shown in FIGS. [0045] 4A-4H. FIG. 4A shows the actual image, with iteration 0 (FIG. 4B) showing the items that are present in FIG. 4A. FIG. 4C shows two iterations of the type illustrated in FIG. 2 to show that the salient features begin to emerge. This is shown in further detail in FIG. 4D (iteration 4), 4E (iterations 6), 4F (iteration 8), 4G (iteration 10) and 4H (iteration 12). FIG. 4G representing iteration 10 clearly shows which features are most salient, and this only becomes more evident in FIG. 4H showing the result of iteration 12. Since there is not that much difference between iterations 10 and 12, in this situation it is evident that the iteration can be stopped at 10. The net effect of the iterative process in this example was to reinforce the brightest object while suppressing the darker objects, which may embody the fact that the brightest object may be perceived as visually salient by human observers.
  • After normalization at [0046] 130, the feature maps for intensity, color, and orientation are summed across scales into three separate “conspicuity maps,” 133 for intensity, 134 for color and 136 for orientation. Conspicuity maps for other features, such as motion or flicker, can easily be added here.
  • Each conspicuity map is then subjected to another 10 iterations of the iterative normalization process shown in FIG. 2. The motivation for the creation of three separate channels and their individual normalization is the hypothesis that similar features compete strongly for salience, while different modalities contribute independently to the saliency map. The individual normalization may provide additional information since similar features may compete strongly for salience, while different modalities may contribute independently to the saliency maps. [0047]
  • This “within-feature competitive” globally promotes the most salient portions of the feature map, both within a feature, and over the whole map. [0048]
  • After this, at [0049] 150, linear combinations of these maps are taken to form the unique saliency map shown as 155. At any given time, the maximum of the saliency map may correspond to the most salient stimulus, and represents the item to which the focus of attention should next be directed. Hence, at any given time, the most salient location may be determined from the maximum of the saliency map. This may be effected at 160 using a “winner take all” technique.
  • Different “winner take all” techniques are known. However, this system may use a two-dimensional layer of integrate and fire neurons with strong global inhibition. [0050]
  • The system as described might direct its focus of attention constantly to one location since the same winner would always be selected. Accordingly, the feedback shown as [0051] 165 is indicated to provide feedback from the “winner take all” array 160 to the saliency map 155. That is, after some period of variable delay, the saliency of the winning location may be transiently inhibited. This assures that the “winner take all” circuit automatically selects the next most salient location. As a consequence, attention then switches to the next most conspicuous location. This inhibition prevents a previously attended location from being attended to again within a short interval and endows the entire algorithm with a dynamic element.
  • The above embodiment has described comparisons between different image parts at [0052] 130 which rely on simple center-surround mechanisms. These are implemented as differences between mean pixel values, across the different spatial scales, as described. However, in some situations this might not correctly detect any dissimilarity between the center and surround regions that may be present in the higher-order spatial statistics of the input.
  • Another embodiment determines higher order, e.g., second order, statistics in the image. This may be done for any of previously described purposes. For example, consider the case as shown in FIG. 6, where the center and surround are two different textures with similar means but different higher-order statistics (for instance, different variances). A simple comparison of the mean pixel values between the center and surround regions would show a low saliency, while both textures may appear quite dissimilar to human observers. [0053]
  • An alternative embodiment described herein takes into account not only mean value differences between center and surround, but also the statistical distribution of the information. [0054]
  • An embodiment describes the use of second-order statistics, here the variance of pixel distribution. This technique may be used when a simple comparison of mean pixel values between center and surround regions shows a low saliency. Alternatively, this may be used for all applications of the invention. [0055]
  • This system may provide a statistical measure of a difference of distributions of pixel values between the center and surrounding regions. [0056]
  • This embodiment may assume that the pixels should be distributed in a Gaussian format. While this assumption holds for only certain kinds of images, it may still represent a better approximation than the first embodiment. However, more general statistical assumptions could also be used. [0057]
  • An example is shown in FIG. 5. An image is shown having a background area with a texture, and an elliptical area within the other background area. An observer can easily see the elliptical area within the background in FIG. 5, but the average values are more or less the same. [0058]
  • FIG. 6 shows a block diagram of a center-surround neuronal “unit” of this embodiment. This unit is comparing two [0059] different parts 600, 605 with different textures. The unit compares the distribution of pixel values between the center 605 and surround regions 600. In the example shown, the mean pixel values are substantially identical over the center and concentric surround regions. Therefore, an operator that only considered the mean intensity in the center and subtracts that from the average intensity in the surround would obtain a value close to zero and would not find the center portion to be salient. Note that the mean of the two Gaussian distributions in the middle plot are identical.
  • This embodiment takes the variance as shown. The variance of the [0060] center region 610 is higher than the variance 615 of the surround. The distributions of pixel values in center and surround are approximated two Gaussian functions. A statistical measure of similarity between those distributions (such as the Kullback divergence) may then be used to compute the response of the neuron at 620, such that identical distributions yield no neuronal response while very different distributions yield a strong response.
  • The mean and standard deviation may be calculated as follows, and as shown in the flowchart of FIG. 7. The pixel distribution is taken in a region represented by a pixel at a given level in a multiscale image pyramid. [0061]
  • At [0062] 700, two different image “pyramids”, are created, that is two sets of images filtered at different resolutions, of sigmas in FIG. 3. Each pyramid accumulates the sum and the sum of the squares of all the pixels up to the chosen level of the pyramids. That is, at a given level n in the sum pyramid, each pixel is the sum of the pixel values xi of the (dn)2 corresponding pixels at the base level of the pyramid, where d is the scaling between levels in the pyramid. In the specific implementation, d=2.
  • The sum-of-squares pyramid is similar except that an image of the sum of the squares of the pixel values in the original image is used as the base of the pyramid. [0063]
  • This data is already calculated and stored in two pyramids. Therefore the mean and standard deviation for any pixel at level n in the pyramid can be easily calculated as [0064] μ = 1 n i x i σ 2 = ( 1 n - 1 ) [ i x i 2 + ( 1 n 2 - 2 n ) ( j x j ) 2 ] n = ( d n ) 2 at 705.
    Figure US20020154833A1-20021024-M00002
  • At [0065] 710, saliency is then derived from a comparison between this mean and standard deviation for the center and surrounding regions. The saliency may use other similar measures including Euclidean distance between the mean-standard deviation pair, ideal-observer discrimination, and the Kullback J-divergence.
  • This higher order comparison may not only be applied to the intensity channel, but also to color opponencies and orientation-selective channels or to any other channel. [0066]
  • FIG. 8 graphically illustrates the computation of mean and variance of pixel distribution within increasingly larger square regions, using an image pyramid architecture. From the [0067] original input image 800, two dyadic image pyramids are created. In the sum pyramid on the left, each pixel at a given level “n” contains the sum of all corresponding pixels at level 0 (the original image). In the second one (right), each pixel at level “n” contains the sum of squares of all corresponding pixels at level 0.
  • Another improvement may include improved detection of center-surround differences by contour identification. Detection of salient contours involves the elaboration of a subset of elongated contours in the image. Image contour detection can be done with standard image processing techniques, such as by using Canny edge-detection filtering. Several techniques have been proposed for the detection of salient contours. The present technique uses a multiscale approach which is flowcharted in FIG. 9, and shown graphically in FIG. 10. [0068]
  • At [0069] 900, contours and edges of the image are detected at multiple spatial scales using oriented Gabor filters which may be set to take account of contours in both local neighborhoods as well as contours across the entire image. This takes note that a longer contour or edge, even if interrupted, may represent a more salient image feature than shorter image segments, even if they are continuous and non-interrupted.
  • In this embodiment, at any given spatial scale, neighboring locations interact such that edge elements at a given orientation which appear to form a contour reinforce each other. This provides the raw map “M” containing Gabor edge detection results at a given spatial scale, with values scaled between [0070] 0 and 1. These values are iterated as follows. At 910, the image is convolved with an excitatory filter mask, yielding a new map “E”.
  • At [0071] 915, the value 1 is added to E.
  • [0072] 920 saturates values>1.25 to avoid explosion.
  • The raw map M is multiplied by E at [0073] 925.
  • At [0074] 930, M is convolved by a difference of Gaussian filter, yielding I
  • At [0075] 940, a small constant k is added to I which implements a global inhibitory bias,
  • AT [0076] 945, I is added to M,
  • At [0077] 950, negative values in M are eliminated by setting them to zero.
  • Note that this is a non-linear process, since saturation is applied at one end, and negative values are eliminated at the other end. At [0078] 955, this non-linear process 910-950 is iterated a few times (on the order of 10 iterations), hence implementing a recurrent non-linear scheme with early termination.
  • FIG. 10A shows parameters defining the field of influence between two nearby visual neurons, as found in typical single-spatial scale models of contour integration. The [0079] actual image 1000 is filtered by banks of orientation-selective filters 1010. These filters may approximate neuronal responses for several orientations and at several spatial scales 1020, not taking into account any interaction. FIG. 10B shows characterizing interactions between distant filters according to separating distance and angles. In typical models, this may yield a “field of influence” which defines the location, preferred orientation and connection strength between a central neuron of interest and its neighbors. FIG. 10C shows this field of influence Results obtained with this technique for each map M are then combined at 960, first across spatial scales for one orientation, and then across orientations as shown in FIG. 11.
  • Local oriented features are first extracted at multiple spatial scales and for multiple orientations (here four orientations at 0, 45, 90 and 135 degrees). The iterative competition for salience and contour integration process described in the previous figure is then applied to each resulting feature map (here represented only for one map, at the right). [0080]
  • The result is a single saliency map which contains not only small, localized salient objects as detected with the basic technique described with reference to FIG. 1, but also extended contours if those are salient. [0081]
  • The above has described operation with static images. An important extension of the above operates to provide a new feature channel for detection of salience in moving images in a video sequence. The operation may use the flowchart shown in FIG. 12. [0082]
  • At [0083] 1200, visual motion from a sequence of images (acquired, for instance, via a video camera) is extracted. This may use the Spatio Temporal Energy model that has previously been described by Adelson and Bergen. However, many other motion algorithms could also be used here. Briefly, this may apply three-dimensional (x, y, t) band-pass filters to the sequence of frames. Each filter detects motion in a given direction and at a given speed. Note that this filter is a type of orientation filter but in space-time instead of in two-dimensional spatial plane. A bank of such filters are provided and are tuned to motion in different directions such as up, down and left and right, and at different velocities, i.e., x pixels per frame. If we assume 4 directions, and three speeds, then 12 filters per image location are required.
  • At [0084] 1205, this motion extraction module is applied to the luminance (Y) and chrominance (C) channels of the image at several spatial scales, yielding one “motion map” for each orientation, velocity and scale.
  • [0085] 1210 carries out non-linear spatial competition for salience, as described previously, with respect to each resulting motion map. That is, the motion saliency of multiple objects, moving roughly in the same direction and speed, is evaluated by the competitive and iterative process described above. Again, this step is crucial for evaluating the saliency of more than one object that moves in a similar direction and speed.
  • At [0086] 1215, all the maps for a given orientation and velocity (and several spatial scales) are summed into one summary map for that orientation and velocity.
  • At [0087] 1220, all those maps are then summed using the non-linear spatial competition process, and then at 1225, all the summary maps are summed. The final result of the non-linear spatial competition process is obtained at 1230.
  • This system is used for detecting saliency in the motion channel. A nonlinear within-feature competition scheme is used to detect motion in luminance and also in chrominance in a multiscale manner. This provides one motion map for each of orientation, velocity and scale for each of luminance and chrominance. [0088]
  • The nonlinear spatial competition system then is used for each resulting motion map. [0089]
  • Another embodiment recognizes that the Adelson-Bergen or spatio-temporal image filters are specialized to pick up motion. Classic motion detectors do not respond to flicker in the image since nothing is moving in any direction. Hence, an additional filter may be added which provides a temporal derivative channel to pick up the flicker. Hence, this embodiment looks at flicker in animated sequences. This may be of particular relevance for evaluating the saliency of web-pages or marquette advertising or electronic displays with flashing LEDs. [0090]
  • Take an example of a light turning on and off, without moving, just flashing. This most certainly attracts attention. Yet Adelson-Bergen motion detectors do not respond to flicker, since nothing is moving in any one particular direction. A temporal derivative channel may be used to pick up flicker and integrate the derivative into saliency. An embodiment is shown in the flowchart of FIG. 13. [0091]
  • At [0092] 1300, the absolute value of the temporal derivative of the image intensity in computed. Since an increase in light should be as salient as a decrease in light intensity, any changes in this value, whether positive or negative, are relevant
  • At [0093] 1305, this absolute difference value is thresholded, and compared against the threshold. That is, if the change in image intensity is too small, it is not considered, since it might be produced by noise. Other temporal information may be calculated at 1310, such as taking the derivative of colors, e.g. the red-green or blue-yellow color channels, with respect to time. Again, the absolute value of the temporal derivative in the red-green and in the blue-yellow color channels can be considered. At 1315, a test is made to determine if the change is over the whole image. If so, then the process stops. This is based on the recognition that flickering of the entire image may not be very salient. For example, simply turning room lights quickly on and off might not be very salient. This can be carried out using spatial competitive interactions as in the other channels. At 1320, the image portion that flickers is identified as salient, or increased in salience according to results of the iterative competition process applied to the flicker map.
  • A preferred embodiment for a flicker saliency channel hence may include: [0094]
  • a basic rectified flicker extraction module based on taking the absolute value of the difference between two successive frames. [0095]
  • its application to several spatial scales and in several image modalities (luminance, chrominance, etc). [0096]
  • the application of a non-linear spatial competition for salience mechanism within each resulting flicker map. [0097]
  • the summation of all the maps for a given modality (and several scales) into one summary map for that modality. [0098]
  • the application on those summary maps of the non-linear spatial competition process [0099]
  • the summation of all summary maps [0100]
  • the application on the final result of the non-linear spatial competition process. [0101]
  • The above system evaluates saliency based on images obtained using a classical red-green-blue representation. This gives rise to two opponency channels (red-green and blue-yellow), an intensity channels, and four orientation channels. These seven channels are processed in separate computational streams. This can be extended to many more channels. Such multi-spectral or hyper-spectral image sensors may include near and far infra-red cameras, visible light cameras, synthetic aperture radar and so on. With images comprising large numbers of spectral bands, e.g., up to hundreds of channels in some futuristic military scenarios, significant redundancies will exist across different spectral bands. The saliency system can therefore be used to model more sophisticated interactions between spectral channels. [0102]
  • This may be achieved by implementing connections across channels whereby each feature map at a given scale can receive multiplicative excitatory or inhibitory input from another feature map at the same or different spatial scale. These connections extend the interactive spatial competition for salience already implemented in the saliency model: at each time step, spatial interactions within each map may be iterated, followed by one iteration of interactions across maps. Supervised training algorithms can be applied to include training of the weights by which the different channels interact. The resulting system may be able to exploit multi-spectral imagery in a much more sophisticated manner than is currently possible. [0103]
  • The above has described different ways of carrying out detection of the image saliency. The importance is that when exposure to an image or a sequence of images is short, attention of the (casual) observer is primarily deployed autonomously onto the locations which are most perceptually salient. A close approximation to the perceptual salience at every visual location allows a designer to optimize their work for notice by users. One application of such detection is in the field of advertising. It has been noted by the inventors that users do not perceive all components of a visual environment to be equally interesting. This may be used to evaluate the effectiveness of an advertising model. Hence, the embodiment is described which uses a computer to provide an automatic, objective, and quantitative tool by which the impact of advertising designs can be tested. This may be used on any image, moving or static, including, for example, web pages, billboards, magazine covers, TV commercials, or any medium to which the target audience may be briefly exposed. [0104]
  • This may be used to calculate saliency/conspicuity of items which are being displayed, for example, in an advertising context. This may include advertisements, visual art and text in print (magazines, news-papers, journals, books); posters, bill-boards and other outside, environmental displays; advertisements, visual art and text in electronic format on the world-wide-web or on computers; as well as the saliency/conscipicuity of dynamic advertisements, visual art and clips in movies, TV film, videos, dynamic display boards or graphical user interfaces. It may also be used for the saliency/conspicuity of displays of products placed in shop windows, department stores, aisles and shelves, printed ads and so on for product placement. That is, given a particular product (e.g. a soda brand, wine bottle, candy bar), the software evaluates its saliency within the entire display by taking account of the entire view as would be seen by a casual observer or shopper. [0105]
  • The software can also determine how to change the visual appearance of the product, including its shape and its label, in order to increase its saliency. It can do so by providing specific information to the user on which features, at which spatial scales, are more or less salient than the object or location that the user wishes to draw the attention of the viewer to. For instance, say the user wishes to draw the eye of the viewer to a specific brand of candy bars in an array of candy bars, chocolates and other sweets. By inspecting the conspicuity maps for color, orientation and intensity (see FIG. 1), the user can get a first impression of which objects in the scene are salient because of an intensity difference, because of a color difference or because of their spatial orientation relative to the background. Further information can be provided by having the user inspect the entire pyramid for the different color and orientation maps. Now the user can discover at what particular spatial scale any one object or location in the image is most salient. This can then guide how the user should rearrange the candy display (or the ad) in order to maximize the desired object's saliency. [0106]
  • The above techniques have taught multiple ways of determining which part of the many maps representing the image has a maximum salience. This can be done from features, feature dimensions, and evaluation of the features at multiple spatial scales. The techniques of increasing the salience effectively uses a search process through parameter space. For example, each parameter may be varied in each direction to determine if that part of the image becomes more salient or less salient. A part of the image, for example, could be made a little redder. Once doing so, an evaluation of whether the saliency increases is made. If the saliency does increase from that change, then the image can be made redder still. This can be continued until the maximum saliency from that parameter is obtained. By carrying out a search process through parameter space, different parts of the image can be made more or less salient. The search process can be carried out through feature channels including any of the feature channels noted above, and through different scales. The parameter is changed systematically throughout each of these values to determine the effect on saliency, allowing the saliency of different parts of the image to be manipulated. [0107]
  • An additional aspect learns from the way in which images are made more salient. From this “experience”, the system may use a conventional learning system to write rules which say, in a certain kind of the image/background/space, do a certain operation in order to increase the salience of the image portion. This automated system hence provides rules or preferences which can increase the salience. [0108]
  • Applications within the military, intelligence and security community which require (semi)-automatic evaluation of salient images to detect construction activities, evidence for burials, missile tests, intruders alert or the detection of humans in surveillance applications who behave “conspicuously” are also envisioned. In particular, this application may include multi-spectral images (where not only three color channels but possibly hundreds of different spectral images are combined in a similar manner to yield saliency) as well as moving imagery. [0109]
  • For applications in the robotic domain, the software can alert (semi)-autonomously moving robotic device of salient locations in the environment that need to be further inspected by a high-resolution sensory system or by a human observer. [0110]
  • This model may predict were casual observers will place their attention. For example, this could either be done as a service, where ad people send their ad to the service, and the service analyzes it and sends it back with an analysis of its saliency. Another paradigm is a web-based service where people submit images and the software automatically determines the first, second, third etc. most salient locations. The paradigm can also be carried out on a computer such as a PDA with attached camera. The software runs on this hand-held device as a sort of “saliency meter” for determining the saliency of, for example, a product display. [0111]
  • Other embodiments are within the disclosed invention. [0112]

Claims (59)

What is claimed is:
1. A method, comprising:
analyzing an image to determine salient parts of an image representation without analyzing the actual content of the image, and
using said salient parts to determine an effectiveness of said image in displaying its content.
2. A method as in claim 1, wherein said analyzing comprises analyzing pixels of said image using mean pixel values.
3. A method as in claim 1, wherein said analyzing comprises analyzing pixels on said image using higher order statistical variations.
4. A method as in claim 1, wherein said image representation includes a single image at a single time.
5. A method as in claim 1, wherein said image representation includes a sequence of images over time.
6. A method as in claim 1, wherein said using comprises evaluating an effectiveness of said image in an advertising context.
7. A method as in claim 1, wherein said using comprises evaluating a display showing one or more items for sale.
8. A method, comprising:
obtaining an electronic file indicative of image content;
forming at least a plurality of feature maps, each feature map representing information about a saliency measure in some area of the image content, said forming comprising detecting differences between a current portion of the image and a surrounding portion of the image using first order, second order or higher order statistics.
9. A method as in claim 8, wherein said second order statistics includes standard deviation.
10. A method as in claim 9, further comprising calculating information indicating a sum of pixels, and second information indicative of a sum of square of pixels at a plurality of different spatial resolution levels.
11. A method as in claim 10, wherein said different spatial resolution levels include different resolution levels within a pyramid scheme.
12. A method as in claim 8 further comprising using both information about mean values and information about standard deviation values.
13. A method as in claim 8, wherein said feature maps include information on intensity.
14. A method as in claim 8, wherein said feature maps include information on color.
15. A method as in claim 8, wherein said feature maps include information about a plurality of different spectral components.
16. A method as in claim 15, further comprising using redundancies between the different spectral components to evaluate said images.
17. A method as in claim 8, wherein said image content includes information about a sequence of moving images.
18. A method as in claim 9, further comprising calculating an image pyramid, where for each of a plurality of different resolutions, said image pyramid stores a sum of all corresponding pixels for current level and lower levels, and a sum of squares of all corresponding pixels for current level and lower levels.
19. A method, comprising:
comparing one portion of an image to another portion of an image to detect salient portions of the image, said comparing comprises determining extended contours in the image which are not complete edges, and rating said contours as part of a saliency detection.
20. A method as in claim 19, wherein said comparing comprises comparing a plurality of different resolution versions of the image to detect said extended contours.
21. A method as in claim 19, wherein said comparing comprises carrying out a nonlinear detection of salient contours.
22. A method as in claim 19, wherein said comparing comprises comparing mean value differences between each part of an image and a surrounding part of an image.
23. A method as in claim 19, wherein said comparing comprises comparing higher order statistical information about each part of an image and a surrounding part of said image.
24. A method as in claim 20, further comprising calculating a plurality of reduced resolution versions of the image at multiple spatial scales, and analyzing said versions of the image.
25. A method as in claim 20, further comprising using edge elements at specified spatial scales to reinforce other edge elements at other spatial scales.
26. A method as in claim 19, wherein said comparing comprises forming a filter mask, saturating said filter mask according to a specified value to form a nonlinearly filtered value, and using said nonlinearly filtered value to detect said contours.
27. A method as in claim 26, further comprising filtering values indicative of said image using a difference of Gaussian filter.
28. A method as in claim 19, wherein said comparing comprises comparing a plurality of different orientation versions of said image to detect said extended contours.
29. A method as in claim 28, wherein said comparing comprises determining a field of influence of contours based on location, and preferred orientation among the contours.
30. A method as in claim 19, wherein said comparing comprises finding interaction among contours across multiple spatial scales.
31. A method as in claim 19, wherein said comparing comprises finding an interaction among contours over a global detection of the image and over a local detection of the image.
32. A method comprising:
analyzing a sequence of temporally changing images, using an automated computer program; and
automatically finding salient portions in said images, based on said analyzing.
33. A method as in claim 32, wherein said automatically finding comprises extracting motion in said images, and using said motion as a feature channel to detect said salient portions.
34. A method as in claim 32, wherein said extracting motion comprises applying three-dimensional spatio-temporal filters to a sequence of images, and using said filters to detect motion having specified characteristics.
35. A method as in claim 32, wherein said applying comprises applying a plurality of spatio-temporal three-dimensional filters, and wherein each of said three-dimensional filters detects specified motion at a specified speed in a specified direction, and each of said filters detects said different speeds and different directions.
36. A method as in claim 35, wherein said filters detect motion across luminance.
37. A method as in claim 35, wherein said filters detect motion across chrominance.
38. A method as in claim 32, further comprising computing an absolute value of a temporal derivative of image intensity, and detecting a change in said image intensity over time greater than a predetermined amount, to detect flicker in the image or a portion thereof.
39. A method as in claim 38, wherein said computing comprises detecting an absolute value of temporal derivatives of color channels that are greater than a predetermined threshold.
40. A method as in claim 38, wherein said computing comprises detecting an absolute value of temporal derivatives of luminance channels that are greater than a predetermined threshold.
41. A method as in claim 32, further comprising using said automatically finding to evaluate an advertisement.
42. A method, comprising:
analyzing an image to determine salient parts of the image representation by obtaining information about the image in at least two different spectral ranges; and
correlating said information about the image to determine salient portions of the image, without looking for specific content of the image.
43. A method as in claim 42, wherein said analyzing comprises using said salient portions to determine an effectiveness of said image in displaying a product.
44. A method as in claim 43, wherein said analyzing comprises analyzing pixels of said image using mean pixel values.
45. A method as in claim 43, wherein said analyzing comprises analyzing pixels on said image using second higher order statistical variations.
46. A method as in claim 43, wherein said image representation is a single image at a single time.
47. A method as in claim 43, wherein said image representation is a sequence of images in time representing a moving scene.
48. A method as in claim 35, further comprising forming a composite map from outputs of said plurality of filters.
49. A method as in claim 35, wherein said filters operate nonlinearly.
50. A method as in claim 39, wherein said operate nonlinearly comprises defining a maximum value and a minimum value.
51. A method as in claim 32, wherein said automatically finding comprises detecting flicker in portions of the image.
52. A method as in claim 51, wherein said detecting flicker comprises detecting flicker only in a portion of the image, but not in the entire image.
53. A method as in claim 32, further comprising using said automatically finding to optimize a display of visual information.
54. A method as in claim 42, further comprising using said analying to optimize a display of visual information.
55. A method, comprising:
analyzing an image representing a display of visual information to determine salient parts of the image representation; and
automatically increasing a salience of a specified part of the image.
56. A method as in claim 55 wherein said automatically increasing comprises systematically changing a value of a parameter and determining the effect of said parameter on said salience.
57. A method as in claim 55, wherein said automatically increasing comprises determining rules for salience increase, and using said rules to increase a salience of the specified part of the image.
58. A method as in claim 55, wherein said display of visual information is an advertisement.
59. A method as in claim 55, wherein said automatically increasing comprises changing a shape of the specified part.
US09/912,225 2001-03-08 2001-07-23 Computation of intrinsic perceptual saliency in visual environments, and applications Abandoned US20020154833A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US09/912,225 US20020154833A1 (en) 2001-03-08 2001-07-23 Computation of intrinsic perceptual saliency in visual environments, and applications
US11/430,684 US8098886B2 (en) 2001-03-08 2006-05-08 Computation of intrinsic perceptual saliency in visual environments, and applications
US13/324,352 US8515131B2 (en) 2001-03-08 2011-12-13 Computation of intrinsic perceptual saliency in visual environments, and applications

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US27467401P 2001-03-08 2001-03-08
US28872401P 2001-05-04 2001-05-04
US09/912,225 US20020154833A1 (en) 2001-03-08 2001-07-23 Computation of intrinsic perceptual saliency in visual environments, and applications

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US11/430,684 Continuation US8098886B2 (en) 2001-03-08 2006-05-08 Computation of intrinsic perceptual saliency in visual environments, and applications

Publications (1)

Publication Number Publication Date
US20020154833A1 true US20020154833A1 (en) 2002-10-24

Family

ID=27402668

Family Applications (3)

Application Number Title Priority Date Filing Date
US09/912,225 Abandoned US20020154833A1 (en) 2001-03-08 2001-07-23 Computation of intrinsic perceptual saliency in visual environments, and applications
US11/430,684 Active 2024-05-28 US8098886B2 (en) 2001-03-08 2006-05-08 Computation of intrinsic perceptual saliency in visual environments, and applications
US13/324,352 Expired - Lifetime US8515131B2 (en) 2001-03-08 2011-12-13 Computation of intrinsic perceptual saliency in visual environments, and applications

Family Applications After (2)

Application Number Title Priority Date Filing Date
US11/430,684 Active 2024-05-28 US8098886B2 (en) 2001-03-08 2006-05-08 Computation of intrinsic perceptual saliency in visual environments, and applications
US13/324,352 Expired - Lifetime US8515131B2 (en) 2001-03-08 2011-12-13 Computation of intrinsic perceptual saliency in visual environments, and applications

Country Status (1)

Country Link
US (3) US20020154833A1 (en)

Cited By (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030072470A1 (en) * 2001-10-15 2003-04-17 Lee Henry C. Two dimensional autonomous isotropic detection technique
WO2004111931A2 (en) * 2003-06-10 2004-12-23 California Institute Of Technology A system and method for attentional selection
US20050045224A1 (en) * 2003-08-29 2005-03-03 Lyden Robert M. Solar cell, module, array, network, and power grid
US20060153459A1 (en) * 2005-01-10 2006-07-13 Yan Zhang Object classification method for a collision warning system
US20060165178A1 (en) * 2002-11-01 2006-07-27 Microsoft Corporation Generating a Motion Attention Model
US20060189886A1 (en) * 2005-02-24 2006-08-24 Warren Jones System and method for quantifying and mapping visual salience
US20060215922A1 (en) * 2001-03-08 2006-09-28 Christof Koch Computation of intrinsic perceptual saliency in visual environments, and applications
US20060257834A1 (en) * 2005-05-10 2006-11-16 Lee Linda M Quantitative EEG as an identifier of learning modality
US20070101387A1 (en) * 2005-10-31 2007-05-03 Microsoft Corporation Media Sharing And Authoring On The Web
US20070101271A1 (en) * 2005-11-01 2007-05-03 Microsoft Corporation Template-based multimedia authoring and sharing
US20070183680A1 (en) * 2005-11-15 2007-08-09 Mario Aguilar Multi-scale image fusion
US20080222670A1 (en) * 2007-03-07 2008-09-11 Lee Hans C Method and system for using coherence of biological responses as a measure of performance of a media
WO2008109771A2 (en) * 2007-03-08 2008-09-12 Second Sight Medical Products, Inc. Saliency-based apparatus and methods for visual prostheses
WO2008108814A1 (en) * 2007-03-07 2008-09-12 Emsense Corporation Method and system for measuring and ranking a positive or negative response to audiovisual or interactive media, products or activities using physiological signals
US20080304740A1 (en) * 2007-06-06 2008-12-11 Microsoft Corporation Salient Object Detection
JP2009003615A (en) * 2007-06-20 2009-01-08 Nippon Telegr & Teleph Corp <Ntt> Attention region extraction method, attention region extraction device, computer program, and recording medium
US20090067752A1 (en) * 2007-09-11 2009-03-12 Samsung Electronics Co., Ltd. Image-registration method, medium, and apparatus
US20090069652A1 (en) * 2007-09-07 2009-03-12 Lee Hans C Method and Apparatus for Sensing Blood Oxygen
US20090070798A1 (en) * 2007-03-02 2009-03-12 Lee Hans C System and Method for Detecting Viewer Attention to Media Delivery Devices
US20090094629A1 (en) * 2007-10-02 2009-04-09 Lee Hans C Providing Actionable Insights Based on Physiological Responses From Viewers of Media
US20090092314A1 (en) * 2007-05-17 2009-04-09 Amitabh Varshney Method, system and apparatus for determining and modifying saliency of a visual medium
US20090133047A1 (en) * 2007-10-31 2009-05-21 Lee Hans C Systems and Methods Providing Distributed Collection and Centralized Processing of Physiological Responses from Viewers
US20090150919A1 (en) * 2007-11-30 2009-06-11 Lee Michael J Correlating Media Instance Information With Physiological Responses From Participating Subjects
CN101916379A (en) * 2010-09-03 2010-12-15 华中科技大学 Target search and recognition method based on object accumulation visual attention mechanism
US20110229025A1 (en) * 2010-02-10 2011-09-22 Qi Zhao Methods and systems for generating saliency models through linear and/or nonlinear integration
US20110249886A1 (en) * 2010-04-12 2011-10-13 Samsung Electronics Co., Ltd. Image converting device and three-dimensional image display device including the same
US20120121173A1 (en) * 2009-05-08 2012-05-17 Kazuki Aisaka Image processing apparatus and method, and program
US20120162528A1 (en) * 2010-01-13 2012-06-28 Shinya Kiuchi Video processing device and video display device
US20120328161A1 (en) * 2011-06-22 2012-12-27 Palenychka Roman Method and multi-scale attention system for spatiotemporal change determination and object detection
US8347326B2 (en) 2007-12-18 2013-01-01 The Nielsen Company (US) Identifying key media events and modeling causal relationships between key events and reported feelings
US20130084013A1 (en) * 2011-09-29 2013-04-04 Hao Tang System and method for saliency map generation
US20130166394A1 (en) * 2011-12-22 2013-06-27 Yahoo! Inc. Saliency-based evaluation of webpage designs and layouts
US20130342758A1 (en) * 2012-06-20 2013-12-26 Disney Enterprises, Inc. Video retargeting using content-dependent scaling vectors
US20140016859A1 (en) * 2012-06-29 2014-01-16 Arizona Board of Regents, a body corporate of the State of AZ, acting for and on behalf of AZ Sta Systems, methods, and media for optical recognition
CN103810503A (en) * 2013-12-26 2014-05-21 西北工业大学 Depth study based method for detecting salient regions in natural image
US8764652B2 (en) 2007-03-08 2014-07-01 The Nielson Company (US), LLC. Method and system for measuring and ranking an “engagement” response to audiovisual or interactive media, products, or activities using physiological signals
US20140193074A1 (en) * 2011-11-02 2014-07-10 Zhongyang Huang Image recognition device, image recognition method, and integrated circuit
US8782681B2 (en) 2007-03-08 2014-07-15 The Nielsen Company (Us), Llc Method and system for rating media and events in media based on physiological data
US8989835B2 (en) 2012-08-17 2015-03-24 The Nielsen Company (Us), Llc Systems and methods to gather and analyze electroencephalographic data
US9053754B2 (en) 2004-07-28 2015-06-09 Microsoft Technology Licensing, Llc Thumbnail generation and presentation for recorded TV programs
CN104966286A (en) * 2015-06-04 2015-10-07 电子科技大学 3D video saliency detection method
US9195903B2 (en) * 2014-04-29 2015-11-24 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US9215996B2 (en) 2007-03-02 2015-12-22 The Nielsen Company (Us), Llc Apparatus and method for objectively determining human response to media
US20160004962A1 (en) * 2014-07-02 2016-01-07 International Business Machines Corporation Classifying features using a neurosynaptic system
US20160026880A1 (en) * 2014-07-28 2016-01-28 Hyundai Mobis Co., Ltd. Driving assist system for vehicle and method thereof
CN105405132A (en) * 2015-11-04 2016-03-16 河海大学 SAR image man-made target detection method based on visual contrast and information entropy
CN105427292A (en) * 2015-11-11 2016-03-23 南京邮电大学 Salient object detection method based on video
US9317776B1 (en) * 2013-03-13 2016-04-19 Hrl Laboratories, Llc Robust static and moving object detection system via attentional mechanisms
US9320450B2 (en) 2013-03-14 2016-04-26 The Nielsen Company (Us), Llc Methods and apparatus to gather and analyze electroencephalographic data
US9351658B2 (en) 2005-09-02 2016-05-31 The Nielsen Company (Us), Llc Device and method for sensing electrical activity in tissue
US9373058B2 (en) * 2014-05-29 2016-06-21 International Business Machines Corporation Scene understanding using a neurosynaptic system
CN105989367A (en) * 2015-02-04 2016-10-05 阿里巴巴集团控股有限公司 Target acquisition method and equipment
US9510752B2 (en) 2012-12-11 2016-12-06 Children's Healthcare Of Atlanta, Inc. Systems and methods for detecting blink inhibition as a marker of engagement and perceived stimulus salience
US9576214B1 (en) 2012-01-23 2017-02-21 Hrl Laboratories, Llc Robust object recognition from moving platforms by combining form and motion detection with bio-inspired classification
US9622702B2 (en) 2014-04-03 2017-04-18 The Nielsen Company (Us), Llc Methods and apparatus to gather and analyze electroencephalographic data
US9798972B2 (en) 2014-07-02 2017-10-24 International Business Machines Corporation Feature extraction using a neurosynaptic system for object classification
CN107767387A (en) * 2017-11-09 2018-03-06 广西科技大学 Profile testing method based on the global modulation of changeable reception field yardstick
WO2018078806A1 (en) * 2016-10-28 2018-05-03 オリンパス株式会社 Image processing device, image processing method, and image processing program
CN108090492A (en) * 2017-11-09 2018-05-29 广西科技大学 The profile testing method inhibited based on scale clue
US20180174329A1 (en) * 2015-06-18 2018-06-21 Nec Solution Innovators, Ltd. Image processing device, image processing method, and computer-readable recording medium
CN110853050A (en) * 2019-10-21 2020-02-28 中国电子科技集团公司第二十九研究所 SAR image river segmentation method, device and medium
US10617295B2 (en) 2013-10-17 2020-04-14 Children's Healthcare Of Atlanta, Inc. Systems and methods for assessing infant and child development via eye tracking
CN111626306A (en) * 2019-03-25 2020-09-04 北京联合大学 Saliency map fusion method and system
DE102012002321B4 (en) 2012-02-06 2022-04-28 Airbus Defence and Space GmbH Method for recognizing a given pattern in an image data set

Families Citing this family (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7826668B1 (en) 2004-02-17 2010-11-02 Corel Corporation Adaptive region editing tool
US7782338B1 (en) * 2004-02-17 2010-08-24 Krzysztof Antoni Zaklika Assisted adaptive region editing tool
US7609894B2 (en) 2004-02-17 2009-10-27 Corel Corporation Adaptive sampling region for a region editing tool
US20070156382A1 (en) * 2005-12-29 2007-07-05 Graham James L Ii Systems and methods for designing experiments
US8233094B2 (en) * 2007-05-24 2012-07-31 Aptina Imaging Corporation Methods, systems and apparatuses for motion detection using auto-focus statistics
US20090012847A1 (en) * 2007-07-03 2009-01-08 3M Innovative Properties Company System and method for assessing effectiveness of communication content
CA2692409A1 (en) 2007-07-03 2009-01-08 3M Innovative Properties Company System and method for assigning pieces of content to time-slots samples for measuring effects of the assigned content
JP2010532539A (en) * 2007-07-03 2010-10-07 スリーエム イノベイティブ プロパティズ カンパニー System and method for generating time slot samples to which content can be assigned and measuring the effect of the assigned content
US20090046907A1 (en) * 2007-08-17 2009-02-19 Siemens Medical Solutions Usa, Inc. Parallel Execution Of All Image Processing Workflow Features
US8369652B1 (en) * 2008-06-16 2013-02-05 Hrl Laboratories, Llc Visual attention system for salient regions in imagery
JP5486006B2 (en) 2008-10-03 2014-05-07 スリーエム イノベイティブ プロパティズ カンパニー System and method for evaluating robustness
WO2010039966A1 (en) * 2008-10-03 2010-04-08 3M Innovative Properties Company Systems and methods for optimizing a scene
US8442328B2 (en) * 2008-10-03 2013-05-14 3M Innovative Properties Company Systems and methods for evaluating robustness of saliency predictions of regions in a scene
JP5334771B2 (en) * 2008-10-07 2013-11-06 トムソン ライセンシング Method for inserting an advertisement clip into a video sequence and corresponding device
US8374462B2 (en) * 2008-11-14 2013-02-12 Seiko Epson Corporation Content-aware image and video resizing by anchor point sampling and mapping
AU2010203781B9 (en) 2009-01-07 2013-12-05 3M Innovative Properties Company System and method for concurrently conducting cause-and-effect experiments on content effectiveness and adjusting content distribution to optimize business objectives
JP4862930B2 (en) * 2009-09-04 2012-01-25 カシオ計算機株式会社 Image processing apparatus, image processing method, and program
CN102271262B (en) * 2010-06-04 2015-05-13 三星电子株式会社 Multithread-based video processing method for 3D (Three-Dimensional) display
EP2509044B1 (en) 2011-04-08 2018-10-03 Dolby Laboratories Licensing Corporation Local definition of global image transformations
CN102184557B (en) * 2011-06-17 2012-09-12 电子科技大学 Salient region detection method for complex scene
WO2013042766A1 (en) * 2011-09-22 2013-03-28 オリンパス株式会社 Image processing device, image processing system, and image readout device
KR101913336B1 (en) * 2011-10-06 2018-10-31 삼성전자주식회사 Mobile apparatus and method for controlling the same
EP2783329A4 (en) * 2011-11-22 2016-08-24 Nokia Corp Method for image processing and an apparatus
AU2011254040B2 (en) * 2011-12-14 2015-03-12 Canon Kabushiki Kaisha Method, apparatus and system for determining a saliency map for an input image
US9025880B2 (en) * 2012-08-29 2015-05-05 Disney Enterprises, Inc. Visual saliency estimation for images and video
US9116926B2 (en) 2012-12-20 2015-08-25 Google Inc. Sharing photos
US9571726B2 (en) 2012-12-20 2017-02-14 Google Inc. Generating attention information from photos
US9384538B2 (en) 2013-07-26 2016-07-05 Li-Cor, Inc. Adaptive noise filter
US9218652B2 (en) 2013-07-26 2015-12-22 Li-Cor, Inc. Systems and methods for setting initial display settings
US10395350B2 (en) 2013-07-26 2019-08-27 Li-Cor, Inc. Adaptive background detection and signal quantification systems and methods
WO2015060897A1 (en) * 2013-10-22 2015-04-30 Eyenuk, Inc. Systems and methods for automated analysis of retinal images
JP6250819B2 (en) 2013-10-30 2017-12-20 インテル コーポレイション Image capture feedback
CN104166986A (en) * 2014-07-07 2014-11-26 广东工业大学 Strip-shaped article surface defect on-line visual attention detection method
US9454712B2 (en) * 2014-10-08 2016-09-27 Adobe Systems Incorporated Saliency map computation
US9626584B2 (en) * 2014-10-09 2017-04-18 Adobe Systems Incorporated Image cropping suggestion using multiple saliency maps
US10331982B2 (en) * 2016-03-11 2019-06-25 Irvine Sensors Corp. Real time signal processor for analyzing, labeling and exploiting data in real time from hyperspectral sensor suites (Hy-ALERT)
US10055652B2 (en) * 2016-03-21 2018-08-21 Ford Global Technologies, Llc Pedestrian detection and motion prediction with rear-facing camera
US10133944B2 (en) 2016-12-21 2018-11-20 Volkswagen Ag Digital neuromorphic (NM) sensor array, detector, engine and methodologies
US10282615B2 (en) 2016-12-21 2019-05-07 Volkswagen Ag System and method for root association in image data
US10235565B2 (en) * 2016-12-21 2019-03-19 Volkswagen Ag System and methodologies for occupant monitoring utilizing digital neuromorphic (NM) data and fovea tracking
US10789495B2 (en) 2016-12-21 2020-09-29 Volkswagen Ag System and method for 1D root association providing sparsity guarantee in image data
US10229341B2 (en) 2016-12-21 2019-03-12 Volkswagen Ag Vector engine and methodologies using digital neuromorphic (NM) data
US10970622B2 (en) 2017-01-13 2021-04-06 International Business Machines Corporation Dynamic gating using neuromorphic hardware
US11361214B2 (en) 2017-01-13 2022-06-14 International Business Machines Corporation Dynamic multiscale routing on networks of neurosynaptic cores
GB201701919D0 (en) 2017-02-06 2017-03-22 Univ London Queen Mary Method of image analysis
FR3081591B1 (en) * 2018-05-23 2020-07-31 Idemia Identity & Security France PROCESS FOR PROCESSING A STREAM OF VIDEO IMAGES
CN110084210B (en) * 2019-04-30 2022-03-29 电子科技大学 SAR image multi-scale ship detection method based on attention pyramid network
KR20210001324A (en) 2019-06-27 2021-01-06 삼성전자주식회사 Artificial neural network model and electronic device comprising thereof
KR20210004229A (en) 2019-07-03 2021-01-13 삼성전자주식회사 An image processing device including a neural network processor and operating method thereof
US10922824B1 (en) 2019-08-13 2021-02-16 Volkswagen Ag Object tracking using contour filters and scalers

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5210799A (en) * 1991-03-28 1993-05-11 Texas Instruments Incorporated System and method for ranking and extracting salient contours for target recognition
US5357194A (en) * 1993-06-28 1994-10-18 Orbotech Ltd. Testing bed and apparatus including same for testing printed circuit boards and other like articles
US5598355A (en) * 1994-05-02 1997-01-28 Commissariat A L'energie Atomique Process for the trajectography of objects and device for performing this process
US5801810A (en) * 1996-10-11 1998-09-01 Visual Resources, Inc. Method and apparatus for testing visual attention capabilities of a subject
US5929849A (en) * 1996-05-02 1999-07-27 Phoenix Technologies, Ltd. Integration of dynamic universal resource locators with television presentations
US5984475A (en) * 1997-12-05 1999-11-16 Mcgill University Stereoscopic gaze controller
US6045226A (en) * 1996-04-12 2000-04-04 Eyelight Research N.V. Device for measuring the visual attention of subjects for a visible object

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0668317A1 (en) * 1994-02-15 1995-08-23 Rohm And Haas Company Impact modified polyacetal compositions
US5983218A (en) 1997-06-30 1999-11-09 Xerox Corporation Multimedia database for use over networks
US6320976B1 (en) 1999-04-01 2001-11-20 Siemens Corporate Research, Inc. Computer-assisted diagnosis method and system for automatically determining diagnostic saliency of digital images
US6442203B1 (en) 1999-11-05 2002-08-27 Demografx System and method for motion compensation and frame rate conversion
JP4732660B2 (en) 2000-02-17 2011-07-27 ブリティッシュ・テレコミュニケーションズ・パブリック・リミテッド・カンパニー Visual attention system
US6670963B2 (en) 2001-01-17 2003-12-30 Tektronix, Inc. Visual attention model
US20020154833A1 (en) 2001-03-08 2002-10-24 Christof Koch Computation of intrinsic perceptual saliency in visual environments, and applications
US20050047647A1 (en) * 2003-06-10 2005-03-03 Ueli Rutishauser System and method for attentional selection
US7400761B2 (en) 2003-09-30 2008-07-15 Microsoft Corporation Contrast-based image attention analysis framework
US7471827B2 (en) 2003-10-16 2008-12-30 Microsoft Corporation Automatic browsing path generation to present image areas with high attention value as a function of space and time
JP4396430B2 (en) 2003-11-25 2010-01-13 セイコーエプソン株式会社 Gaze guidance information generation system, gaze guidance information generation program, and gaze guidance information generation method
FR2888375A1 (en) 2005-07-06 2007-01-12 Thomson Licensing Sa METHOD FOR OBTAINING A SOUND MAP FROM A PLURALITY OF SAME CARDS BASED ON DIFFERENT VISUAL SIZES
FR2897183A1 (en) 2006-02-03 2007-08-10 Thomson Licensing Sas METHOD FOR VERIFYING THE SAVING AREAS OF A MULTIMEDIA DOCUMENT, METHOD FOR CREATING AN ADVERTISING DOCUMENT, AND COMPUTER PROGRAM PRODUCT
GB0619817D0 (en) 2006-10-06 2006-11-15 Imp Innovations Ltd A method of identifying a measure of feature saliency in a sequence of images
CN101601070B (en) 2006-10-10 2012-06-27 汤姆逊许可公司 Device and method for generating a saliency map of a picture
AU2008222789B2 (en) 2007-03-08 2013-08-22 Doheny Eye Institute Saliency-based apparatus and methods for visual prostheses
US8243068B2 (en) 2007-05-17 2012-08-14 University Of Maryland Method, system and apparatus for determining and modifying saliency of a visual medium
EP2034439A1 (en) 2007-09-07 2009-03-11 Thomson Licensing Method for establishing the saliency map of an image
KR100952749B1 (en) 2008-07-31 2010-04-13 중앙대학교 산학협력단 Method for saliency-based lighting for feature emphasis
CN101489139B (en) 2009-01-21 2010-11-10 北京大学 Video advertisement correlation method and system based on visual saliency
US8582881B2 (en) 2009-03-26 2013-11-12 Tp Vision Holding B.V. Method and apparatus for modifying an image by using a saliency map based on color frequency
CN101697593B (en) 2009-09-08 2012-10-10 武汉大学 Time domain prediction-based saliency extraction method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5210799A (en) * 1991-03-28 1993-05-11 Texas Instruments Incorporated System and method for ranking and extracting salient contours for target recognition
US5566246A (en) * 1991-03-28 1996-10-15 Texas Instruments Incorporated System and method for ranking and extracting salient contours for target recognition
US5357194A (en) * 1993-06-28 1994-10-18 Orbotech Ltd. Testing bed and apparatus including same for testing printed circuit boards and other like articles
US5598355A (en) * 1994-05-02 1997-01-28 Commissariat A L'energie Atomique Process for the trajectography of objects and device for performing this process
US6045226A (en) * 1996-04-12 2000-04-04 Eyelight Research N.V. Device for measuring the visual attention of subjects for a visible object
US5929849A (en) * 1996-05-02 1999-07-27 Phoenix Technologies, Ltd. Integration of dynamic universal resource locators with television presentations
US5801810A (en) * 1996-10-11 1998-09-01 Visual Resources, Inc. Method and apparatus for testing visual attention capabilities of a subject
US5984475A (en) * 1997-12-05 1999-11-16 Mcgill University Stereoscopic gaze controller

Cited By (138)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060215922A1 (en) * 2001-03-08 2006-09-28 Christof Koch Computation of intrinsic perceptual saliency in visual environments, and applications
US8098886B2 (en) 2001-03-08 2012-01-17 California Institute Of Technology Computation of intrinsic perceptual saliency in visual environments, and applications
US7298866B2 (en) * 2001-10-15 2007-11-20 Lockheed Martin Corporation Two dimensional autonomous isotropic detection technique
US7742622B2 (en) 2001-10-15 2010-06-22 Lockheed Martin Corporation Two dimension autonomous isotropic detection technique
US20030072470A1 (en) * 2001-10-15 2003-04-17 Lee Henry C. Two dimensional autonomous isotropic detection technique
US20060165178A1 (en) * 2002-11-01 2006-07-27 Microsoft Corporation Generating a Motion Attention Model
US8098730B2 (en) * 2002-11-01 2012-01-17 Microsoft Corporation Generating a motion attention model
WO2004111931A2 (en) * 2003-06-10 2004-12-23 California Institute Of Technology A system and method for attentional selection
WO2004111931A3 (en) * 2003-06-10 2005-02-24 California Inst Of Techn A system and method for attentional selection
US20050045224A1 (en) * 2003-08-29 2005-03-03 Lyden Robert M. Solar cell, module, array, network, and power grid
US9053754B2 (en) 2004-07-28 2015-06-09 Microsoft Technology Licensing, Llc Thumbnail generation and presentation for recorded TV programs
US9355684B2 (en) 2004-07-28 2016-05-31 Microsoft Technology Licensing, Llc Thumbnail generation and presentation for recorded TV programs
US20060153459A1 (en) * 2005-01-10 2006-07-13 Yan Zhang Object classification method for a collision warning system
US8343067B2 (en) 2005-02-24 2013-01-01 Warren Jones System and method for quantifying and mapping visual salience
US8551015B2 (en) 2005-02-24 2013-10-08 Warren Jones System and method for evaluating and diagnosing patients based on ocular responses
US20060189886A1 (en) * 2005-02-24 2006-08-24 Warren Jones System and method for quantifying and mapping visual salience
US20110172556A1 (en) * 2005-02-24 2011-07-14 Warren Jones System And Method For Quantifying And Mapping Visual Salience
US7922670B2 (en) * 2005-02-24 2011-04-12 Warren Jones System and method for quantifying and mapping visual salience
US20060257834A1 (en) * 2005-05-10 2006-11-16 Lee Linda M Quantitative EEG as an identifier of learning modality
US10506941B2 (en) 2005-08-09 2019-12-17 The Nielsen Company (Us), Llc Device and method for sensing electrical activity in tissue
US11638547B2 (en) 2005-08-09 2023-05-02 Nielsen Consumer Llc Device and method for sensing electrical activity in tissue
US9351658B2 (en) 2005-09-02 2016-05-31 The Nielsen Company (Us), Llc Device and method for sensing electrical activity in tissue
US8180826B2 (en) 2005-10-31 2012-05-15 Microsoft Corporation Media sharing and authoring on the web
US20070101387A1 (en) * 2005-10-31 2007-05-03 Microsoft Corporation Media Sharing And Authoring On The Web
US8196032B2 (en) 2005-11-01 2012-06-05 Microsoft Corporation Template-based multimedia authoring and sharing
US20070101271A1 (en) * 2005-11-01 2007-05-03 Microsoft Corporation Template-based multimedia authoring and sharing
US20070183680A1 (en) * 2005-11-15 2007-08-09 Mario Aguilar Multi-scale image fusion
US7940994B2 (en) * 2005-11-15 2011-05-10 Teledyne Licensing, Llc Multi-scale image fusion
US9215996B2 (en) 2007-03-02 2015-12-22 The Nielsen Company (Us), Llc Apparatus and method for objectively determining human response to media
US20090070798A1 (en) * 2007-03-02 2009-03-12 Lee Hans C System and Method for Detecting Viewer Attention to Media Delivery Devices
US8973022B2 (en) 2007-03-07 2015-03-03 The Nielsen Company (Us), Llc Method and system for using coherence of biological responses as a measure of performance of a media
US20080222670A1 (en) * 2007-03-07 2008-09-11 Lee Hans C Method and system for using coherence of biological responses as a measure of performance of a media
US8473044B2 (en) 2007-03-07 2013-06-25 The Nielsen Company (Us), Llc Method and system for measuring and ranking a positive or negative response to audiovisual or interactive media, products or activities using physiological signals
US8230457B2 (en) 2007-03-07 2012-07-24 The Nielsen Company (Us), Llc. Method and system for using coherence of biological responses as a measure of performance of a media
WO2008108814A1 (en) * 2007-03-07 2008-09-12 Emsense Corporation Method and system for measuring and ranking a positive or negative response to audiovisual or interactive media, products or activities using physiological signals
WO2008109771A3 (en) * 2007-03-08 2009-01-15 Second Sight Medical Prod Inc Saliency-based apparatus and methods for visual prostheses
US9795786B2 (en) 2007-03-08 2017-10-24 Second Sight Medical Products, Inc. Saliency-based apparatus and methods for visual prostheses
WO2008109771A2 (en) * 2007-03-08 2008-09-12 Second Sight Medical Products, Inc. Saliency-based apparatus and methods for visual prostheses
US8764652B2 (en) 2007-03-08 2014-07-01 The Nielson Company (US), LLC. Method and system for measuring and ranking an “engagement” response to audiovisual or interactive media, products, or activities using physiological signals
US8782681B2 (en) 2007-03-08 2014-07-15 The Nielsen Company (Us), Llc Method and system for rating media and events in media based on physiological data
US9061150B2 (en) 2007-03-08 2015-06-23 Second Sight Medical Products, Inc. Saliency-based apparatus and methods for visual prostheses
US20090092314A1 (en) * 2007-05-17 2009-04-09 Amitabh Varshney Method, system and apparatus for determining and modifying saliency of a visual medium
US8243068B2 (en) * 2007-05-17 2012-08-14 University Of Maryland Method, system and apparatus for determining and modifying saliency of a visual medium
US7940985B2 (en) * 2007-06-06 2011-05-10 Microsoft Corporation Salient object detection
US20080304740A1 (en) * 2007-06-06 2008-12-11 Microsoft Corporation Salient Object Detection
JP2009003615A (en) * 2007-06-20 2009-01-08 Nippon Telegr & Teleph Corp <Ntt> Attention region extraction method, attention region extraction device, computer program, and recording medium
US8376952B2 (en) 2007-09-07 2013-02-19 The Nielsen Company (Us), Llc. Method and apparatus for sensing blood oxygen
US20090069652A1 (en) * 2007-09-07 2009-03-12 Lee Hans C Method and Apparatus for Sensing Blood Oxygen
US8300985B2 (en) * 2007-09-11 2012-10-30 Samsung Electronics Co., Ltd. Image-registration method, medium, and apparatus
US20090067752A1 (en) * 2007-09-11 2009-03-12 Samsung Electronics Co., Ltd. Image-registration method, medium, and apparatus
US20090094629A1 (en) * 2007-10-02 2009-04-09 Lee Hans C Providing Actionable Insights Based on Physiological Responses From Viewers of Media
US20090094627A1 (en) * 2007-10-02 2009-04-09 Lee Hans C Providing Remote Access to Media, and Reaction and Survey Data From Viewers of the Media
US9021515B2 (en) 2007-10-02 2015-04-28 The Nielsen Company (Us), Llc Systems and methods to determine media effectiveness
US8151292B2 (en) 2007-10-02 2012-04-03 Emsense Corporation System for remote access to media, and reaction and survey data from viewers of the media
US9894399B2 (en) 2007-10-02 2018-02-13 The Nielsen Company (Us), Llc Systems and methods to determine media effectiveness
US8332883B2 (en) 2007-10-02 2012-12-11 The Nielsen Company (Us), Llc Providing actionable insights based on physiological responses from viewers of media
US20090094286A1 (en) * 2007-10-02 2009-04-09 Lee Hans C System for Remote Access to Media, and Reaction and Survey Data From Viewers of the Media
US8327395B2 (en) 2007-10-02 2012-12-04 The Nielsen Company (Us), Llc System providing actionable insights based on physiological responses from viewers of media
US9571877B2 (en) 2007-10-02 2017-02-14 The Nielsen Company (Us), Llc Systems and methods to determine media effectiveness
US9521960B2 (en) 2007-10-31 2016-12-20 The Nielsen Company (Us), Llc Systems and methods providing en mass collection and centralized processing of physiological responses from viewers
US11250447B2 (en) 2007-10-31 2022-02-15 Nielsen Consumer Llc Systems and methods providing en mass collection and centralized processing of physiological responses from viewers
US10580018B2 (en) 2007-10-31 2020-03-03 The Nielsen Company (Us), Llc Systems and methods providing EN mass collection and centralized processing of physiological responses from viewers
US20090133047A1 (en) * 2007-10-31 2009-05-21 Lee Hans C Systems and Methods Providing Distributed Collection and Centralized Processing of Physiological Responses from Viewers
US20090150919A1 (en) * 2007-11-30 2009-06-11 Lee Michael J Correlating Media Instance Information With Physiological Responses From Participating Subjects
US8347326B2 (en) 2007-12-18 2013-01-01 The Nielsen Company (US) Identifying key media events and modeling causal relationships between key events and reported feelings
US8793715B1 (en) 2007-12-18 2014-07-29 The Nielsen Company (Us), Llc Identifying key media events and modeling causal relationships between key events and reported feelings
US20120121173A1 (en) * 2009-05-08 2012-05-17 Kazuki Aisaka Image processing apparatus and method, and program
US8577137B2 (en) * 2009-05-08 2013-11-05 Sony Corporation Image processing apparatus and method, and program
US20120162528A1 (en) * 2010-01-13 2012-06-28 Shinya Kiuchi Video processing device and video display device
US20110229025A1 (en) * 2010-02-10 2011-09-22 Qi Zhao Methods and systems for generating saliency models through linear and/or nonlinear integration
US8649606B2 (en) * 2010-02-10 2014-02-11 California Institute Of Technology Methods and systems for generating saliency models through linear and/or nonlinear integration
US20110249886A1 (en) * 2010-04-12 2011-10-13 Samsung Electronics Co., Ltd. Image converting device and three-dimensional image display device including the same
CN101916379A (en) * 2010-09-03 2010-12-15 华中科技大学 Target search and recognition method based on object accumulation visual attention mechanism
US20120328161A1 (en) * 2011-06-22 2012-12-27 Palenychka Roman Method and multi-scale attention system for spatiotemporal change determination and object detection
US20130084013A1 (en) * 2011-09-29 2013-04-04 Hao Tang System and method for saliency map generation
US8675966B2 (en) * 2011-09-29 2014-03-18 Hewlett-Packard Development Company, L.P. System and method for saliency map generation
US8897578B2 (en) * 2011-11-02 2014-11-25 Panasonic Intellectual Property Corporation Of America Image recognition device, image recognition method, and integrated circuit
US20140193074A1 (en) * 2011-11-02 2014-07-10 Zhongyang Huang Image recognition device, image recognition method, and integrated circuit
US20130166394A1 (en) * 2011-12-22 2013-06-27 Yahoo! Inc. Saliency-based evaluation of webpage designs and layouts
US9576214B1 (en) 2012-01-23 2017-02-21 Hrl Laboratories, Llc Robust object recognition from moving platforms by combining form and motion detection with bio-inspired classification
DE102012002321B4 (en) 2012-02-06 2022-04-28 Airbus Defence and Space GmbH Method for recognizing a given pattern in an image data set
US20130342758A1 (en) * 2012-06-20 2013-12-26 Disney Enterprises, Inc. Video retargeting using content-dependent scaling vectors
US9202258B2 (en) * 2012-06-20 2015-12-01 Disney Enterprises, Inc. Video retargeting using content-dependent scaling vectors
US20140016859A1 (en) * 2012-06-29 2014-01-16 Arizona Board of Regents, a body corporate of the State of AZ, acting for and on behalf of AZ Sta Systems, methods, and media for optical recognition
US9501710B2 (en) * 2012-06-29 2016-11-22 Arizona Board Of Regents, A Body Corporate Of The State Of Arizona, Acting For And On Behalf Of Arizona State University Systems, methods, and media for identifying object characteristics based on fixation points
US10842403B2 (en) 2012-08-17 2020-11-24 The Nielsen Company (Us), Llc Systems and methods to gather and analyze electroencephalographic data
US9907482B2 (en) 2012-08-17 2018-03-06 The Nielsen Company (Us), Llc Systems and methods to gather and analyze electroencephalographic data
US9215978B2 (en) 2012-08-17 2015-12-22 The Nielsen Company (Us), Llc Systems and methods to gather and analyze electroencephalographic data
US10779745B2 (en) 2012-08-17 2020-09-22 The Nielsen Company (Us), Llc Systems and methods to gather and analyze electroencephalographic data
US8989835B2 (en) 2012-08-17 2015-03-24 The Nielsen Company (Us), Llc Systems and methods to gather and analyze electroencephalographic data
US9060671B2 (en) 2012-08-17 2015-06-23 The Nielsen Company (Us), Llc Systems and methods to gather and analyze electroencephalographic data
US9510752B2 (en) 2012-12-11 2016-12-06 Children's Healthcare Of Atlanta, Inc. Systems and methods for detecting blink inhibition as a marker of engagement and perceived stimulus salience
US9861307B2 (en) 2012-12-11 2018-01-09 Children's Healthcare Of Atlanta, Inc. Systems and methods for detecting blink inhibition as a marker of engagement and perceived stimulus salience
US10052057B2 (en) 2012-12-11 2018-08-21 Childern's Healthcare of Atlanta, Inc. Systems and methods for detecting blink inhibition as a marker of engagement and perceived stimulus salience
US10016156B2 (en) 2012-12-11 2018-07-10 Children's Healthcare Of Atlanta, Inc. Systems and methods for detecting blink inhibition as a marker of engagement and perceived stimulus salience
US11759135B2 (en) 2012-12-11 2023-09-19 Children's Healthcare Of Atlanta, Inc. Systems and methods for detecting blink inhibition as a marker of engagement and perceived stimulus salience
US10987043B2 (en) 2012-12-11 2021-04-27 Children's Healthcare Of Atlanta, Inc. Systems and methods for detecting blink inhibition as a marker of engagement and perceived stimulus salience
US9317776B1 (en) * 2013-03-13 2016-04-19 Hrl Laboratories, Llc Robust static and moving object detection system via attentional mechanisms
US9668694B2 (en) 2013-03-14 2017-06-06 The Nielsen Company (Us), Llc Methods and apparatus to gather and analyze electroencephalographic data
US11076807B2 (en) 2013-03-14 2021-08-03 Nielsen Consumer Llc Methods and apparatus to gather and analyze electroencephalographic data
US9320450B2 (en) 2013-03-14 2016-04-26 The Nielsen Company (Us), Llc Methods and apparatus to gather and analyze electroencephalographic data
US10617295B2 (en) 2013-10-17 2020-04-14 Children's Healthcare Of Atlanta, Inc. Systems and methods for assessing infant and child development via eye tracking
US11864832B2 (en) 2013-10-17 2024-01-09 Children's Healthcare Of Atlanta, Inc. Systems and methods for assessing infant and child development via eye tracking
CN103810503A (en) * 2013-12-26 2014-05-21 西北工业大学 Depth study based method for detecting salient regions in natural image
US11141108B2 (en) 2014-04-03 2021-10-12 Nielsen Consumer Llc Methods and apparatus to gather and analyze electroencephalographic data
US9622702B2 (en) 2014-04-03 2017-04-18 The Nielsen Company (Us), Llc Methods and apparatus to gather and analyze electroencephalographic data
US9622703B2 (en) 2014-04-03 2017-04-18 The Nielsen Company (Us), Llc Methods and apparatus to gather and analyze electroencephalographic data
US20180107893A1 (en) * 2014-04-29 2018-04-19 International Business Machines Corporation Extracting motion saliency features from video using a neurosynaptic system
US9195903B2 (en) * 2014-04-29 2015-11-24 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US9922266B2 (en) 2014-04-29 2018-03-20 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US9355331B2 (en) 2014-04-29 2016-05-31 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US10528843B2 (en) * 2014-04-29 2020-01-07 International Business Machines Corporation Extracting motion saliency features from video using a neurosynaptic system
US11227180B2 (en) * 2014-04-29 2022-01-18 International Business Machines Corporation Extracting motion saliency features from video using a neurosynaptic system
US9373058B2 (en) * 2014-05-29 2016-06-21 International Business Machines Corporation Scene understanding using a neurosynaptic system
US10846567B2 (en) 2014-05-29 2020-11-24 International Business Machines Corporation Scene understanding using a neurosynaptic system
US10140551B2 (en) 2014-05-29 2018-11-27 International Business Machines Corporation Scene understanding using a neurosynaptic system
US10043110B2 (en) 2014-05-29 2018-08-07 International Business Machines Corporation Scene understanding using a neurosynaptic system
US10558892B2 (en) 2014-05-29 2020-02-11 International Business Machines Corporation Scene understanding using a neurosynaptic system
US9536179B2 (en) 2014-05-29 2017-01-03 International Business Machines Corporation Scene understanding using a neurosynaptic system
US20160004962A1 (en) * 2014-07-02 2016-01-07 International Business Machines Corporation Classifying features using a neurosynaptic system
US10115054B2 (en) * 2014-07-02 2018-10-30 International Business Machines Corporation Classifying features using a neurosynaptic system
US11138495B2 (en) 2014-07-02 2021-10-05 International Business Machines Corporation Classifying features using a neurosynaptic system
US9798972B2 (en) 2014-07-02 2017-10-24 International Business Machines Corporation Feature extraction using a neurosynaptic system for object classification
US20160026880A1 (en) * 2014-07-28 2016-01-28 Hyundai Mobis Co., Ltd. Driving assist system for vehicle and method thereof
US9940527B2 (en) * 2014-07-28 2018-04-10 Hyundai Mobis Co., Ltd. Driving assist system for vehicle and method thereof
CN105989367A (en) * 2015-02-04 2016-10-05 阿里巴巴集团控股有限公司 Target acquisition method and equipment
EP3254236A4 (en) * 2015-02-04 2018-10-03 Alibaba Group Holding Limited Method and apparatus for target acquisition
CN104966286A (en) * 2015-06-04 2015-10-07 电子科技大学 3D video saliency detection method
US20180174329A1 (en) * 2015-06-18 2018-06-21 Nec Solution Innovators, Ltd. Image processing device, image processing method, and computer-readable recording medium
US10475210B2 (en) * 2015-06-18 2019-11-12 Nec Solution Innovators, Ltd. Image processing device, image processing method, and computer-readable recording medium
CN105405132A (en) * 2015-11-04 2016-03-16 河海大学 SAR image man-made target detection method based on visual contrast and information entropy
CN105427292A (en) * 2015-11-11 2016-03-23 南京邮电大学 Salient object detection method based on video
WO2018078806A1 (en) * 2016-10-28 2018-05-03 オリンパス株式会社 Image processing device, image processing method, and image processing program
US11030749B2 (en) 2016-10-28 2021-06-08 Olympus Corporation Image-processing apparatus, image-processing method, and storage medium storing image-processing program
CN107767387A (en) * 2017-11-09 2018-03-06 广西科技大学 Profile testing method based on the global modulation of changeable reception field yardstick
CN108090492A (en) * 2017-11-09 2018-05-29 广西科技大学 The profile testing method inhibited based on scale clue
CN111626306A (en) * 2019-03-25 2020-09-04 北京联合大学 Saliency map fusion method and system
CN110853050A (en) * 2019-10-21 2020-02-28 中国电子科技集团公司第二十九研究所 SAR image river segmentation method, device and medium

Also Published As

Publication number Publication date
US20120106850A1 (en) 2012-05-03
US8515131B2 (en) 2013-08-20
US8098886B2 (en) 2012-01-17
US20060215922A1 (en) 2006-09-28

Similar Documents

Publication Publication Date Title
US8098886B2 (en) Computation of intrinsic perceptual saliency in visual environments, and applications
Han et al. Fast saliency-aware multi-modality image fusion
Narihira et al. Learning lightness from human judgement on relative reflectance
CN101601287B (en) Apparatus and methods of producing photorealistic image thumbnails
CN112154451A (en) Method, apparatus and computer program for extracting representative features of objects in an image
CN109102521B (en) Video target tracking method based on parallel attention-dependent filtering
EP2074557B1 (en) Method and system for learning spatio-spectral features in an image
WO2016190814A1 (en) Method and system for facial recognition
Espinal et al. Wavelet-based fractal signature analysis for automatic target recognition
Cutzu et al. Estimating the photorealism of images: Distinguishing paintings from photographs
CN104732200A (en) Skin type and skin problem recognition method
Shopa et al. Traffic sign detection and recognition using OpenCV
Drelie Gelasca et al. Which colors best catch your eyes: a subjective study of color saliency
Reina et al. Adaptive traffic road sign panels text extraction
Islam Traffic sign detection and recognition based on convolutional neural networks
Cutzu et al. Distinguishing paintings from photographs
US20230177801A1 (en) Method of salient object detection in images
Harbas et al. Detection of roadside vegetation using features from the visible spectrum
WO2019171574A1 (en) Product analysis system, product analysis method, and product analysis program
Buhrmester et al. Evaluating the impact of color information in deep neural networks
Swapna et al. Deep learning based road traffic sign detection and recognition
Khosla et al. Optimal detection of objects in images and videos using electroencephalography (EEG)
Wan et al. Autonomous facial recognition based on the human visual system
Gupta et al. Techniques of Tracking and Recognition of Sports Players Jersey and Vehicle Licence Number Plate: A Review
Rajaram et al. Machine Learning Enabled Traffic Sign Detection System

Legal Events

Date Code Title Description
AS Assignment

Owner name: CALIFORNIA INSTITUTE OF TECHNOLOGY, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KOCH, CHRISTOF;ITTI, LAURENT;REEL/FRAME:012640/0771;SIGNING DATES FROM 20011112 TO 20011127

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE