US20080247610A1 - Apparatus, Method and Computer Program for Processing Information - Google Patents

Apparatus, Method and Computer Program for Processing Information Download PDF

Info

Publication number
US20080247610A1
US20080247610A1 US12/046,322 US4632208A US2008247610A1 US 20080247610 A1 US20080247610 A1 US 20080247610A1 US 4632208 A US4632208 A US 4632208A US 2008247610 A1 US2008247610 A1 US 2008247610A1
Authority
US
United States
Prior art keywords
character
feature quantity
model
generating
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US12/046,322
Other versions
US8107689B2 (en
Inventor
Tomohiro Tsunoda
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Original Assignee
Sony Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp filed Critical Sony Corp
Assigned to SONY CORPORATION reassignment SONY CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TSUNODA, TOMOHIRO
Publication of US20080247610A1 publication Critical patent/US20080247610A1/en
Application granted granted Critical
Publication of US8107689B2 publication Critical patent/US8107689B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data

Definitions

  • the present invention contains subject matter related to Japanese Patent Application JP 2007-098567 filed in the Japanese Patent Office on Apr. 4, 2007, the entire contents of which are incorporated herein by reference.
  • the present invention relates to an information processing apparatus, an information processing method and a computer program and, in particular, to an information processing apparatus, an information processing method and a computer program for generating a database indicating a mapping between characters and respective face images based on a video content such as a television program.
  • Japanese Unexamined Patent Application Publication No. 2002-189724 discloses a technique of detecting a person's face in a moving picture or a still picture and identifying whose face it is.
  • DB database having recorded characters (persons) and feature quantity models indicating the features of the characters mapped to the characters
  • a feature quantity model of a detected face image is compared with the feature quantity models on the database, and a character having the highest correlation is identified as the character having the detected face image.
  • the DB in the related art having recorded the characters and the feature quantity models of the faces mapped to the characters is manually constructed. If the DB is automatically constructed using a computer, for example, an amount of data to be stored on the DB (the number of characters and the feature quantity models of the faces of the characters) is increased more rapidly than in manual input operation. More characters are thus expected to be recognized.
  • the DB of related art cannot cope with a change in the face of a person with aging process and a change in the feature of the face resulting from makeup or disguise. In such a case, the DB also needs to be manually updated.
  • an information processing apparatus for generating a database indicating mapping between characters and the characters' face images, includes a list generating unit for generating a list of characters, appearing in a video content, based on metadata of the video content, a detecting unit for detecting a character's face image from the video content, a model generating unit for generating a feature quantity model indicating a feature of the detected character's face image, and a mapping unit for mapping the feature quantity model generated based on the video content to a character contained in the character list.
  • the information processing apparatus may further include a classifying unit for classifying into feature quantity model groups a plurality of feature quantity models, generated from the video content, according to a similarity and generating a representative model representing a plurality of feature quantity models classified in each feature quantity model group.
  • the mapping unit maps the representative model to the character contained in the character list.
  • the mapping unit may include a determining unit for determining a target character, a retrieval unit for searching, in accordance with the character list, the video content in which the target character appears and retrieving the feature quantity model generated from the searched video content, a determining unit for determining a plurality of feature quantity models having a high correlation to each other from among the retrieved feature quantity models, and a map generating unit for generating a center model serving as a center of the plurality of feature quantity models determined as having the high correlation to each other and mapping the center model to the target character.
  • the list generating unit may generate the character list including a group composed of a plurality of characters based on the metadata of the video content.
  • the detecting unit may detect the character's face image regardless of a looking face angle thereof from the video content, and the mapping unit may map to the same character a plurality of feature quantity models generated based on the face images detected at different looking face angles.
  • an information processing method of an information processing apparatus for generating a database indicating mapping between characters and the characters' face images includes steps of generating a list of characters, appearing in a video content, based on metadata of the video content, detecting a character's face image from the video content, generating a feature quantity model indicating a feature of the detected character's face image and mapping the feature quantity model generated based on the video content to a character contained in the character list.
  • a computer program for causing a computer to generate a database indicating mapping between characters and the characters' face images includes steps of generating a list of characters, appearing in a video content, based on metadata of the video content, detecting a character's face image from the video content, generating a feature quantity model indicating a feature of the detected character's face image and mapping the feature quantity model generated based on the video content to a character contained in the character list.
  • the list of characters, appearing in the video content is generated based on the metadata of the video content, the character's face image is detected from the video content, the feature quantity model indicating the feature of the detected character's face image is generated and the feature quantity model generated from the video content is mapped to a character contained in the character list.
  • the database indicating the mapping between the characters and the characters' face images is automatically constructed based on the video content.
  • FIG. 1 is a block diagram illustrating an information providing system in accordance with one embodiment of the present invention
  • FIG. 2 is a block diagram illustrating a character and feature quantity model database (DB) generator in accordance with one embodiment of the present invention
  • FIG. 3 illustrates a content DB of FIG. 2 ;
  • FIG. 4 illustrates a character-related information DB of FIG. 2 ;
  • FIG. 5 illustrates a character list of FIG. 2 ;
  • FIG. 6 illustrates a character and feature quantity model DB of FIG. 2 ;
  • FIG. 7 is a block diagram illustrating a mapping section of FIG. 2 ;
  • FIG. 8 is a flowchart illustrating a preparatory process
  • FIG. 9 is a flowchart illustrating a character and feature quantity model DB generation process.
  • FIG. 10 is a block diagram illustrating a computer.
  • an information processing apparatus for example, character and feature content model DB generator 20 of FIG. 2 for generating a database indicating mapping between characters and the characters' face images
  • a list generating unit for example, character list generator 28 of FIG. 2 for generating a list of characters, appearing in a video content, based on metadata of the video content
  • a detecting unit for example, face image detector 24 of FIG. 2
  • a model generating unit for example, feature quantity model extractor 25 of FIG. 2
  • a mapping unit for example, mapping section 29 of FIG. 2 ) for mapping the feature quantity model generated based on the video content to a character contained in the character list.
  • the information processing apparatus may further include a classifying unit (for example, feature quantity model classifier 26 of FIG. 2 ) for classifying into feature quantity model groups a plurality of feature quantity models, generated from the video content, according to a similarity and generating a representative model representing a plurality of feature quantity models classified in each feature quantity model group.
  • a classifying unit for example, feature quantity model classifier 26 of FIG. 2
  • feature quantity model classifier 26 of FIG. 2 for classifying into feature quantity model groups a plurality of feature quantity models, generated from the video content, according to a similarity and generating a representative model representing a plurality of feature quantity models classified in each feature quantity model group.
  • the mapping unit may include a determining unit (for example, target character determiner 41 of FIG. 7 ) for determining a target character, a retrieval unit (for example, target character searcher 42 and representative model group retriever 43 of FIG. 7 ) for searching, in accordance with the character list, the video content in which the target character appears and retrieving the feature quantity model generated based on the searched video content, a determining unit (for example, correlation determiner 44 of FIG. 7 ) for determining a plurality of feature quantity models having a high correlation to each other from among the retrieved feature quantity models, and a map generating unit (for example, center model generator 45 of FIG. 7 ) for generating a center model serving as a center of a plurality of feature quantity models determined as having a high correlation to each other and mapping the center model to the target character.
  • a determining unit for example, target character determiner 41 of FIG. 7
  • a retrieval unit for example, target character searcher 42 and representative model group retriever 43 of FIG. 7
  • one of an information processing method and a computer program of an information processing apparatus for generating a database indicating mapping between characters and the characters' face images includes steps of generating a list of characters, appearing in a video content, based on metadata of the video content (for example, step S 5 of FIG. 8 ), detecting a character's face image from the video content (for example, step S 2 of FIG. 8 ), generating a feature quantity model indicating a feature of the detected character's face image (for example, step S 3 of FIG. 8 ) and mapping the feature quantity model generated based on the video content to a character contained in the character list (for example, step S 16 of FIG. 8 ).
  • the information providing system 10 includes a character and feature quantity model database (DB) generator.
  • a character and feature content model DB is used in the information providing system 10 .
  • the information providing system 10 includes an information providing apparatus 11 , a character-related information DB 23 and an character and feature quantity model DB 30 .
  • the information providing apparatus 11 receives facial photos and facial composite drawings and provides a user with information related to a character (person) corresponding to the input facial photo.
  • the character-related information DB 23 stores character information related to characters appearing in a video content, such as show business people, intellectuals, athletes, and politicians, and character-related information mapped to the characters.
  • the character and feature quantity model DB 30 stores the character information and feature quantity models indicating the feature of each character mapped to the character information.
  • the information providing apparatus 11 generates a feature quantity model indicating a feature of the facial photo and facial composite drawing input from an operator, searches for a feature quantity model having the highest correlation with the generated feature quantity model by referencing the character and feature quantity model DB 30 , and identifies the character matching the search results.
  • the information providing apparatus 11 retrieves the character-related information of the identified character from the character-related information DB 23 and supplies the retrieved information to the operator.
  • the character-related information DB 23 is prepared beforehand (as described in detail later).
  • the character and feature quantity model DB 30 is constructed by a character and feature quantity model DB generator 20 of FIG. 2 .
  • the character and feature content model DB generator 20 in accordance with one embodiment of the present invention is described below with reference to FIG. 2 .
  • the character and feature content model DB generator 20 constructs the character and feature quantity model DB 30 .
  • the character and feature content model DB generator 20 includes a content DB 21 , a metadata DB 22 , a character-related information DB 23 , a face image detector 24 , a feature quantity model extractor 25 , a feature quantity model classifier 26 , a feature quantity model DB 27 , a character list generator 28 and a mapping section 29 .
  • the content DB 21 stores the video content.
  • the metadata DB 22 stores metadata corresponding to the video content stored on the content DB 21 .
  • the character-related information DB 23 stores the character-related information.
  • the face image detector 24 detects a face image from the video content.
  • the feature quantity model extractor 25 generates a feature quantity model indicating a feature of the detected face image.
  • the feature quantity model classifier 26 classifies numerous generated feature quantity models according to similarity.
  • the feature quantity model DB 27 stores the feature quantity models.
  • the character list generator 28 generates a character list 31 of characters appearing in the video content based on the metadata of the video content.
  • the mapping section 29 maps the feature quantity model to the character contained in the character list 31 .
  • the video content contains a moving image such as a television program and a still image such as news photos or gravure pictures.
  • the content DB 21 stores a content ID (identification), namely, unique identification information, identifying the video content and location information indicating the location of storage of actual data of the video content (moving image data, still image data, etc) with the content ID mapped to the location information.
  • the content DB 21 retrieves the actual data of the video content in accordance with the stored location information and outputs the actual data to each of the face image detector 24 and the mapping section 29 .
  • the content DB 21 may store the actual data of the video content.
  • the metadata DB 22 stores the metadata of the video content stored on the content DB 21 and the content ID mapped to the metadata.
  • the metadata includes EPG (electronic program guide) such as a program name, broadcasting date and time, channel, casting, program content, etc., if the video content is a television program. In cases other than the television program, the metadata is general attribute information containing the name of a character appearing in the video content.
  • the metadata DB 22 stores the character list 31 and the content ID mapped to the character list 31 .
  • the character list 31 is generated by the character list generator 28 .
  • the character-related information DB 23 stores character information related to characters appearing in a video content, such as show business people, intellectuals, athletes, and politicians, and character-related information mapped to the characters.
  • FIG. 4 illustrates items in the character-related information stored on the character-related information DB 23 .
  • the character-related information includes a name of a character, a pronunciation guidance of the name, an alphabetical presentation of the name, a home town, a birthday, debut time, related character ID, URL (uniform resource locator) of an official home page of the character, and other information, each item mapped to the character ID.
  • a character ID may be assigned to a group composed of a plurality of characters. If each member of the group separately appears in the video content, the member is assigned the member's own character ID.
  • the related character ID mapped to the character-related information of the member of the group contains the character ID of the group and the character ID of another member of the group.
  • Each group is recorded with the group member mapped mutually thereto. Even if a group name is described in the character list, the face of each individual group member is mapped.
  • the character-related information DB 23 may further contain a content ID of the video content and time information of time at which the character appears in the video content.
  • the video content is retrieved from the content DB 21 and a character face in the video of the retrieved video content is detected.
  • the face image detector 24 detects the character image not only in full face but also when the character face looks away at various angles (for example, turning around right by 10 degrees away from the frontward direction or turning around left by 45 degrees away from the frontward direction). The character as a subject may also be photographed from a variety of angles.
  • the face image detector 24 outputs detection results to the feature quantity model extractor 25 . If the video content is a moving image, a plurality of face images is detected even if a single character performs in the video content. The face images, if detected at different angles in consecutive scenes, may be from the same person in many cases.
  • the feature quantity model classifier 26 as a subsequent element then stores information that indicates consecutive detections so that the plurality of detected face images are recognized as from the same person.
  • the feature quantity model extractor 25 generates the feature quantity model indicating the feature of the detected character face image.
  • the feature quantity model extractor 25 also detects the looking face angle of the detected face and outputs the feature quantity model and the face angle to each of the feature quantity model classifier 26 and the feature quantity model DB 27 . It is perfectly acceptable that the feature quantity model is generated on a per face angle basis. Alternatively, the feature quantity model may be generated for the full face image detected and a feature quantity model for another face angle may be generated based on the full-face feature quantity model generated.
  • the feature quantity model classifier 26 calculates a similarity of a plurality of feature quantity models having the same face angle generated from a single video content and classifies the feature quantity models into a feature quantity model group so that each feature quantity model is composed of similar feature quantity models. A plurality of feature quantity models classified in the same feature quantity model group are considered as corresponding to the same person.
  • the feature quantity model classifier 26 generates an average model of each feature quantity model group (hereinafter referred to as a representative model) and outputs the representative model to the feature quantity model DB 27 . If characters do not resemble in face, representative models of the number larger than the number of characters are generated. If the feature quantity models of different face angles are generated, representative models of different angles are thus generated for the same character.
  • a plurality of representative models generated from a single video content is referred to as a representative model group 32 .
  • the representative model group 32 contains the representative model of each of the characters performing in the single video content.
  • the feature quantity model group as a result of classification of the feature models may be output to the feature quantity model DB 27 .
  • the generation of the representative model allows an amount of calculation in later element to be reduced.
  • the feature quantity model DB 27 stores the feature quantity model generated by the feature quantity model extractor 25 and the representative model generated by the feature quantity model classifier 26 .
  • the feature quantity model DB 27 may also store the feature quantity model group classified by the feature quantity model classifier 26 .
  • the character list generator 28 Based on the metadata of the video content, the character list generator 28 generates the character list 31 of the characters performing in the video content and outputs the generated character list 31 to the metadata DB 22 .
  • the character list 31 contains a listing of character IDs of characters described in the metadata (retrieved from the character-related information DB 23 ), casting (actor, actress, producer, writer), and order of displaying (in the metadata (in particular, EPG), each mapped to the content ID.
  • the mapping section 29 in a later element uses the order of displaying in the metadata as information related to a time length of performance of each character and importance of the character.
  • the mapping section 29 determines a character to whom the operator wants to map the feature quantity model (hereinafter referred to as a target character).
  • the mapping section 29 identifies a plurality of video contents in which the target character performs, compares the character list 31 and the representative model group 32 , each corresponding to the identified video content, for mapping, generates the feature quantity model (center model) corresponding to the target character, and outputs the center model to the character and feature quantity model DB 30 for storage.
  • video contents of the same series may be handled as a single video content.
  • the character and the representative model group 32 may be mapped to each other based on the representative model group 32 corresponding to only a single video content.
  • the character and feature quantity model DB 30 stores the character ID and the feature quantity model mapped to the character ID.
  • FIG. 6 illustrates a data structure of the character and feature quantity model DB 30 .
  • Each character ID contains a face angle in which the face image looks in the video content (for example, full face or 45 degrees turned around right), photograph date (year and date), type indicating special makeup and disguise (normal, makeup 1 , makeup 2 , etc.), feature quantity model ID as identification information of a feature quantity model, probability indicating accuracy of mapping between the character and the feature quantity model and manual updating history indicating a history of manual correction and update of each item, all mapped to each other. If a change in feature quantity model is small with different photographing dates, these data may be merged. In this way, an excessive increase in the data size of the character and feature quantity model DB 30 is controlled.
  • the character and feature quantity model DB 30 stores a plurality of feature quantity models mapped to a single character. More specifically, a character having the same character ID but different values in other items may be recorded. In this way, if the character in the video content changes in face with age, make up, or disguise, the feature quantity model in each state is mapped to the same character ID and recognized as the one for the same character.
  • mapping probabilities of the character A to the feature quantity model a, the character A to the feature quantity model b, the character B to the feature quantity model a and the character B to the feature quantity model b is 50%.
  • FIG. 7 illustrates a structure of the mapping section 29 .
  • the mapping section 29 includes a target character determiner 41 , a target character searcher 42 , a representative model group retriever 43 , a correlation determiner 44 , a center model generator 45 and a recorder 46 .
  • the target character determiner 41 determines a target character. Based on the character list 31 on the metadata DB 22 , the target character searcher 42 identifies a plurality of video contents in which the target character performs.
  • the representative model group retriever 43 retrieves from the feature quantity model DB 27 the representative model group 32 corresponding to the plurality of identified video contents.
  • the correlation determiner 44 selects a plurality of representative models corresponding to the target character based on a correlation of the representative model contained in the plurality of representative model groups 32 .
  • the center model generator 45 generates a center model from the plurality of selected representative models.
  • the recorder 46 causes the character and feature quantity model DB 30 to store the generated center model with the target character mapped thereto.
  • the target character determiner 41 determines the target character by selecting sequentially characters in the video content.
  • the target character searcher 42 identifies a plurality of video contents showing the target character except the video content in which a character other than the target character also performs together throughout.
  • the correlation determiner 44 calculates the correlation of the representative models among the plurality of retrieved representative model groups 32 and selects a combination of representative models having the highest correlation among the representative model groups. Instead of selecting the representative models having the highest correlation, the representative models having a correlation above a threshold value may be selected. If the correlations of all representative models of the representative model groups are calculated, the amount of calculation becomes extremely large. In such a case, the correlation may be calculated of several characters in the high order range of displaying in the character list 31 . With this arrangement, the representative models to be selected are quickly selected, and the amount of calculation of correlation is reduced.
  • the center model generator 45 generates as a center model a feature quantity model having an approximately equal correlation to each of the plurality of selected representative models.
  • the character list 31 and the representative model group 32 are generated for each video content. For example, when a new video content is added to the content DB 21 , the preparatory process is performed on the added video content.
  • step S 1 the face image detector 24 retrieves from the content DB 21 a video content to be processed, detects a character face in the video of the retrieved video content, and outputs the character face to the feature quantity model extractor 25 .
  • step S 2 the feature quantity model extractor 25 generates a feature quantity model indicating the feature of the detected character face.
  • the feature quantity model extractor 25 detects the face angle of the detected face and outputs the feature quantity model and the face angle to each of the feature quantity model classifier 26 and the feature quantity model DB 27 .
  • the face detection is completed from the entire video content, and the feature quantity model of each detected face is generated and stored on the feature quantity model DB 27 . Processing proceeds to step S 3 .
  • step S 3 the feature quantity model classifier 26 calculates the similarity of a plurality of feature quantity models at the same face angle generated from the video content to be processed.
  • the feature quantity model classifier 26 classifies the resulting similar feature quantity models into the same feature quantity model group.
  • step S 4 the feature quantity model classifier 26 generates a representative model representing each feature quantity model group and outputs to the feature quantity model DB 27 the representative model group 32 composed of a plurality of generated representative models.
  • the feature quantity model DB 27 stores the input representative model group 32 with the content ID mapped thereto.
  • step S 5 the character list generator 28 retrieves from the metadata DB 22 the metadata of the video content to be processed. Based on the retrieved metadata, the character list generator 28 generates the character list 31 of the characters related to the video content to be processed and outputs the generated character list 31 to the metadata DB 22 .
  • the metadata DB 22 stores the input character list 31 with the content ID mapped thereto.
  • steps S 1 through S 4 of generating the representative model group 32 and the process in step S 5 of generating the character list 31 may be carried out in reverse order or concurrently.
  • a character and feature quantity model generation process of generating the character and feature quantity model DB 30 is described below with reference to a flowchart of FIG. 9 .
  • the character and feature quantity model generation process is performed after a certain number of video contents, each with the character list 31 and the representative model group 32 , has been accumulated. More specifically, at least two video contents are accumulated throughout which a character desired to be mapped to a feature quantity model (target character) performs with no other character accompanying the target character continuously.
  • step S 11 the target character determiner 41 retrieves from the metadata DB 22 the character list 31 containing a new video content C A added to the content DB 21 and selects sequentially the characters listed in the character list 31 .
  • the target character determiner 41 thus determines the target character ⁇ .
  • step S 12 the target character searcher 42 references the character list 31 on the metadata DB 22 to identify a video content in which the target character a performs with no other characters appearing together throughout.
  • the character list 31 corresponding to the identified video content is retrieved from the metadata DB 22 .
  • video contents C B and C C may be now identified.
  • the representative model group 32 corresponding to each of the video contents C A , C B and C C contains a representative model indicating the feature of the face of the target character ⁇ . The following process is performed based on the assumption that these models have a high correlation.
  • step S 13 the representative model group retriever 43 retrieves from the feature quantity model DB 27 the representative model group 32 corresponding to each of the video contents C A , C B and C C and outputs the representative model group 32 to the correlation determiner 44 .
  • step S 14 the correlation determiner 44 calculates the correlation of the representative models among the plurality of retrieved representative model groups 32 , selects a combination of the representative models having the highest correlation among the representative models, and outputs the selected combination of representative models to the center model generator 45 .
  • a ⁇ represent a representative model selected from the representative model group in the video content C A
  • B ⁇ represent a representative model selected from the representative model group in the video content C B
  • C ⁇ represent a representative model selected from the representative model group in the video content C C .
  • step S 15 the center model generator 45 generates a center model M ⁇ having an approximately equal correlation to each of the selected representative models A ⁇ , B ⁇ and C ⁇ and then outputs the center model M ⁇ to the recorder 46 .
  • step S 16 the recorder 46 attaches a feature quantity model ID to the input center model M ⁇ and then records the center model M ⁇ onto the character and feature quantity model DB 30 .
  • the recorder 46 causes the character and feature quantity model DB 30 to record the character ID of the target character ⁇ with the feature quantity model ID of the center model mapped thereto.
  • information containing the face angle, the photographing date, the type and the probability are also recorded.
  • the character and feature quantity model generation process is thus completed.
  • the accuracy of the feature quantity model of the same character is increased on the character and feature quantity model DB 30 and the number of feature quantity models is increased.
  • the character and feature quantity model DB 30 thus constructed may be corrected, updated and modified.
  • the character and feature quantity model DB 30 may be publicly disclosed on the Internet in the hope that any error is pointed out by viewers. If the same error is pointed out by viewers of the number above a predetermined threshold, the character and feature quantity model DB 30 may be corrected.
  • the information providing system 10 of FIG. 1 including the character and feature quantity model DB 30 generated receives the face images and facial composite drawings from the operator and outputs the character-related information of the corresponding character.
  • the information providing system 10 may also display a web page from which a user can purchase products related to the character (such as compact disks (CDs), compact versatile disks (DVDs) or books) or products publicized by the character.
  • the information providing system 10 may find other applications. For example, by inputting the face image of any person, an actress having a similar face may be searched, and a makeup technique of the actress may be learned.
  • a scene performing in the video content may be output.
  • a content ID of the corresponding video content and time information (time stamp) of the video scene may be output.
  • the series of process steps described above may be performed using one of the hardware of FIG. 2 and software. If the process steps are performed using software, a program forming the software is installed from a program recording medium to a computer built in dedicated hardware or a general-purpose computer that performs a variety of functions with a variety of programs installed thereon.
  • FIG. 10 is a block diagram illustrating a hardware structure of a computer that executes the above-referenced process steps.
  • a central processing unit (CPU) 101 a read-only memory (ROM) 102 and a random-access memory (RAM) 103 are interconnected to each other via a bus 104 .
  • CPU central processing unit
  • ROM read-only memory
  • RAM random-access memory
  • the bus 104 also connects to an input-output interface 105 .
  • the input-output interface 105 connects to an input unit 106 including a keyboard, a mouse and a microphone, an output unit 107 including a display and a loudspeaker, a storage 108 including a hard disk and a non-volatile memory, a communication unit 109 including a network interface and a drive 110 driving a recording medium 111 such as one of a magnetic disk, an optical disk, a magneto-optical disk and a semiconductor memory.
  • the computer thus constructed performs the above-referenced process steps when the CPU 101 loads the program stored on the storage 108 to the RAM 103 via the input-output interface 105 and the bus 104 and executes the loaded program.
  • the program may be executed in the order of the process steps described above. Alternatively, the process steps of the program may be performed in parallel or at a timing a call takes place.
  • the program may be executed by a single computer or a plurality of computers.
  • the program may be transferred to a remote computer for execution.
  • system in the specification may refer to a system including a plurality of apparatuses.

Abstract

An information processing apparatus for generating a database indicating mapping between characters and the characters' face images, includes a list generating unit for generating a list of characters, appearing in a video content, based on metadata of the video content, a detecting unit for detecting a character's face image from the video content, a model generating unit for generating a feature quantity model indicating a feature of the detected character's face image and a mapping unit for mapping the feature quantity model generated based on the video content to a character contained in the character list.

Description

    CROSS REFERENCES TO RELATED APPLICATIONS
  • The present invention contains subject matter related to Japanese Patent Application JP 2007-098567 filed in the Japanese Patent Office on Apr. 4, 2007, the entire contents of which are incorporated herein by reference.
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention relates to an information processing apparatus, an information processing method and a computer program and, in particular, to an information processing apparatus, an information processing method and a computer program for generating a database indicating a mapping between characters and respective face images based on a video content such as a television program.
  • 2. Description of the Related Art
  • For example, Japanese Unexamined Patent Application Publication No. 2002-189724 discloses a technique of detecting a person's face in a moving picture or a still picture and identifying whose face it is.
  • In the related art, a database (hereinafter referred to as DB) having recorded characters (persons) and feature quantity models indicating the features of the characters mapped to the characters is referenced, a feature quantity model of a detected face image is compared with the feature quantity models on the database, and a character having the highest correlation is identified as the character having the detected face image.
  • SUMMARY OF THE INVENTION
  • The DB in the related art having recorded the characters and the feature quantity models of the faces mapped to the characters is manually constructed. If the DB is automatically constructed using a computer, for example, an amount of data to be stored on the DB (the number of characters and the feature quantity models of the faces of the characters) is increased more rapidly than in manual input operation. More characters are thus expected to be recognized.
  • The DB of related art cannot cope with a change in the face of a person with aging process and a change in the feature of the face resulting from makeup or disguise. In such a case, the DB also needs to be manually updated.
  • It is thus desirable to construct automatically a database indicating mapping between characters and feature quantity models of faces based on a video content.
  • In accordance with one embodiment of the present invention, an information processing apparatus for generating a database indicating mapping between characters and the characters' face images, includes a list generating unit for generating a list of characters, appearing in a video content, based on metadata of the video content, a detecting unit for detecting a character's face image from the video content, a model generating unit for generating a feature quantity model indicating a feature of the detected character's face image, and a mapping unit for mapping the feature quantity model generated based on the video content to a character contained in the character list.
  • The information processing apparatus may further include a classifying unit for classifying into feature quantity model groups a plurality of feature quantity models, generated from the video content, according to a similarity and generating a representative model representing a plurality of feature quantity models classified in each feature quantity model group. The mapping unit maps the representative model to the character contained in the character list.
  • The mapping unit may include a determining unit for determining a target character, a retrieval unit for searching, in accordance with the character list, the video content in which the target character appears and retrieving the feature quantity model generated from the searched video content, a determining unit for determining a plurality of feature quantity models having a high correlation to each other from among the retrieved feature quantity models, and a map generating unit for generating a center model serving as a center of the plurality of feature quantity models determined as having the high correlation to each other and mapping the center model to the target character.
  • The list generating unit may generate the character list including a group composed of a plurality of characters based on the metadata of the video content.
  • The detecting unit may detect the character's face image regardless of a looking face angle thereof from the video content, and the mapping unit may map to the same character a plurality of feature quantity models generated based on the face images detected at different looking face angles.
  • In accordance with one embodiment of the present invention, an information processing method of an information processing apparatus for generating a database indicating mapping between characters and the characters' face images, includes steps of generating a list of characters, appearing in a video content, based on metadata of the video content, detecting a character's face image from the video content, generating a feature quantity model indicating a feature of the detected character's face image and mapping the feature quantity model generated based on the video content to a character contained in the character list.
  • In accordance with one embodiment of the present invention, a computer program for causing a computer to generate a database indicating mapping between characters and the characters' face images, includes steps of generating a list of characters, appearing in a video content, based on metadata of the video content, detecting a character's face image from the video content, generating a feature quantity model indicating a feature of the detected character's face image and mapping the feature quantity model generated based on the video content to a character contained in the character list.
  • In accordance with one embodiment of the present invention, the list of characters, appearing in the video content, is generated based on the metadata of the video content, the character's face image is detected from the video content, the feature quantity model indicating the feature of the detected character's face image is generated and the feature quantity model generated from the video content is mapped to a character contained in the character list.
  • In accordance with embodiments of the present invention, the database indicating the mapping between the characters and the characters' face images is automatically constructed based on the video content.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram illustrating an information providing system in accordance with one embodiment of the present invention;
  • FIG. 2 is a block diagram illustrating a character and feature quantity model database (DB) generator in accordance with one embodiment of the present invention;
  • FIG. 3 illustrates a content DB of FIG. 2;
  • FIG. 4 illustrates a character-related information DB of FIG. 2;
  • FIG. 5 illustrates a character list of FIG. 2;
  • FIG. 6 illustrates a character and feature quantity model DB of FIG. 2;
  • FIG. 7 is a block diagram illustrating a mapping section of FIG. 2;
  • FIG. 8 is a flowchart illustrating a preparatory process;
  • FIG. 9 is a flowchart illustrating a character and feature quantity model DB generation process; and
  • FIG. 10 is a block diagram illustrating a computer.
  • DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • Before describing an embodiment of the present invention, the correspondence between the features of the present invention and an embodiment disclosed in the specification or the drawings of the invention is discussed below. This statement is intended to assure that embodiments supporting the claimed invention are described in this specification or the drawings. Thus, even if an embodiment is described in the specification or the drawings, but not described as relating to a feature of the invention herein, that does not necessarily mean that the embodiment does not relate to that feature of the invention. Conversely, even if an embodiment is described herein as relating to a certain feature of the invention, that does not necessarily mean that the embodiment does not relate to other features of the invention.
  • In accordance with one embodiment of the present invention, an information processing apparatus (for example, character and feature content model DB generator 20 of FIG. 2) for generating a database indicating mapping between characters and the characters' face images, includes a list generating unit (for example, character list generator 28 of FIG. 2) for generating a list of characters, appearing in a video content, based on metadata of the video content, a detecting unit (for example, face image detector 24 of FIG. 2) for detecting a character's face image from the video content, a model generating unit (for example, feature quantity model extractor 25 of FIG. 2) for generating a feature quantity model indicating a feature of the detected character's face image, and a mapping unit (for example, mapping section 29 of FIG. 2) for mapping the feature quantity model generated based on the video content to a character contained in the character list.
  • The information processing apparatus may further include a classifying unit (for example, feature quantity model classifier 26 of FIG. 2) for classifying into feature quantity model groups a plurality of feature quantity models, generated from the video content, according to a similarity and generating a representative model representing a plurality of feature quantity models classified in each feature quantity model group.
  • The mapping unit may include a determining unit (for example, target character determiner 41 of FIG. 7) for determining a target character, a retrieval unit (for example, target character searcher 42 and representative model group retriever 43 of FIG. 7) for searching, in accordance with the character list, the video content in which the target character appears and retrieving the feature quantity model generated based on the searched video content, a determining unit (for example, correlation determiner 44 of FIG. 7) for determining a plurality of feature quantity models having a high correlation to each other from among the retrieved feature quantity models, and a map generating unit (for example, center model generator 45 of FIG. 7) for generating a center model serving as a center of a plurality of feature quantity models determined as having a high correlation to each other and mapping the center model to the target character.
  • In accordance with one embodiment of the present invention, one of an information processing method and a computer program of an information processing apparatus for generating a database indicating mapping between characters and the characters' face images, includes steps of generating a list of characters, appearing in a video content, based on metadata of the video content (for example, step S5 of FIG. 8), detecting a character's face image from the video content (for example, step S2 of FIG. 8), generating a feature quantity model indicating a feature of the detected character's face image (for example, step S3 of FIG. 8) and mapping the feature quantity model generated based on the video content to a character contained in the character list (for example, step S16 of FIG. 8).
  • The embodiments of the present invention are described below with reference to the drawings.
  • An information providing system 10 of one embodiment of the present invention is described below with reference to FIG. 1. The information providing system 10 includes a character and feature quantity model database (DB) generator. A character and feature content model DB is used in the information providing system 10.
  • The information providing system 10 includes an information providing apparatus 11, a character-related information DB 23 and an character and feature quantity model DB 30. The information providing apparatus 11 receives facial photos and facial composite drawings and provides a user with information related to a character (person) corresponding to the input facial photo. The character-related information DB 23 stores character information related to characters appearing in a video content, such as show business people, intellectuals, athletes, and politicians, and character-related information mapped to the characters. The character and feature quantity model DB 30 stores the character information and feature quantity models indicating the feature of each character mapped to the character information.
  • The information providing apparatus 11 generates a feature quantity model indicating a feature of the facial photo and facial composite drawing input from an operator, searches for a feature quantity model having the highest correlation with the generated feature quantity model by referencing the character and feature quantity model DB 30, and identifies the character matching the search results. The information providing apparatus 11 retrieves the character-related information of the identified character from the character-related information DB 23 and supplies the retrieved information to the operator.
  • The character-related information DB 23 is prepared beforehand (as described in detail later). The character and feature quantity model DB 30 is constructed by a character and feature quantity model DB generator 20 of FIG. 2.
  • The character and feature content model DB generator 20 in accordance with one embodiment of the present invention is described below with reference to FIG. 2. The character and feature content model DB generator 20 constructs the character and feature quantity model DB 30.
  • The character and feature content model DB generator 20 includes a content DB 21, a metadata DB 22, a character-related information DB 23, a face image detector 24, a feature quantity model extractor 25, a feature quantity model classifier 26, a feature quantity model DB 27, a character list generator 28 and a mapping section 29. The content DB 21 stores the video content. The metadata DB 22 stores metadata corresponding to the video content stored on the content DB 21. The character-related information DB 23 stores the character-related information. The face image detector 24 detects a face image from the video content. The feature quantity model extractor 25 generates a feature quantity model indicating a feature of the detected face image. The feature quantity model classifier 26 classifies numerous generated feature quantity models according to similarity. The feature quantity model DB 27 stores the feature quantity models. The character list generator 28 generates a character list 31 of characters appearing in the video content based on the metadata of the video content. The mapping section 29 maps the feature quantity model to the character contained in the character list 31.
  • The video content contains a moving image such as a television program and a still image such as news photos or gravure pictures. As shown in FIG. 3, the content DB 21 stores a content ID (identification), namely, unique identification information, identifying the video content and location information indicating the location of storage of actual data of the video content (moving image data, still image data, etc) with the content ID mapped to the location information. The content DB 21 retrieves the actual data of the video content in accordance with the stored location information and outputs the actual data to each of the face image detector 24 and the mapping section 29. The content DB 21 may store the actual data of the video content.
  • The metadata DB 22 stores the metadata of the video content stored on the content DB 21 and the content ID mapped to the metadata. The metadata includes EPG (electronic program guide) such as a program name, broadcasting date and time, channel, casting, program content, etc., if the video content is a television program. In cases other than the television program, the metadata is general attribute information containing the name of a character appearing in the video content. The metadata DB 22 stores the character list 31 and the content ID mapped to the character list 31. The character list 31 is generated by the character list generator 28.
  • The character-related information DB 23 stores character information related to characters appearing in a video content, such as show business people, intellectuals, athletes, and politicians, and character-related information mapped to the characters.
  • FIG. 4 illustrates items in the character-related information stored on the character-related information DB 23. The character-related information includes a name of a character, a pronunciation guidance of the name, an alphabetical presentation of the name, a home town, a birthday, debut time, related character ID, URL (uniform resource locator) of an official home page of the character, and other information, each item mapped to the character ID.
  • A character ID may be assigned to a group composed of a plurality of characters. If each member of the group separately appears in the video content, the member is assigned the member's own character ID. The related character ID mapped to the character-related information of the member of the group contains the character ID of the group and the character ID of another member of the group. Each group is recorded with the group member mapped mutually thereto. Even if a group name is described in the character list, the face of each individual group member is mapped.
  • The character-related information DB 23 may further contain a content ID of the video content and time information of time at which the character appears in the video content.
  • Referring to FIG. 2, the video content is retrieved from the content DB 21 and a character face in the video of the retrieved video content is detected. The face image detector 24 detects the character image not only in full face but also when the character face looks away at various angles (for example, turning around right by 10 degrees away from the frontward direction or turning around left by 45 degrees away from the frontward direction). The character as a subject may also be photographed from a variety of angles. The face image detector 24 outputs detection results to the feature quantity model extractor 25. If the video content is a moving image, a plurality of face images is detected even if a single character performs in the video content. The face images, if detected at different angles in consecutive scenes, may be from the same person in many cases. The feature quantity model classifier 26 as a subsequent element then stores information that indicates consecutive detections so that the plurality of detected face images are recognized as from the same person.
  • The feature quantity model extractor 25 generates the feature quantity model indicating the feature of the detected character face image. The feature quantity model extractor 25 also detects the looking face angle of the detected face and outputs the feature quantity model and the face angle to each of the feature quantity model classifier 26 and the feature quantity model DB 27. It is perfectly acceptable that the feature quantity model is generated on a per face angle basis. Alternatively, the feature quantity model may be generated for the full face image detected and a feature quantity model for another face angle may be generated based on the full-face feature quantity model generated.
  • For example, techniques disclosed in Japanese Unexamined Patent Application Publication No. 2002-189724 may be applied for the face image detector 24 and the feature quantity model extractor 25.
  • The feature quantity model classifier 26 calculates a similarity of a plurality of feature quantity models having the same face angle generated from a single video content and classifies the feature quantity models into a feature quantity model group so that each feature quantity model is composed of similar feature quantity models. A plurality of feature quantity models classified in the same feature quantity model group are considered as corresponding to the same person. The feature quantity model classifier 26 generates an average model of each feature quantity model group (hereinafter referred to as a representative model) and outputs the representative model to the feature quantity model DB 27. If characters do not resemble in face, representative models of the number larger than the number of characters are generated. If the feature quantity models of different face angles are generated, representative models of different angles are thus generated for the same character.
  • A plurality of representative models generated from a single video content is referred to as a representative model group 32. More specifically, the representative model group 32 contains the representative model of each of the characters performing in the single video content. Instead of generating the representative model, the feature quantity model group as a result of classification of the feature models may be output to the feature quantity model DB 27. However, the generation of the representative model allows an amount of calculation in later element to be reduced.
  • The feature quantity model DB 27 stores the feature quantity model generated by the feature quantity model extractor 25 and the representative model generated by the feature quantity model classifier 26. The feature quantity model DB 27 may also store the feature quantity model group classified by the feature quantity model classifier 26.
  • Based on the metadata of the video content, the character list generator 28 generates the character list 31 of the characters performing in the video content and outputs the generated character list 31 to the metadata DB 22. As shown in FIG. 5, the character list 31 contains a listing of character IDs of characters described in the metadata (retrieved from the character-related information DB 23), casting (actor, actress, producer, writer), and order of displaying (in the metadata (in particular, EPG), each mapped to the content ID. The mapping section 29 in a later element uses the order of displaying in the metadata as information related to a time length of performance of each character and importance of the character.
  • The mapping section 29 determines a character to whom the operator wants to map the feature quantity model (hereinafter referred to as a target character). The mapping section 29 identifies a plurality of video contents in which the target character performs, compares the character list 31 and the representative model group 32, each corresponding to the identified video content, for mapping, generates the feature quantity model (center model) corresponding to the target character, and outputs the center model to the character and feature quantity model DB 30 for storage. When a plurality of video contents is identified, video contents of the same series may be handled as a single video content. The character and the representative model group 32 may be mapped to each other based on the representative model group 32 corresponding to only a single video content.
  • In response to the output from the mapping section 29, the character and feature quantity model DB 30 stores the character ID and the feature quantity model mapped to the character ID. FIG. 6 illustrates a data structure of the character and feature quantity model DB 30. Each character ID contains a face angle in which the face image looks in the video content (for example, full face or 45 degrees turned around right), photograph date (year and date), type indicating special makeup and disguise (normal, makeup 1, makeup 2, etc.), feature quantity model ID as identification information of a feature quantity model, probability indicating accuracy of mapping between the character and the feature quantity model and manual updating history indicating a history of manual correction and update of each item, all mapped to each other. If a change in feature quantity model is small with different photographing dates, these data may be merged. In this way, an excessive increase in the data size of the character and feature quantity model DB 30 is controlled.
  • The character and feature quantity model DB 30 stores a plurality of feature quantity models mapped to a single character. More specifically, a character having the same character ID but different values in other items may be recorded. In this way, if the character in the video content changes in face with age, make up, or disguise, the feature quantity model in each state is mapped to the same character ID and recognized as the one for the same character.
  • As the character appears in more video contents, the face image are detected more frequently and the probability becomes higher. For example, if a duo of characters A and B always perform together in each video content, feature quantity models a and b are mapped to the characters A and B, respectively. Each of mapping probabilities of the character A to the feature quantity model a, the character A to the feature quantity model b, the character B to the feature quantity model a and the character B to the feature quantity model b is 50%.
  • FIG. 7 illustrates a structure of the mapping section 29.
  • The mapping section 29 includes a target character determiner 41, a target character searcher 42, a representative model group retriever 43, a correlation determiner 44, a center model generator 45 and a recorder 46. The target character determiner 41 determines a target character. Based on the character list 31 on the metadata DB 22, the target character searcher 42 identifies a plurality of video contents in which the target character performs. The representative model group retriever 43 retrieves from the feature quantity model DB 27 the representative model group 32 corresponding to the plurality of identified video contents. The correlation determiner 44 selects a plurality of representative models corresponding to the target character based on a correlation of the representative model contained in the plurality of representative model groups 32. The center model generator 45 generates a center model from the plurality of selected representative models. The recorder 46 causes the character and feature quantity model DB 30 to store the generated center model with the target character mapped thereto.
  • When a preparatory process (to be discussed later) ends with a new video content added to the content DB 21, the target character determiner 41 determines the target character by selecting sequentially characters in the video content. The target character searcher 42 identifies a plurality of video contents showing the target character except the video content in which a character other than the target character also performs together throughout.
  • The correlation determiner 44 calculates the correlation of the representative models among the plurality of retrieved representative model groups 32 and selects a combination of representative models having the highest correlation among the representative model groups. Instead of selecting the representative models having the highest correlation, the representative models having a correlation above a threshold value may be selected. If the correlations of all representative models of the representative model groups are calculated, the amount of calculation becomes extremely large. In such a case, the correlation may be calculated of several characters in the high order range of displaying in the character list 31. With this arrangement, the representative models to be selected are quickly selected, and the amount of calculation of correlation is reduced.
  • The center model generator 45 generates as a center model a feature quantity model having an approximately equal correlation to each of the plurality of selected representative models.
  • The preparatory process for generating the character and feature quantity model DB 30 is described below with reference to a flowchart of FIG. 8.
  • In the preparatory process, the character list 31 and the representative model group 32 are generated for each video content. For example, when a new video content is added to the content DB 21, the preparatory process is performed on the added video content.
  • In step S1, the face image detector 24 retrieves from the content DB 21 a video content to be processed, detects a character face in the video of the retrieved video content, and outputs the character face to the feature quantity model extractor 25. In step S2, the feature quantity model extractor 25 generates a feature quantity model indicating the feature of the detected character face. The feature quantity model extractor 25 detects the face angle of the detected face and outputs the feature quantity model and the face angle to each of the feature quantity model classifier 26 and the feature quantity model DB 27.
  • The face detection is completed from the entire video content, and the feature quantity model of each detected face is generated and stored on the feature quantity model DB 27. Processing proceeds to step S3.
  • In step S3, the feature quantity model classifier 26 calculates the similarity of a plurality of feature quantity models at the same face angle generated from the video content to be processed. The feature quantity model classifier 26 classifies the resulting similar feature quantity models into the same feature quantity model group. In step S4, the feature quantity model classifier 26 generates a representative model representing each feature quantity model group and outputs to the feature quantity model DB 27 the representative model group 32 composed of a plurality of generated representative models. The feature quantity model DB 27 stores the input representative model group 32 with the content ID mapped thereto.
  • In step S5, the character list generator 28 retrieves from the metadata DB 22 the metadata of the video content to be processed. Based on the retrieved metadata, the character list generator 28 generates the character list 31 of the characters related to the video content to be processed and outputs the generated character list 31 to the metadata DB 22. The metadata DB 22 stores the input character list 31 with the content ID mapped thereto.
  • The process in steps S1 through S4 of generating the representative model group 32 and the process in step S5 of generating the character list 31 may be carried out in reverse order or concurrently.
  • The preparatory process of the video content to be processed has been described.
  • A character and feature quantity model generation process of generating the character and feature quantity model DB 30 is described below with reference to a flowchart of FIG. 9.
  • The character and feature quantity model generation process is performed after a certain number of video contents, each with the character list 31 and the representative model group 32, has been accumulated. More specifically, at least two video contents are accumulated throughout which a character desired to be mapped to a feature quantity model (target character) performs with no other character accompanying the target character continuously.
  • In step S11, the target character determiner 41 retrieves from the metadata DB 22 the character list 31 containing a new video content CA added to the content DB 21 and selects sequentially the characters listed in the character list 31. The target character determiner 41 thus determines the target character α.
  • In step S12, the target character searcher 42 references the character list 31 on the metadata DB 22 to identify a video content in which the target character a performs with no other characters appearing together throughout. The character list 31 corresponding to the identified video content is retrieved from the metadata DB 22.
  • In addition to the video content CA, video contents CB and CC may be now identified. The representative model group 32 corresponding to each of the video contents CA, CB and CC contains a representative model indicating the feature of the face of the target character α. The following process is performed based on the assumption that these models have a high correlation.
  • In step S13, the representative model group retriever 43 retrieves from the feature quantity model DB 27 the representative model group 32 corresponding to each of the video contents CA, CB and CC and outputs the representative model group 32 to the correlation determiner 44.
  • In step S14, the correlation determiner 44 calculates the correlation of the representative models among the plurality of retrieved representative model groups 32, selects a combination of the representative models having the highest correlation among the representative models, and outputs the selected combination of representative models to the center model generator 45. Let Aα represent a representative model selected from the representative model group in the video content CA, Bα represent a representative model selected from the representative model group in the video content CB and Cα represent a representative model selected from the representative model group in the video content CC.
  • In step S15, the center model generator 45 generates a center model Mα having an approximately equal correlation to each of the selected representative models Aα, Bα and Cα and then outputs the center model Mα to the recorder 46. In step S16, the recorder 46 attaches a feature quantity model ID to the input center model Mα and then records the center model Mα onto the character and feature quantity model DB 30. The recorder 46 causes the character and feature quantity model DB 30 to record the character ID of the target character α with the feature quantity model ID of the center model mapped thereto. In addition to the feature quantity model ID of the center model Mα, information containing the face angle, the photographing date, the type and the probability are also recorded.
  • The character and feature quantity model generation process is thus completed. By repeating the character and feature model generation process, the accuracy of the feature quantity model of the same character is increased on the character and feature quantity model DB 30 and the number of feature quantity models is increased.
  • The character and feature quantity model DB 30 thus constructed may be corrected, updated and modified. For example, the character and feature quantity model DB 30 may be publicly disclosed on the Internet in the hope that any error is pointed out by viewers. If the same error is pointed out by viewers of the number above a predetermined threshold, the character and feature quantity model DB 30 may be corrected.
  • The information providing system 10 of FIG. 1 including the character and feature quantity model DB 30 generated receives the face images and facial composite drawings from the operator and outputs the character-related information of the corresponding character. The information providing system 10 may also display a web page from which a user can purchase products related to the character (such as compact disks (CDs), compact versatile disks (DVDs) or books) or products publicized by the character. The information providing system 10 may find other applications. For example, by inputting the face image of any person, an actress having a similar face may be searched, and a makeup technique of the actress may be learned. By inputting a video content and a character, a scene performing in the video content may be output. By inputting one video scene, a content ID of the corresponding video content and time information (time stamp) of the video scene may be output.
  • The series of process steps described above may be performed using one of the hardware of FIG. 2 and software. If the process steps are performed using software, a program forming the software is installed from a program recording medium to a computer built in dedicated hardware or a general-purpose computer that performs a variety of functions with a variety of programs installed thereon.
  • FIG. 10 is a block diagram illustrating a hardware structure of a computer that executes the above-referenced process steps.
  • In the computer, a central processing unit (CPU) 101, a read-only memory (ROM) 102 and a random-access memory (RAM) 103 are interconnected to each other via a bus 104.
  • The bus 104 also connects to an input-output interface 105. The input-output interface 105 connects to an input unit 106 including a keyboard, a mouse and a microphone, an output unit 107 including a display and a loudspeaker, a storage 108 including a hard disk and a non-volatile memory, a communication unit 109 including a network interface and a drive 110 driving a recording medium 111 such as one of a magnetic disk, an optical disk, a magneto-optical disk and a semiconductor memory.
  • The computer thus constructed performs the above-referenced process steps when the CPU 101 loads the program stored on the storage 108 to the RAM 103 via the input-output interface 105 and the bus 104 and executes the loaded program.
  • The program may be executed in the order of the process steps described above. Alternatively, the process steps of the program may be performed in parallel or at a timing a call takes place.
  • The program may be executed by a single computer or a plurality of computers. The program may be transferred to a remote computer for execution.
  • The term system in the specification may refer to a system including a plurality of apparatuses.
  • It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and alterations may occur depending on design requirements and other factors insofar as they are within the scope of the appended claims or the equivalents thereof.

Claims (8)

1. An information processing apparatus for generating a database indicating mapping between characters and the characters' face images, comprising:
list generating means for generating a list of characters, appearing in a contents data, based on metadata of the contents data;
detecting means for detecting a character's face image from the contents data;
model generating means for generating a feature quantity model indicating a feature of the detected character's face image; and
mapping means for mapping the feature quantity model generated based on the contents data to a character contained in the character list.
2. The information processing apparatus according to claim 1, further comprising classifying means for classifying into feature quantity model groups a plurality of feature quantity models, generated from the contents data, according to a similarity and generating a representative model representing a plurality of feature quantity models classified in each feature quantity model group,
wherein the mapping means maps the representative model to a character contained in the character list.
3. The information processing apparatus according to claim 1, wherein the mapping means comprises:
determining means for determining a target character;
retrieval means for searching, in accordance with the character list, the contents data in which the target character appears and retrieving the feature quantity model generated from the searched contents data;
determining means for determining a plurality of feature quantity models having a high correlation to each other from among the retrieved feature quantity models; and
map generating means for generating a center model serving as a center of the plurality of feature quantity models determined as having the high correlation to each other and mapping the center model to the target character.
4. The information processing apparatus according to claim 1, wherein the list generating means generates the character list including a group composed of a plurality of characters based on the metadata of the contents data.
5. The information processing apparatus according to claim 1, wherein the detecting means detects the character's face image regardless of a looking face angle thereof from the contents data, and
wherein the mapping means maps to the same character a plurality of feature quantity models generated from the face images detected at different looking face angles.
6. An information processing method of an information processing apparatus for generating a database indicating mapping between characters and the characters' face images, comprising steps of:
generating a list of characters, appearing in a contents data, based on metadata of the contents data;
detecting a character's face image from the contents data;
generating a feature quantity model indicating a feature of the detected character's face image; and
mapping the feature quantity model generated based on the contents data to a character contained in the character list.
7. A computer program for causing a computer to generate a database indicating mapping between characters and the characters' face images, comprising steps of:
generating a list of characters, appearing in a contents data, based on metadata of the contents data;
detecting a character's face image from the contents data;
generating a feature quantity model indicating a feature of the detected character's face image; and
mapping the feature quantity model generated based on the contents data to a character contained in the character list.
8. An information processing apparatus for generating a database indicating mapping between characters and the characters' face images, comprising:
a list generating unit generating a list of characters, appearing in a contents data, based on metadata of the contents data;
a detecting unit detecting a character's face image from the contents data;
a model generating unit generating a feature quantity model indicating a feature of the detected character's face image; and
a mapping unit mapping the feature quantity model generated based on the contents data to a character contained in the character list.
US12/046,322 2007-04-04 2008-03-11 Apparatus, method and computer program for processing information Expired - Fee Related US8107689B2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JPP2007-098567 2007-04-04
JP2007098567A JP4337064B2 (en) 2007-04-04 2007-04-04 Information processing apparatus, information processing method, and program

Publications (2)

Publication Number Publication Date
US20080247610A1 true US20080247610A1 (en) 2008-10-09
US8107689B2 US8107689B2 (en) 2012-01-31

Family

ID=39826936

Family Applications (1)

Application Number Title Priority Date Filing Date
US12/046,322 Expired - Fee Related US8107689B2 (en) 2007-04-04 2008-03-11 Apparatus, method and computer program for processing information

Country Status (3)

Country Link
US (1) US8107689B2 (en)
JP (1) JP4337064B2 (en)
CN (1) CN101281540B (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2481298A (en) * 2010-06-15 2011-12-21 Apple Inc Generating, transmitting and receiving object detection metadata
US8107689B2 (en) * 2007-04-04 2012-01-31 Sony Corporation Apparatus, method and computer program for processing information
US20120030727A1 (en) * 2010-08-02 2012-02-02 At&T Intellectual Property I, L.P. Apparatus and method for providing media content
CN103258012A (en) * 2013-04-16 2013-08-21 广东欧珀移动通信有限公司 Method and device for acquiring picture information
US20140019893A1 (en) * 2012-07-11 2014-01-16 Cellco Partnership D/B/A Verizon Wireless Story element indexing and uses thereof
EP2345980A3 (en) * 2010-01-13 2014-06-04 Hitachi, Ltd. Classifier learning image production program, method, and system
US8947511B2 (en) 2010-10-01 2015-02-03 At&T Intellectual Property I, L.P. Apparatus and method for presenting three-dimensional media content
US8947497B2 (en) 2011-06-24 2015-02-03 At&T Intellectual Property I, Lp Apparatus and method for managing telepresence sessions
US9030536B2 (en) 2010-06-04 2015-05-12 At&T Intellectual Property I, Lp Apparatus and method for presenting media content
US9032470B2 (en) 2010-07-20 2015-05-12 At&T Intellectual Property I, Lp Apparatus for adapting a presentation of media content according to a position of a viewing apparatus
US9030522B2 (en) 2011-06-24 2015-05-12 At&T Intellectual Property I, Lp Apparatus and method for providing media content
US9049426B2 (en) 2010-07-07 2015-06-02 At&T Intellectual Property I, Lp Apparatus and method for distributing three dimensional media content
US9086778B2 (en) 2010-08-25 2015-07-21 At&T Intellectual Property I, Lp Apparatus for controlling three-dimensional images
US9167205B2 (en) 2011-07-15 2015-10-20 At&T Intellectual Property I, Lp Apparatus and method for providing media services with telepresence
US9232274B2 (en) 2010-07-20 2016-01-05 At&T Intellectual Property I, L.P. Apparatus for adapting a presentation of media content to a requesting device
US20160006921A1 (en) * 2014-07-03 2016-01-07 Adience Ser Ltd. System and method of predicting whether a person in an image is an operator of an imager capturing the image
US9445046B2 (en) 2011-06-24 2016-09-13 At&T Intellectual Property I, L.P. Apparatus and method for presenting media content with telepresence
US9560406B2 (en) 2010-07-20 2017-01-31 At&T Intellectual Property I, L.P. Method and apparatus for adapting a presentation of media content
US9602766B2 (en) 2011-06-24 2017-03-21 At&T Intellectual Property I, L.P. Apparatus and method for presenting three dimensional objects with telepresence
US9781469B2 (en) 2010-07-06 2017-10-03 At&T Intellectual Property I, Lp Method and apparatus for managing a presentation of media content
US9787974B2 (en) 2010-06-30 2017-10-10 At&T Intellectual Property I, L.P. Method and apparatus for delivering media content
US9922239B2 (en) 2016-07-19 2018-03-20 Optim Corporation System, method, and program for identifying person in portrait
US10657417B2 (en) 2016-12-28 2020-05-19 Ambass Inc. Person information display apparatus, a person information display method, and a person information display program

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5224360B2 (en) * 2008-11-10 2013-07-03 日本電気株式会社 Electronic advertising device, electronic advertising method and program
JP2010152744A (en) * 2008-12-25 2010-07-08 Toshiba Corp Reproducing device
CN103984931B (en) * 2014-05-27 2017-11-07 联想(北京)有限公司 A kind of information processing method and the first electronic equipment
CN105426829B (en) * 2015-11-10 2018-11-16 深圳Tcl新技术有限公司 Video classification methods and device based on facial image

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010031073A1 (en) * 2000-03-31 2001-10-18 Johji Tajima Face recognition method, recording medium thereof and face recognition device
US20020135618A1 (en) * 2001-02-05 2002-09-26 International Business Machines Corporation System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input
US6661906B1 (en) * 1996-12-19 2003-12-09 Omron Corporation Image creating apparatus
US6671391B1 (en) * 2000-05-26 2003-12-30 Microsoft Corp. Pose-adaptive face detection system and process
US20040190775A1 (en) * 2003-03-06 2004-09-30 Animetrics, Inc. Viewpoint-invariant detection and identification of a three-dimensional object from two-dimensional imagery
US20060195475A1 (en) * 2005-02-28 2006-08-31 Microsoft Corporation Automatic digital image grouping using criteria based on image metadata and spatial information
US20070098231A1 (en) * 2005-11-02 2007-05-03 Yoshihisa Minato Face identification device
US20100149177A1 (en) * 2005-10-11 2010-06-17 Animetrics Inc. Generation of normalized 2d imagery and id systems via 2d to 3d lifting of multifeatured objects

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10326278A (en) * 1997-03-27 1998-12-08 Minolta Co Ltd Processor and method for information processing and recording medium for information processing program
JP2001167110A (en) * 1999-12-08 2001-06-22 Matsushita Electric Ind Co Ltd Picture retrieving method and its device
JP2002189724A (en) 2000-12-20 2002-07-05 Victor Co Of Japan Ltd Image data retrieval device
CN1137662C (en) * 2001-10-19 2004-02-11 清华大学 Main unit component analysis based multimode human face identification method
CN1313962C (en) * 2004-07-05 2007-05-02 南京大学 Digital human face image recognition method based on selective multi-eigen space integration
JP4586446B2 (en) * 2004-07-21 2010-11-24 ソニー株式会社 Content recording / playback apparatus, content recording / playback method, and program thereof
JP4973188B2 (en) * 2004-09-01 2012-07-11 日本電気株式会社 Video classification device, video classification program, video search device, and video search program
JP2006080803A (en) * 2004-09-08 2006-03-23 Toshiba Corp Program recording apparatus and performer list generating method
JP4591215B2 (en) * 2005-06-07 2010-12-01 株式会社日立製作所 Facial image database creation method and apparatus
JP4595750B2 (en) * 2005-08-29 2010-12-08 ソニー株式会社 Image processing apparatus and method, and program
JP4334545B2 (en) * 2006-01-31 2009-09-30 シャープ株式会社 Storage device and computer-readable recording medium
JP4337064B2 (en) * 2007-04-04 2009-09-30 ソニー株式会社 Information processing apparatus, information processing method, and program

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6661906B1 (en) * 1996-12-19 2003-12-09 Omron Corporation Image creating apparatus
US20010031073A1 (en) * 2000-03-31 2001-10-18 Johji Tajima Face recognition method, recording medium thereof and face recognition device
US6671391B1 (en) * 2000-05-26 2003-12-30 Microsoft Corp. Pose-adaptive face detection system and process
US20020135618A1 (en) * 2001-02-05 2002-09-26 International Business Machines Corporation System and method for multi-modal focus detection, referential ambiguity resolution and mood classification using multi-modal input
US20040190775A1 (en) * 2003-03-06 2004-09-30 Animetrics, Inc. Viewpoint-invariant detection and identification of a three-dimensional object from two-dimensional imagery
US20060195475A1 (en) * 2005-02-28 2006-08-31 Microsoft Corporation Automatic digital image grouping using criteria based on image metadata and spatial information
US20100149177A1 (en) * 2005-10-11 2010-06-17 Animetrics Inc. Generation of normalized 2d imagery and id systems via 2d to 3d lifting of multifeatured objects
US20070098231A1 (en) * 2005-11-02 2007-05-03 Yoshihisa Minato Face identification device

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8107689B2 (en) * 2007-04-04 2012-01-31 Sony Corporation Apparatus, method and computer program for processing information
EP2345980A3 (en) * 2010-01-13 2014-06-04 Hitachi, Ltd. Classifier learning image production program, method, and system
US9030536B2 (en) 2010-06-04 2015-05-12 At&T Intellectual Property I, Lp Apparatus and method for presenting media content
US9774845B2 (en) 2010-06-04 2017-09-26 At&T Intellectual Property I, L.P. Apparatus and method for presenting media content
US10567742B2 (en) 2010-06-04 2020-02-18 At&T Intellectual Property I, L.P. Apparatus and method for presenting media content
US9380294B2 (en) 2010-06-04 2016-06-28 At&T Intellectual Property I, Lp Apparatus and method for presenting media content
GB2481298A (en) * 2010-06-15 2011-12-21 Apple Inc Generating, transmitting and receiving object detection metadata
GB2481298B (en) * 2010-06-15 2014-06-11 Apple Inc Object detection metadata
US8509540B2 (en) 2010-06-15 2013-08-13 Apple Inc. Object detection metadata
US9100630B2 (en) 2010-06-15 2015-08-04 Apple Inc. Object detection metadata
US8744195B2 (en) 2010-06-15 2014-06-03 Apple Inc. Object detection metadata
US9787974B2 (en) 2010-06-30 2017-10-10 At&T Intellectual Property I, L.P. Method and apparatus for delivering media content
US9781469B2 (en) 2010-07-06 2017-10-03 At&T Intellectual Property I, Lp Method and apparatus for managing a presentation of media content
US11290701B2 (en) 2010-07-07 2022-03-29 At&T Intellectual Property I, L.P. Apparatus and method for distributing three dimensional media content
US10237533B2 (en) 2010-07-07 2019-03-19 At&T Intellectual Property I, L.P. Apparatus and method for distributing three dimensional media content
US9049426B2 (en) 2010-07-07 2015-06-02 At&T Intellectual Property I, Lp Apparatus and method for distributing three dimensional media content
US9830680B2 (en) 2010-07-20 2017-11-28 At&T Intellectual Property I, L.P. Apparatus for adapting a presentation of media content according to a position of a viewing apparatus
US10070196B2 (en) 2010-07-20 2018-09-04 At&T Intellectual Property I, L.P. Apparatus for adapting a presentation of media content to a requesting device
US9560406B2 (en) 2010-07-20 2017-01-31 At&T Intellectual Property I, L.P. Method and apparatus for adapting a presentation of media content
US10602233B2 (en) 2010-07-20 2020-03-24 At&T Intellectual Property I, L.P. Apparatus for adapting a presentation of media content to a requesting device
US9232274B2 (en) 2010-07-20 2016-01-05 At&T Intellectual Property I, L.P. Apparatus for adapting a presentation of media content to a requesting device
US10489883B2 (en) 2010-07-20 2019-11-26 At&T Intellectual Property I, L.P. Apparatus for adapting a presentation of media content according to a position of a viewing apparatus
US9032470B2 (en) 2010-07-20 2015-05-12 At&T Intellectual Property I, Lp Apparatus for adapting a presentation of media content according to a position of a viewing apparatus
US9668004B2 (en) 2010-07-20 2017-05-30 At&T Intellectual Property I, L.P. Apparatus for adapting a presentation of media content to a requesting device
US8994716B2 (en) * 2010-08-02 2015-03-31 At&T Intellectual Property I, Lp Apparatus and method for providing media content
US9247228B2 (en) 2010-08-02 2016-01-26 At&T Intellectual Property I, Lp Apparatus and method for providing media content
US20120030727A1 (en) * 2010-08-02 2012-02-02 At&T Intellectual Property I, L.P. Apparatus and method for providing media content
US9352231B2 (en) 2010-08-25 2016-05-31 At&T Intellectual Property I, Lp Apparatus for controlling three-dimensional images
US9086778B2 (en) 2010-08-25 2015-07-21 At&T Intellectual Property I, Lp Apparatus for controlling three-dimensional images
US9700794B2 (en) 2010-08-25 2017-07-11 At&T Intellectual Property I, L.P. Apparatus for controlling three-dimensional images
US8947511B2 (en) 2010-10-01 2015-02-03 At&T Intellectual Property I, L.P. Apparatus and method for presenting three-dimensional media content
US9736457B2 (en) 2011-06-24 2017-08-15 At&T Intellectual Property I, L.P. Apparatus and method for providing media content
US10033964B2 (en) 2011-06-24 2018-07-24 At&T Intellectual Property I, L.P. Apparatus and method for presenting three dimensional objects with telepresence
US9681098B2 (en) 2011-06-24 2017-06-13 At&T Intellectual Property I, L.P. Apparatus and method for managing telepresence sessions
US9445046B2 (en) 2011-06-24 2016-09-13 At&T Intellectual Property I, L.P. Apparatus and method for presenting media content with telepresence
US9160968B2 (en) 2011-06-24 2015-10-13 At&T Intellectual Property I, Lp Apparatus and method for managing telepresence sessions
US9407872B2 (en) 2011-06-24 2016-08-02 At&T Intellectual Property I, Lp Apparatus and method for managing telepresence sessions
US9030522B2 (en) 2011-06-24 2015-05-12 At&T Intellectual Property I, Lp Apparatus and method for providing media content
US8947497B2 (en) 2011-06-24 2015-02-03 At&T Intellectual Property I, Lp Apparatus and method for managing telepresence sessions
US10484646B2 (en) 2011-06-24 2019-11-19 At&T Intellectual Property I, L.P. Apparatus and method for presenting three dimensional objects with telepresence
US10200651B2 (en) 2011-06-24 2019-02-05 At&T Intellectual Property I, L.P. Apparatus and method for presenting media content with telepresence
US10200669B2 (en) 2011-06-24 2019-02-05 At&T Intellectual Property I, L.P. Apparatus and method for providing media content
US9602766B2 (en) 2011-06-24 2017-03-21 At&T Intellectual Property I, L.P. Apparatus and method for presenting three dimensional objects with telepresence
US9270973B2 (en) 2011-06-24 2016-02-23 At&T Intellectual Property I, Lp Apparatus and method for providing media content
US9807344B2 (en) 2011-07-15 2017-10-31 At&T Intellectual Property I, L.P. Apparatus and method for providing media services with telepresence
US9167205B2 (en) 2011-07-15 2015-10-20 At&T Intellectual Property I, Lp Apparatus and method for providing media services with telepresence
US9414017B2 (en) 2011-07-15 2016-08-09 At&T Intellectual Property I, Lp Apparatus and method for providing media services with telepresence
US9304992B2 (en) * 2012-07-11 2016-04-05 Cellco Partnership Story element indexing and uses thereof
US20140019893A1 (en) * 2012-07-11 2014-01-16 Cellco Partnership D/B/A Verizon Wireless Story element indexing and uses thereof
CN103258012A (en) * 2013-04-16 2013-08-21 广东欧珀移动通信有限公司 Method and device for acquiring picture information
US20160006921A1 (en) * 2014-07-03 2016-01-07 Adience Ser Ltd. System and method of predicting whether a person in an image is an operator of an imager capturing the image
US9922239B2 (en) 2016-07-19 2018-03-20 Optim Corporation System, method, and program for identifying person in portrait
US10657417B2 (en) 2016-12-28 2020-05-19 Ambass Inc. Person information display apparatus, a person information display method, and a person information display program

Also Published As

Publication number Publication date
JP2008257460A (en) 2008-10-23
JP4337064B2 (en) 2009-09-30
CN101281540A (en) 2008-10-08
CN101281540B (en) 2011-05-18
US8107689B2 (en) 2012-01-31

Similar Documents

Publication Publication Date Title
US8107689B2 (en) Apparatus, method and computer program for processing information
US20200401615A1 (en) System and methods thereof for generation of searchable structures respective of multimedia data content
US10831814B2 (en) System and method for linking multimedia data elements to web pages
US9176987B1 (en) Automatic face annotation method and system
US7672508B2 (en) Image classification based on a mixture of elliptical color models
US8266185B2 (en) System and methods thereof for generation of searchable structures respective of multimedia data content
JP3568117B2 (en) Method and system for video image segmentation, classification, and summarization
US9355330B2 (en) In-video product annotation with web information mining
US20150186405A1 (en) System and methods for generation of a concept based database
US8467611B2 (en) Video key-frame extraction using bi-level sparsity
US20140212106A1 (en) Music soundtrack recommendation engine for videos
US20140040273A1 (en) Hypervideo browsing using links generated based on user-specified content features
US20090028393A1 (en) System and method of saving digital content classified by person-based clustering
EP1374096A2 (en) Camera meta-data for content categorization
Tuytelaars et al. Naming people in news videos with label propagation
US9229958B2 (en) Retrieving visual media
US11531839B2 (en) Label assigning device, label assigning method, and computer program product
US9189545B2 (en) Content summarizing apparatus and content summarizing displaying apparatus
KR101640317B1 (en) Apparatus and method for storing and searching image including audio and video data
JP2011100240A (en) Representative image extraction method, representative image extraction device, and representative image extraction program
US10360253B2 (en) Systems and methods for generation of searchable structures respective of multimedia data content
US20170315994A1 (en) Method and system for populating a concept database with respect to user identifiers
Dai et al. A mechanism for large image/videos’ automatic annotation considering semantic tolerance relation
Hua et al. Camera notes

Legal Events

Date Code Title Description
AS Assignment

Owner name: SONY CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TSUNODA, TOMOHIRO;REEL/FRAME:020644/0252

Effective date: 20080301

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20160131