WO2016026064A1 - A method and a system for estimating facial landmarks for face image - Google Patents

A method and a system for estimating facial landmarks for face image Download PDF

Info

Publication number
WO2016026064A1
WO2016026064A1 PCT/CN2014/000785 CN2014000785W WO2016026064A1 WO 2016026064 A1 WO2016026064 A1 WO 2016026064A1 CN 2014000785 W CN2014000785 W CN 2014000785W WO 2016026064 A1 WO2016026064 A1 WO 2016026064A1
Authority
WO
WIPO (PCT)
Prior art keywords
face image
image dataset
annotations
landmark
type
Prior art date
Application number
PCT/CN2014/000785
Other languages
French (fr)
Inventor
Xiaoou Tang
Shizhan ZHU
Cheng Li
Chen Change Loy
Original Assignee
Xiaoou Tang
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiaoou Tang filed Critical Xiaoou Tang
Priority to PCT/CN2014/000785 priority Critical patent/WO2016026064A1/en
Priority to CN201480082760.XA priority patent/CN107004136B/en
Publication of WO2016026064A1 publication Critical patent/WO2016026064A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/755Deformable models or variational models, e.g. snakes or active contours
    • G06V10/7553Deformable models or variational models, e.g. snakes or active contours based on shape, e.g. active shape models [ASM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships

Definitions

  • the present application relates to a method for estimating facial landmarks for a face image, and a system for estimating facial landmarks for a face image.
  • Face alignment is a critical component of various face analyses, such as face verification and expression classification.
  • Various benchmark datasets have been released, each of which containing large quantities of labeled images.
  • the bias presents in the form of different characteristics and distributions exists across datasets. For instance, one set mainly contains white Caucasian male with mostly frontal faces, while another set consists of challenging samples with various poses or severe occlusions. In addition, the distribution difference between profile views can differ as much as over 10% across datasets. Clearly, training a model on one dataset forcefully would lead to over-fitting easily, and causing poor performance on unseen domain. To improve generalization, it is of practical interest to combine different databases so as to leverage the characteristics and distributions of multiple sources. This thought, however, is hindered by the annotation gaps, which requires huge effort to standardize before databases fusion is possible.
  • a method for estimating facial landmarks for a face image comprising:
  • a system for estimating facial landmarks for a face image comprising:
  • a transductive alignment device configured to retrieve a first face image dataset with first type landmark annotations and a second face image dataset with second type landmark annotations, and transfer the first type landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type annotations for the second face image dataset;
  • a data augmentation device configured to combine the second face image dataset with the pseudo second type landmark annotations and the first face image dataset to make the second face image dataset have the first type landmark annotations.
  • FIG. 1 is s a schematic diagram illustrating an exemplary system 100 for transferring face landmark annotations according to one embodiment of the present application.
  • FIG. 2 is a schematic diagram illustrating an exemplary block diagram for the transductive alignment device 10 according to one embodiment of the present application.
  • Fig. 3 illustrates flow chart for the process 200 to show how the units 101-106 cooperate to obtain a pseudo S-Type annotation for the new training set.
  • Fig. 4 is a schematic flowchart illustrating detailed process for the transductive model training unit consistent with some disclosed embodiments of the present application.
  • FIG. 5 illustrates a flow chart for process of the data augmentation device consistent with another disclosed embodiment of the present application.
  • Fig. 6 is a schematic diagram illustrating an exemplary system for determining face landmark according to one embodiment of the present application.
  • Fig. 7 illustrates a flow chart for process of the training device to train the predicting device according to one embodiment of the present application.
  • Fig. 8 illustrates a flow chart for process of detailed process for the predicting device according to one embodiment of the present application.
  • Fig. 1 is a schematic diagram illustrating an exemplary system 100 for transferring face landmark annotations according to one embodiment of the present application.
  • the system 100 for transferring face landmark annotations may comprise a transductive alignment device 10 and a data augmentation device 20.
  • the transductive alignment device 10 is configured to retrieve an first (original) training set for a first face image with S-type landmark annotations (hereinafter, also referred to "Set 1") and a second (new) training set with T-type landmark annotations (hereinafter, also referred to "Set 2"), and to transfer the S-type landmark annotations from the original face image dataset (training dataset) to the new training dataset so as to obtain pseudo S-Type annotations for the new training set.
  • the landmark annotations may comprise facial landmark points on a given face image, such as eyes, nose, and mouth corners.
  • the data augmentation device 20 is then configured to combine the new training set with the pseudo S-Type landmark annotations and the original training set into an augmented dada training set, i.e. to make the new training set with the S-type landmark annotations.
  • S-type might be denser with a plurality of (for example, 194 or more) landmarks, and even face outer contour is annotated, while T-Type might be sparse, with only a few of (for example, 5) landmarks only on eyes and mouth corners).
  • This transductive alignment devicelO may only predict S-type annotation on new training dataset only when the T-type annotations on the new training set are provided. But the goal of the present application is to predict S-type annotations for arbitrary input face image, such that no T-type annotations are needed to predict the landmark annotations. Since more diversified training samples from the new training dataset are included, it can get more robust model for predicting S-type landmarks of facial images.
  • the transductive alignment device is further configured to determine a transductive model ⁇ M PCA k , M reg k ] from a common landmark index between the first type landmark annotations and the second type landmark annotations, an initial first-type annotations, and the first face image dataset; and transfer, based on the transductive model, the first landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type annotations for the second face image dataset.
  • Fig. 2 is a schematic diagram illustrating an exemplary block diagram for the transductive alignment device 10 according to one embodiment of the present application. As shown in Fig. 2, the transductive alignment device 10 may comprise a common landmarks determination unit 101, a mapping unit 102, a first annotation estimated unit 103, a transductive model unit 104, a second annotation estimated unit 105 and a pseudo annotation determination unit 106.
  • Fig. 3 illustrates flow chart for the process 300 to show how the units 101-106 cooperate to obtain pseudo S-Type landmark annotations for the new training dataset.
  • the common landmark determination unit 101 operates to retrieve a first training dataset ⁇ I ⁇ x s , B- ) for first face image with S-type landmark annotations (Set 1) and a second training set ⁇ I 2 , ⁇ > B 2 ⁇ ) with T-type landmark annotations (Set 2), wherein the first and the second training datasets include bounding box B] and B 2 of each face in the image I x and the image 1 2 , respectively, where Ij represents face images from the training image set with index i, x ⁇ represents landmark location (in x-y coordinates) and Bj and B2 represent bounding box of image 7 X and the image I 2 , respectively. And then the common landmarks determination unit 101 determines a plurality of common landmark index
  • Xs common landmark annotations in data Set 1 and T-type landmark annotations in data Set 2.
  • common landmarks common exist across the data Set 1 and data Set 2.
  • Common landmark annotations are defined as facial landmarks that are well labeled with decisive semantic definition across different datasets, such as left and right eyes corners, mouth corners and pupil centers.
  • the mapping unit 102 operates to learn mapping matrix T from common landmark annotation indexes (xs)common to S-type landmarks x s in original training dataset, i.e., Set 1.
  • T (xsc x sc ⁇ lx sc x > m which x sc is short for (x s ) C ommon, and '*' in '(x s ) C ommon * T' means matrix multiply, not convolution.
  • the first annotation estimated unit 103 operates to compute the initial or estimated S-type annotations x on the data Set 1 based on the common landmarks ( s) common obtained from step S201 and the mapping T obtained from step S202 by rule of
  • the transductive model training unit 104 operates to determine a transductive model from the common landmarks index (xs) common from step S301, the initial S-type annotations x, and the first training dataset
  • the second annotation estimated unit 105 receives new training dataset (i.e. Set 2) (with T-type annotations ⁇ I 2 , X j , # 2 ⁇ ) and use the mapping T obtained from S302 and the common landmark indexes (Xr ⁇ common obtained from S301 to get initialized/estimated annotation x for the new training dataset (data Set 2) by rule of
  • the pseudo annotation determination unit 106 operates to extract local appearance information 0(x) for the data Set 1 , and feature Jacobian 0(x * )— 0(x) only for common landmarks
  • 0(x) is to extract local SIFT (Scale Invariant Feature Transform) features according to the coordinates x, and SIFT will be treated as black box.
  • SIFT Scale Invariant Feature Transform
  • the pseudo annotation determination unit 106 operates to calculate an estimated annotation error ⁇ x based on the transductive model by rule of:
  • a x M reg (M PCA (0 4) where, M PCA transforms original features into PCA (Principal Component Analysis) features, M reg transforms PCA features into regression displacement target.
  • the training dataset will be prepared by the transductive model training unit 104.
  • the transductive model training unit 104 receives the first training dataset ⁇ I ⁇ x s ⁇ ) for a first face image with S-type landmark annotations (data Set 1) and prepares the following data and then begins to train for k iterations:
  • the transductive model training unit 104 operates to extract: (1) local appearance information ⁇ ( ⁇ ) for the data Set 1 , and (2) Feature Jacobian ⁇ ( ⁇ *) — ⁇ ( ⁇ ) ONLY for common landmarks ( s)commom an d then concatenates these two parts (1) and (2) as features f by rule of formula 3) as stated in the above.
  • the transductive model training unit 104 gets a PCA projection model M PCA via performing PCA analysis on the features f; and gets a mapping M reg via ridge regression from PCA-projected features to dissimilarity.
  • PCA principle component analysis
  • singular value decomposition which outputs a PCA projection model M PCA containing a mean vector and projection coefficient.
  • the PCA -projected features are obtained by first subtracting the original feature with mean vector, and then performing matrix multiplication with the projection coefficient.
  • Ridge regression is a mapping function containing coefficient and bias, which will be used to obtainAx as shown in Equation 4) ⁇
  • the transductive model training unit 104 operates to determine if the estimated shape converge to the ground truth shape. If yes, at step S2046, the transductive model training unit 104 will determine the transductive model M (containing PCA (principle component analysis) projection model and mapping function for each iteration) by rule of
  • the data augmentation device 20 is configured to combine the new training set with the pseudo S-Type landmark annotations and the original training set into an augmented dada training set.
  • the S-type landmark annotations for the new training set might be inaccurate, so it is called "pseudo S-Type annotation" and thus subsequent data augmentation process is necessary to move the error from the pseudo S-Type annotation.
  • Fig. 5 illustrates a flow chart 500 for process of the data augmentation device 20.
  • the data augmentation device 20 operates to filter erroneous transferred annotations from the pseudo S-Type landmark annotations in the new training dataset by comparing the estimated common landmarks x s and ground truth common landmarks so as to get a cleaned training set ⁇ 2 ', Xs' > B 2 ' ⁇ .
  • the data augmentation device 20 receives original training set (data Set 1) (with S-type landmark annotations ⁇ 1; x s , B- ) and then combine the cleaned new training set with the original training set to obtain ⁇ I A , x s , B ⁇ .
  • Fig. 6 is a schematic diagram illustrating an exemplary system 1000 for determining face landmark according to one embodiment of the present application.
  • the system 1000 may further comprise a training device 30 and a predicting device 40.
  • the operations of the transductive alignment device 10 and the data augmentation device 20 in system 1000 are the same as those of system 100, and thus detailed description thereof will be omitted hereinafter.
  • the combined dataset generated by the data augmentation device 20 may be treated as the predetermined training set for the training device 30 to train the predicting device 40.
  • Fig. 7 illustrates a flow chart 700 for process of the training device 30 to train the predicting device 40.
  • the training device 30 receives from the data augmentation device 20 the augmented training set with bounding box of images ⁇ I A , x s , B ⁇ and then learn an initializing function init(B) to estimate relation between initial landmarks and the bounding box B, so as to get initialized landmarks x according to the bounding box B and learned init(B) .
  • the function intit may be determined intuitively. For example, it may generate initial landmarks relative to the bounding box, e.g.
  • the present application always uses the 0.25 and 0.3 for all samples with respect to left eye center and other landmarks are the same.
  • the training data set will be prepared.
  • the training device 30 receives the first training set x s ⁇ ) for the first face image with S-type landmark annotation (data Set 1) and prepares the following data and then begins to to train for k iterations:
  • the training device 30 operates to extract local appearance information ⁇ ( ⁇ ) for the augmented training set ⁇ I A , x s , B ⁇ and represents the extract local appearance information as features f.
  • step S704 the training device 30 operates to compute the
  • the training device 30 gets a PCA (principle component analysis) projection model M PCA k via performing PCA analysis on the features f; and gets a mapping M reg k via ridge regression from PCA-projected features to dissimilarity.
  • PCA principal component analysis
  • the predicting device 40 is configured to receive face image with pre-detected bounding box B and predict the facial landmarks positions, i.e. estimated 2D coordinates (x and y) of facial landmarks of the received face image.
  • the detailed process for the predicting device 40 will be further discussed in reference to Fig. 8.
  • the predicting device 40 gets the initializing function init(B) from the raining device 30 and gets initialized landmarks x according to the bounding box B and init(B) for the received face image.
  • the predicting device 40 gets the robust model trained model M from the training device 30, and then, for each iteration, the predicting device 40 computes local appearance information ⁇ ( ⁇ ) as features f and calculates the estimated ⁇ ⁇ by rule of
  • ⁇ x M reg (M PCA (f)) .
  • the unit 40 output x from last iteration of the iteration K.
  • the systems 10 and 100 have been discussed in the case they are implemented using certain hardware or the combination of the hardware and the software. It shall be appreciated that the systems 10 and 100 may be also implemented using software.
  • the embodiments of the present invention may be adapted to a computer program product embodied on one or more computer readable storage media (comprising but not limited to disk storage, CD-ROM, optical memory and the like) containing computer program codes.
  • the systems 10 and 100 may run in a general purpose computer, a computer cluster, a mainstream computer, a computing device dedicated for providing online contents, or a computer network comprising a group of computers operating in a centralized or distributed fashion.

Abstract

Disclosed are a method for estimating facial landmarks for a face image, and a system for estimating facial landmarks for a face image. The method may comprise: retrieving a first face image dataset with first type landmark annotations and a second face image dataset with second type landmark annotations; transferring the first type landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type annotations for the second face image dataset; and combining the second face image dataset with the pseudo second type landmark annotations and the first face image dataset to make the second face image dataset have the first type landmark annotations.

Description

A Method and a System for Estimating Facial Landmarks for Face image Technical field
[0001] The present application relates to a method for estimating facial landmarks for a face image, and a system for estimating facial landmarks for a face image.
Background
[0002] Face alignment is a critical component of various face analyses, such as face verification and expression classification. Various benchmark datasets have been released, each of which containing large quantities of labeled images. Despite the databases were collected with the goal of being as rich and diverse as possible, inherent bias across datasets is unavoidable in practice.
[0003] The bias presents in the form of different characteristics and distributions exists across datasets. For instance, one set mainly contains white Caucasian male with mostly frontal faces, while another set consists of challenging samples with various poses or severe occlusions. In addition, the distribution difference between profile views can differ as much as over 10% across datasets. Clearly, training a model on one dataset forcefully would lead to over-fitting easily, and causing poor performance on unseen domain. To improve generalization, it is of practical interest to combine different databases so as to leverage the characteristics and distributions of multiple sources. This thought, however, is hindered by the annotation gaps, which requires huge effort to standardize before databases fusion is possible.
Summary of invention
[0004] In one aspect of the present application, there is disclosed a method for estimating facial landmarks for a face image, comprising:
retrieving a first face image dataset with first type landmark annotations and a second face image dataset with second type landmark annotations; transferring the first type landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type annotations for the second face image dataset; and
combining the second face image dataset with the pseudo second type landmark annotations and the first face image dataset to make the second face image dataset have the first type landmark annotations.
[0005] In another aspect of the present application, there is disclosed a system for estimating facial landmarks for a face image, comprising:
a transductive alignment device configured to retrieve a first face image dataset with first type landmark annotations and a second face image dataset with second type landmark annotations, and transfer the first type landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type annotations for the second face image dataset; and
a data augmentation device configured to combine the second face image dataset with the pseudo second type landmark annotations and the first face image dataset to make the second face image dataset have the first type landmark annotations.
Brief Description of the Drawing
[0006] Exemplary non-limiting embodiments of the present invention are described below with reference to the attached drawings. The drawings are illustrative and generally not to an exact scale. The same or similar elements on different figures are referenced with the same reference numbers.
[0007] Fig. 1 is s a schematic diagram illustrating an exemplary system 100 for transferring face landmark annotations according to one embodiment of the present application.
[0008] Fig. 2 is a schematic diagram illustrating an exemplary block diagram for the transductive alignment device 10 according to one embodiment of the present application.
[0009] Fig. 3 illustrates flow chart for the process 200 to show how the units 101-106 cooperate to obtain a pseudo S-Type annotation for the new training set.
[0010] Fig. 4 is a schematic flowchart illustrating detailed process for the transductive model training unit consistent with some disclosed embodiments of the present application.
[0011] Fig. 5 illustrates a flow chart for process of the data augmentation device consistent with another disclosed embodiment of the present application.
[0012] Fig. 6 is a schematic diagram illustrating an exemplary system for determining face landmark according to one embodiment of the present application.
[0013] Fig. 7 illustrates a flow chart for process of the training device to train the predicting device according to one embodiment of the present application.
[0014] Fig. 8 illustrates a flow chart for process of detailed process for the predicting device according to one embodiment of the present application.
Detailed Description
[0015] Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When appropriate, the same reference numbers are used throughout the drawings to refer to the same or like parts.
[0016] Fig. 1 is a schematic diagram illustrating an exemplary system 100 for transferring face landmark annotations according to one embodiment of the present application. As illustrated in Fig. 1, the system 100 for transferring face landmark annotations may comprise a transductive alignment device 10 and a data augmentation device 20.
[0017] The transductive alignment device 10 is configured to retrieve an first (original) training set for a first face image with S-type landmark annotations (hereinafter, also referred to "Set 1") and a second (new) training set with T-type landmark annotations (hereinafter, also referred to "Set 2"), and to transfer the S-type landmark annotations from the original face image dataset (training dataset) to the new training dataset so as to obtain pseudo S-Type annotations for the new training set. In the embodiments of the present application, the landmark annotations may comprise facial landmark points on a given face image, such as eyes, nose, and mouth corners. The data augmentation device 20 is then configured to combine the new training set with the pseudo S-Type landmark annotations and the original training set into an augmented dada training set, i.e. to make the new training set with the S-type landmark annotations. According to some embodiments of the present application S-type might be denser with a plurality of (for example, 194 or more) landmarks, and even face outer contour is annotated, while T-Type might be sparse, with only a few of (for example, 5) landmarks only on eyes and mouth corners).
[0018] This transductive alignment devicelO may only predict S-type annotation on new training dataset only when the T-type annotations on the new training set are provided. But the goal of the present application is to predict S-type annotations for arbitrary input face image, such that no T-type annotations are needed to predict the landmark annotations. Since more diversified training samples from the new training dataset are included, it can get more robust model for predicting S-type landmarks of facial images.
[0019] In one embodiment of the present application, the transductive alignment device is further configured to determine a transductive model {MPCA k, Mreg k] from a common landmark index between the first type landmark annotations and the second type landmark annotations, an initial first-type annotations, and the first face image dataset; and transfer, based on the transductive model, the first landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type annotations for the second face image dataset. Fig. 2 is a schematic diagram illustrating an exemplary block diagram for the transductive alignment device 10 according to one embodiment of the present application. As shown in Fig. 2, the transductive alignment device 10 may comprise a common landmarks determination unit 101, a mapping unit 102, a first annotation estimated unit 103, a transductive model unit 104, a second annotation estimated unit 105 and a pseudo annotation determination unit 106.
[0020] Fig. 3 illustrates flow chart for the process 300 to show how the units 101-106 cooperate to obtain pseudo S-Type landmark annotations for the new training dataset.
[0021] At step S301, the common landmark determination unit 101 operates to retrieve a first training dataset {I^ xs, B- ) for first face image with S-type landmark annotations (Set 1) and a second training set {I2, Χχ> B2}) with T-type landmark annotations (Set 2), wherein the first and the second training datasets include bounding box B] and B2 of each face in the image Ix and the image 12, respectively, where Ij represents face images from the training image set with index i, x^ represents landmark location (in x-y coordinates) and Bj and B2 represent bounding box of image 7X and the image I2 , respectively. And then the common landmarks determination unit 101 determines a plurality of common landmark index
(Xs) common f°r two types of annotations, i.e. S-type landmark annotations in data Set 1 and T-type landmark annotations in data Set 2. In the embodiment, common landmarks (Xs) common exist across the data Set 1 and data Set 2. Common landmark annotations are defined as facial landmarks that are well labeled with decisive semantic definition across different datasets, such as left and right eyes corners, mouth corners and pupil centers.
[0022] At step S302, the mapping unit 102 operates to learn mapping matrix T from common landmark annotation indexes (xs)common to S-type landmarks xs in original training dataset, i.e., Set 1. In order to learn the mapping, it may use simple linear regression, and the general learning scheme is T = (xscxsc ~lxscx > m which xsc is short for (xs)Common, and '*' in '(xs)Common * T' means matrix multiply, not convolution.
[0023] At step S303, the first annotation estimated unit 103 operates to compute the initial or estimated S-type annotations x on the data Set 1 based on the common landmarks ( s) common obtained from step S201 and the mapping T obtained from step S202 by rule of
x = (xs)common * T. 1)
[0024] At Step S304, the transductive model training unit 104 operates to determine a transductive model from the common landmarks index (xs) common from step S301, the initial S-type annotations x, and the first training dataset
{Ilf xs, B- ) with S-type landmark annotations (i.e., data Set 1), which will be discussed later in reference to Fig. 4.
[0025] At step S305, the second annotation estimated unit 105 receives new training dataset (i.e. Set 2) (with T-type annotations {I2, Xj, #2}) and use the mapping T obtained from S302 and the common landmark indexes (Xr^common obtained from S301 to get initialized/estimated annotation x for the new training dataset (data Set 2) by rule of
X ~ (.Χψ) common * T · 2)
[0026] At step S306, for each of the iteration K, the pseudo annotation determination unit 106 operates to extract local appearance information 0(x) for the data Set 1 , and feature Jacobian 0(x*)— 0(x) only for common landmarks
(½) common-^ and then concatenate the local appearance information 0(x) and the feature Jacobian as features / by rule of f(x)= [(0(**) - 0( ))common, 0(x)private] 3)
where [] means matrix concatenation,
0(x) is to extract local SIFT (Scale Invariant Feature Transform) features according to the coordinates x, and SIFT will be treated as black box.
[0027] And then the pseudo annotation determination unit 106 operates to calculate an estimated annotation error Δ x based on the transductive model by rule of:
A x = Mreg (MPCA(0 4) where, MPCA transforms original features into PCA (Principal Component Analysis) features, Mreg transforms PCA features into regression displacement target.
[0028] The pseudo annotation determination unit 106 then updates the current estimated annotation x by rule of formula 5) and outputs x from last iteration, i.e. the pseudo annotation {I2, ¾, 52}: x = x + A x 5)
[0029] Hereinafter, the detailed process for the transductive model training unit 104 will be further discussed in reference to Fig. 4.
[0030] At step S3041 , the training dataset will be prepared by the transductive model training unit 104. To be specific, the transductive model training unit 104 receives the first training dataset {I^ xs}) for a first face image with S-type landmark annotations (data Set 1) and prepares the following data and then begins to train for k iterations:
1) Common Landmark Index (xs)common?
2) Face Images I = Il 5
3) Initialized/estimated annotation x
4) Ground Truth annotation x* = xs
[0031] At step S3042, the transductive model training unit 104 operates to extract: (1) local appearance information φ(χ) for the data Set 1 , and (2) Feature Jacobian φ(χ*) — φ(χ) ONLY for common landmarks ( s)commom and then concatenates these two parts (1) and (2) as features f by rule of formula 3) as stated in the above.
[0032] At S3043, the transductive model training unit 104 computes the dissimilarity between the estimated current shape x and ground truth shape x* by rule of Δ x = x*— x.
[0033] At S3044, the transductive model training unit 104 gets a PCA projection model MPCA via performing PCA analysis on the features f; and gets a mapping Mreg via ridge regression from PCA-projected features to dissimilarity. In one embodiment of the present application, for the purpose of training, principle component analysis (PCA) is conducted using singular value decomposition, which outputs a PCA projection model MPCAcontaining a mean vector and projection coefficient. In the testing stage, the PCA -projected features are obtained by first subtracting the original feature with mean vector, and then performing matrix multiplication with the projection coefficient. Ridge regression is a mapping function containing coefficient and bias, which will be used to obtainAx as shown in Equation 4)·
[0034] At step S3045, the transductive model training unit 104 operates to determine if the estimated shape converge to the ground truth shape. If yes, at step S2046, the transductive model training unit 104 will determine the transductive model M (containing PCA (principle component analysis) projection model and mapping function for each iteration) by rule of
MT = {MPCA,k, Mreg,k), Vk = 1,2, .... 6)
[0035] Otherwise, at step S3047, the estimated annotation will be updated as x = x + Mreg(MPCA(f)) and then input it to step S3041.
[0036] Hereinafter, the data augmentation device 20 will be discussed in details. As mentioned, the data augmentation device 20 is configured to combine the new training set with the pseudo S-Type landmark annotations and the original training set into an augmented dada training set. The S-type landmark annotations for the new training set might be inaccurate, so it is called "pseudo S-Type annotation" and thus subsequent data augmentation process is necessary to move the error from the pseudo S-Type annotation.
[0037] Fig. 5 illustrates a flow chart 500 for process of the data augmentation device 20. In particular, at step S501, the data augmentation device 20 operates to filter erroneous transferred annotations from the pseudo S-Type landmark annotations in the new training dataset by comparing the estimated common landmarks xs and ground truth common landmarks so as to get a cleaned training set {Ι2', Xs'> B2'}. At step 502, the data augmentation device 20 receives original training set (data Set 1) (with S-type landmark annotations {Ι1; xs, B- ) and then combine the cleaned new training set with the original training set to obtain {IA, xs, B}.
[0038] Fig. 6 is a schematic diagram illustrating an exemplary system 1000 for determining face landmark according to one embodiment of the present application. As shown in Fig. 6, besides the transductive alignment device 10 and the data augmentation device 20, the system 1000 may further comprise a training device 30 and a predicting device 40. The operations of the transductive alignment device 10 and the data augmentation device 20 in system 1000 are the same as those of system 100, and thus detailed description thereof will be omitted hereinafter.
[0039] The combined dataset generated by the data augmentation device 20 may be treated as the predetermined training set for the training device 30 to train the predicting device 40.
[0040] Fig. 7 illustrates a flow chart 700 for process of the training device 30 to train the predicting device 40. At step S701, the training device 30 receives from the data augmentation device 20 the augmented training set with bounding box of images {IA, xs, B} and then learn an initializing function init(B) to estimate relation between initial landmarks and the bounding box B, so as to get initialized landmarks x according to the bounding box B and learned init(B) . The function intit may be determined intuitively. For example, it may generate initial landmarks relative to the bounding box, e.g. to locate initial left eye center, the relative place from all training samples will be averagely learned, and then it finds left eye is located with height 0.25 to the up and 0.3 to the left. If there is a bounding box of testing samples with up: 100, down:200,left:500,right:600, then the initial coordinates of left eye center would be x=530, y=125. The present application always uses the 0.25 and 0.3 for all samples with respect to left eye center and other landmarks are the same.
[0041] At step S702, the training data set will be prepared. To be specific, the training device 30 receives the first training set xs}) for the first face image with S-type landmark annotation (data Set 1) and prepares the following data and then begins to to train for k iterations:
Face Images I = IA,
Initialized/estimated annotation x
Ground Truth annotation x* = xs,
[0042] At step S703, the training device 30 operates to extract local appearance information φ(χ) for the augmented training set {IA, xs, B} and represents the extract local appearance information as features f.
[0043] At step S704, the training device 30 operates to compute the
dissimilarity Δ x between the estimated current shape x and ground truth shape x* by rule of Δχ = x*— x.
[0044] At step S705, the training device 30 gets a PCA (principle component analysis) projection model MPCA k via performing PCA analysis on the features f; and gets a mapping Mreg k via ridge regression from PCA-projected features to dissimilarity.
[0045] At step S706, the training device 30 operates to determine if the estimated shape converge to the ground truth shape. If yes, at step S707, the training device 30 will determine a Model M = {MPCA k, Mreg k), Vk = 1,2, .... (containing
PCA projection model and mapping function for each iteration)
[0046] Otherwise, at step S708, the estimated annotation will be updated as x = x + Mreg(MPCA(f)) and then input it to step S702 to repeat the steps S703-708 so as to obtain a Robust Model Trained Model M and initializing function init(B).
[0047] Referring to Fig. 6 again, the predicting device 40 is configured to receive face image with pre-detected bounding box B and predict the facial landmarks positions, i.e. estimated 2D coordinates (x and y) of facial landmarks of the received face image. The detailed process for the predicting device 40 will be further discussed in reference to Fig. 8.
[0048] At step S801, the predicting device 40 gets the initializing function init(B) from the raining device 30 and gets initialized landmarks x according to the bounding box B and init(B) for the received face image. At step S802, the predicting device 40 gets the robust model trained model M from the training device 30, and then, for each iteration, the predicting device 40 computes local appearance information φ(χ) as features f and calculates the estimated Δ χ by rule of
Δ x = Mreg(MPCA(f)) . And then, the predicting device 40 operates to update the landmarks x by rule of x = x + Δ x. In the last, the unit 40 output x from last iteration of the iteration K.
[0049] In the above, the systems 10 and 100 have been discussed in the case they are implemented using certain hardware or the combination of the hardware and the software. It shall be appreciated that the systems 10 and 100 may be also implemented using software. In addition, the embodiments of the present invention may be adapted to a computer program product embodied on one or more computer readable storage media (comprising but not limited to disk storage, CD-ROM, optical memory and the like) containing computer program codes.
[0050] In the case that the systems 10 and 100 are implemented with software, the these systems 100 may run in a general purpose computer, a computer cluster, a mainstream computer, a computing device dedicated for providing online contents, or a computer network comprising a group of computers operating in a centralized or distributed fashion.
[0051] Although the preferred examples of the present invention have been described, those skilled in the art can make variations or modifications to these examples upon knowing the basic inventive concept. The appended claims is intended to be considered as comprising the preferred examples and all the variations or modifications fell into the scope of the present invention. [0052] Obviously, those skilled in the art can make variations or modifications to the present invention without departing the spirit and scope of the present invention. As such, if these variations or modifications belong to the scope of the claims and equivalent technique, they may also fall into the scope of the present invention.

Claims

What is claimed is:
1. A method for estimating facial landmarks for a face image, comprising:
retrieving a first face image dataset with first type landmark annotations and a second face image dataset with second type landmark annotations;
transferring the first type landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type annotations for the second face image dataset; and
combining the second face image dataset with the pseudo second type landmark annotations and the first face image dataset to make the second face image dataset have the first type landmark annotations.
2. A method according to claim 1, wherein the first type landmark annotations comprise S-type landmark annotations; and the second type landmark annotations comprise T-type landmark annotations.
3. A method according to claim 1, wherein the transferring further comprises: determining a transductive model from a common landmark index between the first type landmark annotations and the second type landmark annotations, initial first-type annotations, and the first face image dataset; and
transferring, based on the transductive model, the first landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type landmark annotations for the second face image dataset.
4. A method according to claim 3, wherein the determining further comprises:
1) determining a plurality of common landmarks indexes for the first type landmark annotations and the second type landmark annotations;
2) learning a mapping matrix from the determined common landmark indexes (Xs) common to the first type landmark annotations; 3) determining initial/estimated first-type annotations for the second face image dataset based on the common landmark indexes and the mapping matrix;
4) determining the transductive model {MPCA k, Mreg k from the common landmark index, the initial first type annotations, and the first face image dataset.
5. A method according to claim 4, wherein the transferring further comprises:
5) determining an estimated annotation x for the second face image dataset from the mapping matrix and the common landmark indexes;
6) determining an estimated error Δ x based on the transductive model, local appearance information 0(x) for the first face image dataset, and Feature Jacobian 0(x*)— 0(x) for common landmark indexes (xs^common-''
7) updating the current estimated annotation x by rule of x = x + Δ x so as to obtain the pseudo landmark annotations,
where x* represents a ground truth annotation for x,
Bx and B2 represent a bounding box of an image for the first face image dataset and the second face image dataset, respectively.
6. A method according to claim 5, wherein the step 6) further comprises:
extracting local appearance information 0(x) for the first face image dataset; and Feature Jacobian for common landmark indexes (Xs)common ■ ' >
concatenating the local appearance information and the Feature Jacobian; and determining, from the concatenation of the local appearance information and the
Feature Jacobian, an estimated error Δ x based on the transductive model.
7. A method according to claim 5, wherein the step 4) further comprises:
a) extracting local appearance information for the first face image dataset, and Feature Jacobian for the common landmark indexes ; b) concatenating the local appearance information and the Feature Jacobian; c) computing a dissimilarity Δ x between an estimated current shape x and a ground truth shape x*;
d) getting a PCA projection model MPCA via performing PCA analysis on the features f, where f represents a concatenation of the local appearance information and the Feature Jacobian
e) getting a mapping model Mreg via ridge regression from PCA-projected features to the dissimilarity;
f) determining if the estimated shape converge to the ground truth shape;
if yes, determining the transductive model {MPCA, Mreg];
otherwise, updating the estimated annotation by rule x = x + Mreg (MPCA(fy) and then repeating the above steps a)-f) with the updated annotation.
8. A method according to claim 1 , wherein the combining further comprises: comparing the estimated common landmark indexes xs and the ground truth common landmark indexes to get erroneous transferred annotations from the pseudo first type landmark annotations in the second face image dataset;
filtering out erroneous transferred annotations so as to get a cleaned face image dataset {I2 ' , B2 ' };
receiving the first face image dataset {\lt xs, B-J) ;and
combining the cleaned new face image dataset with the first face image dataset to obtain an augmented face image dataset {IA, xs, B}.
9. A method according to claim 8, further comprising:
receiving the augmented face image dataset with bounding box of images {IA, xs, B}, where B represents a bounding box of an image in the augmented face image dataset; Xs represents landmark annotation and IA represents index of face image, and estimating relation between initial landmarks and the bounding box B, so as to get initialized landmark indexes x according to the bounding box B.
10. A method according to claim 9, further comprising:
receiving the first face image dataset {I^ xs}) and preparing the following data and then begins to train for k iterations:
face Images I = IA,
initialized/estimated annotation x
ground truth annotation x* = xs,
extracting local appearance information φ(χ) for the augmented face image dataset {IA, xs, B} and representing the extract local appearance information as features f. computing a dissimilarity Δ χ between the estimated current shape x and ground truth shape x*;
determining a PCA projection model MPCA k via performing PCA analysis on the features f;
determining a mapping Mreg k via ridge regression from PCA-projected features to dissimilarity;
determining if the estimated shape converge to the ground truth shape;
if yes, determining a Model M = {MPCA k, reg,k), Vk = 1,2 ; otherwise, updating the estimated annotation as x = x + Mreg(MPCA(f)) and repeating the above steps so as to obtain a Robust Model Trained Model M.
1 1. A method according to claim 10, further comprising:
receiving a face image with a pre-detected bounding box B; and
predicting facial landmarks positions of facial landmarks of the received face image.
12. A method according to claim 11, wherein the predicting further comprises: getting initialized landmarks x according to the bounding box B for the received face image;
computing local appearance information for the received face image;
calculating an estimated error Δ χ by rule of Δ χ = Mreg(MpCA(f)), where f represents the local appearance information; and
updating the landmarks x by rule of x = x + Δ x.
13. A system for estimating facial landmarks for a face image, comprising:
a transductive alignment device configured to retrieve a first face image dataset with first type landmark annotations and a second face image dataset with second type landmark annotations, and transfer the first type landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type annotations for the second face image dataset; and
a data augmentation device configured to combine the second face image dataset with the pseudo second type landmark annotations and the first face image dataset to make the second face image dataset have the first type landmark annotations.
14. A system according to claim 13, wherein the first type landmark annotations comprise S-type landmark annotations; and the second type landmark annotations comprise T-Type landmark annotations.
15. A system according to claim 13, wherein the transductive alignment device is further configured to determine a transductive model from common landmark indexes between the first type landmark annotations and the second type landmark annotations, an initial first-type annotations, and the first face image dataset and transfer, based on the transductive model, the first landmark annotations from the first face image dataset to the second face image dataset to obtain pseudo first type landmark annotations for the second face image dataset.
16. A system according to claim 13, wherein the transductive alignment device further comprises:
a common landmarks determination unit configured to determine a plurality of common landmark indexes for the first type landmark annotations and the second type landmark annotations;
a mapping unit configured to learn a mapping matrix from the determined common landmark indexes to the first type landmark annotations;
a first annotation estimated unit configured to determine initial/estimated first-type annotations for the second face image dataset based on the common landmark indexes and the mapping matrix;
a transductive model training unit configured to determine the transductive model from the common landmark indexes, the initial first type annotations, and the first face image dataset.
17. A system according to claim 16, wherein the transductive alignment device further comprises:
a second annotation estimated unit configured to determine an estimated annotation x for the second face image dataset from the mapping matrix and the common landmark indexes;
a pseudo annotation determination unit configured to determine an estimated error Δ χ based on the transductive model, local appearance information 0(x) for the first face image dataset, and Feature Jacobian 0(x*)— for common landmark indexes; and then update the current estimated annotation x by rule of x = x + Δ x so as to obtain the pseudo annotations,
where x* represents a ground truth annotation for x, B1 and B2 represent a bounding box of an image for the first face image dataset and the second face image dataset, respectively.
18 A system according to claim 17, wherein the pseudo annotation determination unit is further configured to determine the estimated error Δ x by
extracting local appearance information 0(x) for the first face image dataset; and Feature Jacobian for common landmark indexes (xs)common ■ ' >
concatenating the local appearance information and the Feature Jacobian; and determining, from the concatenation of the local appearance information and the
Feature Jacobian, an estimated error Δ x based on the transductive model.
19. A system according to claim 17, wherein pseudo annotation determination unit is further configured to obtain the pseudo annotation by:
a) extracting local appearance information for the first face image dataset, and Feature Jacobian for the common landmark indexes ;
b) concatenating the local appearance information and the Feature Jacobian; c) computing a dissimilarity Δ x between an estimated current shape x and a ground truth shape x*
d) getting a PCA projection model MPCA k via performing PCA analysis on the features f, where f represents the concatenation of the local appearance information and the Feature Jacobian
e) getting a mapping model Mreg k via ridge regression from PCA-projected features to the dissimilarity;
f) determining if the estimated shape converge to the ground truth shape;
If yes, determining the transductive model {MPCA k, Mreg k], V/c = 1,2, ....;
otherwise, updating the estimated annotation by rule x = x + Mreg MPCA(j ) and then repeating the above steps a)-f) with the updated annotation.
20. A system according to claim 13, wherein the data augmentation device is further configured to:
compare the estimated common landmark indexes xs and the ground truth common landmark indexes to get the erroneous transferred annotations from the pseudo first type landmark annotations in the second face image dataset;
filter out erroneous transferred annotations so as to get a cleaned face image dataset {I2 ' , B2 ' };
receive the first face image dataset {Ιχ, xs, B-J) ;and
combine the cleaned new face image dataset with the first face image dataset to obtain an augmented face image dataset {IA, xs, B}.
21. A system according to claim 20, further comprising:
a training device configured to receive the augmented face image dataset with bounding box of images {IA, xs, B}, where B represents a bounding box of an image in the augmented face image dataset; Xs represents landmark annotation and IA represents index of face image, and
wherein the predicting device estimate relation between initial landmarks and the bounding box B, so as to get initialized landmarks x according to the bounding box B.
22. A system according to claim 21, wherein the training device is further configured to train a Robust Model Trained Model by:
receiving the first face image dataset \lt xs}) and preparing the following data and then begins to a train for k iteration:
face Images I = IA,
initialized/estimated annotation x
ground truth annotation x* = xs, extracting local appearance information φ(χ) for the augmented face image dataset {IA, xs, B} and representing the extract local appearance information as features f. computing a dissimilarity Δ χ between the estimated current shape x and ground truth shape x*;
determining a PCA projection model MPCA k via performing PCA analysis on the features f;
determining a mapping Mreg k via ridge regression from PCA-projected features to dissimilarity;
determining if the estimated shape converge to the ground truth shape;
if yes, determining a Model M = {MPCA k, reg,k), Vk = 1,2 ; otherwise, updating the estimated annotation as x = x + Mreg(MPCA(f)) and repeating the above steps so as to obtain a Robust Model Trained Model.
22. A system according to claim 21 , further comprising:
a predicting device configured to receive a face image with a pre-detected bounding box B, and predict facial landmarks positions of facial landmarks of the received face image.
23. A system according to claim 22, wherein the predicting device is further configured to predict facial landmarks positions by:
getting initialized landmarks x according to the bounding box B and init(B) for the received face image;
computing local appearance information for the received face image; calculating an estimated error Δ x by rule of
Δ χ = Mreg(MpCA(f)), where f represents the local appearance information; and updating the landmarks x by rule of x = x + Δ x.
PCT/CN2014/000785 2014-08-20 2014-08-20 A method and a system for estimating facial landmarks for face image WO2016026064A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2014/000785 WO2016026064A1 (en) 2014-08-20 2014-08-20 A method and a system for estimating facial landmarks for face image
CN201480082760.XA CN107004136B (en) 2014-08-20 2014-08-20 Method and system for the face key point for estimating facial image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/000785 WO2016026064A1 (en) 2014-08-20 2014-08-20 A method and a system for estimating facial landmarks for face image

Publications (1)

Publication Number Publication Date
WO2016026064A1 true WO2016026064A1 (en) 2016-02-25

Family

ID=55350057

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/000785 WO2016026064A1 (en) 2014-08-20 2014-08-20 A method and a system for estimating facial landmarks for face image

Country Status (2)

Country Link
CN (1) CN107004136B (en)
WO (1) WO2016026064A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109858382A (en) * 2019-01-04 2019-06-07 广东智媒云图科技股份有限公司 A method of portrait is drawn according to dictation
WO2021246821A1 (en) * 2020-06-05 2021-12-09 주식회사 픽스트리 Method and device for improving facial image
JP7445785B2 (en) 2020-07-24 2024-03-07 深▲チェン▼市富途網絡科技有限公司 Information processing methods, devices, electronic devices and storage media

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113192162B (en) * 2021-04-22 2022-12-02 清华珠三角研究院 Method, system, device and storage medium for driving image by voice

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1701339A (en) * 2002-09-19 2005-11-23 汤晓鸥 Portrait-photo recognition
US20060008149A1 (en) * 2004-07-12 2006-01-12 The Board Of Trustees Of The University Of Illinois Method of performing shape localization
CN103268623A (en) * 2013-06-18 2013-08-28 西安电子科技大学 Static human face expression synthesizing method based on frequency domain analysis
US20130287294A1 (en) * 2012-04-30 2013-10-31 Cywee Group Limited Methods for Generating Personalized 3D Models Using 2D Images and Generic 3D Models, and Related Personalized 3D Model Generating System

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009067560A1 (en) * 2007-11-20 2009-05-28 Big Stage Entertainment, Inc. Systems and methods for generating 3d head models and for using the same
CN102436668A (en) * 2011-09-05 2012-05-02 上海大学 Automatic Beijing Opera facial mask making-up method
US8977012B2 (en) * 2012-10-31 2015-03-10 Google Inc. Image denoising system and method
US20140185924A1 (en) * 2012-12-27 2014-07-03 Microsoft Corporation Face Alignment by Explicit Shape Regression
CN103390282B (en) * 2013-07-30 2016-04-13 百度在线网络技术(北京)有限公司 Image labeling method and device thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1701339A (en) * 2002-09-19 2005-11-23 汤晓鸥 Portrait-photo recognition
US20060008149A1 (en) * 2004-07-12 2006-01-12 The Board Of Trustees Of The University Of Illinois Method of performing shape localization
US20130287294A1 (en) * 2012-04-30 2013-10-31 Cywee Group Limited Methods for Generating Personalized 3D Models Using 2D Images and Generic 3D Models, and Related Personalized 3D Model Generating System
CN103268623A (en) * 2013-06-18 2013-08-28 西安电子科技大学 Static human face expression synthesizing method based on frequency domain analysis

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109858382A (en) * 2019-01-04 2019-06-07 广东智媒云图科技股份有限公司 A method of portrait is drawn according to dictation
WO2021246821A1 (en) * 2020-06-05 2021-12-09 주식회사 픽스트리 Method and device for improving facial image
JP7445785B2 (en) 2020-07-24 2024-03-07 深▲チェン▼市富途網絡科技有限公司 Information processing methods, devices, electronic devices and storage media

Also Published As

Publication number Publication date
CN107004136A (en) 2017-08-01
CN107004136B (en) 2018-04-17

Similar Documents

Publication Publication Date Title
CN110969250B (en) Neural network training method and device
CN108027878B (en) Method for face alignment
CN113449857B (en) Data processing method and data processing equipment
JP7360497B2 (en) Cross-modal feature extraction method, extraction device, and program
US20190301861A1 (en) Method and apparatus for binocular ranging
WO2016026063A1 (en) A method and a system for facial landmark detection based on multi-task
WO2016026135A1 (en) Face alignment with shape regression
JPWO2005119507A1 (en) High-speed and high-precision singular value decomposition method, program and apparatus for matrix
CN104679818A (en) Video keyframe extracting method and video keyframe extracting system
Conroy et al. Fast, exact model selection and permutation testing for l2-regularized logistic regression
WO2016026064A1 (en) A method and a system for estimating facial landmarks for face image
Steedly et al. Spectral Partitioning for Structure from Motion.
WO2023151237A1 (en) Face pose estimation method and apparatus, electronic device, and storage medium
CN112767230A (en) GPU graph neural network optimization method and device
Kumar et al. Modeling latent variable uncertainty for loss-based learning
CN114913330B (en) Point cloud component segmentation method and device, electronic equipment and storage medium
WO2015035593A1 (en) Information extraction
CN104463864B (en) Multistage parallel key frame cloud extracting method and system
WO2012091539A1 (en) A semantic similarity matching system and a method thereof
CN112650869B (en) Image retrieval reordering method and device, electronic equipment and storage medium
Jo et al. Ransac versus cs-ransac
CN113935387A (en) Text similarity determination method and device and computer readable storage medium
JP5754306B2 (en) Image identification information addition program and image identification information addition device
Qin et al. Larger receptive field based RGB visual relocalization method using convolutional network
CN106157289B (en) Line detecting method and equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14900018

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14900018

Country of ref document: EP

Kind code of ref document: A1