US6347156B1 - Device, method and storage medium for recognizing a document image - Google Patents

Device, method and storage medium for recognizing a document image Download PDF

Info

Publication number
US6347156B1
US6347156B1 US09/216,712 US21671298A US6347156B1 US 6347156 B1 US6347156 B1 US 6347156B1 US 21671298 A US21671298 A US 21671298A US 6347156 B1 US6347156 B1 US 6347156B1
Authority
US
United States
Prior art keywords
image
gray
subpixel
value
threshold value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US09/216,712
Inventor
Hiroshi Kamada
Katsuhito Fujimoto
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Assigned to FUJITSU LIMITED reassignment FUJITSU LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FUJIMOTO, KATSUHITO, KAMADA, HIROSHI
Application granted granted Critical
Publication of US6347156B1 publication Critical patent/US6347156B1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/40Picture signal circuits
    • H04N1/40068Modification of image resolution, i.e. determining the values of picture elements at new relative positions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/162Quantising the image signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/40Picture signal circuits
    • H04N1/40012Conversion of colour to monochrome
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the present invention relates to an image recognizing device.
  • the document image recognizing device is a device which uses a document image as an input, and performs a coding process by recognizing characters, etc. included in the document image.
  • the binary document image is a document image where image data of a character is represented, for example, by “1”, while the image data of the background is represented by “0”.
  • an image filing device has become popular, and also the demand for inputting an image filed by the image filing device to a document image recognizing device, and for recognizing the image has been increasing. Especially, the number of gray-scale or color documents including photographs has been growing. Therefore, the demand for recognizing not only binary documents but also gray-scale or color documents has been on the rise.
  • a color document image recognizing device recognizes also a color or gray-scale document image.
  • a conventional color document image recognizing device obtains a binary image by binarizing each brightness component with a predetermined threshold value, and recognizes the obtained binary image, if an input document image is not a binary image but a gray-scale or color image.
  • FIGS. 1A and 1B respectively show the configuration of the conventional color document image recognizing device and an extended color text image.
  • a document image inputting unit 170 is a unit for inputting a document image, and is typically implemented as a scanner, etc.
  • parameters such as a color parameter, a brightness parameter, etc. are assigned to respective pixels by illuminating the document, receiving a reflected light, and analyzing the received light. If the document to be scanned is a gray-scale document, the light reflected from the document is analyzed, the information about the level of brightness is obtained, and this information is assigned to each pixel. At this time, all of the pixels of the gray-scale document are detected to be black-and-white, which is set as the color parameters of each of the pixels.
  • a brightness image extracting unit 171 extracts a brightness component for each pixel, and outputs a brightness image which is a gray-scale image to a predetermined threshold value binarizing unit. If the input image is a gray-scale image, the color parameters of all of the pixels are set to be black-and-white. Therefore, the gray-scale image resultant from the process of the brightness image extracting unit 171 will become the image data having the same brightness data as that of the input gray-scale image, in principle. This is because only the parameters related to the hue are removed from all the color parameters of the processed gray-scale image.
  • a predetermined threshold value binarizing unit 172 obtains a binary image by binarizing the gray-scale image with a predetermined threshold value.
  • This threshold value is a value which is externally determined and input.
  • a gray-scale image fundamentally indicates not the gray-scale image with color parameters set to black-and-white, which is scanned by the document image inputting unit 170 , but the brightness image resultant from the process performed by the brightness image extracting unit 171 . That is, the gray-scale image is defined to have not color but brightness parameters. Even if the gray-scale image has color parameters, only brightness parameters are substantially valid for recognizing the image if the color parameters are set to black-and-white for all of the pixels. Accordingly, image recognition can be made also by using such a gray-scale image.
  • a binary image recognizing unit 173 recognizes a binary image. That is, this unit recognizes characters by obtaining the features of the document image which is binarized by the predetermined threshold value binarizing unit 172 , and replaces the characters with the codes which are internally used by a computer and correspond to the recognized characters.
  • a recognition result outputting unit 174 outputs the result of the character recognition made by the binary image recognizing unit 173 , that is, the file which is restructured into a code sequence representing the characters of a document image.
  • the conventional color document image recognizing device has a disadvantage that recognition accuracy is low.
  • the color document image is input to a color document image recognizing device at a resolution lower than that of the binary image in order to reduce an amount of processing time of an image input device such as a scanner, etc., and a capacity of a memory used for filing an image, etc.
  • FIG. 1B illustrates an expanded low-resolution color text image.
  • This figure shows a monochromatic image of a 150-dpi text image in full color.
  • a color display many colors can be identified on the periphery of the characters, and it seems difficult to identify an area by extracting the same color.
  • the conventional document image recognizing device is fundamentally designed to have an input of a binary image with a small amount of data, and assumes the document image of a standard resolution of approximately 400 dpi. Accordingly, if a color document image of 150 or less dpi is input, the conventional device converts the document image into a binary image with the resolution equal to or less than 150 dpi, and recognizes the converted image. Therefore, the device cannot recognize the image with sufficient accuracy.
  • An object of the present invention is to provide a high-speed document image recognizing device which implements high recognition accuracy.
  • An image recognizing device comprises: an image converting unit for converting an input document image into a gray-scale image if the input document image is a color image, and for newly outputting a gray-scale image if the input document image is a gray-scale image; a variable resolution binarizing unit for converting the gray-scale image into a binary image with a higher resolution according to the resolution of the gray-scale image; and a unit for recognizing the binarized image.
  • An image recognizing method comprises the steps of: (a) converting an input document image into a gray-scale image if the input document image is a color image, and for newly outputting a gray-scale image if the input document image is a gray-scale image; (b) converting the gray-scale image into a binary image with a higher resolution according to the resolution of the gray-scale image; and (c) recognizing the binarized image.
  • a gray-scale image obtained by converting a color document image or an input gray-scale image is converted into the image data having a suitably higher resolution according to the resolution of the gray-scale image, thereby binarizing the image without losing the information about brightness levels of a gray scale. Therefore, characters appearing in a binary image can be prevented from being defaced, thereby implementing character recognition with higher accuracy.
  • a drawing area including characters, etc. is roughly extracted from a document image, and a binarization process according to the present invention is performed only for the extracted area, thereby improving the processing speed.
  • FIG. 1A shows the configuration of a conventional color document image recognizing device
  • FIG. 1B shows an expanded color text image
  • FIG. 2 is a block diagram showing the principle of a color document image recognizing device according to a preferred embodiment
  • FIGS. 3A and 3B exemplify the configurations of a variable resolution binarizing unit
  • FIG. 4 is a schematic diagram explaining the principle of the process for generating subpixels
  • FIG. 5 exemplifies the table for specifying the number and the resolution of subpixels to be generated according to the resolution of an input gray-scale image
  • FIG. 6 exemplifies an additional configuration of the variable resolution binarizing unit
  • FIG. 7 is a schematic diagram explaining the method for setting a global threshold value used for a rough extraction process
  • FIGS. 8A and 8B exemplify the further configurations of the variable resolution binarizing unit
  • FIG. 9 exemplifies the details of a first configuration of a locally binarizing unit
  • FIG. 10 exemplifies the details of a second configuration of the locally binarizing unit
  • FIG. 11 exemplifies a third configuration of the locally binarizing unit
  • FIG. 12 exemplifies a fourth configuration of the locally binarizing unit
  • FIG. 13 exemplifies a fifth configuration of the locally binarizing unit
  • FIG. 14 is a schematic diagram explaining the method for calculating a local average value and a local square average value from original pixel values
  • FIG. 15 exemplifies a sixth configuration of the locally binarizing unit
  • FIG. 16 exemplifies a seventh configuration of the locally binarizing unit
  • FIG. 17 is a schematic diagram exemplifying the processing up to the process for binarizing a color or gray-scale image with the process according to the preferred embodiment (No. 1);
  • FIG. 18 is a schematic diagram exemplifying the processing up to the process for binarizing a color or gray-scale image with the process according to the preferred embodiment (No. 2);
  • FIG. 19 is a block diagram explaining the configuration of the hardware required for implementing the preferred embodiment as software.
  • an input image is converted into a binary image with a higher resolution according to the resolution of the input image in order to overcome the above described problems of the conventional device.
  • an input document image is a color image, it is converted into a gray-scale image and further converted into a binary image with the resolution according to that of the input image.
  • the converted binary image is recognized and the characters are converted into electronic codes.
  • a method for obtaining a binary image after performing the subpixel generation process which increases the number of pixels included in a gray-scale image by interpolating the values of the pixels included in the gray-scale image is considered.
  • a specific method of the subpixel generation process a linear interpolation method between pixel values can be cited.
  • the method for extracting a character portion and its periphery from an entire image, for suitably generating a threshold value within the extracted partial image, and for performing character recognition can be cited as a method for improving the recognition ratio of a color or gray-scale image.
  • the result of the binarization process allows a character to be clearly shaped by reducing the light and shade of an area having a higher level of brightness in the background, that is, noise components rather than by preventing the information of brightness components from being lost.
  • FIG. 2 is a block diagram showing the principle of a color document image recognizing device according to a preferred embodiment.
  • a document image inputting unit 10 is a unit corresponding to a scanner in a similar manner as in the above described conventional technique, and is intended to illuminate a color or gray-scale document (including parameters related to hue), and to capture the document as an image.
  • a brightness image extracting unit 11 is a unit for extracting only brightness components from the image input by the document image inputting unit 10 , and for generating a gray-scale image (which does not include parameters related to hue).
  • the gray-scale image which is the output of the brightness image extracting unit 11 is input to a variable resolution binarizing unit 12 to be described next.
  • variable resolution binarizing unit 12 (roughly) extracts a partial area including characters from the input gray-scale image having a low resolution, and provides a binary image where characters are easily recognized to a binary image recognizing unit 13 at a succeeding stage by generating subpixels between original pixels within the gray-scale image and interpolating the information about brightness components.
  • the binary image recognizing unit 13 performs character recognition based on the binary image passed from the variable resolution binarizing unit 12 , and performs the process for replacing characters of an image with electronic codes.
  • a recognition result outputting unit 14 receives the document file as an electronic code sequence from the binary image recognizing unit 13 , stores the document file onto a storage medium such as a hard disk, etc., and outputs the document file as a recognition result to a display monitor.
  • the configurations of the document image inputting unit 10 , the brightness image extracting unit 11 , the binary image recognizing unit 13 , and the recognition result outputting unit 14 are similar to those of the conventional technique, their detailed explanations are omitted here. That is, even if they are omitted, a skilled artisan can easily understand the configurations of the document image inputting unit 10 , the binary image recognizing unit 13 , and the recognition result outputting unit 14 , and can actually use them in a current situation where devices and software for recognizing a black-and-white document are commercialized. Additionally, the skilled artisan can easily understand also the configuration of the brightness image extracting unit 11 , because this unit is intended to convert a color image, etc. into a gray-scale image, and, at present, a color image is converted into a black-and-white image and telecast. Therefore, its explanation is omitted here.
  • variable resolution binarizing unit 12 converts a gray-scale image into the binary image having the resolution according to that of an input image, or binarizes a gray-scale image after roughly extracting a portion including characters from a gray-scale image, so that the image can be used for character recognition. Additionally, the resolution conversion and the rough extraction may be performed at the same time.
  • FIGS. 3A and 3B are block diagrams exemplifying the configurations of the variable resolution binarizing unit.
  • variable resolution binarizing unit subpixels are generated between original pixels of a gray-scale image by a subpixel processing unit 20 , the resolution is increased according to the resolution of the gray-scale image which is an input image, and then the conventional process for binarizing an image with a predetermined threshold value is performed by a predetermined threshold value binarizing unit 21 , as shown in FIG. 3 A.
  • a predetermined threshold value is defined to be used for a single document image. For example, only one threshold value is used for binarizing a 1-page document image.
  • the process for generating subpixels is a process for subdividing the space between original pixels of a gray-scale image whose recognition ratio is not likely to be improved if it is binarized unchanged, and for generating virtual pixel data in the original data.
  • the pixel value (the level of brightness) of a subpixel is obtained by interpolating the levels of brightness of original pixels of an input gray-scale image.
  • Typical of the interpolation method is a linear interpolation method.
  • a locally binarizing unit 23 shown in FIG. 3B may be arranged instead of the predetermined threshold value binarizing unit 21 shown in FIG. 3A, as illustrated in FIG. 3 B.
  • the locally binarizing unit 23 sets the local area where the process is to be performed for each pixel included in a drawing area within an entire gray-scale document image, and obtains a binary image by binarizing the image with the threshold value generated by using the pixel data within the local area.
  • the threshold value of the level of brightness is obtained in a local range (local area such as a regular square area centering around a target pixel), and the binarization process is performed. That is, the binarization process is performed so that the pixel with the level of brightness which is equal to or lower than the threshold value is black (the level of brightness is set to, for example, “1”), and the pixel with the level of brightness which is equal to or higher than the threshold value is white (the level of brightness is set to, for example, “0”).
  • the threshold value of a local area is defined, for example, with the method using the linear combination of an average pixel value, a standard deviation value, and a variance.
  • the threshold value is defined, for example, as follows. In the following equation, a local binarization parameter is a constant, and an optimum threshold value is obtained by suitably setting this parameter. Note that the gray scale is used almost similar to the brightness in the following equation.
  • FIG. 4 is a schematic diagram explaining the principle of the process for generating subpixels.
  • black circles indicate original pixels of a gray-scale image, while white circles indicate subpixels.
  • IA through ID indicate the levels of the gray scale of the original pixels A through D.
  • I 1 through I 5 indicate the levels of gray scale of the subpixels 1 through 5 to be obtained from the levels of gray scale of the original pixels with the interpolation process.
  • subpixels are generated with the linear interpolation within an area enclosed by the four original pixels, it is first determined how many subpixels will be arranged between the original pixels. Next, subpixels are positioned at regular intervals according to the number of subpixels to be arranged. Then, the levels of gray scale are assigned to the respective subpixels by interpolating the levels of gray scale of the original pixels.
  • a subpixel 1 is arranged at the position represented by “p:1 ⁇ p” in a straight line AB linking the original pixels A and B.
  • the levels of gray scale of the subpixel 1 is obtained from the levels of gray scales IA and IB of the original pixels A and B with the linear interpolation according to the following equation.
  • I 1 p*IB+ (1 ⁇ p ) *IA
  • the levels of gray scales of the subpixels 2 through 4 are obtained as follows.
  • the level of gray scale I 5 of the subpixel 5 included in the area enclosed by the original pixels A through D can be obtained according to the following equation.
  • I 5 p*q*ID+p* (1 ⁇ q ) *IB+q* (1 ⁇ p ) *IC+ (1 ⁇ p )*(1 ⁇ q ) *IA
  • the above described calculation is made for all of the subpixels arranged between the original pixels, so that the process for generating subpixels is completed.
  • the obtained levels of gray scale are stored with a method similar to that of original pixel data, along with the corresponding positions of the subpixels.
  • FIG. 5 exemplifies a table for specifying the number of subpixels to be generated, according to the resolution of a gray-scale image which is an input image.
  • the process for generating subpixels between original pixels must be performed at the beginning of the calculation process explained by referring to FIG. 4 .
  • the number of subpixels to be generated is arbitrary, it must be set so that sufficient character recognition accuracy can be obtained from the image for which subpixels are generated and which is binarized. Accordingly, there is no equation for uniquely determining the number of subpixels to be generated according to the resolution of an input image. The number of subpixels to be generated must be determined based on experimental data to some extent.
  • the resolution of an input image is stored in the left column in correspondence with a subpixel generation parameter in the right column. For example, if the resolution of an input image is 100 dpi, subpixels are generated to form a grid of 6 ⁇ 6 for each original pixel. If the resolution of an input image is 150 dpi, subpixels are generated in order to form a grid of 3 ⁇ 3 for each original pixel.
  • the above described subpixel processing unit stores the table shown in FIG. 5 .
  • the subpixel processing unit obtains the number of subpixels to be generated according to the resolution of the input image, and determines the positions of the subpixels.
  • the subpixel processing unit obtains the levels of gray scale of the respective subpixels from the levels of gray scale of the original pixels with the interpolation process by using the calculation method explained by referring to FIG. 4, and generates the image data having a high resolution according to the resolution of the input image.
  • FIG. 6 is a block diagram exemplifying a further configuration of the variable resolution binarizing unit.
  • a drawing area roughly extracting unit may perform the global process for recognizing as the periphery of characters the image portion whose pixel values are lower than a global threshold value set for the pixel values of a gray-scale image, and for roughly extracting the drawing area. Then, a local binarization process may be performed.
  • the drawing area roughly extracting unit 50 extracts the portion including characters and its periphery from a document image, and the locally binarizing unit binarizes the gray-scale image within the extracted area.
  • the rough extraction is performed by totaling the number of pixels having one level of gray scale or brightness of all of the pixels of a document image, as will be described later. If the number of pixels is large in a portion where the level of brightness is high, it indicates the background of a document image. If the number of pixels is large in a portion where the level of brightness is low, it indicates the portion including a character of the document image.
  • a threshold value is set to a value in the portion where the number of pixels is small, which corresponds to the middle of the two peaks.
  • This threshold value setting is similar to that of the binarization process. Note that, however, the position at which the threshold value is set is slightly shifted to a higher level of brightness. If the pixels included in a document image are extracted with such a threshold value (global threshold value), the extracted portion will include a character and its periphery. The locally binarizing unit 51 binarizes the image with another threshold value for the roughly extracted area, so that noise components caused by the light and shade of the background of the document image can be removed. Consequently, the binary image which is easier to be recognized can be obtained.
  • the global threshold value is determined with the linear sum of an average pixel value, a standard deviation value, and a variance. Or, it can be determined as follows.
  • a global process parameter which will appear in the following equation is a constant.
  • the position of the global threshold value is adjusted to the position where the rough extraction can be most effectively made by making an adjustment with the global process parameter.
  • FIG. 7 is a schematic diagram explaining how to set a global threshold value used for the rough extraction process.
  • the levels of brightness are obtained from all of the pixels structuring an entire document image, and the statistics of the frequency at which a pixel having a particular level of brightness appears in the document is collected in a similar manner as in the case where the predetermined threshold value is determined for the entire document image.
  • the appearance frequency of the pixel against the level of brightness forms a gentle curve.
  • the threshold value setting method If the statistical process representing the frequency at which a pixel having a certain level of brightness appears is actually performed with a device, the result will be a histogram.
  • frequency peaks are formed in the respective portions where the levels of brightness are low and high.
  • the portion where the level of brightness is high is the background of a document image, and this level of brightness is the level of brightness of the paper on which the document is created.
  • the portion where the level of brightness is low is an area where a character or a graphic, etc. is drawn. If the frequencies at which pixels appear are classified depending on the levels of brightness, the pixels are grouped into two major groups such as the pixels included in the drawing area and the pixels included in a background. Therefore, the rough extraction process can be performed by setting the threshold value of the level of brightness to almost the middle of the two peaks, and extracting the pixels having the levels of brightness which are lower than this threshold value from the document image.
  • a document is structured by the reverse video of a black-and-white image, the area where the two peaks are drawn and the background become reversed. Accordingly, the pixels having levels of brightness which are higher than a preset threshold value, are output as a drawing area.
  • the threshold value is set to a value corresponding to almost the middle of the two peaks, that is, the value at the bottom of the frequency curves. This is because the character recognition cannot be made with high accuracy if the pixels of the background are also captured.
  • the threshold value is set to the value which is slightly shifted from the bottom toward the peak of the background. In this way, the area where a character, etc. is drawn and its periphery are extracted.
  • FIGS. 8A and 8B are block diagrams exemplifying further configurations of the variable resolution binarizing unit.
  • the global process for roughly extracting a drawing area is performed after the subpixel generation process is performed for a gray-scale image. Then, the local binarization process is performed for each of the pixels included in the drawing area. As a result, the processes can be performed at a higher speed than that implemented by the configuration shown in FIG. 3A, and at the same time, the recognition accuracy is improved more than that available from the configuration shown in FIG. 6 .
  • the subpixel generation process is initially performed by a subpixel processing unit 70 , and then a document image is roughly extracted by a drawing area roughly extracting unit 71 .
  • the information about the level of brightness of a gray-scale image can be prevented from being lost, or rather, the information about the level of brightness is included.
  • the accuracy of the binarization process can be improved.
  • performing the rough extraction process eliminates the need to target the whole of a document image, thereby reducing the number of pixels to be processed and improving the processing speed of the locally binarizing unit.
  • the subpixel processing unit 70 performs the subpixel generation process for the whole of a document image.
  • the drawing area roughly extracting unit 71 performs the rough extraction process by also targeting subpixels. Accordingly, the processing speed and the accuracy can be improved further than that implemented in the case where only the rough extraction process is performed.
  • the subpixel generation process must be performed for the whole of the document image in this case. Therefore, the amounts of data handled by the subpixel processing unit 70 and the drawing area roughly extracting unit 71 increase, which leads to the slowdown of the processing speed and the insufficiency of the capacity of a memory storing data, although the processing speed of the locally binarizing unit 72 can be improved.
  • FIG. 8B is a block diagram exemplifying the configuration which improves the processing speed implemented by the configuration shown in FIG. 8 A.
  • the configuration shown in FIG. 8B is obtained by reversing the processing order of the subpixel processing unit 70 and the drawing area roughly extracting unit 71 , which are included in the configuration shown in FIG. 8 A. Because the process of the drawing area roughly extracting unit 73 is performed for the original gray-scale image whose number of pixels is not increased with the subpixel generation process, the amount of data handled by the drawing area roughly extracting unit 73 is reduced and the processing speed is made faster.
  • the subpixel processing unit 74 it is sufficient for the subpixel processing unit 74 to also perform the subpixel generation process, not for the whole of an input gray-scale image, but only for the area extracted by the drawing area roughly extracting unit 73 . Accordingly, the number of pixels to be handled is reduced much more than that in the case where the subpixel generation process is performed for the whole of the input gray-scale image, in the configuration shown in FIG. 8 A. Consequently, the capacity of the memory storing data can be decreased, and at the same time, the processing speed can be improved.
  • the amount of throughput of the locally binarizing unit 75 is approximately the same as that of the corresponding unit shown in FIG. 8A, the processing speed on the whole can be improved, and also the amount of required hardware resources such as a memory, etc. can be reduced.
  • the threshold value of the local binarization process is calculated from the subpixel value at the interpolation point (subpixel).
  • the subpixel value at the interpolation point is obtained from the original pixel values
  • the local binarization process can be performed according to the equation obtained by assigning the equation for obtaining a subpixel value at an interpolation point from original pixel values, to the equation for obtaining a local threshold value from the subpixel value at the interpolation point.
  • FIG. 9 is a block diagram exemplifying the details of the configuration of the locally binarizing unit.
  • a locally binarizing unit 81 is composed of a threshold value calculating unit 82 and a comparing unit 83 .
  • the threshold value calculating unit 82 directly obtains the threshold value of the local binarization process from a gray-scale image.
  • the comparing unit 83 makes a comparison between the threshold value and the value at the point where a subpixel is generated, which is obtained by the subpixel processing unit 80 , and outputs the result of the binarization process.
  • the threshold value calculating unit 82 sets a local area by targeting a particular pixel point, statistically classifies the level of gray scale (the level of brightness) at each pixel point within the local area as explained by referring to FIG. 7, and sets a threshold value which allows the distinction between the pixel having the level of brightness which represents a background, and the pixel having the level of brightness which represents a drawing area, to be made as definite as possible.
  • the rough extraction process is explained by referring to FIG. 7, and the threshold value which is shifted to the level of brightness, which represents a background, is described to be used.
  • the threshold value is set to the value corresponding to the bottom of the frequency curves shown in the middle of FIG. 7 .
  • the distinction between the pixel structuring a character and the pixel structuring a background can be definitely made.
  • a binary image which is easier to be recognized can be obtained.
  • the threshold value calculated by the threshold value calculating unit 82 is input to the comparing unit 83 .
  • the subpixel processing unit 80 generates the information about each pixel point of the gray-scale image for which the subpixel generation process is performed, and about the level of brightness at each pixel point, and inputs to the comparing unit 83 the information about the level of brightness of the pixel point targeted by the threshold value calculating unit 82 .
  • the comparing unit 83 then makes a comparison between the level of brightness at each pixel point and the threshold value.
  • the level of brightness at a pixel point, which is higher than the threshold value is set to “0”, while the level of brightness at the pixel point, which is lower than the threshold value, is set to “1”.
  • the pixel point to be targeted is sequentially changed, and the above described process is repeatedly performed, so that a binary image corresponding to the input gray-scale image can be obtained.
  • the pixel point referred to in the above explanation includes both an original pixel of an input gray-scale image and a subpixel generated by performing the subpixel generation process. If the pixel point is used to indicate an original pixel of an input gray-scale image, it is hereinafter referred to as an original pixel point.
  • the binarization process is performed for the whole of image data whose number of pixels is increased by performing the subpixel generation process for the whole of the input gray-scale image.
  • FIG. 10 shows the details of the second configuration of the locally binarizing unit.
  • the local threshold value after the subpixel generation process is obtained only at each original pixel point of a gray-scale image.
  • the local threshold value at an interpolation point (a subpixel point) is obtained by interpolating the local threshold values at the original pixel points of the gray-scale image. Namely, with the configuration shown in FIG. 9, both subpixel points and original pixel points of a gray-scale image are handled in a similar manner, and the threshold value for the local binarization process is calculated. However, with the configuration shown in FIG. 10, only the threshold value at an original pixel point is calculated, and the threshold value at a subpixel point is obtained with the interpolation process. Actually, the interpolation process explained by referring to FIG. 4 is not performed for the level of brightness at each original pixel point. The threshold value at a subpixel point is obtained by interpolating threshold values obtained at original pixel points, similar to the level of brightness.
  • a pixel point threshold value calculating unit 91 obtains the local threshold value after the subpixel generation process only at an original point of a gray-scale image.
  • An interpolating unit 92 obtains the local threshold value at an interpolation point (subpixel) by interpolating the local threshold values at the pixel points of the gray-scale image.
  • the comparing unit 83 makes a comparison between the pixel value at a pixel point, which is obtained by the subpixel processing unit 80 , and the threshold value obtained by the interpolating unit 92 , and outputs the result of the binarization process. Such a binarizing process is performed for all of the pixel points obtained by the subpixel processing unit 80 , thereby obtaining a binary image.
  • the method for obtaining the local threshold value at an original pixel point of a gray-scale image which is executed by the pixel point threshold value calculating unit 91 , includes the following methods.
  • the configuration implementing the method (1) is shown in FIG. 10 .
  • the configuration implementing the method (2) is shown in FIG. 11 .
  • the pixel point threshold value calculating unit uses only the pixel values of a gray-scale image.
  • the method (3) is applicable if the local threshold value is the linear combination of an average pixel value, a standard deviation value, and a variance within a local area.
  • the configuration implementing the method (3) is shown in FIG. 12 .
  • the variance is obtained from the average pixel value and the square average value.
  • the standard deviation value is obtained from the variance. The specific example of the equation will be provided next.
  • the local area recognizes original pixels as its boundary, the equations for obtaining the local threshold value and the average values are simplified, which leads to a speeding-up of the processing. Therefore, the method using these expressions may be available.
  • the locally binarizing unit may comprise a local area specifying unit, by which the process may be performed by making a distinction between the case where a local area does not recognize original pixels as its boundary, and the case where the local area recognizes the original pixels as its boundary (FIG. 13 ).
  • the local area specifying unit is a unit for using the number of subpixels generated with the interpolation process and the size of a local area as its specification data
  • a local are an original pixel boundary determining unit is a unit for automatically determining whether or not the local area recognizes original pixels as a boundary.
  • FIG. 11 exemplifies the third configuration of the locally binarizing unit.
  • the locally binarizing unit 100 is composed of a pixel point threshold calculating unit 101 , an interpolating unit 102 , and a comparing unit 83 .
  • the data of the original pixels of the gray-scale are input to the subpixel processing unit 80 and the pixel point threshold value calculating unit 101 .
  • the subpixel processing unit 80 generates a predetermined number of subpixels between the original pixels by performing the above described process.
  • the pixel point threshold value calculating unit 101 obtains the threshold value at an original pixel point only from the values of the original pixels.
  • the threshold value is obtained by generating the data shown in FIG. 7, and using the average value, the variance, etc.
  • the threshold value obtained at each original pixel point is used by an interpolating unit in order to obtain the threshold value of a subpixel with the method such as the linear interpolation method, etc.
  • the comparing unit 83 makes a comparison between the value of an original pixel or the value of a subpixel generated by the subpixel processing unit, and the threshold value at each pixel point, which is obtained by the pixel point threshold value calculating unit 101 and the interpolating unit 102 , and binarizes the pixel value.
  • a binary image is obtained by performing such a process for all of the pixel points.
  • FIG. 12 is a block diagram exemplifying the fourth configuration of the locally binarizing unit.
  • a locally binarizing unit 110 is composed of a pixel point threshold value calculating unit 111 , an interpolating unit 117 , and a comparing unit 118 .
  • the contents of the process performed by the subpixel processing unit 80 are similar to those described above.
  • the data of an input gray-scale image is input to the subpixel processing unit 80 , and the pixel point threshold value calculating unit 111 included in the locally binarizing unit 110 .
  • the pixel point threshold value calculating unit 111 obtains the values at the pixel points within the local area centering around a targeted pixel point.
  • the pixel point average value calculating unit 112 calculates an average of the pixel values.
  • a pixel point square average value calculating unit 113 calculates an average of the squares of the pixel values within the local area.
  • the average of the pixel values from the pixel point average value calculating unit 112 and the average of the squares of the pixel values from the pixel point square average value calculating unit 113 are input to a pixel point variance calculating unit 114 , which calculates the variance of the distribution of the pixel values.
  • the calculated variance is input to the pixel point standard deviation value calculating unit 15 , which obtains the standard deviation value of the distribution of the pixel values.
  • the average, variance, and standard deviation values of the pixel values are input to a linear combination calculating unit 116 , which calculates a threshold value.
  • the expression for calculating a threshold value must be suitably set by a skilled artisan. According to this preferred embodiment, however, the linear combination of the average and standard deviation values is used as described above, and the threshold value is adjusted by multiplying the standard deviation value and a parameter in order to adjust to what degree the standard deviation value is to affect the threshold value to be set.
  • this threshold value calculation process is repeatedly performed for the respective original pixel points of an input gray-scale image until the threshold values are obtained for all of the original pixels.
  • These threshold values are transmitted to the interpolating unit 117 , which then obtains the threshold values of subpixels by performing the interpolation process.
  • the threshold values of the original pixels and the subpixels are obtained.
  • the threshold values are compared with the pixel values transmitted from the subpixel processing unit 80 in the comparing unit 118 , and the pixel values are then binarized and output.
  • FIG. 13 is a block diagram exemplifying the fifth configuration of the locally binarizing unit.
  • a locally binarizing unit 120 is composed of a local area original pixel boundary determining unit 121 , an original pixel boundary pixel point threshold value calculating unit 122 , an original pixel non-boundary pixel point threshold value calculating unit 123 , an interpolating unit 124 , and a comparing unit 125 .
  • the data of an input gray-scale image are input to the subpixel processing unit 80 , and the above described process is performed. Then, the data after the subpixel generation process is input to the comparing unit 125 .
  • the data of the input gray-scale image is input also to the local area original pixel boundary determining unit 121 included in the locally binarizing unit 120 .
  • a local area specifying unit 126 is an input unit for specifying where to set a local area for a particular original pixel.
  • the specification may be made either manually or automatically. Especially, if the local area is included by the line structuring a character, the local binarization process may be unsuccessfully performed in some cases. It is therefore desirable that the local area includes the portion of the line structuring a character and the portion of a background in a ratio of 1 to 1.
  • the local area original pixel boundary determining unit 121 determines whether or not original pixels exist on the boundary of the local area. If the original pixels exist on the boundary of the local area, data is transmitted to the original pixel boundary pixel point threshold value calculating unit 122 , which is made to calculate the threshold value for each of the original pixels. If the original pixels do not exist on the boundary of the local area, the data is transmitted to the original pixel non-boundary pixel point threshold value calculating unit 123 , which is made to calculate the threshold value for each of the original pixels.
  • the methods for calculating a threshold value which are executed by the original pixel boundary pixel point threshold value calculating unit 122 and the original pixel non-boundary pixel point threshold value calculating unit 123 , will be described later.
  • the threshold values calculated by the original pixel point threshold value calculating unit 122 or the original pixel non-boundary pixel point threshold value calculating unit 123 are intended only for the original pixels, the threshold values are transmitted to the interpolating unit 124 , which obtains the threshold value for a subpixel by performing the interpolation process.
  • the threshold values obtained for the original pixels and the subpixels in this way are transmitted to the comparing unit 125 , which makes a comparison between the threshold values and the pixel values transmitted from the subpixel processing unit 80 .
  • the pixels are then binarized and output.
  • a local variance and a local standard deviation value are obtained with fundamental arithmetic operations by calculating the local average value and the local square average value, and the linear combination of the local variance and the local standard deviation value is used, whereby a threshold value is obtained according to the above described expressions referred to in the explanation of FIGS. 3A and 3B.
  • FIG. 14 is a schematic diagram explaining the method for calculating a local average value and a local square average value from original pixel values.
  • a local area centering around an original pixel is illustrated in FIG. 14 .
  • the coordinate of the targeted original pixel is assumed to be ( 0 , 0 ) for ease of explanation.
  • the description of the coordinate of a normal original pixel is obtained by parallel translating the coordinate.
  • the degree of the subpixel generation is defined to be n( ⁇ 1), and (n ⁇ 1) new subpixels are inserted between two original pixels.
  • the local area is a square area centering around a targeted original pixel.
  • the upper left point (that is, the coordinate of the upper left original pixel) of the regular square, which is the original pixel farthest from the targeted original point, among the original pixels included in the local area, is assumed to be ( ⁇ M, ⁇ M) with the condition 1 ⁇ M imposed.
  • the number of subpixels, which exist outside the regular square but within the local area in one direction, after the subpixel generation process is assumed to be “r”. “r” must satisfy 0 ⁇ r ⁇ n.
  • the original pixel value at a coordinate (i, j) is represented as I(i, j).
  • “L” is an odd number larger than “2n”. Conversely, if “L” which is an odd number larger than a positive number “n”, and “2n” are given, “M” and “r” which satisfy the above described equation are uniquely determined. That is, if (L ⁇ 1)/2 is divided by “n” and its quotient and remainder are obtained, the quotient and the remainder respectively correspond to “M” and “r”.
  • the variance and standard deviation values can be obtained by using the average value and the square average value of the pixel values within a local area, which are obtained with the above described expressions, thereby obtaining the expression for determining a threshold value.
  • FIG. 15 is a block diagram exemplifying the sixth configuration of the locally binarizing unit.
  • the value obtained by interpolating the difference between the pixel value at an original pixel point of a gray-scale image and the local threshold value after the subpixel generation process, at an interpolation point (subpixel), and the value of the binary image at the interpolation point, are determined with the sign of the interpolation value. Because the interpolation process is reduced from twice to once with this configuration, the processing speed can be improved. Especially, if the subpixel generation process is performed with the linear interpolation method, the same effects as those in the case where respective original pixels are interpolated and compared can be obtained.
  • the subpixel value for ⁇ original pixel value ⁇ binarization threshold value ⁇ the value obtained by performing the subpixel generation process for original pixel values ⁇ the value obtained by performing the subpixel generation process for a binarization threshold value
  • the locally binarizing unit 140 is composed of a pixel point threshold value calculating unit 140 , a difference calculating unit 142 , an interpolating unit 143 , and a sign determining unit 144 .
  • the data of an input gray-scale image are directly input to the difference calculating unit 142 , and to the pixel point threshold value calculating unit 141 .
  • the pixel point threshold value calculating unit 141 calculates the threshold value within a local area only from the original pixel values included in the local area for the original pixel point of the input gray-scale image, and generates the threshold value for a targeted original pixel.
  • the targeted original pixel value and the generated threshold value are input to the difference calculating unit 142 , which calculates the difference between them.
  • the value of a subpixel is then obtained by interpolating this difference in a similar manner as in the above described process for interpolating the level of brightness (gray scale). Since the obtained value is the same as the value of the difference between the value of the subpixel and the threshold value, the difference from the value of the original pixel and that of the subpixel are input to the sign determining unit 144 , which examines the sign of the difference.
  • the pixel value is larger than the threshold value, the above described difference becomes positive, and, for example, the value “0” is assigned to this pixel when being binarized. If the pixel value is smaller than the threshold value, the difference becomes negative, and, for example, the value “1” is assigned to the pixel when being binarized.
  • FIG. 16 is a block diagram exemplifying the seventh configuration of the local binarizing unit.
  • a binary image is obtained by obtaining the local threshold value after the subpixel generation process at each pixel point of a gray-scale image, and by using the table to which the values at 4 pixel points of the gray-scale image and the local threshold value and from which the binary image enclosed by 4 pixel points is output as data.
  • a locally binarizing unit 150 is composed of a pixel point threshold value calculating unit 151 , a binary image searching unit 152 , and a memory 153 .
  • the pixel point threshold value calculating unit 151 obtains the threshold value in a local area from the pixel values of the input gray-scale image for each of the pixels, and provides it to the binary image searching unit 152 .
  • the binary image searching unit 152 receives the threshold values from the pixel point threshold value calculating unit 151 and the original pixel values of the gray-scale image, and selects 4 original pixels forming the regular square which is a minimum unit of the grid formed by original pixel points.
  • a binary image is obtained by referencing the table stored in the memory 153 based on the pixel values and the threshold values of the 4 original pixels.
  • the table stored in the memory 153 is a table to which the binary image data is registered for the combination of the pixel values and the threshold values of 4 original pixels.
  • the binary image searching unit 152 obtains the binary image data for all of unit grids (minimum regular squares forming a grid) structured by the original pixels of the input gray-scale image according to this table, generates an entire binary image by combining the data, and outputs the generated image.
  • variable resolution binarizing unit Only the explanations about the configurations of the variable resolution binarizing unit were provided above. However, as shown in FIG. 6, the rough extraction process for roughly extracting a drawing area is initially performed for a gray-scale image, and the above described process is performed only for the extracted drawing area, thereby reducing the amount of processing time.
  • FIG. 17 shows the processing examples up to the process for binarizing a color image
  • FIG. 18 shows a gray-scale image according to this preferred embodiment.
  • FIG. 17 shows the processing example of a 150-dpi color image.
  • the top image is a color image document with a resolution of 150 dpi.
  • various colors appear around a black color representing characters.
  • the top color image converted into a gray-scale image is shown as a middle image.
  • the gray-scale image binarized with the conventional method is the binary image on the right at the bottom. Viewing this image, the detailed portions of the characters are defaced and the characters are difficult to recognize.
  • the binary image obtained by performing the processing according to this preferred embodiment is the image on the left at the bottom. Since subpixels are generated according to this preferred embodiment, the resolution of the binary image is substantially higher than that of the original color image. In this case, the amount of information increases.
  • the binary image data is used not for printing by a printer, etc, but for charter recognition as it is, the clearer representation of the characters facilitates the character recognition.
  • FIG. 18 shows the processing example of 150- and 100-dpi gray-scale images.
  • FIG. 19 is a block diagram explaining the configuration of the hardware required for implementing this preferred embodiment as software.
  • the subpixel generation process, the interpolation process, the threshold value calculation process, etc. can be implemented as a program running on a computer.
  • a CPU 181 for performing the above described processes, a RAM 183 for storing the program for performing these processes in an executable form, etc. must be interconnected by a bus 180 , so that they can communicate with each other.
  • a ROM 182 for storing the BIOS required for running the CPU 181 , a storage device 189 for storing the program, etc. are arranged.
  • the storage device 189 is implemented, for example, as a hard disk, etc.
  • a storage medium reading device 187 is required when the program is stored onto a portable storage medium 188 such as a floppy disk, a CD-ROM, etc. and used.
  • the program read from the storage device 189 or the portable storage medium 188 is expanded and stored in the RAM 183 so that the CPU 181 can execute it.
  • an input/output device 186 composed of a monitor, a keyboard, a mouse, etc. is arranged in order to transmit to the CPU 181 the commands issued by a user who operates the device, and to display the results of the processes performed by the CPU 181 for the user.
  • the program may not be stored in a computer used by a user. It may be used depending on need by being downloaded from a database possessed by a program provider 185 . Or, the program may be executed in a network by connecting the user and the program provider 185 via a LAN. Only a command input and a result display are performed by the computer possessed by the user in this case.
  • a color or gray-scale document image can be quickly binarized with high accuracy, thereby recognizing the image accurately and rapidly.

Abstract

A color image input from a document image inputting unit is converted into a gray-scale image by a brightness image extracting unit. The gray-scale image is then converted into an image having a higher resolution according to the resolution of the original gray-scale image. When this conversion is performed, subpixels are generated between the original pixels, and the values of the subpixels are obtained with an interpolation method. Furthermore, a threshold value for a binarization process is generated by using an original pixel value and a subpixel value. The characters included in the binarized image are recognized by a binary image recognizing unit, and a recognition result is output from a recognition result outputting unit.

Description

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an image recognizing device.
2. Description of the Related Art
With the popularization of personal computers and the arrangements of networks, the number of electronic documents has been growing in recent years. However, the main medium of information distribution is still a paper document, and an enormous number of paper documents currently exist. Accordingly, a document image recognizing device intended to convert a paper document into an electronic document, and to edit a conversion result has been increasingly demanded. The document image recognizing device is a device which uses a document image as an input, and performs a coding process by recognizing characters, etc. included in the document image. Currently, there are products which use a binary document image as an input. The binary document image is a document image where image data of a character is represented, for example, by “1”, while the image data of the background is represented by “0”.
Recently, an image filing device has become popular, and also the demand for inputting an image filed by the image filing device to a document image recognizing device, and for recognizing the image has been increasing. Especially, the number of gray-scale or color documents including photographs has been growing. Therefore, the demand for recognizing not only binary documents but also gray-scale or color documents has been on the rise.
A color document image recognizing device recognizes also a color or gray-scale document image. A conventional color document image recognizing device obtains a binary image by binarizing each brightness component with a predetermined threshold value, and recognizes the obtained binary image, if an input document image is not a binary image but a gray-scale or color image.
FIGS. 1A and 1B respectively show the configuration of the conventional color document image recognizing device and an extended color text image.
In FIG. 1A, a document image inputting unit 170 is a unit for inputting a document image, and is typically implemented as a scanner, etc. For a color document, parameters such as a color parameter, a brightness parameter, etc. are assigned to respective pixels by illuminating the document, receiving a reflected light, and analyzing the received light. If the document to be scanned is a gray-scale document, the light reflected from the document is analyzed, the information about the level of brightness is obtained, and this information is assigned to each pixel. At this time, all of the pixels of the gray-scale document are detected to be black-and-white, which is set as the color parameters of each of the pixels.
If an input image is a color image, a brightness image extracting unit 171 extracts a brightness component for each pixel, and outputs a brightness image which is a gray-scale image to a predetermined threshold value binarizing unit. If the input image is a gray-scale image, the color parameters of all of the pixels are set to be black-and-white. Therefore, the gray-scale image resultant from the process of the brightness image extracting unit 171 will become the image data having the same brightness data as that of the input gray-scale image, in principle. This is because only the parameters related to the hue are removed from all the color parameters of the processed gray-scale image.
If a gray-scale image which is a brightness image is input, a predetermined threshold value binarizing unit 172 obtains a binary image by binarizing the gray-scale image with a predetermined threshold value. This threshold value is a value which is externally determined and input. Hereinafter, a gray-scale image fundamentally indicates not the gray-scale image with color parameters set to black-and-white, which is scanned by the document image inputting unit 170, but the brightness image resultant from the process performed by the brightness image extracting unit 171. That is, the gray-scale image is defined to have not color but brightness parameters. Even if the gray-scale image has color parameters, only brightness parameters are substantially valid for recognizing the image if the color parameters are set to black-and-white for all of the pixels. Accordingly, image recognition can be made also by using such a gray-scale image.
A binary image recognizing unit 173 recognizes a binary image. That is, this unit recognizes characters by obtaining the features of the document image which is binarized by the predetermined threshold value binarizing unit 172, and replaces the characters with the codes which are internally used by a computer and correspond to the recognized characters.
A recognition result outputting unit 174 outputs the result of the character recognition made by the binary image recognizing unit 173, that is, the file which is restructured into a code sequence representing the characters of a document image.
The conventional color document image recognizing device has a disadvantage that recognition accuracy is low.
Since the amount of data of each pixel of a color document image is 8 times that of a binary image, the color document image is input to a color document image recognizing device at a resolution lower than that of the binary image in order to reduce an amount of processing time of an image input device such as a scanner, etc., and a capacity of a memory used for filing an image, etc.
FIG. 1B illustrates an expanded low-resolution color text image.
This figure shows a monochromatic image of a 150-dpi text image in full color. With a color display, many colors can be identified on the periphery of the characters, and it seems difficult to identify an area by extracting the same color.
The conventional document image recognizing device is fundamentally designed to have an input of a binary image with a small amount of data, and assumes the document image of a standard resolution of approximately 400 dpi. Accordingly, if a color document image of 150 or less dpi is input, the conventional device converts the document image into a binary image with the resolution equal to or less than 150 dpi, and recognizes the converted image. Therefore, the device cannot recognize the image with sufficient accuracy.
SUMMARY OF THE INVENTION
An object of the present invention is to provide a high-speed document image recognizing device which implements high recognition accuracy.
An image recognizing device according to the present invention comprises: an image converting unit for converting an input document image into a gray-scale image if the input document image is a color image, and for newly outputting a gray-scale image if the input document image is a gray-scale image; a variable resolution binarizing unit for converting the gray-scale image into a binary image with a higher resolution according to the resolution of the gray-scale image; and a unit for recognizing the binarized image.
An image recognizing method according to the present invention comprises the steps of: (a) converting an input document image into a gray-scale image if the input document image is a color image, and for newly outputting a gray-scale image if the input document image is a gray-scale image; (b) converting the gray-scale image into a binary image with a higher resolution according to the resolution of the gray-scale image; and (c) recognizing the binarized image.
According to the present invention, a gray-scale image obtained by converting a color document image or an input gray-scale image is converted into the image data having a suitably higher resolution according to the resolution of the gray-scale image, thereby binarizing the image without losing the information about brightness levels of a gray scale. Therefore, characters appearing in a binary image can be prevented from being defaced, thereby implementing character recognition with higher accuracy.
Furthermore, a drawing area including characters, etc. is roughly extracted from a document image, and a binarization process according to the present invention is performed only for the extracted area, thereby improving the processing speed.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1A shows the configuration of a conventional color document image recognizing device;
FIG. 1B shows an expanded color text image;
FIG. 2 is a block diagram showing the principle of a color document image recognizing device according to a preferred embodiment;
FIGS. 3A and 3B exemplify the configurations of a variable resolution binarizing unit;
FIG. 4 is a schematic diagram explaining the principle of the process for generating subpixels;
FIG. 5 exemplifies the table for specifying the number and the resolution of subpixels to be generated according to the resolution of an input gray-scale image;
FIG. 6 exemplifies an additional configuration of the variable resolution binarizing unit;
FIG. 7 is a schematic diagram explaining the method for setting a global threshold value used for a rough extraction process;
FIGS. 8A and 8B exemplify the further configurations of the variable resolution binarizing unit;
FIG. 9 exemplifies the details of a first configuration of a locally binarizing unit;
FIG. 10 exemplifies the details of a second configuration of the locally binarizing unit;
FIG. 11 exemplifies a third configuration of the locally binarizing unit;
FIG. 12 exemplifies a fourth configuration of the locally binarizing unit;
FIG. 13 exemplifies a fifth configuration of the locally binarizing unit;
FIG. 14 is a schematic diagram explaining the method for calculating a local average value and a local square average value from original pixel values;
FIG. 15 exemplifies a sixth configuration of the locally binarizing unit;
FIG. 16 exemplifies a seventh configuration of the locally binarizing unit;
FIG. 17 is a schematic diagram exemplifying the processing up to the process for binarizing a color or gray-scale image with the process according to the preferred embodiment (No. 1);
FIG. 18 is a schematic diagram exemplifying the processing up to the process for binarizing a color or gray-scale image with the process according to the preferred embodiment (No. 2); and
FIG. 19 is a block diagram explaining the configuration of the hardware required for implementing the preferred embodiment as software.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
According to the present invention, an input image is converted into a binary image with a higher resolution according to the resolution of the input image in order to overcome the above described problems of the conventional device.
Even if the resolution of an input image is low, a human being can recognize the image if it is a color or gray-scale image. However, even a human being has difficulty in recognizing an image which is binarized by the conventional device. This is because the information about the brightness components of the original image is lost. To facilitate the recognition of the binarized image, the information about the brightness components of the image must be reflected in the binarized image. For the implementation of the reflection, it is effective to increase the resolution of the binarized image.
If an input document image is a color image, it is converted into a gray-scale image and further converted into a binary image with the resolution according to that of the input image. The converted binary image is recognized and the characters are converted into electronic codes.
As a specific means for converting a gray-scale image into a binary image having the resolution according to that of an input image, a method for obtaining a binary image after performing the subpixel generation process which increases the number of pixels included in a gray-scale image by interpolating the values of the pixels included in the gray-scale image, is considered. As a specific method of the subpixel generation process, a linear interpolation method between pixel values can be cited.
Additionally, the method for extracting a character portion and its periphery from an entire image, for suitably generating a threshold value within the extracted partial image, and for performing character recognition, can be cited as a method for improving the recognition ratio of a color or gray-scale image. With this method, the result of the binarization process allows a character to be clearly shaped by reducing the light and shade of an area having a higher level of brightness in the background, that is, noise components rather than by preventing the information of brightness components from being lost.
FIG. 2 is a block diagram showing the principle of a color document image recognizing device according to a preferred embodiment.
A document image inputting unit 10 is a unit corresponding to a scanner in a similar manner as in the above described conventional technique, and is intended to illuminate a color or gray-scale document (including parameters related to hue), and to capture the document as an image.
A brightness image extracting unit 11 is a unit for extracting only brightness components from the image input by the document image inputting unit 10, and for generating a gray-scale image (which does not include parameters related to hue). The gray-scale image which is the output of the brightness image extracting unit 11 is input to a variable resolution binarizing unit 12 to be described next.
The variable resolution binarizing unit 12 (roughly) extracts a partial area including characters from the input gray-scale image having a low resolution, and provides a binary image where characters are easily recognized to a binary image recognizing unit 13 at a succeeding stage by generating subpixels between original pixels within the gray-scale image and interpolating the information about brightness components.
The binary image recognizing unit 13 performs character recognition based on the binary image passed from the variable resolution binarizing unit 12, and performs the process for replacing characters of an image with electronic codes.
A recognition result outputting unit 14 receives the document file as an electronic code sequence from the binary image recognizing unit 13, stores the document file onto a storage medium such as a hard disk, etc., and outputs the document file as a recognition result to a display monitor.
Since the configurations of the document image inputting unit 10, the brightness image extracting unit 11, the binary image recognizing unit 13, and the recognition result outputting unit 14 are similar to those of the conventional technique, their detailed explanations are omitted here. That is, even if they are omitted, a skilled artisan can easily understand the configurations of the document image inputting unit 10, the binary image recognizing unit 13, and the recognition result outputting unit 14, and can actually use them in a current situation where devices and software for recognizing a black-and-white document are commercialized. Additionally, the skilled artisan can easily understand also the configuration of the brightness image extracting unit 11, because this unit is intended to convert a color image, etc. into a gray-scale image, and, at present, a color image is converted into a black-and-white image and telecast. Therefore, its explanation is omitted here.
Accordingly, the present invention is characterized in that a predetermined threshold value binarizing unit 172 is replaced with a variable resolution binarizing unit 12. The explanation to be provided below will refer to the details of the variable resolution binarizing unit 12. The variable resolution binarizing unit 12 converts a gray-scale image into the binary image having the resolution according to that of an input image, or binarizes a gray-scale image after roughly extracting a portion including characters from a gray-scale image, so that the image can be used for character recognition. Additionally, the resolution conversion and the rough extraction may be performed at the same time.
FIGS. 3A and 3B are block diagrams exemplifying the configurations of the variable resolution binarizing unit.
As a first example of the configuration of the variable resolution binarizing unit, subpixels are generated between original pixels of a gray-scale image by a subpixel processing unit 20, the resolution is increased according to the resolution of the gray-scale image which is an input image, and then the conventional process for binarizing an image with a predetermined threshold value is performed by a predetermined threshold value binarizing unit 21, as shown in FIG. 3A.
Here, a predetermined threshold value is defined to be used for a single document image. For example, only one threshold value is used for binarizing a 1-page document image.
The process for generating subpixels is a process for subdividing the space between original pixels of a gray-scale image whose recognition ratio is not likely to be improved if it is binarized unchanged, and for generating virtual pixel data in the original data. Although its details will be described later, in short, the pixel value (the level of brightness) of a subpixel is obtained by interpolating the levels of brightness of original pixels of an input gray-scale image. Typical of the interpolation method is a linear interpolation method.
As a second example of the configuration of the variable resolution binarizing unit, a locally binarizing unit 23 shown in FIG. 3B may be arranged instead of the predetermined threshold value binarizing unit 21 shown in FIG. 3A, as illustrated in FIG. 3B. The locally binarizing unit 23 sets the local area where the process is to be performed for each pixel included in a drawing area within an entire gray-scale document image, and obtains a binary image by binarizing the image with the threshold value generated by using the pixel data within the local area.
That is, the threshold value of the level of brightness is obtained in a local range (local area such as a regular square area centering around a target pixel), and the binarization process is performed. That is, the binarization process is performed so that the pixel with the level of brightness which is equal to or lower than the threshold value is black (the level of brightness is set to, for example, “1”), and the pixel with the level of brightness which is equal to or higher than the threshold value is white (the level of brightness is set to, for example, “0”). The threshold value of a local area is defined, for example, with the method using the linear combination of an average pixel value, a standard deviation value, and a variance. The threshold value is defined, for example, as follows. In the following equation, a local binarization parameter is a constant, and an optimum threshold value is obtained by suitably setting this parameter. Note that the gray scale is used almost similar to the brightness in the following equation.
(threshold value)=(average gray scale)+(local binarization parameter)×(gray-scale standard deviation value)
FIG. 4 is a schematic diagram explaining the principle of the process for generating subpixels.
In this figure, black circles indicate original pixels of a gray-scale image, while white circles indicate subpixels. Additionally, IA through ID indicate the levels of the gray scale of the original pixels A through D. Furthermore, I1 through I5 indicate the levels of gray scale of the subpixels 1 through 5 to be obtained from the levels of gray scale of the original pixels with the interpolation process.
As shown in this figure, if subpixels are generated with the linear interpolation within an area enclosed by the four original pixels, it is first determined how many subpixels will be arranged between the original pixels. Next, subpixels are positioned at regular intervals according to the number of subpixels to be arranged. Then, the levels of gray scale are assigned to the respective subpixels by interpolating the levels of gray scale of the original pixels.
Considered below is the case where the subpixels are arranged in the area enclosed by the original pixels A through D based on the assumption that “p” and “q” are numbers which are larger than “0” and smaller than “1”. A subpixel 1 is arranged at the position represented by “p:1−p” in a straight line AB linking the original pixels A and B. In this case, the levels of gray scale of the subpixel 1 is obtained from the levels of gray scales IA and IB of the original pixels A and B with the linear interpolation according to the following equation.
I 1=p*IB+(1−p)*IA
Similarly, the levels of gray scales of the subpixels 2 through 4 are obtained as follows.
I 2=q*IC+(1−q)*IA
I 3=q*ID+(1−q)*IB
I 4=p*ID+(1−p)*IC
Additionally, the level of gray scale I5 of the subpixel 5 included in the area enclosed by the original pixels A through D can be obtained according to the following equation.
I 5 =p*q*ID+p*(1−q)*IB+q*(1−p)*IC+(1−p)*(1−q)*IA
The above described calculation is made for all of the subpixels arranged between the original pixels, so that the process for generating subpixels is completed. The obtained levels of gray scale are stored with a method similar to that of original pixel data, along with the corresponding positions of the subpixels.
FIG. 5 exemplifies a table for specifying the number of subpixels to be generated, according to the resolution of a gray-scale image which is an input image.
The process for generating subpixels between original pixels must be performed at the beginning of the calculation process explained by referring to FIG. 4. Although the number of subpixels to be generated is arbitrary, it must be set so that sufficient character recognition accuracy can be obtained from the image for which subpixels are generated and which is binarized. Accordingly, there is no equation for uniquely determining the number of subpixels to be generated according to the resolution of an input image. The number of subpixels to be generated must be determined based on experimental data to some extent.
In FIG. 5, the resolution of an input image is stored in the left column in correspondence with a subpixel generation parameter in the right column. For example, if the resolution of an input image is 100 dpi, subpixels are generated to form a grid of 6×6 for each original pixel. If the resolution of an input image is 150 dpi, subpixels are generated in order to form a grid of 3×3 for each original pixel.
In this way, the above described subpixel processing unit stores the table shown in FIG. 5. When a gray-scale image is input, the subpixel processing unit obtains the number of subpixels to be generated according to the resolution of the input image, and determines the positions of the subpixels. Next, the subpixel processing unit obtains the levels of gray scale of the respective subpixels from the levels of gray scale of the original pixels with the interpolation process by using the calculation method explained by referring to FIG. 4, and generates the image data having a high resolution according to the resolution of the input image.
FIG. 6 is a block diagram exemplifying a further configuration of the variable resolution binarizing unit.
The configuration shown in FIG. 3B relatively requires a considerable amount of time in order to locally binarize an entire image area. In the meantime, with the configuration shown in FIG. 6, a drawing area roughly extracting unit may perform the global process for recognizing as the periphery of characters the image portion whose pixel values are lower than a global threshold value set for the pixel values of a gray-scale image, and for roughly extracting the drawing area. Then, a local binarization process may be performed.
That is, the drawing area roughly extracting unit 50 extracts the portion including characters and its periphery from a document image, and the locally binarizing unit binarizes the gray-scale image within the extracted area. The rough extraction is performed by totaling the number of pixels having one level of gray scale or brightness of all of the pixels of a document image, as will be described later. If the number of pixels is large in a portion where the level of brightness is high, it indicates the background of a document image. If the number of pixels is large in a portion where the level of brightness is low, it indicates the portion including a character of the document image. A threshold value is set to a value in the portion where the number of pixels is small, which corresponds to the middle of the two peaks. This threshold value setting is similar to that of the binarization process. Note that, however, the position at which the threshold value is set is slightly shifted to a higher level of brightness. If the pixels included in a document image are extracted with such a threshold value (global threshold value), the extracted portion will include a character and its periphery. The locally binarizing unit 51 binarizes the image with another threshold value for the roughly extracted area, so that noise components caused by the light and shade of the background of the document image can be removed. Consequently, the binary image which is easier to be recognized can be obtained.
Specifically, the global threshold value is determined with the linear sum of an average pixel value, a standard deviation value, and a variance. Or, it can be determined as follows. A global process parameter which will appear in the following equation is a constant.
(global threshold value)=(an average of all pixel values)+(global process parameter)×(standard deviation value of all pixel values)
The position of the global threshold value is adjusted to the position where the rough extraction can be most effectively made by making an adjustment with the global process parameter.
FIG. 7 is a schematic diagram explaining how to set a global threshold value used for the rough extraction process.
When the rough extraction process is performed, as shown in this figure, the levels of brightness are obtained from all of the pixels structuring an entire document image, and the statistics of the frequency at which a pixel having a particular level of brightness appears in the document is collected in a similar manner as in the case where the predetermined threshold value is determined for the entire document image. In this figure, the appearance frequency of the pixel against the level of brightness forms a gentle curve. However, this is illustrated for ease of explanation about the threshold value setting method. If the statistical process representing the frequency at which a pixel having a certain level of brightness appears is actually performed with a device, the result will be a histogram.
In FIG. 7, frequency peaks are formed in the respective portions where the levels of brightness are low and high. The portion where the level of brightness is high is the background of a document image, and this level of brightness is the level of brightness of the paper on which the document is created. In the meantime, the portion where the level of brightness is low is an area where a character or a graphic, etc. is drawn. If the frequencies at which pixels appear are classified depending on the levels of brightness, the pixels are grouped into two major groups such as the pixels included in the drawing area and the pixels included in a background. Therefore, the rough extraction process can be performed by setting the threshold value of the level of brightness to almost the middle of the two peaks, and extracting the pixels having the levels of brightness which are lower than this threshold value from the document image.
If a document is structured by the reverse video of a black-and-white image, the area where the two peaks are drawn and the background become reversed. Accordingly, the pixels having levels of brightness which are higher than a preset threshold value, are output as a drawing area.
If an entire document is binarized with a predetermined threshold value, the threshold value is set to a value corresponding to almost the middle of the two peaks, that is, the value at the bottom of the frequency curves. This is because the character recognition cannot be made with high accuracy if the pixels of the background are also captured. When the rough extraction process is performed, however, an extracted area which includes the periphery of a character is more convenient to set a threshold value in the binarization process. Therefore, the threshold value is set to the value which is slightly shifted from the bottom toward the peak of the background. In this way, the area where a character, etc. is drawn and its periphery are extracted.
FIGS. 8A and 8B are block diagrams exemplifying further configurations of the variable resolution binarizing unit.
If the configurations shown in FIGS. 3A and 6 are combined, the global process for roughly extracting a drawing area is performed after the subpixel generation process is performed for a gray-scale image. Then, the local binarization process is performed for each of the pixels included in the drawing area. As a result, the processes can be performed at a higher speed than that implemented by the configuration shown in FIG. 3A, and at the same time, the recognition accuracy is improved more than that available from the configuration shown in FIG. 6.
With the configuration shown in FIG. 8A, the subpixel generation process is initially performed by a subpixel processing unit 70, and then a document image is roughly extracted by a drawing area roughly extracting unit 71. With the subpixel generation process, the information about the level of brightness of a gray-scale image can be prevented from being lost, or rather, the information about the level of brightness is included. As a result, the accuracy of the binarization process can be improved. Furthermore, performing the rough extraction process eliminates the need to target the whole of a document image, thereby reducing the number of pixels to be processed and improving the processing speed of the locally binarizing unit.
The subpixel processing unit 70 performs the subpixel generation process for the whole of a document image. The drawing area roughly extracting unit 71 performs the rough extraction process by also targeting subpixels. Accordingly, the processing speed and the accuracy can be improved further than that implemented in the case where only the rough extraction process is performed. However, the subpixel generation process must be performed for the whole of the document image in this case. Therefore, the amounts of data handled by the subpixel processing unit 70 and the drawing area roughly extracting unit 71 increase, which leads to the slowdown of the processing speed and the insufficiency of the capacity of a memory storing data, although the processing speed of the locally binarizing unit 72 can be improved.
FIG. 8B is a block diagram exemplifying the configuration which improves the processing speed implemented by the configuration shown in FIG. 8A.
The configuration shown in FIG. 8B is obtained by reversing the processing order of the subpixel processing unit 70 and the drawing area roughly extracting unit 71, which are included in the configuration shown in FIG. 8A. Because the process of the drawing area roughly extracting unit 73 is performed for the original gray-scale image whose number of pixels is not increased with the subpixel generation process, the amount of data handled by the drawing area roughly extracting unit 73 is reduced and the processing speed is made faster.
Additionally, it is sufficient for the subpixel processing unit 74 to also perform the subpixel generation process, not for the whole of an input gray-scale image, but only for the area extracted by the drawing area roughly extracting unit 73. Accordingly, the number of pixels to be handled is reduced much more than that in the case where the subpixel generation process is performed for the whole of the input gray-scale image, in the configuration shown in FIG. 8A. Consequently, the capacity of the memory storing data can be decreased, and at the same time, the processing speed can be improved.
Furthermore, since the amount of throughput of the locally binarizing unit 75 is approximately the same as that of the corresponding unit shown in FIG. 8A, the processing speed on the whole can be improved, and also the amount of required hardware resources such as a memory, etc. can be reduced.
Generally, with the configurations (shown in FIGS. 3A, 8A, and 8B) for performing the local binarization process after the subpixel generation process, the threshold value of the local binarization process is calculated from the subpixel value at the interpolation point (subpixel). However, the subpixel value at the interpolation point is obtained from the original pixel values, the local binarization process can be performed according to the equation obtained by assigning the equation for obtaining a subpixel value at an interpolation point from original pixel values, to the equation for obtaining a local threshold value from the subpixel value at the interpolation point.
FIG. 9 is a block diagram exemplifying the details of the configuration of the locally binarizing unit.
Because the contents of the subpixel generation process performed by a subpixel processing unit 80 are the same as those explained by referring to FIG. 4, the explanation is omitted here. A locally binarizing unit 81 is composed of a threshold value calculating unit 82 and a comparing unit 83. The threshold value calculating unit 82 directly obtains the threshold value of the local binarization process from a gray-scale image. The comparing unit 83 makes a comparison between the threshold value and the value at the point where a subpixel is generated, which is obtained by the subpixel processing unit 80, and outputs the result of the binarization process.
That is, the information about a pixel point of an input gray-scale image is directly input to the threshold value calculating unit 82. Furthermore, the threshold value calculating unit 82 sets a local area by targeting a particular pixel point, statistically classifies the level of gray scale (the level of brightness) at each pixel point within the local area as explained by referring to FIG. 7, and sets a threshold value which allows the distinction between the pixel having the level of brightness which represents a background, and the pixel having the level of brightness which represents a drawing area, to be made as definite as possible. The rough extraction process is explained by referring to FIG. 7, and the threshold value which is shifted to the level of brightness, which represents a background, is described to be used. If the binarization process is performed, the threshold value is set to the value corresponding to the bottom of the frequency curves shown in the middle of FIG. 7. With such a threshold value setting, the distinction between the pixel structuring a character and the pixel structuring a background can be definitely made. As a result, a binary image which is easier to be recognized can be obtained.
The threshold value calculated by the threshold value calculating unit 82 is input to the comparing unit 83. The subpixel processing unit 80 generates the information about each pixel point of the gray-scale image for which the subpixel generation process is performed, and about the level of brightness at each pixel point, and inputs to the comparing unit 83 the information about the level of brightness of the pixel point targeted by the threshold value calculating unit 82. The comparing unit 83 then makes a comparison between the level of brightness at each pixel point and the threshold value. Here, assume that the level of brightness at a pixel point, which is higher than the threshold value, is set to “0”, while the level of brightness at the pixel point, which is lower than the threshold value, is set to “1”. The pixel point to be targeted is sequentially changed, and the above described process is repeatedly performed, so that a binary image corresponding to the input gray-scale image can be obtained. The pixel point referred to in the above explanation includes both an original pixel of an input gray-scale image and a subpixel generated by performing the subpixel generation process. If the pixel point is used to indicate an original pixel of an input gray-scale image, it is hereinafter referred to as an original pixel point. With the configuration shown in FIG. 9, the binarization process is performed for the whole of image data whose number of pixels is increased by performing the subpixel generation process for the whole of the input gray-scale image.
FIG. 10 shows the details of the second configuration of the locally binarizing unit.
In this figure, the local threshold value after the subpixel generation process is obtained only at each original pixel point of a gray-scale image. The local threshold value at an interpolation point (a subpixel point) is obtained by interpolating the local threshold values at the original pixel points of the gray-scale image. Namely, with the configuration shown in FIG. 9, both subpixel points and original pixel points of a gray-scale image are handled in a similar manner, and the threshold value for the local binarization process is calculated. However, with the configuration shown in FIG. 10, only the threshold value at an original pixel point is calculated, and the threshold value at a subpixel point is obtained with the interpolation process. Actually, the interpolation process explained by referring to FIG. 4 is not performed for the level of brightness at each original pixel point. The threshold value at a subpixel point is obtained by interpolating threshold values obtained at original pixel points, similar to the level of brightness.
That is, a pixel point threshold value calculating unit 91 obtains the local threshold value after the subpixel generation process only at an original point of a gray-scale image. An interpolating unit 92 obtains the local threshold value at an interpolation point (subpixel) by interpolating the local threshold values at the pixel points of the gray-scale image. The comparing unit 83 makes a comparison between the pixel value at a pixel point, which is obtained by the subpixel processing unit 80, and the threshold value obtained by the interpolating unit 92, and outputs the result of the binarization process. Such a binarizing process is performed for all of the pixel points obtained by the subpixel processing unit 80, thereby obtaining a binary image.
The method for obtaining the local threshold value at an original pixel point of a gray-scale image, which is executed by the pixel point threshold value calculating unit 91, includes the following methods.
(1) Obtaining the local threshold value at an original pixel point of a gray-scale image by using the value of the subpixel obtained by interpolating original pixel points of the gray-scale image.
(2) Obtaining the local threshold value by substituting the equation for obtaining the subpixel value at the interpolation points from the original pixel values into the equation for obtaining the local threshold value from the subpixel value at the interpolation points.
(3) Obtaining the subpixel value from the linear combination of an average value, a standard deviation value, and a variance, by obtaining the standard deviation value and the variance after obtaining the average value and a square average value of values at pixel points of the gray-scale image, for which the subpixel generation process is performed, from an equation obtained by substituting an equation for obtaining a subpixel value from original pixel values, to an equation for obtaining the average value and the square average value from subpixel values.
The configuration implementing the method (1) is shown in FIG. 10.
The configuration implementing the method (2) is shown in FIG. 11. The pixel point threshold value calculating unit uses only the pixel values of a gray-scale image.
The method (3) is applicable if the local threshold value is the linear combination of an average pixel value, a standard deviation value, and a variance within a local area. The configuration implementing the method (3) is shown in FIG. 12. The variance is obtained from the average pixel value and the square average value. The standard deviation value is obtained from the variance. The specific example of the equation will be provided next.
variance=an average of square values−the square of an average value standard deviation={square root over ( )}(variance)
If the local area recognizes original pixels as its boundary, the equations for obtaining the local threshold value and the average values are simplified, which leads to a speeding-up of the processing. Therefore, the method using these expressions may be available.
Additionally, the locally binarizing unit may comprise a local area specifying unit, by which the process may be performed by making a distinction between the case where a local area does not recognize original pixels as its boundary, and the case where the local area recognizes the original pixels as its boundary (FIG. 13). The local area specifying unit is a unit for using the number of subpixels generated with the interpolation process and the size of a local area as its specification data, and a local are an original pixel boundary determining unit is a unit for automatically determining whether or not the local area recognizes original pixels as a boundary.
Provided below are the explanations about the configurations shown in FIGS. 11 through 13.
FIG. 11 exemplifies the third configuration of the locally binarizing unit.
The locally binarizing unit 100 is composed of a pixel point threshold calculating unit 101, an interpolating unit 102, and a comparing unit 83.
When a gray-scale image is input, the data of the original pixels of the gray-scale are input to the subpixel processing unit 80 and the pixel point threshold value calculating unit 101. The subpixel processing unit 80 generates a predetermined number of subpixels between the original pixels by performing the above described process. The pixel point threshold value calculating unit 101 obtains the threshold value at an original pixel point only from the values of the original pixels. The threshold value is obtained by generating the data shown in FIG. 7, and using the average value, the variance, etc. The threshold value obtained at each original pixel point is used by an interpolating unit in order to obtain the threshold value of a subpixel with the method such as the linear interpolation method, etc. The comparing unit 83 makes a comparison between the value of an original pixel or the value of a subpixel generated by the subpixel processing unit, and the threshold value at each pixel point, which is obtained by the pixel point threshold value calculating unit 101 and the interpolating unit 102, and binarizes the pixel value. A binary image is obtained by performing such a process for all of the pixel points.
FIG. 12 is a block diagram exemplifying the fourth configuration of the locally binarizing unit.
A locally binarizing unit 110 is composed of a pixel point threshold value calculating unit 111, an interpolating unit 117, and a comparing unit 118. The contents of the process performed by the subpixel processing unit 80 are similar to those described above. The data of an input gray-scale image is input to the subpixel processing unit 80, and the pixel point threshold value calculating unit 111 included in the locally binarizing unit 110. The pixel point threshold value calculating unit 111 obtains the values at the pixel points within the local area centering around a targeted pixel point. The pixel point average value calculating unit 112 calculates an average of the pixel values. A pixel point square average value calculating unit 113 calculates an average of the squares of the pixel values within the local area. The average of the pixel values from the pixel point average value calculating unit 112 and the average of the squares of the pixel values from the pixel point square average value calculating unit 113 are input to a pixel point variance calculating unit 114, which calculates the variance of the distribution of the pixel values. The calculated variance is input to the pixel point standard deviation value calculating unit 15, which obtains the standard deviation value of the distribution of the pixel values.
The average, variance, and standard deviation values of the pixel values are input to a linear combination calculating unit 116, which calculates a threshold value. The expression for calculating a threshold value must be suitably set by a skilled artisan. According to this preferred embodiment, however, the linear combination of the average and standard deviation values is used as described above, and the threshold value is adjusted by multiplying the standard deviation value and a parameter in order to adjust to what degree the standard deviation value is to affect the threshold value to be set.
After the threshold value is obtained for a certain targeted pixel point in this way, this threshold value calculation process is repeatedly performed for the respective original pixel points of an input gray-scale image until the threshold values are obtained for all of the original pixels. These threshold values are transmitted to the interpolating unit 117, which then obtains the threshold values of subpixels by performing the interpolation process. As a result of the above described processes, the threshold values of the original pixels and the subpixels are obtained. The threshold values are compared with the pixel values transmitted from the subpixel processing unit 80 in the comparing unit 118, and the pixel values are then binarized and output.
FIG. 13 is a block diagram exemplifying the fifth configuration of the locally binarizing unit.
In this example, a locally binarizing unit 120 is composed of a local area original pixel boundary determining unit 121, an original pixel boundary pixel point threshold value calculating unit 122, an original pixel non-boundary pixel point threshold value calculating unit 123, an interpolating unit 124, and a comparing unit 125.
The data of an input gray-scale image are input to the subpixel processing unit 80, and the above described process is performed. Then, the data after the subpixel generation process is input to the comparing unit 125.
The data of the input gray-scale image is input also to the local area original pixel boundary determining unit 121 included in the locally binarizing unit 120. A local area specifying unit 126 is an input unit for specifying where to set a local area for a particular original pixel. The specification may be made either manually or automatically. Especially, if the local area is included by the line structuring a character, the local binarization process may be unsuccessfully performed in some cases. It is therefore desirable that the local area includes the portion of the line structuring a character and the portion of a background in a ratio of 1 to 1.
When the range of the local area is set by the local area specifying unit 126 for the target pixel, the local area original pixel boundary determining unit 121 determines whether or not original pixels exist on the boundary of the local area. If the original pixels exist on the boundary of the local area, data is transmitted to the original pixel boundary pixel point threshold value calculating unit 122, which is made to calculate the threshold value for each of the original pixels. If the original pixels do not exist on the boundary of the local area, the data is transmitted to the original pixel non-boundary pixel point threshold value calculating unit 123, which is made to calculate the threshold value for each of the original pixels. The methods for calculating a threshold value, which are executed by the original pixel boundary pixel point threshold value calculating unit 122 and the original pixel non-boundary pixel point threshold value calculating unit 123, will be described later.
Since the threshold values calculated by the original pixel point threshold value calculating unit 122 or the original pixel non-boundary pixel point threshold value calculating unit 123 are intended only for the original pixels, the threshold values are transmitted to the interpolating unit 124, which obtains the threshold value for a subpixel by performing the interpolation process.
The threshold values obtained for the original pixels and the subpixels in this way are transmitted to the comparing unit 125, which makes a comparison between the threshold values and the pixel values transmitted from the subpixel processing unit 80. The pixels are then binarized and output.
Here, the expressions for directly calculating a local average value and a local square average value from original pixel values will be provided. A local variance and a local standard deviation value are obtained with fundamental arithmetic operations by calculating the local average value and the local square average value, and the linear combination of the local variance and the local standard deviation value is used, whereby a threshold value is obtained according to the above described expressions referred to in the explanation of FIGS. 3A and 3B.
FIG. 14 is a schematic diagram explaining the method for calculating a local average value and a local square average value from original pixel values.
First of all, symbols will be defined. A local area centering around an original pixel is illustrated in FIG. 14. The coordinate of the targeted original pixel is assumed to be (0, 0) for ease of explanation. The description of the coordinate of a normal original pixel is obtained by parallel translating the coordinate. The degree of the subpixel generation is defined to be n(≧1), and (n−1) new subpixels are inserted between two original pixels. The local area is a square area centering around a targeted original pixel. The upper left point (that is, the coordinate of the upper left original pixel) of the regular square, which is the original pixel farthest from the targeted original point, among the original pixels included in the local area, is assumed to be (−M, −M) with the condition 1≦M imposed. The number of subpixels, which exist outside the regular square but within the local area in one direction, after the subpixel generation process is assumed to be “r”. “r” must satisfy 0≦r<n. The original pixel value at a coordinate (i, j) is represented as I(i, j).
Assuming that one side of the local area is “L”, the following equation is satisfied.
L=2(M*n+r)+1 (0≦r<n)
That is, “M” original pixels center around the targeted pixel point, for example, in a vertical direction, “n−1” subpixels exist between the original pixels, and “r” subpixels exist outside the “M” original pixels. Since the pixels exist in both of the top and the bottom of the local area in a similar manner, 2(M*n+r) is obtained. Additionally, the single target point is added, so that the above described equation is obtained.
“L” is an odd number larger than “2n”. Conversely, if “L” which is an odd number larger than a positive number “n”, and “2n” are given, “M” and “r” which satisfy the above described equation are uniquely determined. That is, if (L−1)/2 is divided by “n” and its quotient and remainder are obtained, the quotient and the remainder respectively correspond to “M” and “r”.
Represented below are the equations for obtaining a local average value and a local square average value, which are respectively used in the case (1) where the boundary of a local area is original pixels (r=0), and in a normal case (2). The expressions used in the case (1) are simpler than those used in the case (2). The expressions are represented as follows.
local average value=E(I)
local square average value=E(I 2)
Ci(i=0, 1, 2, 3, . . . ) is a coefficient which will appear in the following expressions. Even if identical symbols are used, their definitions are different depending on the expressions. Since E(I) and E(I2) are multiplied by C0 in the following expressions, it is easily understood that the right side of the expressions is divided by co in order to obtain E(I) and E(I2).
(1) In the case where original pixels exist on the boundary of a local area (r=0)
Note that the coefficients are defined by the C 0 E ( I ) = C 1 i = - M + 1 M - 1 j = - M + 1 M - 1 I ( i , j ) + C 2 k = - M + 1 M - 1 { I ( - M , k ) + I ( M , k ) + I ( k , - M ) + I ( k , M ) } + C 3 { I ( - M , - M ) + I ( - M , M ) + I ( M , - M ) + I ( M , M ) }
Figure US06347156-20020212-M00001
following equations.
C 0=4(2Mn+1)2
C 1=4n 2
C 2=2n(n+1)
C 3=(n+1)2
C 0 E ( I 2 ) = [ C 1 i = - M + 1 M - 1 j = - M + 1 M - 1 I ( i , j ) 2 + C 2 k = - M + 1 M - 1 { I ( - M , k ) 2 + I ( M , k ) 2 + I ( k , - M ) 2 + I ( k , M ) 2 } + C 3 { I ( - M , - M ) 2 + I ( - M , M ) 2 + I ( M , - M ) 2 + I ( M , M ) 2 } ] + [ C 4 i = - M + 1 M - 1 j = - M M - 1 I ( i , j ) I ( i , j + 1 ) + C 5 j = - M M - 1 { I ( - M , j ) I ( - M , J + 1 ) + I ( M , j ) I ( M , j + 1 ) } ] + [ C 4 i = - M ( M - 1 ) j = - M + 1 M - 1 I ( i , j ) I ( i + 1 , j ) + C 5 i = - M M - 1 { I ( i , - M ) I ( i + 1 , - M ) + I ( i , M ) I ( i + 1 , M ) } ] + [ C 6 i = - M M - 1 j = - M M - 1 { I ( i , j ) I ( i + 1 , j + 1 ) + I ( i , j + 1 ) I ( i + 1 , j ) } ]
Figure US06347156-20020212-M00002
Note that the coefficients are defined by the following equations.
(2) In a normal case
C 0=36(2Mn+1)2 n 2
C 1=4(2n 2+1)2
C 2=2(n+1)(2n+1)(2n 2+1)
C 3=(n+1)2(2n+1)2
C 4=4(n 2−1)(2n 2+1)
C 5=2(n+1)(n 2−1)(2n+1)
C 6=2(n 2−1)2
C 0 E ( I ) = C 1 i = - M + 1 M - 1 j = - M + 1 M - 1 I ( i , j ) + C 2 k = - M + 1 M - 1 { I ( - M , k ) + I ( M , k ) + I ( k , - M ) + I ( k , M ) } + C 3 { I ( - M , - M ) + I ( - M , M ) + I ( M , - M ) + I ( M , M ) } + C 4 k = - M + 1 M - 1 { I ( - M - 1 , k ) + I ( M + 1 , k ) + I ( k , - M - 1 ) + I ( k , M + 1 ) } + C 5 { I ( - M - 1 , - M ) + I ( - M - 1 , M ) + I ( M + 1 , - M ) + I ( M + 1 , M ) + I ( - M , - M - 1 ) + I ( - M , M + 1 ) + I ( M , - M - 1 ) + I ( M , M + 1 ) } + C 6 { I ( - M - 1 , - M , - 1 ) + I ( - M - 1 , M + 1 ) + I ( M + 1 , - M - 1 ) + I ( M + 1 , M +
Figure US06347156-20020212-M00003
Note that the coefficients are defined by the following equations.
C 0=4(2Mn+1)2 n 2
C 1=4n 4
C 2=2n 2 {n 2+(2r+1)n−r(r+1)}
C 3 =n 4+2(2r+1)n 3+(2r 2+2r+1)n 2−2r(r+1)(2r+1)n+r 2(r+1
C 4=2r(r+1)n 2
C 5 =r(r+1){n2+(2r+1)n−r(r+1)}
C 6 =r 2(r+1)2
Note that the coefficients are defined by the following equations. C 0 E ( I 2 ) = [ C 1 i = - M + 1 M - 1 j = - M + 1 M - 1 I ( i , j ) 2 + C 2 k = - M + 1 M - 1 { I ( - M , k ) 2 + I ( M , k ) 2 + I ( k , - M ) 2 + I ( k , M ) 2 } + C 3 { I ( - M , - M ) 2 + I ( - M , M ) 2 + I ( M , - M ) 2 + I ( M , M ) 2 } + C 4 k = - M + 1 M - 1 { I ( - M - 1 , k ) 2 + I ( M + 1 , k ) 2 + I ( k , - M - 1 ) 2 + I ( k , M + 1 ) 2 } + C 5 { I ( - M - 1 , - M ) 2 + I ( - M - 1 , M ) 2 + I ( M + 1 , - M ) 2 + I ( M + 1 , M ) 2 + I ( - M , - M - 1 ) 2 + I ( - M , M + 1 ) 2 + I ( M , - M - 1 ) 2 + I ( M , M + 1 ) 2 } + C 6 { I ( - M - 1 , - M - 1 ) 2 + I ( - M - 1 , M + 1 ) 2 + I ( M + 1 , - M - 1 ) 2 + I ( M + 1 , M + [ C 7 i = - M + 1 M - 1 j = - m M - 1 I ( i , j ) I ( i , j + 1 ) + C 8 j = - M M - 1 { I ( - M , j ) I ( - M , j + 1 ) + I ( M , j ) I ( M , j + 1 ) } + C 9 j = - M M - 1 { I ( - M - 1 , j ) I ( - M - 1 , j + 1 ) + I ( M + 1 , j ) I ( M + 1 , j + 1 ) } + C 10 i = - M + 1 M - 1 { I ( i , - M - 1 ) I ( j , - M ) + I ( i , M ) I ( i , M + 1 ) } + C 11 { I ( - M , - M - 1 ) I ( - M , - M ) + I ( - M , M ) I ( - M , M + 1 ) + I ( M , - M - 1 ) I ( M , - M ) + I ( M , M ) I ( M , M + 1 ) } + C 12 { I ( - M - 1 , - M - 1 ) I ( - M - 1 , - M ) + I ( - M - 1 , M ) I ( - M - 1 , M + 1 ) + I ( M + 1 , - M - 1 ) I ( M + 1 , - M ) + I ( M + 1 , M ) I ( M + 1 , M + 1 ) } ] + [ C 7 i = - M + 1 M - 1 j = - M + 1 M - 1 I ( i , j ) I ( i + 1 , j ) + C 8 i = - M M - 1 { I8i , - M ) I ( i + 1 , - M ) + I ( i , M ) I ( i + 1 , M ) } + C 9 j = - M M - 1 { I ( i , - M - 1 ) I ( i + 1 , - M - 1 ) + I ( i , M + 1 ) I ( i + 1 , M + 1 ) } + C 10 j = - M + 1 M - 1 { I ( - M - 1 , j ) I ( - M , j ) + I ( M , j ) I ( M + 1 , j ) } + C 11 { I ( - M - 1 , - M ) I ( - M , - M ) + I ( M , - M ) I ( M + 1 , - M ) + I ( - M - 1 , M ) I ( - M , M ) + I ( M , M ) I ( M + 1 , M ) } + C 12 { I ( - M - 1 , - M - 1 ) I ( - M , - M - 1 ) + I ( M , - M - 1 ) I ( M + 1 , - M - 1 ) + I ( - M - 1 , M + 1 ) I ( - M , M + 1 ) + I ( M , M + 1 ) I ( M + 1 , M + 1 ) } ] + [ C 13 k = - M M - 1 j = - M M - 1 { I ( i , j ) I ( i + 1 , j + 1 ) + I ( i , j + 1 ) I ( i + 1 , j ) } + C 14 k = - M M - 1 ( { I ( - M - 1 , k ) I ( - M , k + 1 ) + I ( - M - 1 , k + 1 ) I ( - M , k ) } + { I ( M , k ) I ( M + 1 , k + 1 ) + I ( M , k + 1 ) I ( M + 1 , k ) } + { I ( k , - M - 1 ) I ( k + 1 , - M ) + I ( k + 1 , - M - 1 ) I ( k , - M ) } + { I ( k , M ) I ( k + 1 , M + 1 ) LI ( k + 1 , M ) I ( - M , - M - 1 ) } ) + C 15 ( { I ( - M - 1 , - M - 1 ) I ( - M , - M ) + I ( - M - 1 , - M ) I ( - M , - M - 1 ) } + { I ( - M - 1 , M ) I ( - M , M + 1 ) + I ( - M - 1 , M + 1 ) I ( - M , M ) } + I ( M , - M - 1 ) I ( M + 1 , - M ) + I ( M , - M ) I ( M + 1 , - M - 1 ) } + { I ( M , M ) I ( M + 1 , M + 1 ) + I ( M , M + 1 ) I ( M + 1 , M } )
Figure US06347156-20020212-M00004
The expression which is used in the case where “M” is equal to or larger than “1” is represented for the
C 0=36n 4(2MN+2r+1)2
C 1=4n 2(2n 2+1)2
C 2=2n{4n 5+6(2r+1)n44(−3r 2−3r+1)n3+(4r 3+6r 2+8r+3)n2+(−6r 2−6r+1)n+r(r+1)(2r+1)}
C 3=4n 6+12(2r+1)n5+(12r 2+12r+13)n4−2(32r 3+48r 2+10r−3)n3
+(60r 4+120r 3+54r 2−6r+1)n2−2r(12r 4+30r 3+22r 2+3r−1)n+r 2(r+1)2(2r+1)2
C 4=2r(r+1)(2r+1)n(2n 2+1)
C 5 =r(r+1)(2r+1){2n 3+3(2r+1)n2+(−6r 2−6r+1)n+r(r+1)(2r1)}
C 6 =r 2(r+1)2(2r+1)2
C 74n 2(n 2−1)(2n 1+1)
C 82n(n 2−1){2n 3+3(2r+1)n 2+(−6r 2−6r+1)n+r(r+1)(2r+1)
C 9=2r(r+1)(2r+1)n(n 2−1)
C 10=4r(r+1)n(3n−(2r+1) )(2n 2+1)
C 11=2r(r+1){6n 4+7(2r+1)n 3−30r(r+1)n 2+(18r 3+27r 2+7r−1)n−r(4r 3+8r 2+5r+1)}
C 12=2r 2(r+1)2(2r+1)(3n−(2r+1)
C 13=2n 2(n 2−1)2
C 14=2r(r+1)n(n 2−1)(3n−(2r+1)
C 15=2r 2(r+1)2(3n−(2r+1)
normal case (2). The expression which is used in the case where M=0 can be obtained in the following three steps.
(1) Setting “M” to “1”.
(2) Replacing “n” with “r”.
(3) Replacing the value of “I” with the boundary value of a local area
The variance and standard deviation values can be obtained by using the average value and the square average value of the pixel values within a local area, which are obtained with the above described expressions, thereby obtaining the expression for determining a threshold value.
FIG. 15 is a block diagram exemplifying the sixth configuration of the locally binarizing unit.
With this configuration, the value obtained by interpolating the difference between the pixel value at an original pixel point of a gray-scale image and the local threshold value after the subpixel generation process, at an interpolation point (subpixel), and the value of the binary image at the interpolation point, are determined with the sign of the interpolation value. Because the interpolation process is reduced from twice to once with this configuration, the processing speed can be improved. Especially, if the subpixel generation process is performed with the linear interpolation method, the same effects as those in the case where respective original pixels are interpolated and compared can be obtained.
the subpixel value for {original pixel value−binarization threshold value}=the value obtained by performing the subpixel generation process for original pixel values−the value obtained by performing the subpixel generation process for a binarization threshold value
That is, the locally binarizing unit 140 is composed of a pixel point threshold value calculating unit 140, a difference calculating unit 142, an interpolating unit 143, and a sign determining unit 144.
The data of an input gray-scale image are directly input to the difference calculating unit 142, and to the pixel point threshold value calculating unit 141. The pixel point threshold value calculating unit 141 calculates the threshold value within a local area only from the original pixel values included in the local area for the original pixel point of the input gray-scale image, and generates the threshold value for a targeted original pixel. Next, the targeted original pixel value and the generated threshold value are input to the difference calculating unit 142, which calculates the difference between them. The value of a subpixel is then obtained by interpolating this difference in a similar manner as in the above described process for interpolating the level of brightness (gray scale). Since the obtained value is the same as the value of the difference between the value of the subpixel and the threshold value, the difference from the value of the original pixel and that of the subpixel are input to the sign determining unit 144, which examines the sign of the difference.
For example, if the pixel value is larger than the threshold value, the above described difference becomes positive, and, for example, the value “0” is assigned to this pixel when being binarized. If the pixel value is smaller than the threshold value, the difference becomes negative, and, for example, the value “1” is assigned to the pixel when being binarized. By performing such a process for all of the pixel points, a binary image can be obtained.
FIG. 16 is a block diagram exemplifying the seventh configuration of the local binarizing unit.
With this configuration, a binary image is obtained by obtaining the local threshold value after the subpixel generation process at each pixel point of a gray-scale image, and by using the table to which the values at 4 pixel points of the gray-scale image and the local threshold value and from which the binary image enclosed by 4 pixel points is output as data.
Namely, a locally binarizing unit 150 is composed of a pixel point threshold value calculating unit 151, a binary image searching unit 152, and a memory 153. The pixel point threshold value calculating unit 151 obtains the threshold value in a local area from the pixel values of the input gray-scale image for each of the pixels, and provides it to the binary image searching unit 152. The binary image searching unit 152 receives the threshold values from the pixel point threshold value calculating unit 151 and the original pixel values of the gray-scale image, and selects 4 original pixels forming the regular square which is a minimum unit of the grid formed by original pixel points. Then, a binary image is obtained by referencing the table stored in the memory 153 based on the pixel values and the threshold values of the 4 original pixels. The table stored in the memory 153 is a table to which the binary image data is registered for the combination of the pixel values and the threshold values of 4 original pixels. The binary image searching unit 152 obtains the binary image data for all of unit grids (minimum regular squares forming a grid) structured by the original pixels of the input gray-scale image according to this table, generates an entire binary image by combining the data, and outputs the generated image.
Only the explanations about the configurations of the variable resolution binarizing unit were provided above. However, as shown in FIG. 6, the rough extraction process for roughly extracting a drawing area is initially performed for a gray-scale image, and the above described process is performed only for the extracted drawing area, thereby reducing the amount of processing time.
FIG. 17 shows the processing examples up to the process for binarizing a color image, and FIG. 18 shows a gray-scale image according to this preferred embodiment.
FIG. 17 shows the processing example of a 150-dpi color image.
The top image is a color image document with a resolution of 150 dpi. In an actual color image, various colors appear around a black color representing characters. The top color image converted into a gray-scale image is shown as a middle image. The gray-scale image binarized with the conventional method is the binary image on the right at the bottom. Viewing this image, the detailed portions of the characters are defaced and the characters are difficult to recognize. In the meantime, the binary image obtained by performing the processing according to this preferred embodiment is the image on the left at the bottom. Since subpixels are generated according to this preferred embodiment, the resolution of the binary image is substantially higher than that of the original color image. In this case, the amount of information increases. However, since the binary image data is used not for printing by a printer, etc, but for charter recognition as it is, the clearer representation of the characters facilitates the character recognition.
FIG. 18 shows the processing example of 150- and 100-dpi gray-scale images.
If the 150- and 100-dpi gray-scale images are binarized by a conventional process, an erroneous character recognition result may often be obtained due to the defacement of the characters. In the meantime, according to this preferred embodiment, subpixels are generated and the information about the levels of brightness can be prevented from being lost, whereby the characters can be recognized more clearly, and a higher recognition rate can be realized when the character recognition process is performed.
FIG. 19 is a block diagram explaining the configuration of the hardware required for implementing this preferred embodiment as software.
In this preferred embodiment, the subpixel generation process, the interpolation process, the threshold value calculation process, etc. can be implemented as a program running on a computer. In this case, as the hardware configuration required for the computer, a CPU 181 for performing the above described processes, a RAM 183 for storing the program for performing these processes in an executable form, etc. must be interconnected by a bus 180, so that they can communicate with each other. Furthermore, a ROM 182 for storing the BIOS required for running the CPU 181, a storage device 189 for storing the program, etc. are arranged. The storage device 189 is implemented, for example, as a hard disk, etc. In addition, a storage medium reading device 187 is required when the program is stored onto a portable storage medium 188 such as a floppy disk, a CD-ROM, etc. and used. The program read from the storage device 189 or the portable storage medium 188 is expanded and stored in the RAM 183 so that the CPU 181 can execute it. Furthermore, an input/output device 186 composed of a monitor, a keyboard, a mouse, etc. is arranged in order to transmit to the CPU 181 the commands issued by a user who operates the device, and to display the results of the processes performed by the CPU 181 for the user.
Furthermore, the program may not be stored in a computer used by a user. It may be used depending on need by being downloaded from a database possessed by a program provider 185. Or, the program may be executed in a network by connecting the user and the program provider 185 via a LAN. Only a command input and a result display are performed by the computer possessed by the user in this case.
As described above in detail, according to the present invention, a color or gray-scale document image can be quickly binarized with high accuracy, thereby recognizing the image accurately and rapidly.

Claims (54)

What is claimed is:
1. A document image recognizing device, comprising:
image converting means for converting an input document image into a gray-scale image if the input document image is a color image, and for newly outputting a gray-scale image if the input image is a gray-scale image;
variable resolution binarizing means for converting the input document image into a binary image having a higher resolution according to a resolution of the gray-scale image; wherein said variable resolution binarizing means performs a sub-pixel generation process for increasing a number of pixels included in an image by interpolating pixel values ofa gray-scale image, sets a local threshold value within a local area centering around a particular pixel, and obtains a binary image by using the local threshold value; and
recognizing means for recognizing the binarized image.
2. The device according to claim 1, wherein:
said recognizing means recognizes a converted or input binary image, and converts the binary image into electronic codes.
3. The device according to claim 1, further comprising:
drawing area roughly extracting means for roughly extracting a drawing area according to a global threshold value for pixel values of a gray-scale image, wherein
said recognizing means recognizes a binary image in an area extracted by said drawing area roughly extracting means.
4. The device according to claim 3, wherein:
the global threshold value uses a linear combination of an average pixel value, a standard deviation value, and a variance.
5. The device according to claim 3, wherein
said variable resolution binarizing means performs the subpixel generation process for an entire input gray-scale image; and
said drawing area roughly extracting means roughly extracts a drawing area from gray-scale image data for which the subpixel generation process is performed.
6. The device according to claim 1, wherein:
the subpixel generation process is performed by linearly interpolating original pixel values of the gray-scale image.
7. The device according to claim 1, wherein:
said variable resolution binarizing means obtains a binary image by binarizing pixel values by using the local threshold value obtained from a distribution of the pixel values within a local area including pixels; and
the local threshold value uses a linear combination of an average pixel value, a standard deviation value, and a variance.
8. The device according to claim 1, further comprising:
drawing area roughly extracting means for performing a global process which roughly extracts a drawing area according to a global threshold value for pixel values of a gray-scale image, wherein
said variable resolution binarizing means performs the subpixel generation process for increasing a number of pixels included in an image by interpolating pixel values of a gray-scale image, for a drawing area which is roughly extracted with the global process, and performs a local binarization process by using the local threshold value for each pixel included in the roughly extracted drawing area.
9. The device according to claim 1, wherein the local threshold value at a pixel point after the subpixel generation process is based on obtaining a value of a subpixel generated with the subpixel generation process from original pixel values, and obtaining a local threshold value from subpixel values.
10. The device according to claim 1, wherein a local threshold value after the subpixel generation process is obtained at a pixel point of a gray-scale image, and a local threshold value of a subpixel is obtained by interpolating local threshold values at pixel points of the gray-scale image.
11. The device according to claim 10, wherein the local threshold value at the pixel point of the gray-scale image is obtained by using a value of a subpixel obtained by performing the subpixel generation process for pixel points of the gray-scale image.
12. The device according to claim 10, wherein the local threshold value at the pixel point of the gray-scale image after the subpixel generation process is based on obtaining a subpixel value from original pixel values, and obtaining a local threshold value from subpixel values.
13. The device according to claim 12, wherein the local threshold value is obtained, using subpixel values, from a local threshold value from a local area that recognizes original pixels as its boundary.
14. The device according to claim 10, wherein an interpolation value at a subpixel point is obtained by interpolating a difference between a value at a pixel point of a gray-scale image and a local threshold value after the subpixel generation process, and a value of a binary image at the subpixel point is determined with a sign of the interpolation value.
15. The device according to claim 1, wherein a local threshold value at an original pixel point of a gray-scale image after the subpixel generation process is obtained from a linear combination of an average value, a standard deviation value, and a variance, by obtaining the standard deviation value and the variance after obtaining the average value and a square average value of values at pixel points of the gray-scale image, for which the subpixel generation process is performed, based on obtaining a subpixel value from original pixel values, and obtaining the average value and the square average value from subpixel values.
16. The device according to claim 1, wherein:
the local area recognizes original pixels as its boundary; and
a local threshold value at an original pixel point of a gray-scale image after the subpixel generation process is obtained from a linear combination of an average value, a standard deviation value, and a variance, by obtaining the standard deviation value and the variance after obtaining the average value and a square average value of values at pixel points of the gray-scale image, for which the subpixel generation process is performed, based on obtaining a subpixel value from original pixel values, and obtaining the average value and the square average value from subpixel values.
17. The device according to claim 1, further comprising:
specifying means for specifying a range of a local area, wherein
whether or not the local area recognizes original pixels as its boundary is determined by using as specification data a number of subpixels generated with the subpixel generation process and a size of the local area.
18. The device according to claim 1, wherein:
a local threshold value after the subpixel generation process is obtained at a pixel point of a gray-scale image;
a table to which pixel values and local threshold values at four pixel points of a gray-scale are input and from which a binary image enclosed by the four pixel points is output as data, is included; and
a binary image is obtained by using said table.
19. A document image recognizing method, comprising:
converting an input document image into a gray-scale image if the input document image is a color image, and newly outputting a gray-scale image if the input image is a gray-scale image;
converting the input document image into a binary image having a higher resolution according to a resolution of the gray-scale image; wherein said converting the input document image into a binary image performs a sub-pixel generation process for increasing a number of pixels included in an image by interpolating pixel values of a gray-scale image, sets a local threshold value within a local area centering around a particular pixel, and obtains a binary image by using the local threshold value; and
recognizing the binary image.
20. The method according to claim 19, wherein:
said recognizing the binary image recognizes a converted or input binary image, and converts the binary image into electronic codes.
21. The method according to claim 20, further comprising:
roughly extracting a drawing area according to a global threshold value for pixel values of a gray-scale image, wherein
said recognizing the binary image recognizes a binary image in an area extracted by said roughly extracting a drawing area.
22. The method according to claim 21, wherein the global threshold value uses a linear combination of an average pixel value, a standard deviation value, and a variance.
23. The method according to claim 21, wherein:
said converting the input document image into a binary image performs the subpixel generation process for an entire input gray-scale image; and
said roughly extracting a drawing area roughly extracts a drawing area from gray-scale image data for which the subpixel generation process is performed.
24. The method according to claim 19, wherein:
the subpixel generation process is performed by linearly interpolating original pixel values of the gray-scale image.
25. The method according to claim 19, wherein:
said converting the input document image into a binary image obtains a binary image by binarizing pixel values by using the local threshold value obtained from a distribution of the pixel values within a local area including pixels; and
the local threshold value uses a linear combination of an average pixel value, a standard deviation value, and a variance.
26. The method according to claim 19, further comprising:
performing a global process which roughly extracts a drawing area according to a global threshold value for pixel values of a gray-scale image, wherein
said converting the input document image into a binary image performs the subpixel generation process for increasing a number of pixels included in an image by interpolating pixel values of a gray-scale image, for a drawing area which is roughly extracted with the global process, and performs a local binarization process by using the local threshold value for each pixel included in the roughly extracted drawing area.
27. The method according to claim 19, wherein the local threshold value at a pixel point after the subpixel generation process is based on obtaining a value of a subpixel generated with the subpixel generation process from original pixel values, and obtaining a local threshold value from subpixel values.
28. The method according to claim 19, wherein a local threshold value after the subpixel generation process is obtained at a pixel point of a gray-scale image, and a local threshold value of a subpixel is obtained by interpolating local threshold values at pixel points of the gray-scale image.
29. The method according to claim 28, wherein the local threshold value at the pixel point of the gray-scale image after the subpixel generation process is based on obtaining a subpixel value from original pixel values, and obtaining a local threshold value from subpixel values.
30. The method according to claim 29, wherein the local threshold value is obtained, using subpixel values, from a local threshold value within a local area that recognizes original pixels as its boundary.
31. The method according to claim 28, wherein an interpolation value at a subpixel point is obtained by interpolating a difference between a value at a pixel point of a gray-scale image and a local threshold value after the subpixel generation process, and a value of a binary image at the subpixel point is determined by a sign of the interpolation value.
32. The method according to claim 19, wherein the local threshold value at the pixel point of the gray-scale image is obtained by using a value of a subpixel obtained by performing the subpixel generation process for pixel points of the gray-scale image.
33. The method according to claim 19, wherein a local threshold value at an original pixel point of a gray-scale image after the subpixel generation process is obtained from a linear combination of an average value, a standard deviation value, and a variance, by obtaining the standard deviation value and the variance after obtaining the average value and a square average value of values at pixel points of the gray-scale image, for which the subpixel generation process is performed, based on obtaining a subpixel value from original pixel values, and obtaining the average value and the square average value from subpixel values.
34. The method according to claim 19, wherein:
the local area recognizes original pixels as its boundary; and
a local threshold value at an original pixel point of a gray-scale image after the subpixel generation process is obtained from a linear combination of an average value, a standard deviation value, and a variance, by obtaining the standard deviation value and the variance after obtaining the average value and a square average value of values at pixel points of the gray-scale image, for which the subpixel generation process is performed, based on obtaining a subpixel value from original pixel values, and obtaining the average value and the square average value from subpixel values.
35. The method according to claim 19, further comprising:
specifying a range of a local area, wherein whether or not the local area recognizes original pixels as its boundary is determined by using as specification data a number of subpixels generated with the subpixel generation process and a size of the local area.
36. The method according to claim 19, wherein:
a local threshold value after the subpixel generation process is obtained at a pixel point of a gray-scale image;
a table to which pixel values and local threshold values at four pixel points of a gray-scale are input and from which a binary image enclosed by the four pixel points is output as data, is included; and
a binary image is obtained by using said table.
37. A computer-readable storage medium for directing a computer to execute a process comprising:
converting an input document image into a gray-scale image if the input document image is a color image, and newly outputting a gray-scale image if the input image is a gray-scale image;
converting the input document image into a binary image having a higher resolution according to a resolution of a gray-scale image; wherein said converting the input document image into a binary image performs a sub-pixel generation process for increasing a number of pixels included in an image by interpolating pixel values of a gray-scale image, sets a local threshold value within a local area centering around a particular pixel, and obtains a binary image by using the local threshold value; and
recognizing the binary image.
38. The storage medium according to claim 37, wherein:
said recognizing the binary image recognizes a converted or input binary image, and converts the binary image into electronic codes.
39. The storage medium according to claim 37, wherein the process further comprises:
roughly extracting a drawing area according to a global threshold value for pixel values of a gray-scale image, and wherein
said recognizing the binary image recognizes a binary image in an area extracted by said roughly extracting a drawing area.
40. The storage medium according to claim 39, wherein the global threshold value uses a linear combination of an average pixel value, a standard deviation value, and a variance.
41. The storage medium according to claim 39, wherein:
said converting the input document image into a binary image performs the subpixel generation process for an entire input gray-scale image; and
said roughly extracting a drawing area roughly extracts a drawing area from gray-scale image data for which the subpixel generation process is performed.
42. The storage medium according to claim 39, wherein the subpixel generation process is performed by linearly interpolating original pixel values of the gray-scale image.
43. The storage medium according to claim 37, wherein:
said converting the input document image into a binary image obtains a binary image by binarizing pixel values by using the local threshold value obtained from a distribution of the pixel values within a local area including pixels; and
the local threshold value uses a linear combination of an average pixel value, a standard deviation value, and a variance.
44. The storage medium according to claim 37, further comprising:
performing a global process which roughly extracts a drawing area according to a global threshold value for pixel values of a gray-scale image, and wherein
said converting the input document image into a binary image performs the subpixel generation process for increasing a number of pixels included in an image by interpolating pixel values of a gray-scale image, for a drawing area which is roughly extracted with the global process, and performs a local binarization process by using the local threshold value for each pixel included in the roughly extracted drawing area.
45. The storage medium according to claim 37, wherein the local threshold value at a pixel point after the subpixel generation process is based on obtaining a value of a subpixel generated with the subpixel generation process from original pixel values, and obtaining a local threshold value from subpixel values.
46. The storage medium according to claim 37, wherein a local threshold value after the subpixel generation process is obtained at a pixel point of a gray-scale image, and a local threshold value of a subpixel is obtained by interpolating local threshold values at pixel points of the gray-scale image.
47. The storage medium according to claim 46, wherein the local threshold value at the pixel point of the gray-scale image is obtained by using a value of a subpixel obtained by performing the subpixel generation process for pixel points of the gray-scale image.
48. The storage medium according to claim 46, wherein the local threshold value at the pixel point of the gray-scale image after the subpixel generation process is based on obtaining a subpixel value from original pixel values, and obtaining a local threshold value from subpixel values.
49. The storage medium according to claim 48, wherein the local threshold value is obtained, using subpixel values, from a local threshold value within a local area that recognizes original pixels as its boundary.
50. The storage medium according to claim 46, wherein an interpolation value at a subpixel point is obtained by interpolating a difference between a value at a pixel point of a gray-scale image and a local threshold value after the subpixel generation process, and a value of a binary image at the subpixel point is determined with a sign of the interpolation value.
51. The storage medium according to claim 37, wherein a local threshold value at an original pixel point of a gray-scale image after the subpixel generation process is obtained from a linear combination of an average value, a standard deviation value, and a variance, by obtaining the standard deviation value and the variance after obtaining the average value and a square average value of values at pixel points of the gray-scale image, for which the subpixel generation process is performed, based on obtaining a subpixel value from original pixel values, and obtaining the average value and the square average value from subpixel values.
52. The storage medium according to claim 37, wherein:
the local area recognizes original pixels as its boundary; and
a local threshold value at an original pixel point of a gray-scale image after the subpixel generation process is obtained from a linear combination of an average value, a standard deviation value, and a variance, by obtaining the standard deviation value and the variance after obtaining the average value and a square average value of values at pixel points of the gray-scale image, for which the subpixel generation process is performed, based on obtaining a subpixel value from original pixel values, and obtaining the average value and the square average value from subpixel values.
53. The storage medium according to claim 37, further comprising:
specifying a range of a local area, and wherein
whether or not the local area recognizes original pixels as its boundary is determined by using as specification data a number of subpixels generated with the subpixel generation process and a size of the local area.
54. The storage medium according to claim 37, wherein:
a local threshold value after the subpixel generation process is obtained at a pixel point of a gray-scale image;
a table to which pixel values and local threshold values at four pixel points of a gray-scale are input and from which a binary image enclosed by the four pixel points is output as data, is included; and
a binary image is obtained by using said table.
US09/216,712 1998-05-27 1998-12-21 Device, method and storage medium for recognizing a document image Expired - Lifetime US6347156B1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP14532298A JP3345350B2 (en) 1998-05-27 1998-05-27 Document image recognition apparatus, method thereof, and recording medium
JP10-145322 1998-05-27

Publications (1)

Publication Number Publication Date
US6347156B1 true US6347156B1 (en) 2002-02-12

Family

ID=15382488

Family Applications (1)

Application Number Title Priority Date Filing Date
US09/216,712 Expired - Lifetime US6347156B1 (en) 1998-05-27 1998-12-21 Device, method and storage medium for recognizing a document image

Country Status (2)

Country Link
US (1) US6347156B1 (en)
JP (1) JP3345350B2 (en)

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010036317A1 (en) * 2000-04-26 2001-11-01 Toshihiro Mori Apparatus and method for detecting a pattern
US20020008715A1 (en) * 2000-02-03 2002-01-24 Noam Sorek Image resolution improvement using a color mosaic sensor
US20020016750A1 (en) * 2000-06-20 2002-02-07 Olivier Attia System and method for scan-based input, storage and retrieval of information over an interactive communication network
US20020033818A1 (en) * 2000-08-05 2002-03-21 Ching-Fang Lin Three-dimensional relative positioning and tracking using LDRI
US20020136458A1 (en) * 2001-03-22 2002-09-26 Akio Nagasaka Method and apparatus for character string search in image
WO2002063868A3 (en) * 2001-02-05 2002-10-03 Hewlett Packard Co System and method for scaling and enhancing color text images
US20030031366A1 (en) * 2001-07-31 2003-02-13 Yulin Li Image processing method and apparatus using self-adaptive binarization
US20040042660A1 (en) * 1999-12-22 2004-03-04 Hitachi, Ltd. Sheet handling system
US20040151373A1 (en) * 2001-01-16 2004-08-05 Wang Yibing (Michelle) Image sensing system with histogram modification
US20040175031A1 (en) * 1998-12-09 2004-09-09 Fujitsu Limited Image processing apparatus and pattern extraction apparatus
US20050011957A1 (en) * 2003-07-16 2005-01-20 Olivier Attia System and method for decoding and analyzing barcodes using a mobile device
US20050031220A1 (en) * 2003-08-08 2005-02-10 Hirobumi Nishida Method, apparatus, system, and program for image processing capable of recognizing, reproducing, and enhancing an image, and a medium storing the program
US20050035206A1 (en) * 2003-08-11 2005-02-17 Olivier Attia Group average filter algorithm for digital image processing
US20050074163A1 (en) * 2003-10-02 2005-04-07 Doron Shaked Method to speed-up Retinex-type algorithms
US20050125301A1 (en) * 2003-12-04 2005-06-09 Ashish Muni System and method for on the spot purchasing by scanning barcodes from screens with a mobile device
US20050121521A1 (en) * 2003-12-04 2005-06-09 Rashmi Ghai Section based algorithm for image enhancement
US20050201619A1 (en) * 2002-12-26 2005-09-15 Fujitsu Limited Video text processing apparatus
US20050207641A1 (en) * 2004-03-16 2005-09-22 Xerox Corporation Color to grayscale conversion method and apparatus
US20050246196A1 (en) * 2004-04-28 2005-11-03 Didier Frantz Real-time behavior monitoring system
US20050242189A1 (en) * 2004-04-20 2005-11-03 Michael Rohs Visual code system for camera-equipped mobile devices and applications thereof
US20060011728A1 (en) * 2004-07-14 2006-01-19 Didier Frantz Mobile device gateway providing access to instant information
EP1646221A2 (en) 2004-08-10 2006-04-12 Ricoh Company, Ltd. Image processing device, image processing method, image processing program and recording medium
US20060078220A1 (en) * 1999-09-17 2006-04-13 Hiromi Okubo Image processing based on degree of white-background likeliness
US20060147113A1 (en) * 2004-12-28 2006-07-06 Han Seung-Hoon Apparatus to detect homogeneous region of image using adaptive threshold
US7162098B1 (en) 2002-09-20 2007-01-09 Lockheed Martin Corporation System and method for increasing temporal and spatial capacity of systems that amplitude quantize data prior to processing
US20070153024A1 (en) * 2005-12-29 2007-07-05 Samsung Electronics Co., Ltd. Multi-mode pixelated displays
US20070194123A1 (en) * 2006-02-21 2007-08-23 Didler Frantz Mobile payment system using barcode capture
US20070242883A1 (en) * 2006-04-12 2007-10-18 Hannes Martin Kruppa System And Method For Recovering Image Detail From Multiple Image Frames In Real-Time
US20080089565A1 (en) * 2004-10-15 2008-04-17 Chui Kui M Pattern Matching
WO2009021996A2 (en) * 2007-08-15 2009-02-19 I.R.I.S. S.A. Method for fast up-scaling of color images and method for interpretation of digitally acquired documents
US20090080776A1 (en) * 2007-09-25 2009-03-26 Kabushiki Kaisha Toshiba Image data processing system and image data processing method
WO2009089451A1 (en) * 2008-01-10 2009-07-16 Copanion, Inc. System for optimal document scanning
CN101826159B (en) * 2009-03-07 2013-01-09 鸿富锦精密工业(深圳)有限公司 Method for realizing partitioned binarization of gray scale image and data processing equipment
CN105913032A (en) * 2016-04-15 2016-08-31 天地(常州)自动化股份有限公司 Detection method and system for working state of mining belt
US20160275378A1 (en) * 2015-03-20 2016-09-22 Pfu Limited Date identification apparatus
CN106341926A (en) * 2016-10-08 2017-01-18 南华大学 LED digital driving power source for controlling lighting based on image recognition and control method of LED digital driving power source
US20170286797A1 (en) * 2016-03-29 2017-10-05 Brother Kogyo Kabushiki Kaisha Image processing apparatus and image processing method
CN108572520A (en) * 2017-03-10 2018-09-25 株式会社东芝 Image forming apparatus and image forming method
US20190279022A1 (en) * 2018-03-08 2019-09-12 Chunghwa Picture Tubes, Ltd. Object recognition method and device thereof
CN111369923A (en) * 2020-02-26 2020-07-03 歌尔股份有限公司 Display screen abnormal point detection method, detection device and readable storage medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4571758B2 (en) * 2000-04-03 2010-10-27 株式会社リコー Character recognition device, character recognition method, image processing device, image processing method, and computer-readable recording medium
JP4484183B2 (en) * 2000-06-13 2010-06-16 株式会社パスコ Forest information processing system
JP4756436B2 (en) * 2001-07-03 2011-08-24 日本電気株式会社 Pattern recognition apparatus, pattern recognition method, and pattern recognition program
CN101986711B (en) * 2010-11-24 2013-01-16 清华大学 1/8 pixel precision interpolation method and interpolation device
JP5867683B2 (en) * 2011-09-09 2016-02-24 富士ゼロックス株式会社 Image processing apparatus and image processing program
KR101595719B1 (en) * 2014-03-13 2016-02-19 (주)에이텍티앤 Apparatus for image preprocessing in identification recognizer

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01243668A (en) 1988-03-24 1989-09-28 Toshiba Corp Picture processing device
JPH029268A (en) 1988-06-27 1990-01-12 Canon Inc Picture processor
US5128748A (en) * 1989-02-15 1992-07-07 Hitachi, Ltd. Image processing system and apparatus for processing color documents
US5274471A (en) * 1990-11-28 1993-12-28 Samsung Electronics Co., Ltd. Apparatus for converting resolution and gray scale of document image data
US5583659A (en) * 1994-11-10 1996-12-10 Eastman Kodak Company Multi-windowing technique for thresholding an image using local image properties
US5742703A (en) * 1995-10-11 1998-04-21 Xerox Corporation Method and apparatus for the resolution enhancement of gray-scale images that include text and line art
US5781658A (en) * 1994-04-07 1998-07-14 Lucent Technologies, Inc. Method of thresholding document images
US5809167A (en) * 1994-04-15 1998-09-15 Canon Kabushiki Kaisha Page segmentation and character recognition system
US5850466A (en) * 1995-02-22 1998-12-15 Cognex Corporation Golden template comparison for rotated and/or scaled images
US5875268A (en) * 1993-09-27 1999-02-23 Canon Kabushiki Kaisha Image processing with low-resolution to high-resolution conversion
US6055336A (en) * 1996-11-18 2000-04-25 Canon Kabushiki Kaisha Image processing system which converts multi-value image data into binary image data

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01243668A (en) 1988-03-24 1989-09-28 Toshiba Corp Picture processing device
JPH029268A (en) 1988-06-27 1990-01-12 Canon Inc Picture processor
US5128748A (en) * 1989-02-15 1992-07-07 Hitachi, Ltd. Image processing system and apparatus for processing color documents
US5274471A (en) * 1990-11-28 1993-12-28 Samsung Electronics Co., Ltd. Apparatus for converting resolution and gray scale of document image data
US5875268A (en) * 1993-09-27 1999-02-23 Canon Kabushiki Kaisha Image processing with low-resolution to high-resolution conversion
US5781658A (en) * 1994-04-07 1998-07-14 Lucent Technologies, Inc. Method of thresholding document images
US5809167A (en) * 1994-04-15 1998-09-15 Canon Kabushiki Kaisha Page segmentation and character recognition system
US5583659A (en) * 1994-11-10 1996-12-10 Eastman Kodak Company Multi-windowing technique for thresholding an image using local image properties
US5850466A (en) * 1995-02-22 1998-12-15 Cognex Corporation Golden template comparison for rotated and/or scaled images
US5742703A (en) * 1995-10-11 1998-04-21 Xerox Corporation Method and apparatus for the resolution enhancement of gray-scale images that include text and line art
US6055336A (en) * 1996-11-18 2000-04-25 Canon Kabushiki Kaisha Image processing system which converts multi-value image data into binary image data

Cited By (87)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7280688B2 (en) * 1998-12-09 2007-10-09 Fujitsu Limited Image processing apparatus and pattern extraction apparatus
US20040175031A1 (en) * 1998-12-09 2004-09-09 Fujitsu Limited Image processing apparatus and pattern extraction apparatus
US7221799B2 (en) * 1999-09-17 2007-05-22 Ricoh Company, Ltd. Image processing based on degree of white-background likeliness
US20060078220A1 (en) * 1999-09-17 2006-04-13 Hiromi Okubo Image processing based on degree of white-background likeliness
US20040042660A1 (en) * 1999-12-22 2004-03-04 Hitachi, Ltd. Sheet handling system
US7003157B2 (en) * 1999-12-22 2006-02-21 Hitachi, Ltd. Sheet handling system
US20020008715A1 (en) * 2000-02-03 2002-01-24 Noam Sorek Image resolution improvement using a color mosaic sensor
US6876763B2 (en) * 2000-02-03 2005-04-05 Alst Technical Excellence Center Image resolution improvement using a color mosaic sensor
US7123768B2 (en) * 2000-04-26 2006-10-17 Minolta Co., Ltd. Apparatus and method for detecting a pattern
US7263228B2 (en) 2000-04-26 2007-08-28 Minolta Co., Ltd. Apparatus and method for detecting a pattern
US20010036317A1 (en) * 2000-04-26 2001-11-01 Toshihiro Mori Apparatus and method for detecting a pattern
US20050089228A1 (en) * 2000-04-26 2005-04-28 Minolta Co., Ltd Apparatus and method for detecting a pattern
US20020016750A1 (en) * 2000-06-20 2002-02-07 Olivier Attia System and method for scan-based input, storage and retrieval of information over an interactive communication network
US6677941B2 (en) * 2000-08-05 2004-01-13 American Gnc Corporation Three-dimensional relative positioning and tracking using LDRI
US20020033818A1 (en) * 2000-08-05 2002-03-21 Ching-Fang Lin Three-dimensional relative positioning and tracking using LDRI
US6792142B1 (en) * 2001-01-16 2004-09-14 Micron Technology, Inc. Image sensing system with histogram modification
US7013044B2 (en) 2001-01-16 2006-03-14 Micron Technology, Inc. Image sensing system with histogram modification
US7206447B2 (en) 2001-01-16 2007-04-17 Micron Technology, Inc. Image sensing system with histogram modification
US20060093214A1 (en) * 2001-01-16 2006-05-04 Wang Yibing Michelle Image sensing system with histogram modification
US20040151373A1 (en) * 2001-01-16 2004-08-05 Wang Yibing (Michelle) Image sensing system with histogram modification
WO2002063868A3 (en) * 2001-02-05 2002-10-03 Hewlett Packard Co System and method for scaling and enhancing color text images
US20020136458A1 (en) * 2001-03-22 2002-09-26 Akio Nagasaka Method and apparatus for character string search in image
US7403657B2 (en) * 2001-03-22 2008-07-22 Hitachi, Ltd. Method and apparatus for character string search in image
US20030031366A1 (en) * 2001-07-31 2003-02-13 Yulin Li Image processing method and apparatus using self-adaptive binarization
US7062099B2 (en) * 2001-07-31 2006-06-13 Canon Kabushiki Kaisha Image processing method and apparatus using self-adaptive binarization
US7162098B1 (en) 2002-09-20 2007-01-09 Lockheed Martin Corporation System and method for increasing temporal and spatial capacity of systems that amplitude quantize data prior to processing
US7787705B2 (en) * 2002-12-26 2010-08-31 Fujitsu Limited Video text processing apparatus
US20050201619A1 (en) * 2002-12-26 2005-09-15 Fujitsu Limited Video text processing apparatus
US7929765B2 (en) 2002-12-26 2011-04-19 Fujitsu Limited Video text processing apparatus
US7156311B2 (en) 2003-07-16 2007-01-02 Scanbuy, Inc. System and method for decoding and analyzing barcodes using a mobile device
US7287696B2 (en) 2003-07-16 2007-10-30 Scanbuy, Inc. System and method for decoding and analyzing barcodes using a mobile device
US20050011957A1 (en) * 2003-07-16 2005-01-20 Olivier Attia System and method for decoding and analyzing barcodes using a mobile device
US20070063050A1 (en) * 2003-07-16 2007-03-22 Scanbuy, Inc. System and method for decoding and analyzing barcodes using a mobile device
US20050031220A1 (en) * 2003-08-08 2005-02-10 Hirobumi Nishida Method, apparatus, system, and program for image processing capable of recognizing, reproducing, and enhancing an image, and a medium storing the program
US7508987B2 (en) * 2003-08-08 2009-03-24 Ricoh Company, Ltd. Method, apparatus, system, and program for image processing capable of recognizing, reproducing, and enhancing an image, and a medium storing the program
US20060193530A1 (en) * 2003-08-11 2006-08-31 Scanbuy, Inc. Group average filter algorithm for digital image processing
US7242816B2 (en) * 2003-08-11 2007-07-10 Scanbuy, Inc. Group average filter algorithm for digital image processing
US7245780B2 (en) * 2003-08-11 2007-07-17 Scanbuy, Inc. Group average filter algorithm for digital image processing
US20050035206A1 (en) * 2003-08-11 2005-02-17 Olivier Attia Group average filter algorithm for digital image processing
US7760943B2 (en) * 2003-10-02 2010-07-20 Hewlett-Packard Development Company, L.P. Method to speed-up Retinex-type algorithms
US20050074163A1 (en) * 2003-10-02 2005-04-07 Doron Shaked Method to speed-up Retinex-type algorithms
US7168621B2 (en) 2003-12-04 2007-01-30 Scanbury, Inc. Section based algorithm for image enhancement
US20050121521A1 (en) * 2003-12-04 2005-06-09 Rashmi Ghai Section based algorithm for image enhancement
US20050125301A1 (en) * 2003-12-04 2005-06-09 Ashish Muni System and method for on the spot purchasing by scanning barcodes from screens with a mobile device
US7387250B2 (en) 2003-12-04 2008-06-17 Scanbuy, Inc. System and method for on the spot purchasing by scanning barcodes from screens with a mobile device
US20050207641A1 (en) * 2004-03-16 2005-09-22 Xerox Corporation Color to grayscale conversion method and apparatus
US7382915B2 (en) * 2004-03-16 2008-06-03 Xerox Corporation Color to grayscale conversion method and apparatus
US7760934B2 (en) * 2004-03-16 2010-07-20 Xerox Corporation Color to grayscale conversion method and apparatus utilizing a high pass filtered chrominance component
US20080181491A1 (en) * 2004-03-16 2008-07-31 Xerox Corporation Color to grayscale conversion method and apparatus
US20050242189A1 (en) * 2004-04-20 2005-11-03 Michael Rohs Visual code system for camera-equipped mobile devices and applications thereof
US7946492B2 (en) 2004-04-20 2011-05-24 Michael Rohs Methods, media, and mobile devices for providing information associated with a visual code
US7296747B2 (en) 2004-04-20 2007-11-20 Michael Rohs Visual code system for camera-equipped mobile devices and applications thereof
US20050246196A1 (en) * 2004-04-28 2005-11-03 Didier Frantz Real-time behavior monitoring system
US7309015B2 (en) 2004-07-14 2007-12-18 Scanbuy, Inc. Mobile device gateway providing access to instant information
US20060011728A1 (en) * 2004-07-14 2006-01-19 Didier Frantz Mobile device gateway providing access to instant information
US20080093460A1 (en) * 2004-07-14 2008-04-24 Scanbuy, Inc. Systems, methods, and media for providing and/or obtaining information associated with a barcode
EP1646221A2 (en) 2004-08-10 2006-04-12 Ricoh Company, Ltd. Image processing device, image processing method, image processing program and recording medium
CN100362530C (en) * 2004-08-10 2008-01-16 株式会社理光 Image processing device, image processing method, image processing program and recording medium
US7525694B2 (en) 2004-08-10 2009-04-28 Ricoh Company, Ltd. Image processing device, image processing method, image processing program, and recording medium
US20080089565A1 (en) * 2004-10-15 2008-04-17 Chui Kui M Pattern Matching
US20060147113A1 (en) * 2004-12-28 2006-07-06 Han Seung-Hoon Apparatus to detect homogeneous region of image using adaptive threshold
US7970208B2 (en) * 2004-12-28 2011-06-28 Samsung Electronics Co., Ltd. Apparatus to detect homogeneous region of image using adaptive threshold
US20070153024A1 (en) * 2005-12-29 2007-07-05 Samsung Electronics Co., Ltd. Multi-mode pixelated displays
US8016187B2 (en) 2006-02-21 2011-09-13 Scanbury, Inc. Mobile payment system using barcode capture
US20070194123A1 (en) * 2006-02-21 2007-08-23 Didler Frantz Mobile payment system using barcode capture
US8150163B2 (en) 2006-04-12 2012-04-03 Scanbuy, Inc. System and method for recovering image detail from multiple image frames in real-time
US20070242883A1 (en) * 2006-04-12 2007-10-18 Hannes Martin Kruppa System And Method For Recovering Image Detail From Multiple Image Frames In Real-Time
WO2009021996A3 (en) * 2007-08-15 2009-06-18 Iris Sa Method for fast up-scaling of color images and method for interpretation of digitally acquired documents
WO2009021996A2 (en) * 2007-08-15 2009-02-19 I.R.I.S. S.A. Method for fast up-scaling of color images and method for interpretation of digitally acquired documents
US20110206281A1 (en) * 2007-08-15 2011-08-25 I. R. I. S. Method for fast up-scaling of color images and method for interpretation of digitally acquired documents
US8411940B2 (en) 2007-08-15 2013-04-02 I.R.I.S. Method for fast up-scaling of color images and method for interpretation of digitally acquired documents
US8165400B2 (en) * 2007-09-25 2012-04-24 Kabushiki Kaisha Toshiba Image data processing system and image data processing method for generating arrangement pattern representing arrangement of representative value in pixel block including pixel in image
US20090080776A1 (en) * 2007-09-25 2009-03-26 Kabushiki Kaisha Toshiba Image data processing system and image data processing method
US20090201541A1 (en) * 2008-01-10 2009-08-13 Copanion, Inc. System for optimal document scanning
WO2009089451A1 (en) * 2008-01-10 2009-07-16 Copanion, Inc. System for optimal document scanning
CN101826159B (en) * 2009-03-07 2013-01-09 鸿富锦精密工业(深圳)有限公司 Method for realizing partitioned binarization of gray scale image and data processing equipment
US20160275378A1 (en) * 2015-03-20 2016-09-22 Pfu Limited Date identification apparatus
US9594985B2 (en) * 2015-03-20 2017-03-14 Pfu Limited Date identification apparatus
US20170286797A1 (en) * 2016-03-29 2017-10-05 Brother Kogyo Kabushiki Kaisha Image processing apparatus and image processing method
US10621459B2 (en) * 2016-03-29 2020-04-14 Brother Kogyo Kabushiki Kaisha Image processing apparatus and method for binarization of image data according to adjusted histogram threshold index values
CN105913032A (en) * 2016-04-15 2016-08-31 天地(常州)自动化股份有限公司 Detection method and system for working state of mining belt
CN106341926A (en) * 2016-10-08 2017-01-18 南华大学 LED digital driving power source for controlling lighting based on image recognition and control method of LED digital driving power source
CN108572520A (en) * 2017-03-10 2018-09-25 株式会社东芝 Image forming apparatus and image forming method
CN108572520B (en) * 2017-03-10 2022-09-20 株式会社东芝 Image forming apparatus and image forming method
US20190279022A1 (en) * 2018-03-08 2019-09-12 Chunghwa Picture Tubes, Ltd. Object recognition method and device thereof
CN111369923A (en) * 2020-02-26 2020-07-03 歌尔股份有限公司 Display screen abnormal point detection method, detection device and readable storage medium
CN111369923B (en) * 2020-02-26 2023-09-29 歌尔光学科技有限公司 Display screen outlier detection method, detection apparatus, and readable storage medium

Also Published As

Publication number Publication date
JPH11338976A (en) 1999-12-10
JP3345350B2 (en) 2002-11-18

Similar Documents

Publication Publication Date Title
US6347156B1 (en) Device, method and storage medium for recognizing a document image
EP1999688B1 (en) Converting digital images containing text to token-based files for rendering
US5563403A (en) Method and apparatus for detection of a skew angle of a document image using a regression coefficient
US7016552B2 (en) Image processing device, image processing method, and recording medium storing image processing program
US8200012B2 (en) Image determination apparatus, image search apparatus and computer readable recording medium storing an image search program
EP1173003B1 (en) Image processing method and image processing apparatus
JP3950777B2 (en) Image processing method, image processing apparatus, and image processing program
US20100073735A1 (en) Camera-based document imaging
JP6743092B2 (en) Image processing apparatus, image processing control method, and program
JP4077094B2 (en) Color document image recognition device
US6055336A (en) Image processing system which converts multi-value image data into binary image data
JP2009302758A (en) Image processing device, image conversion method, and computer program
GB2366108A (en) Vectorization of raster images
US8081188B2 (en) Image delivering apparatus and image delivery method
US5778105A (en) Method of and apparatus for removing artifacts from a reproduction
JP2004199622A (en) Apparatus and method for image processing, recording media, and program
JP2002199179A (en) Inclination detector
JP2845107B2 (en) Image processing device
JPH08123901A (en) Character extraction device and character recognition device using this device
US20020164087A1 (en) System and method for fast rotation of binary images using block matching method
EP0655703A2 (en) Method for scanning small fonts in an optical character recognition system
JPH08315155A (en) Graphic preprocessing unit
JPH02166583A (en) Character recognizing device
JPH0535914A (en) Picture inclination detection method
JPH05282489A (en) Method for deciding attribute of document image

Legal Events

Date Code Title Description
AS Assignment

Owner name: FUJITSU LIMITED, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KAMADA, HIROSHI;FUJIMOTO, KATSUHITO;REEL/FRAME:009683/0758

Effective date: 19981120

STCF Information on status: patent grant

Free format text: PATENTED CASE

CC Certificate of correction
FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12