Towards Identification Of Very Low Resolution, Anti-Aliased Characters
Type of publication: | Inproceedings |
Citation: | eins07:isspa |
Booktitle: | IEEE International Symposium on Signal Processing and its Applications (ISSPA'07), Sharjah, United Arab Emirates |
Year: | 2007 |
Abstract: | Current Web indexing technologies suffer from a severe drawback due to the fact that web documents often present textual information that is encapsulated in digital images and therefore not available as actual coded text. Moreover such images are not suited to be processed by existing OCR software, since they are generally designed for recognizing binary document images produced by scanners with resolutions between 200-600 dpi, whereas text embedded in web images is often anti-aliased and has generally a resolution between 72 and 90 dpi. The presented paper describes two preliminary studies about character identification at very low resolution (72 dpi) and small font sizes (3-12 pts). The proposed character identification system delivers identification rates up to 99.93 percents for 12'600 isolated character samples and up to 99.89 percents for 300'000 character samples in context. |
Keywords: | image analysis, OCR, Pattern Recognition |
Authors | |
Added by: | [] |
Total mark: | 0 |
Attachments
|
|
Notes
|
|
|
|
Topics
|
|
|