Impact of Character Models Choice on Arabic Text Recognition Performance - HES SO Valais Publications

Deutsch, English, Nederlands, Norsk, Português, <mehr...>

Art der Publikation:	Artikel in einem Konferenzbericht
Zitat:	fouad10:icfhr
Buchtitel:	12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010)
Jahr:	2010
Monat:	November
Ort:	Kolkata (India)
URL:	http://www.hennebert.org/downl...
Abriss:	We analyze in this paper the impact of sub-models choice for automatic Arabic printed text recognition based on Hidden Markov Models (HMM). In our approach, sub-models correspond to characters shapes assembled to compose words models. One of the peculiarities of Arabic writing is to present various character shapes according to their position in the word. With 28 basic characters, there are over 120 different shapes. Ideally, there should be one sub-model for each different shape. However, some shapes are less frequent than others and, as training databases are finite, the learning process leads to less reliable models for the infrequent shapes. We show in this paper that an optimal set of models has then to be found looking for the trade-off between having more models capturing the intricacies of shapes and grouping the models of similar shapes with other. We propose in this paper different sets of sub-models that have been evaluated using the Arabic Printed Text Image (APTI) Database freely available for the scientific community.
Schlagworte:	arabic, HMM, machine learning, OCR
Autoren	Slimane, Fouad Ingold, Rolf Kanoun, Slim Alimi, Adel Hennebert, Jean
Hinzugefügt von:	[]
Gesamtbewertung:	0
Anhänge

Notizen

Themen
Institute of Informatics (II) 0/1381

Ausführdauer: 1.8035 Sekunden.