The document summarizes the status of the Khmer OCR project. It discusses optical character recognition (OCR) in general and provides an overview of the Khmer OCR system being developed, which includes pre-processing, segmentation, recognition, and post-processing steps. It also reviews the current status of the project, noting which components are complete and which still need further development, such as skew detection/correction and automatic post-processing. Potential future work is mentioned, like handling joined characters and low quality images.
4. Khmer OCR Project
• 2011
• Team
– Dr. SENG Sopheap, ITC
– Mr. LONG Seangmeng, ITC5th
– Mr. EN Sovann (doing master)
– Ms. PRUM Sophea (doing PhD)
– Mr. HAO Jeudi (year)
• Develop a Khmer OCR system
– Font independent
– Size independent
4
5. State of the Art
Author Limitation Result
CHEY Chanoeurn, KOSIN 10 characters (បបបបប
បបបប 92%
Chamnongthai and PINIT ប)
Kumhom
CHEY Chanoeurn, KOSIN 20 fonts 92.85% (size 22)
Chamnongthai and PINIT 91.66% (size 18)
Kumhom 89.27% (size 12)
ING Leng Ieng and MUAZ Limon R1 22 98.88%
Ahmed
KRUY Vanna Font and size independent 97%
(manual preparation for
new fonts)
EN Sovann Font and size independent 96%
(manual preparation for
new fonts)
5
6. Khmer OCR System
Text Image
Pre processing
Segmentation
Recognition
សស ស ស សសស ស ស
ស ស សស
Post processing
Editable Text ស ស ស ស ស ស ស ស ស ស ស ស
ស ស ស ស ស ស ស ស ស ស ស ស
6
7. Khmer OCR System (cont.)
• Pre processing
Binarization
Noise removal
Skew detection
and correction
7
8. Khmer OCR System (cont.)
• Segmentation
Page
Line 1
Line
Line 2
Vertical Symbol
Blob
8
9. Khmer OCR System (cont.)
• Recognition
Blob
Training images (sample images) with label
Closest match
Blob to be recognized Image:
Search for closest Label: ស
match
…
9
10. Khmer OCR System (cont.)
• Recognition (cont.)
– How to find closest match?
– How to represent the blob image?
• Fourier transform: Any function f(t) with period T can be written as
Blob image => 2-D Fourier transform
The blob image (B) represented by Fourier coefficients:
B[0], B[1], B[2], …
City block distance between two blobs B and B’:
Distance = |B[0] – B’[0]| + |B[1] – B’[1]| + |B[2] – B’[2]| + …
10
12. Project status
• Pre processing
– Binarization and noise removal √
– Skew detection and correction X
• Segmentation √
• Recognition
– Features extraction √
– Automatic generation of training data for new fonts √
• Post processing
– Assembling and reordering rules
• Manual √
• Automatic X
– Spell checking X
• Performance evaluation X
12
13. Perspectives
• Joining characters
• Text layout
• Low quality text images
• Curve line
13