光学文字识别系统
Optical Character Recognition System (OCR)
-
摘要: 复旦789型光学文字识别系统是为专门需要而研制的.其目的是为了直接阅读原稿上的打印的英文字母与特殊符号.它的指标是在阅读速度为每秒100字符的条件下,要求误识率与拒识率均达到10-4(其中还包含了红黑颜色的误分率).由于采用通用型小写英文字母,加上待识文件制作过程为连续自动打印,以致在字符的可分性与印刷质量的扰动上面临复杂的形势.本文涉及的范围是有关识别系统的研制过程与方案的选择.在此基础上提出了一个具有硬软结合的贰级识别的方案.并给出了试验的结果与改进的方向.Abstract: The Optical Character Recognition System of Futan 789 model has been developed for a special demand. The aim of the system is to recognize the alphabet and some special marks which have been machine-printed on certain document papers. The researching target of the system is that under the recognition speed of 100 characters per second both the sabstitution rate and reject rate approaches to 10-4 (including the black/red colour recognition substitution rate). Because of the lower case alphabet the system is to recognize and automatically continuously printing in the document-making process, the printing quality varies a great deal. So we are confronted with a difficult situation in the aspeet of character reeognizability. This paper will deeribe the developing process of the system and how to select the scheme. Basing on it, a two-stage recognition scheme with the combination of hard ware and soft ware will be set forth. Finally this paper will present the experimental results and the orientation of its improvement.
计量
- 文章访问数: 1809
- HTML全文浏览量: 93
- PDF下载量: 1079
- 被引次数: 0