基于谷歌tesseract的图片文本识别APP开发毕业论文

2021-11-10 11:11

论文总字数：22759字

摘要

进入21世纪之后，OCR技术迅猛发展，伴随着互联网的兴起，各种大型互联网公司都相继开发了自己的OCR系统，然而这项技术在移动端上的开发却刚刚起步不久，OCR技术在移动端的开发是一个不小的挑战，但能极大的拓展了OCR技术的应用，给人们的生活带来很多便利，是值得去研究的。本文通过对传统技术进行学习使用，基于软件开发方法，使用tesseract和Android技术来实现一个图文识别软件。

首先，为了掌握各种技术，我们学习了相关的知识，对系统开发有了一定的掌握。系统以Android Studio为开发环境，以Tesseract-OCR为主要框架，结合JNI和NDK技术进行图像处理。在测试部分，使用了多种Android模拟器进行测试。在功能上，用户可以实现图像操作和OCR识别，可以更准确地识别大部分中英文字符。在图像处理中，包括图像的各种切割处理、灰度化、增强、锐化、黑白、旋转等功能。在OCR的识别中，还可以识别和翻译不同的语言。此外，MD5还用于数据加密。采用JSON规范设计了参数转换格式的方法，实现了在线调用Baidu API进行翻译的功能。最后，通过设计的测试用例和软件测试方法对系统的功能进行了测试。

关键词：Android语言；Tesseract-OCR框架；JNI与NDK技术；OCR识别系统

Abstract

After entering the 21st century, with the rapid development of OCR technology, with the rise of the Internet, various large Internet companies have developed their own OCR system. However, the development of OCR technology on the mobile side is just starting, and the development of OCR technology on the mobile side is not a small challenge, but it can greatly expand the application of OCR technology, and bring a lot of convenience to people's lives, so it is worth studying. Based on the method of software development, this paper uses the technology of Tesseract and Android to realize a graphic recognition software

First of all, in order to master various technologies, we have learned relevant knowledge and have a certain grasp of system development. The system uses Android Studio as the development environment, uses Tesseract-OCR as the main framework, and combines JNI and NDK technology for image processing. In the test part, a variety of Android simulators are used for testing. In terms of function, users can realize image operation and OCR recognition, and can recognize most English and Chinese characters more accurately. In image processing, including the image of a variety of cutting processing, grayscale, enhancement, sharpening, black and white, rotation and other functions. In the recognition of OCR, different languages can also be recognized and translated. In addition, MD5 is used for data encryption. The method of parameter transfer format is designed by JSON specification, and the function of online calling Baidu API to translate is realized. Finally, the function of the system is tested by the designed test case and the software test method.

Key Words：Android language; Tesseract-OCR framework; JNI And NDK; OCR system

摘要 III

Abstract IV

第1章绪论 1

1.1 研究背景 1

1.1.1 概述 1

1.1.2 研究目的与意义 1

1.2 国内外研究现状 2

1.3 研究主要内容 3

第2章关键技术分析 4

2.1 Tesseract-OCR框架 4

2.2 Android开发平台 5

2.3 JNI与NDK技术 6

第3章系统分析与设计 8

3.1 需求分析 8

3.1.1 功能性需求分析 8

3.1.2 非功能性需求分析 9

3.2 系统可行性分析 10

3.3 系统总体设计 11

3.3.1 系统功能结构设计 11

3.3.2 系统功能模块设计 12

第4章系统实现与测试 17

4.1 系统开发环境 17

4.2 系统功能实现 17

4.2.1 组织项目结构 17

4.2.2 图像预处理 18

4.2.2.1 图像灰度化处理 19

4.2.2.2 灰度图像二值化 20

4.2.2.3 图像偏斜校正处理 20

4.2.3 jTessBoxEditor生成traineddata 21

4.2.4 功能介绍与实现 22

4.3 系统测试 31

4.3.1 黑盒测试 31

4.3.2 白盒测试 31

4.3.3 测试内容 31

第5章总结与拓展 36

5.1 总结 36

5.2 拓展 36

参考文献 38

致谢 39

第1章绪论

1.1 研究背景

1.1.1 概述

图像文本识别技术，国外一般称之为OCR识别。传入中国不过三十多年的时间，但是国内OCR系统的研究由来已久，早期的汉王OCR，再到近十年，由于手机的流行之下，应用程序的开发也随之兴起，进而带动了国内OCR技术的新一轮潮流。现在，主要的互联网企业基本都研发了自己的OCR识别系统，比如百度OCR，阿里OCR，腾讯OCR之类，还有有道云OCR，但是这些基本上都是商用，真正免费开源的只有tesseract，即便大部分软件都是闭源的，即便我们无法将这些加入到自己的程序中，也无力进行修改。但是，无论作为学术研究也好，实际应用也好，都有很强的研究价值，如何能够提高中文的识别率，如何能够将更新的技术加入传统程序中，这些都是具有现实意义的研究，我也正是在这样的背景下完成了对于图像文本识别系统的开发。

1.1.2 研究目的与意义

目前，由于智能手机的发展和普及，移动学习成为常态，智能手机和图文识别软件结合起来成为新的应用需求。利用移动端进行图像文字识别可以解决传统大型扫描设备使用步骤繁杂和移动不便等问题，为移动学习和办公提供便利。相对于传统的OCR系统，移动端上的开发是一个不小的挑战，但能极大的拓展了OCR技术的应用，给人们的生活带来很多便利。另一方面，国内的OCR系统大多商用，不利于普及化，而在这样的背景下，使用tesseract开发出的图像文本识别系统是一个可以满足人们基本需要的APP，而且极其的便捷，相对于传统的大型扫描设备更加方便使用。

请支付后下载全文，论文总字数：22759字

您需要先支付 80元 才能查看全部内容！立即支付

注册

找回密码