Applications of Deep Learning for Handwritten Chinese Character Recognition: A Review
-
摘要: 手写汉字识别(Handwritten Chinese character recognition,HCCR)是模式识别的一个重要研究领域,最近几十年来得到了广泛的研究与关注,随着深度学习新技术的出现,近年来基于深度学习的手写汉字识别在方法和性能上得到了突破性的进展.本文综述了深度学习在手写汉字识别领域的研究进展及具体应用.首先介绍了手写汉字识别的研究背景与现状.其次简要概述了深度学习的几种典型结构模型并介绍了一些主流的开源工具,在此基础上详细综述了基于深度学习的联机和脱机手写汉字识别的方法,阐述了相关方法的原理、技术细节、性能指标等现状情况,最后进行了分析与总结,指出了手写汉字识别领域仍需要解决的问题及未来的研究方向.Abstract: Handwritten Chinese character recognition (HCCR) is an important research filed of pattern recognition, which has attracted extensive studies during the past decades. With the emergence of deep learning, new breakthrough progresses of HCCR have been obtained in recent years. In this paper, we review the applications of deep learning models in the field of HCCR. First, the research background and current state-of-the-art HCCR technologies are introduced. Then, we provide a brief overview of several typical deep learning models, and introduce some widely used open source tools for deep learning. The approaches of online HCCR and offline HCCR based on deep learning are surveyed, with the summaries of the related methods, technical details, and performance analysis. Finally, further research directions are discussed.
-
1. Introduction
In order to test a software system completely, we should make the functional detection for all kinds of system element combinations. If a software system under test (SUT) has ${k}$ elements and each element has $v_i$ $(i=1, 2, \ldots, k)$ different values, then the number of whole test cases, which could covered all kinds of system element combinations for this SUT, is $\sum_{i=1}^k v_i$. For example, Table Ⅰ is a SUT with 3 testing elements, and the number of whole test cases for completely testing this SUT is $3^3=27$. Since $\sum_{i=1}^k v_i$ might be a huge number, the cost of entire operations for testing such SUT is likely to be tremendous. So, it is infeasible to make a complete testing for the SUT with large input space.
表 Ⅰ A SUT WITH THREE TEST PARAMETERSTable Ⅰ A SUT WITH THREE TEST PARAMETERSValue $A$ $B$ $C$ 0 a0 b0 c0 1 a1 b1 c1 2 a2 b2 c2 Fortunately, some researches show about 70$\, \%$ software bugs are caused by the combinations between two elements, meanwhile, the correlations among three elements generated about 90$\, \%$ software bugs [1]. That means, we can generate a smaller test case set, which just identifies the interactions among several system elements, such as pairwise or triple combinations, to capture most of bugs in a SUT [2], [3]. Therefore, the combinatorial testing could be a more effective technique for discovering interaction faults of SUT, whose key task is to generate a test suite as small as possible to cover all given system element combinations.
In this study, we propose a novel test suite generation and global optimization approach. In the first step, the test suite generation problem has been transformed into a binary sequence optimization problem by a binary encoding mechanism. Based on it, a novel evolutionary algorithm, cluster searching algorithm (CSA), is presented to optimize the binary sequence in solution space so as to construct the optimal test suite. In Section 2, we analyze the related research works. Section 3 introduces the problem transformation. Section 4 shows the CSA. Section 5 provides the simulations. Section 6 gives the conclusion.
2. Related Work
Generally, a test case subset, which can satisfy all covering requirements, is known as a representative set. Assuming that the cost of generating and managing each test case is the same, a representative set with a minimum number of test cases is desirable and is called the optimal testing suite. The optimal combinatorial test suite generation problem is defined as, given a SUT and a set of combinatorial covering criterion, find a minimal representative set from the complete test case set. For example, Table Ⅱ is an optimal pairwise combinations representative set, which can cover all combinations between any two system elements of the SUT in Table Ⅰ. In general, this problem can be expressed as a set-covering problem [4], and it is a NP-complete problem also [5].
表 Ⅱ AN OPTIMAL TEST SUITE COVERING ALL PAIRWISE COMBINATIONSTable Ⅱ AN OPTIMAL TEST SUITE COVERING ALL PAIRWISE COMBINATIONSNumber $A$ $B$ $C $ 1 a0 b0 c0 2 a0 b2 c2 3 a0 b1 c1 4 a1 b0 c2 5 a1 b1 c0 6 a1 b2 c1 7 a2 b0 c1 8 a2 b2 c0 9 a2 b1 c2 In traditional studies, people make use of some mathematical methods to construct the representative set, such as orthogonal array [6]. But, there are some unsolved problems existing in the orthogonal array generation method, such as we cannot generate a necessary orthogonal array for any kind of SUT. Another mathematical method is based on the matrix recursive construction process [7]. For some special instances, this method can give a wonderful result. But, it cannot be applied to all kinds of SUT either. Moreover, the theoretical researches of combinatorial testing are still unable to give an explicit optimal covering number for most problems. The major conclusions of such studies just can give some logical relations for the optimal covering number between different problems [8].
In recent years, the evolutionary algorithm (EA) has developed into a powerful global optimization algorithm to solve many complex engineering optimization problems [9]$-$[12]. In combinatorial testing researches, the EA is usually coupled with one-test-at-a-time mechanism [13]. In such studies, an EA has been used to search a best test case $t_i$ in the test case complete set, which can cover the maximum combinations in uncovering combination set (CS), in one computation. Then, let $t_i$ join test suite (TS) and delete the covered combinations by $t_i$ in CS. After that, the above operations will repeat until all combinations in CS have been covered. Based on this iterative construction process, the one-test-at-a-time mechanism can solve most large-size SUT very well [14], [15]. However, to a small-size SUT, it does not show a desired performance. Firstly, it is likely to generate an approximate representative set. Secondly, it always takes a much longer time and more matrix transformations to generate whole representative set. These characteristics make one-test-at-a-time mechanism not suitable to solve small-size SUT. For example, if we use this mechanism to generate the TS for the SUT in Table Ⅰ, the test cases a0b0c0, a1b1c1 and a2b2c2 are likely to be preferentially selected from complete set to join TS one by one, because both of them contain 3 uncovered pairwise combinations in CS, which conform to the optimal condition for the test case selection. However, if it does, no matter what test case in the left 24 cases has been chosen to join TS in the next computation cycle, the generated representative set would be an approximate solution, because there is at least one pair-wise combination repetitive with the 9 covered pair-wise combinations, which have been generated by the above 3 test cases.
Recently, efforts have been focused on the use of meta-heuristic algorithms as part of the computational approach for test suite generation. Generally, meta-heuristic algorithms start with a random set of test cases, such as using a simple one-test-at-a-time greedy procedure to construct it [16]. Then, the initial test case set undergoes a series of transformations in an attempt to improve itself. Each of them could be an independent meta-heuristic algorithm, such as teaching learning based optimization, global neighborhood algorithm, particle swarm optimization, and cuckoo search algorithm [17]. In this process, one best candidate is selected at each iteration until all the required interactions are covered. Such researches show a good performance for large-size constrained CA problems as well. But, these algorithms always contain several searching algorithms and try to balance them in the iteration computation process. So, these approaches often have a relatively heavy and complicated structure and are more suitable to solve large-size CA problems.
By using one-test-at-a-time mechanism or meta-heuristic algorithms, it is possible to perform large combinatorial testing that was not possible before. But, for small-size CA problem, such approaches still imply a high cost and often get an approximate result. In the practical applications, a portion of software testing works belongs to the small-size problem. So, finding a simple method to improve the probability of generating the optimal test suite for small-size combinatorial testing is a worthy goal. In order to optimize the test suite more effectively, some researchers try to translate small-size CA problem into another kind of problem to solve it, such as satisfiability problem (SAT) [18] and integer program problem [19] etc. However, these researches also meet some difficult challenges. For example, even translating a small-size SUT into a SAT problem, we will get a very large clause set. Besides, the normal planning algorithm cannot deal with a big optimization problem efficiently. Above researches enlighten us that the combination test suite generation problem should be translated into a simple and concise data structure for global optimization, meanwhile, it is necessary to equip an effective global optimization algorithm to improve the solution quality.
3. Proposed Method
Fig. 1 is the flowchart of combinatorial test suite global optimization mechanism. As it shows, a one-to-one correspondence is created between a test case in its complete set and a gene of a binary code string. Based on this, we can create a mapping relation between a test case subset of complete set and a vertex in the binary code space. This means we can translate a combination test suite generation problem into a binary code based global optimization problem to solve. The binary string is a simpler and more compact data structure. Moreover, it is more suitable for global optimization computation. In this section, we will introduce the encoding and decoding procedure firstly.
3.1 Binary Code Based Problem Transformation
Generally, we can use a covering array (CA) or a mixed level covering array (MCA) to describe a SUT [4]. The difference between CA and MCA is that each test parameter of CA has the same value range, but in MCA it can be different. Actually the CA can be looked upon as a special case of MCA, and the processing methods for them have no difference. To facilitate the computation process, we make this study for CA problem only. The following is the definition of covering array given by Cohen [20].
Definition 1 (Covering array): Let $N$, $k$, $t$, and $v$ be positive integers. A covering array, $CA(N; t, k, v)$, is a $N\times k$ array on $v$ symbols, and every $N\times t$ sub-array contains all ordered subsets from $v$ symbols of size $t$ at least once. In such an array, $t$ is called strength, $k$ is called the degree, $v$ is called the order.
By this definition, we can get that the number of test cases in the complete set of $CA(N;t, k, v)$ is $v^k$, and the number, of, whole, combinations, with covering strength $t$ is $v^tC_k^t$. For describing the problem transformation mechanism, the other definitions are given in the following.
Definition 2: Let $\Phi$ be the complete test case set of $CA(N;t, k, v)$, then $|\Phi|=v^k$. Sort $\Phi$ in ascending order by the value of each test parameter. The id $j$ of test case $t_j$ is labeled from 0 to $|\Phi|-1$.
By this definition, the complete test case set $\Phi$ will show an orderly structure. Furthermore, we can use the id $j$ to visit a definite test case $t_j$ in $\Phi$. For example, Table Ⅲ is the complete test case set of the SUT in Table Ⅰ.
表 Ⅲ AN ORDERLY COMPLETE TEST CASES SETTable Ⅲ AN ORDERLY COMPLETE TEST CASES SET$A$ $B$ $C$ $t_j$ id $j$ a0 b0 c0 $t_0$ 0 a0 b0 c1 $t_1$ 1 a0 b0 c2 $t_2$ 2 a0 b1 c0 $t_3$ 3 $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ a2 b2 c2 $t_{26}$ 26 Definition 3: The binary code sequence $I$ is a string of length $L$. Its genes $\alpha_j\{0, 1\}$, $j=0, 1, \ldots L-1$.
As Fig. 1 shows, the binary code space is an $L$-dimensional hypercube. The vertex in this hypercube is a unique binary coding string.
Definition 4: Let $\gamma_i$ be a test cases subset of $\Phi$ and $\Gamma=$ $\{\gamma_i|\gamma_i$ $\in$ $\Phi\}$. Then, $\Gamma$ is the complete set of $\Phi$'s subset, and for $\forall\gamma_i$ and $\forall\gamma_j$ $(\gamma_i, \gamma_j \in \Gamma$ and $i\neq j)$, $\gamma_i\neq\gamma_j$. Let $\lambda_i$ be a vertex in binary code space and $\Lambda=\{\lambda_0, \lambda_1, \ldots$ $\}$. Then, $\Gamma$ is the complete set of vertex in binary code space.
Based on above definitions we can get 2 characteristic theorems of set $\Gamma$ and set $\Lambda$ firstly.
Theorem 1: Let $L=|\Phi|$, then the element number of set $\Gamma$ is equal to set $\Lambda$'s. That is $|\Gamma|=|\Lambda|=2^L$.
Proof: By Definition 3, we can get there are $2^L$ unique vertexes in the $L$-dimensional hypercube. By Definition 1, we know there are $|\Phi|=v^k$ distinct test cases in $\Phi$. Meanwhile, for a test case, there are also two statuses, it belong to the test case subset or not. That means the element number of $\Gamma$ is $|\Gamma|=2^{|\Phi|}$. So, if $L=|\Phi|$, we can get $|\Gamma|=$ $|\Lambda|$ $=$ $2^L$.
Theorem 2: Make test case $t_j$ join the TS only when the gene $\alpha_j=1$ $(j=0, 1, \ldots, L-1)$. Then, both $\Gamma\rightarrow\Lambda$ and $\Lambda$ $\rightarrow$ $\Gamma$ are bijections.
Proof: Firstly, we can prove both $\Gamma\rightarrow\Lambda$ and $\Lambda\rightarrow\Gamma$ are surjections based on above assumption. By Theorem 1 we know the set $\Gamma$ and set $\Lambda$ have same number of elements when $L=|\Phi|$. Besides, both the test case in $\Phi$ and the gene in a binary string have two statuses, which is yes or no and 1 or 0. Meanwhile, the $t_j$ has corresponded to one and only $\alpha_j$ by the same id $j$. So, under the given condition, each element in set $\Gamma$ would correspond to one and only object in set $\Lambda$, and each element in set $\Lambda$ is mapped by a unique object in set $\Gamma$, and vice versa. Based on this, we can get both $\Gamma\rightarrow\Lambda$ and $\Lambda\rightarrow\Gamma$ are surjections. On the other hand, by the Theorem 1, we can get all elements in set $\Gamma$ and set $\Lambda$ is a unique object. That means each element in set $\Gamma$ and set $\Lambda$ is different from others. Therefore, both $\Gamma\rightarrow\Lambda$ and $\Lambda\rightarrow\Gamma$ are injections. For $\Gamma\rightarrow\Lambda$ and $\Lambda$ $\rightarrow$ $\Gamma$ are both surjection and injection, we can get that both $\Gamma\rightarrow\Lambda$ and $\Lambda\rightarrow\Gamma$ are bijections.
The bijection relationships between set $\Gamma$ and set $\Lambda$ show there is a one-to-one correspondence between the test case subset and the vertex in binary code space, which is a necessary condition for the problem transformation.
3.2 Decoding Mechanism
The decoding mechanism is used to parse the binary code sequence so as to construct the corresponding combinatorial test case subset. To facilitate this process, we use a positive integer, whose value is set from 0 to $v-1$, to mark the symbol of each parameter of SUT. Then, we can use a mod-$v$ number to present the details of a test parameter.
Definition 5: For a covering array $CA(N;t, k, v)$, a test case can be expressed as a $k$ figures integer sequence, and each figure is a positive mod-$v$ integer.
Let $x_i$ $(x_i\in\{0, 1, \ldots, v-1\})$ express the value of $i$th parameter ($i\in\{0, 1, \ldots, k-1\}$) of test case $t_j$. Then, an equation can be created to show the details of test case $t_j$ after parsing the number id $j$. The formula is
$ \begin{align} j=x_{k-1}\ast v^{k-1}+ \cdots + x_1\ast v^1+x_0\ast v^0 \end{align} $
(1) For example, to the CA in Table Ⅰ, if the gene value of $\alpha_3$ in $I_i$ is 1, that means $j=3.$ Then, we can create an equation $3=0\times3^2+1\times3^1+0\times3^0$. Based on this equation, we can get an integer sequence, which is 010. After that, this integer sequence can be translated into a test case $t_3$ $=$ a0b1c0. Repeating this process for each valuable gene in $I_i$, whose value is 1, we can generate a whole test suite. This procedure is shown as follows.
Algorithm 1 Decoding Input: positive integer $k$ and $v$, binary code of $I_i$ Output:test suite TS 1: for $j$ from 0 to $L-1$do 2: if $\alpha_j$ equal to 1 then 3: $dnum=j$; 4: for $m$ from 0 to $k-1$ do 5: $x = dnum$%$v; ~ dnum/=v$; 6: $t_j[m]=x$; 7: end for 8: put $t_j$ into TS; 9: end if 10:end for In above procedure, TS = $t_j|j=0, 1, \ldots, L-1$, $t_j=$ $ x_{k-1}, $ $\ldots, $ $x_1, x_0$ where $x_i$ is a mod-$v$ integer value. Firstly, we set TS = $\phi$.
3.3 Fitness Calculation and Constraint Handling
In the proposed approach, the CA problem can be regarded as a sort of optimization problem. So, we can formulate $\lambda_i$'s fitness as
$ \begin{align} \max f(\lambda_i), \quad \lambda_i\in\Lambda. \end{align} $
(2) By Theorem 2, we know $\lambda_i\rightarrow\gamma_i$ is a one to one mapping. So, we can give the following definition
Definition 6: the fitness of $\lambda_i$ and $\gamma_i$ is
$ \begin{align} f(\lambda_i)=f(\gamma_i)=f_i. \end{align} $
(3) Since we expect to use as few test cases as possible to cover whole combinations with strength-$t$, the fitness of $\lambda_i$ can be evaluated from two aspects. The first one is the covering degree for strength-$t$ combination in CS. The second one is the number of test cases in $\lambda_i$. In order to balance the two sides, we propose to use following formula to calculate $f_i$
$ \begin{align} f_i =10\ast\frac{\omega}{|CS|}+1-\frac{|TS|}{|\Phi|} \end{align} $
(4) where $\omega$ is the number of covered combinations in CS and $|CS|$ is the number of all strength-$t$ combinations. Obviously, $0<\omega\leq|CS|$ and $0<|TS|\leq|\Phi|$. Based on our experience, $10\ast\omega/|CS|$, which is between 0 and 10, is used to judge the coverage of combinations set. Meanwhile, $1$ $-$ $|TS|/|\Phi|$, which is between 0 and 1, is used to show the test suite size relationship.
Besides, for handing constraints, a specific calculation mechanism is used to filter out all invalid genes from $\lambda_i$ before fitness evaluating.
Definition 7: The constraint set $CO=\{co_i | i=1, 2, \ldots \}$. If the constraint $co_i$ is available, we can generate a corresponding binary string $I_i=\{\alpha_0\alpha_1 \cdots \alpha_{L-1}\}$, in which the $\alpha_j$ $=$ $0$ if its corresponding test case $t_j$ is infeasible to $co_i$; otherwise, $\alpha_j=1$. Based on ${I_1, I_2, \ldots}$, we can generate an invalid genes filtering string $I_{co}$
$ \begin{align} I_{co} = I_1\wedge I_2\wedge \cdots \end{align} $
(5) where $\wedge$ is the logic and operation.
Using $I_{co}$ we can filter out all invalid genes in $\lambda_i$ easily by following formula
$ \begin{align} \lambda_i =\lambda_i \wedge I_{co}. \end{align} $
(6) After this calculation process, the fitness of $\lambda_i$ would be calculated by (4).
Based on fitness function, the value of $\lambda_i$ and $\gamma_i$ can be evaluated. Moreover, both of them have the same fitness value. Then, if we sort the set $\Gamma$ and set $\Lambda$ based on its individuals fitness, we can see the corresponding individuals between two sets also have the same position in each fitness sequence. So, we can get Theorem 3.
Theorem 3: The $\gamma_i$ and $\lambda_i$ have the same relative position in its fitness ordering sequence of set $\Gamma$ and set $\Lambda$.
This theorem gives a sufficient condition for the problem transformation. Above all, we can get a conclusion that a combination test suite generation problem can be translated into a binary code based global optimization problem to solve.
4. Cluster Searching Algorithm
As we know, the solution of EA is much likely to be affected by the phenomena of premature convergence and searching stagnation. The adaptive mechanism is the most commonly used self-adjusting method in EA. Nevertheless, in a practical application, it is difficult to make a reliable and accurate self-adjusting strategy for EA. So, the adaptive mechanism just can make a limited impact on the performance improvement of EA. Recently, a new research trend is aimed at creating a hybrid algorithm model by merging various searching mechanisms and operators in order to improve the performance of optimization system [21], [22]. Meanwhile, a new problem has emerged. That is how to balance the diversity and the complexity in a hybrid evolution system and harmonize the specificity and the coordination among multiple operators. We find that an organization with sort of cluster form structure in a complicated and huge evolutionary group has shown a particularly important role in the occurrence and development process of such group. Inspired by this, we propose to create a certain cluster organization in the population, and use it to control and adjust the searching process of population. Fig. 2 is the model of cluster searching algorithm (CSA).
In CSA, the cluster $C$ is a connection relationship among individuals, which can be generated by a clustering process. Such structure makes the population searching procedure divided into two parts, which are the searching among group process and the searching inside group process. By the interactions among multiple clusters, the searching among group process is developed to explore the code space. In contrast with it, the searching inside group process is aimed at refining the individuals in each group. Such job-division mechanism is not only helpful keeping low coupling between the global and local searching computation in evolution system, but also provides an environment and foundation for merging various searching mechanisms and operators, and adjusting the interrelation between the global and local searching process. For describing CSA, we give some basic definitions firstly.
Definition 8: A population Pop of CSA consists of an $n$-tuple of strings $I_i$ $(i=0, 1, \ldots, n-1)$ of length $L$, where the genes $\alpha_j\in\{0, 1\}$, $j=0, 1, \ldots, L-1$. $I_i$ is called an individual. The fitness of $I_i$ is $f_i >0$.
Definition 9: The cluster $C$ is a virtual group organization and the $A_i^j$ is a member of group $C_i$, which maps an individual $I_k$ in $Pop$. We can express it as
$ C=\{C_1, C_2, \ldots \}, \ \ C_i=\{A_i^1, A_i^2, \ldots \} \notag\\ \qquad\qquad\qquad\qquad\quad A_i^j=k, \ \ i, j, k=1, 2, \ldots $
(7) $ \emptyset=\{C_i\cap C_j|i\neq j\}, \ \ \emptyset=\{A_i^r\cap A_j^s|i\neq j\}\notag\\ Pop=\cup C_i, \ \ i, j, r, s=1, 2, \ldots. $
(8) Definition 10: In cluster $C$, the members of the same group have more similarity, and the members belonging to different groups have more diversity. That is
$ \begin{align} & Distance(A_i^r, A_j^q) > Distance(A_i^r, A_i^s), \notag\\ & \qquad\qquad\qquad\qquad\qquad i\neq j, \ \ i, j, r, s, q=1, 2, \ldots. \end{align} $
(9) Definition 11: The core member $A_i^c$ of $C_i$ is a member who has the best fitness in group $C_i$. That is
$ \begin{align} f_{A_i^r}\leq f_{A_i^c}\quad \forall\, A_i^r \in C_i, \ \, i, r=1, 2, \ldots. \end{align} $
(10) In the iteration process of CSA, the cluster, created in $k$th generation, is written as $C^k$, and the children, generated by the searching process of $C^k$, are $Ch^k$. The following are the main steps of CSA.
1. $C^0 = initializing (Pop^0)$;
2. While (the termination criteria are not reached) do
3. $Ch^k = searching among group (Pop^{k-1}, C^{k-1})$;
4. $C^k= clustering (Pop^{k-1}, Ch^k)$;
5. $Ch^k = searching inside group (Pop^{k-1}, Ch^k, C^k)$;
6. $C^0 = initializing (Pop^0)$;
7. End while.
In the first step, CSA initializes the population $Pop^0$ in binary code space randomly. After that each $I_i$ in $Pop^0$ is set to be a group $C_i$ $(i=1, 2, \ldots, |n|)$. And it is the only member $A_i^1$ and center member $A_i^c$ of $C_i$ also. Then, the cluster $C^0$ is created. The following is the iteration searching process of CSA.
4.1 Searching Among Group Process
In the searching among group process, we choose an individual from different groups and use multi-point crossover operation $\oplus$ and mutation operation $\odot$ to explore the code space. This step will generate $n$ offspring individuals.The number of crossover point $\xi$ in $\oplus$ operation is randomly generated between $a$ and $b$. The default parameters $a$ and $b$ satisfy $1<a<b<L/2$. In $\odot$ operation, the gene mutation probability is $\rho\in(0, 0.05)$.
Algorithm 2 Searching among group Input: population $Pop^{k-1}$ and cluster $C^{k-1}$ Output:$n$ offspring individuals in $Ch^k$ 1: Randomly select two different groups $C_i$ and $C_j$ ($i\neq j$) from current cluster $C^{k-1}$, then randomly select member $A_i^r$ and $A_j^s$ from $C_i$ and $C_j$ respectively. 2: Perform $\oplus$ operation between $A_i^r$ and $A_j^s$ to generate binary string $ch_1$ and $ch_2$. Then, perform $\odot$ operation on $ch_1$ and $ch_2$, respectively. That is ${align}{matrix} c{{h}_{1}},c{{h}_{2}}=\odot (A_{i}^{r}\oplus A_{j}^{s}). & (11) \\ {matrix}{align}$ 3: The number of offspring individuals $p=p+2$. If $p<n$, return to step 1. Otherwise, exit. 4.2 Clustering Process
After the searching among group process, 2$n$ individuals will be dispersed in code space widely. Then, we use a clustering process to assign them into different groups so that they have a high degree of similarity within the group, and that the cluster is to be distinct. This process helps CSA to analyze the spatial distribution structure of population in code space. This clustering process also consists of two parts: a technique for calculating the distance between binary strings, and a grouping technique to minimize the distance between individuals in same group while maximizing the distance among groups.
4.2.1 Distance Calculation by PAD
Based on the bound researches of optimal test case number, such as symbol-fusing and the lower and upper bound [23], [8], we can see that the valuable genes, whose value is 1, just have occupied a small proportion in binary code. In order to emphasize the importance of these valuable genes in clustering process, we propose to use positive attribute distance (PAD) [24] instead of traditional hamming distance to calculate the individual distance.
The PAD of two binary sequences is as follows
$ \begin{align} PAD(I_i, I_j) = 0\leq \frac{2\Psi_{ij}}{\Psi_i+\Psi_j} \end{align} $
(12) where $\Psi_i$ is the number of 1's in $i$th binary sequence, $\Psi_j$ is the number of 1's in $j$th binary sequences, and $\Psi_{ij}$ is the number of 1's common to both $i$th and $j$th binary sequences. Obviously, the result of PAD is in the interval between 0 and 1, where 1 expresses absolute similarity and 0 expresses absolute diversity.
Furthermore, we use the average PAD of population as the indicator of population diversity $\gamma\in(0, 1)$. It can be calculated by following formula
$ \begin{align} \gamma=\frac{2}{|n|(|n|-1)}\sum\limits_{i=2}^{|n|}\sum\limits_{j=1}^{i-1}PAD(I_i, I_j). \end{align} $
(13) 4.2.2 Clustering Based on PAD
Algorithm 3 is the main steps of a hierarchical clustering process. In this process, we set two matrixes, IDM and GDM, to store the individual distance $idm_{ij}\in IDM$ and group distance $gdm_{ij}\in GDM$. That is
$ idm_{ij}=\begin{cases} 0,&{\rm if} ~i\leq j, \ \, i, j=1, 2, \ldots \\ PAD(I_i, I_j),&{\rm else} \end{cases} $
(14) $ gdm_{ij}= \begin{cases} 0,&{\rm if} \ i\leq j, \\ &\quad i, j, r, s=1, 2, \ldots \\ \frac{\sum\limits_{A_i^r\in C_i} \sum\limits_{A_j^s\in C_j}PAD(A_i^r, A_j^s)} {|C_i||C_j|},&{\rm else}. \end{cases} $
(15) Meanwhile, the current population $Pop^{k-1}$ and its offspring individuals $Ch^k$ have been mixed together to join clustering process.
Algorithm 3 Searching among group Input: population $Pop^{k-1}$ and offspring individuals $Ch^k$ Output: new cluster $C^k$ 1: Set each $I_i$ in population to be a temporary group $C_i$. Initialize the IDM by (14). After that initialize the GDM by (15) based on IDM. Then, calculate the population diversity $\gamma$ by (13), and set the average group distance $\xi=\gamma$. 2: Find the minimum group distance $gdm_{pq}$ in GDM and merge the group $C_p$ and $C_q$ into a new group $C_{pq}$. Then, update the GDM. 3: Calculate new average group distance $\xi'$ $ {align}{matrix} {\xi }'=\frac{2}{|C|(|C|-1)}\sum\limits_{i=2}^{|C|}{\sum\limits_{j=1}^{i-1}{g}}d{{m}_{ij}} & (16) \\ {matrix} {align}$ where $|C|$ is the number of groups after step 2. If $\xi'<\xi$, go to step 4; otherwise, $\xi=\xi'$ and return step 2. 4: Restore the group $C_p$, $C_q$ and GDM. After that, update the center member for each group and output $C$. 4.3 sSearching Inside Group Process
The searching inside group process consists of two operations, SIG1 and SIG2, which are used to reduce the valuable genes in binary string while maintaining it can satisfy all covering requirements. In this process, a parameter $m$ is used to limit the execution times of SIG1 and SIG2 for each group. In this paper, the $m$ is adjusted by the following formula
$ \begin{align} m = {\rm round}\left(\frac{m_2-m_1}{G-1}g+\frac{m_1G-m_2}{G-1}\right) \end{align} $
(17) where $0\leq m1<m2$, $g=1, 2, \ldots, G$ is the generation number, G is its ceiling number and round($\cdot$) is the rounding operation. During the searching process, (17) makes $m$ increase from $m_1$ to $m_2$ gradually. Besides, parameter $\lambda$ $\in$ $(0, 1)$ is used to adjust the operation probability between SIG1 and SIG2.
Algorithm 4 Searching inside group Input: $Pop^{k-1}$, $Ch^k$, $C^k$ and $\lambda$ Output: $m|C|$ offspring individuals in $Ch^k$ 1: Run this operation for $C_i$, $i=1, 2, \ldots, |C|$, and set $t=0$. 2: $UR(0, 1)$ is a uniformly distributed random number between 0 and 1. If $UR(0, 1)<\lambda$, goto step 2.1 to execute SIG1. Otherwise, goto step 2.2 to execute SIG2. 2.1: SIG1: select two members $A_i^r$ and $A_i^s$ from $C_i$ randomly, and execute logic and operation $\wedge$ between them. After that, perform multi-point crossover operation $\oplus$ between the output string by $A_i^r\wedge A_i^s$ and the center members $A_i^c$. This process can be expressed as ${align}{matrix} c{{h}_{1}},c{{h}_{2}}=A_{i}^{c}\oplus (A_{i}^{r}\wedge A_{i}^{s}). & (18) \\ {matrix}{align}$ After that, put the generating individuals $ch_1$ and $ch_2$ into $C_i$, $t=t+2$ and goto step 3. 2.2: SIG2: Randomly select three members $A_i^r$, $A_i^s$ and $A_i^t$ from $C_i$. Then, perform logic and operation $\wedge$ among them and output a binary string $ch$ ${align}{matrix} ch=A_{i}^{r}\wedge A_{i}^{s}\wedge A_{i}^{t}. & (19) \\ {matrix}{align}$ If $ch$ can cover all strength-$t$ combinations in CS, goto step 2.2.1. Otherwise, goto step 2.2.2. 2.2.1: Randomly select a valuable gene in $ch$, whose value is 1, and change it into 0. If the updated $ch$ still can cover all strength-$t$ combinations in CS, repeat this step. Otherwise, recover the value of last chosen gene to 1 and goto step 2.2.3. 2.2.2: Randomly select a gene in $ch$, whose value is 0, and change its value into 1. Then repeat this step until $ch$ has covered all strength-$t$ combination in CS and goto step 2.2.3. 2.2.3: Make the generating individuals $ch$ join $C_i$, set $t=t+1$ and goto step 3. 3: If $t<m$, update the $A_i^c$ of $C_i$ and return to step 2. Otherwise, set $i=i+1$ and return to step 1. Fig. 3 shows the practical calculation process of searching inside group. In Fig. 3(a), two binary strings, $ch_1$ and $ch_2$, are generated by (18). In Fig. 3(b), a temporary individual $ch$ is generated by (19). Then, one gene of $ch$ is selected to change its value from 0 to 1, and a solution is gotten.
4.4 Cluster Selection Process
After the cluster searching process, the scale of current population increases to $2n+m|C|$. In order to satisfy the computation requirement for next generation, we need to select $n$ individuals from current population to form next population. In cluster selection process, $n$ individuals will be selected from each group respectively.
Algorithm 5 Cluster selection Input: $Pop^{k-1}$, $Ch^k$, $C^k$ Output: $Pop^k$ 1: For each $C_i$, sort its members in descending order by their fitness. After that, set $i=1$, $k=1$, $j=1$. 2: Take member $A_i^k$ from $C_i$ to be a surviving individual $I_j$ and join next population. It can be expressed as ${matrix} {{I}_{j}}=A_{i}^{k},\quad j=1,2,\ldots ,n & {} \\ i=j \%|C|+1,\quad k=\text{round}\left( \frac{\mathit{j}}{|\mathit{C}|} \right)+1. & (20) \\ {matrix}$ 3: If $j<n$, $j=j+1$ and return to step 2; otherwise, exit. 5. Simulation Experiments
In this section, we implemented 3 algorithms, a real code genetic algorithm with one-test-at-a-time mechanism GA/OTAT, a binary code GA with global optimization approach GA/BCGO, and CSA, in C++. Meanwhile, 32 CA problems are chosen to test above 3 algorithms and TCA [16], whose source code is implemented in C++, and available online1, on a 2.1 GHz AMD Phenom PC with 2 GB memory.
1https://github.com/leiatpku/TCA
5.1 Experimental Settings
An individual $x_k, \ldots, x_2, x_1$ in GA/OTAT is a $k$-dimensional real vector. The gene value of each dimension is initialized between 0 and 1 randomly. In decoding process, the gene value of each dimension in an individual will be translated into an integer by the formula round $(x_i\cdot v)$. According to this integer, we can get the corresponding parameter symbol in the SUT. For example, in an individual, if the gene $x_1 =0.6$, we can get $1={\rm round} (0.6\cdot3)$. For the CA in Table Ⅰ, the integer 1 means $x_1\rightarrow a_1$. Besides, the population size $n$ of GA/OTAT is 100. The searching operations of GA/OTAT include algebraic crossover, non-uniform mutation and linear sort selection. The following is mainly the computation procedure of GA/OTAT.
1. Initialize test suite TS $=\phi$;
2. Initialize combination set CS;
3. While(CS $\neq\phi$) do
4. Initialize population of GA/OTAT randomly;
5. While(the termination criteria are not reached)do
6. search a best test case $t_i$;
7. End while;
8. Make $t_i$ join TS;
9. Delete the covered combinations in CS;
10. End while;
11. Output TS.
Both GA/BCGO and CSA use (4) to verify the quality of test suite. Besides, based on our experience, we set population size $n=60$ in CSA. Furthermore, the GA/BCGO and CSA use the same maximum running iterations in the following trials and the CSA will costs more operation number than GA/BCGO in each searching iteration process. To be fair, we set a big population size for GA/BCGO, which is 100, to make its searching operations number is not less than CSA in each iteration process. Besides, GA/BCGO and CSA use same crossover and mutation operation. In multi-point crossover operation, the parameter $a$ and $b$, which are used to set the lower and upper limitation of crossover point, are set to $a={\rm round}(L/10)$ and $b$ $=$ ${\rm round}(L/4)$. The gene mutation probability is $\rho$ $=$ $0.01$. But, GA/BCGO uses the linear sort selection operation to generate next population. Besides, the parameter $m_1$ and $m_2$ of CSA, which are used to limit the computation times $m$ in searching inside process, are set to $m_1$ $=$ ${\rm round}(n/10)$ and $m_2 ={\rm round}(n/4)$. The parameter $\lambda$, which is used to control the execute probability of SIG1 and SIG2, is set to $\lambda=0.4$. The computation procedure of CSA and GA/BCGO is
1. Encode the binary code space;
2. Initialize population of CSA or GA/BCGO;
3. While(the termination criteria are not reached)do
4. search the best genetic string $I_i$ in code space;
5. End while;
6. Decode $I_i$ and output TS.
In the beginning of TCA, the initialization step is called to construct a CA set to cover all valid $t$-tuples, which works with a simple one-test-at-a-time greedy strategy. After that, TCA executes the search steps to adjust the CA set until the time budget is reached. During the search process, TCA switches between two modes, that is, the greedy mode and the random mode. With a probability $p=0.001$, TCA works in the random mode; otherwise (with a probability 1-p), TCA works in the greedy mode. In each run of testing, we will set a cutoff time to TCA in advance.
5.2 Results and Discussions
We ran GA/OTAT, GA/BCGO, CSA and TCA for 32 CA problems over 30 independent trials. The experiment results of above 4 algorithms are shown in Table Ⅳ and Table Ⅴ with the experiment results of HHH [17] and integer program method [19] for part of testing problems.
表 Ⅳ EXPERIMENT STATISTICAL RESULTS OF 3 ALGORITHMS FOR 16 CA PROBLEMS WITH STRENGTH-2Table Ⅳ EXPERIMENT STATISTICAL RESULTS OF 3 ALGORITHMS FOR 16 CA PROBLEMS WITH STRENGTH-2$v$ $k$ $|\Phi|$ $\mathit{N}$ GA/OTAT GA/BCGO CSA TCA HHH Integer program $|TS|$ Time (s) $|TS|$ Time (s) $|TS|$ Time (s) $|TS|$ $|TS|$ $|TS|$ Time (s) Best Ave. Best Ave. Best Ave. Best Ave. Best Ave. Best Ave. 2 5 32 6 7 7.7 5.97 6 6 0.27 6 6 0.36 6 6.5 - - 6 - 0.70 6 64 - 7 8.1 6.54 6 6.2 0.68 6 6 1.10 6 6.7 - - 6 - 16.57 7 128 - 7 8.4 10.89 6 9.4 1.43 6 6 1.99 6 7.3 - - 6 - 441.2 8 256 - 8 10.7 12.71 6 11.3 3.91 6 6 4.70 6 7.0 - - - - - 9 512 - 8 11.2 14.5 40 51.6 7.49 6 6 9.60 7 7.9 - - - - - 10 1024 - 8 12.9 25.3 60 70.8 19.5 6 8.4 23.1 7 8.2 - - - - - 11 2048 7 9 14.2 28.8 141 167 40.3 7 9.5 49.0 8 8.7 - - - - - 12 4096 - 9 16.7 32.9 196 228 112.4 20 33.9 145 8 8.8 - - - - - 3 4 81 9 9 10.6 8.8 9 9 1.29 9 9 1.70 9 9.6 9 9 9 - 0.08 5 243 11 11 12.8 17.2 11 17.2 3.73 11 11 5.31 11 11.3 11 11.35 13 - * 6 729 12 16 18.1 19.6 44 57.8 11.5 12 12 14.9 13 14.1 13 14.2 - - - 7 2187 - 19 21.7 36.9 157 165 39.1 12 16.9 50.2 13 14.5 14 15 - - - 8 6561 13 20 27.9 40.3 224 277 133 88 97.5 148 14 14.7 15 15.6 - - - 4 5 1024 16 21 29.5 68.1 56 71.5 22.4 16 16.9 25.6 18 19.8 - - - - - 6 4096 19 25 36.7 74.4 197 227 194 22 34.2 223 21 22.7 - - - - - 7 16384 21 28 40.9 93 320 342 417 116 135 508 24 25.2 - - - - - 表 Ⅴ EXPERIMENT STATISTICAL RESULTS OF TEST CASE NUMBER FOR 16 CA PROBLEMS WITH STRENGTH-3Table Ⅴ EXPERIMENT STATISTICAL RESULTS OF TEST CASE NUMBER FOR 16 CA PROBLEMS WITH STRENGTH-3$v$ $k$ $|\Phi|$ $\mathit{N}$ GA/OTAT GA/BCGO CSA TCA HHH $|TS|$ Time (s) $|TS|$ Time (s) $|TS|$ Time (s) $|TS|$ $|TS|$ Best Ave. Best Ave. Best Ave. Best Ave. Best Ave. 2 5 32 10 11 13.6 11.1 10 11.2 0.41 10 10 0.42 10 10.6 - - 6 64 - 13 14.2 17.6 10 12.6 1.29 10 10 1.34 10 10.7 - - 7 128 - 14 15.6 24.4 12 15.1 2.09 10 10 2.15 10 11.2 - - 8 256 - 17 19.8 31.8 15 19.6 5.22 10 10 5.62 11 12.0 - - 9 512 - 19 21.9 41.0 17 23.9 9.85 10 10 11.0 11 11.9 - - 10 1024 - 22 25.2 52.3 41 54.7 25.9 10 11.8 25.3 13 15.8 - - 11 2048 12 24 27.7 69.9 80 92.8 56.8 12 19.4 55.5 14 15.5 - - 12 4096 15 28 30.9 82.1 239 257 131 69 85.7 133 17 18.1 - - 3 4 81 27 29 34.2 68.1 27 27 1.62 27 27 1.91 27 28.9 27 29.45 5 243 - 30 36.1 71.3 27 34.2 3.93 27 27 5.53 29 31.6 39 41.25 6 729 33 36 42.5 80.5 55 62.9 15.5 33 35.5 15.4 34 36.1 33 39 7 2187 39 48 54.4 89.9 195 217 87.9 49 54.9 93.2 46 47.7 49 50.8 8 6561 42 54 58.9 99.6 251 275 145 156 170 157 51 52.2 52 53.65 4 5 1024 64 72 81 148 60 78.2 22.7 64 67.9 27.6 66 68.7 - - 6 4096 88 93 105 198 229 249 212 194 216 231 94 96.6 - - 7 16384 - 99 117 153 370 379 499 233 247 527 96 98.4 - - In first round of experiments, 16 CA problems with strength-2 have been tested. In the first 8 tests, we set $v$ $= 2$ and $k= 3, 4, 5, \ldots, 12$, respectively. The maximum running iterations of GA/OTAT, GA/BCGO and CSA are set to 200, 500 and 1000 when $k$ is less than or equal to 6, 9 and 12, respectively. In the next 5 tests, $v =3$ and $k=$ $4, $ $5, $ $\ldots, $ $8$, respectively. The maximum running iterations of 3 algorithms are set to 200, 500 and 1000 when $k$ is less than or equal to 4, 6 and 8, respectively. In the last 3 tests, $v=4$ and $k = 5$, $6$, $7$, respectively. The maximum running iterations of 3 algorithms are set to 1000 and 2000 when $k$ is less than or equal to 6 and 7, respectively. The experiment statistical results of above 16 tests are shown in Table Ⅳ. In second round of experiments, the covering criterion of above 16 CA problems is changed to strength-3. Meanwhile, the parameters of $v$, $k$ and maximum running iterations of above 16 CA problems have the same value as the first round experiments. Table Ⅴ is the experiment statistical results of these 16 CA problems. For a fair comparison, the cutoff time of TCA for each CA problem is set to an integer number, which is gotten by rounding up CSAs running time. HHH does not provide its running time and just shows the best and average number of $|TS|$ for $CA(N; t, 3^k)$ where $4\leq k\leq 8$ with strength-2 and strength-3. The integer program method just provides the results of 8 small-size CA problems with strength-2. The $\ast$ in Table Ⅳ indicates the best possible solution found at the point when the process was stopped after running for about 6.5 hours. Besides, the number $N$ is the known minimal number of $|TS|$ for 36 CA problems, which can be found in Colbourns website2.
2http://www.public.asu.edu/~ccolbou/src/tabby/catable.html
In Table Ⅳ, we can see the CSA has gotten 11 best solutions of 16 CA problems and GA/BCGO has gotten 6 best solutions, but GA/OTAT just gets an approximate solution for each 16 problems. In Table Ⅴ, the experiment results show a similar formation as Table Ⅳ. With a holistic view, GA/BCGO can get the optimal solution with a high probability when the problems scale $|\Phi|$ is less than 250. It is worth noting that CSA has improved the performance of global optimization mechanism immensely. It can find the optimal solution with a high probability when the problems scale $|\Phi|$ is less than 2000. Instead, GA/OTAT just can get the approximate solution even the $|\Phi|$ of CA problem is very small. However, when the $|\Phi|$ of CA problem is more than 4000, the solution qualities and average CPU times of GA/OTAT begin to transcend CSA distinctly. Moreover, even the $|\Phi|$ is more than 16 000, the solutions and average CPU times of GA/OTOA still keep an acceptable approximate result.
Comparing the experiment results between GA/OTAT, TCA and HHH, we can see the meta-heuristic algorithms improve the average quality of $|TS|$ remarkably relative to the traditional one-test-at-a-time algorithm for its search step can adjust the generated CA set dynamically. However, the heuristic algorithms should cost lots of array transformation operations to adjust the CA set and combinations set in its greedy and random search processes, which make its running time increase clearly when the cover strength has been augmented. In [16], TCA sets its cutoff time up to 1000 second so as to get a good performance. For above experiments, we can see an extended cutoff time is very helpful for improving the stability of solutions in TCA. It is worth noting that the global optimization ability of TCA is usually limited, notwithstanding it can adjust the generated CA set constantly. So, for most of experiment trials, even the problems scale $|\Phi|$ is not very huge, TCA is likely to generate a high quality approximate solution. Under the same runtime, CSA shows a better average quality of $|TS|$ than TCA when the problems $|\Phi|$ is less than 2000. Meanwhile, the Integer Problems results shows that it is difficult for an integer program method to solve the problem which has more than about 100 integer variables. Instead, CSA shows a good performance when the $|\Phi|$ is less than 2000. That means the proposed approach can acquire more powerful global optimization ability through CSA than others. Moreover, the problem translation mechanism makes CSA less sensitive to strength-$t$ than TCA and HHH, which is a very helpful character for CSA to get a higher performance when the covering strength is augmented.
On the other hand, comparing the experiment data between CSA and GA/BCGO in Table Ⅳ and Table Ⅴ, we also find the best solution of GA/BCGO is very close to CSA when the $|\Phi|$ is less than 250. But, with the increasing of $|\Phi|$, the solution qualities of GA/BCGO begin to decline clearly. We believe the traditional searching operations, such as crossover operation and mutation operation are great at rearranging genetic structure and finding new code pattern, but not so good at refining the existing genetic structure, especially when the gene sequence is too long. In CSA, the searching among group process could be used to prospect the potential gene code, and the searching inside group process could be used to reduce multiple gene patterns in different groups. Based on the collaboration between the searching inside group process and the searching among group process, CSA can improve the performance of proposed approach remarkably.
Above all, we believe the proposed global mechanism is a more efficient calculation approach for small-size SUT with multiple covering strength.
5.3 Parameter Analysis
In this section, we will discuss how to adjust and control the coordination process between the searching among group process and the searching inside group process.
In CSA, the parameter $m$ is used to control the computing number of searching inside group process in each group, which affects the proportion between global searching and local searching. For discussing the proper range and adjusting rule of $m$, the CA problem $(v=2, $ $k=10$ and with strength-2) is used to do a testing. In this experiment, the $m$ has been assigned 3 different values, which is $m=m_1$ ($m_1$ $=n/10$), $m=m_2$ ($m_2=n/4$), and $m$ is increased from $m_1$ to $m_2$ linearly by (17). Besides, the population size $n$ is 60 and the other parameters in CSA are same as the testing in above section.
Fig. 4 is the statistical data of this experiment after 30 independent trials. The boxplot shows the dispersion of 30 experiment results of $|TS|$. We can see that too smaller $m$ is likely to affect the solution quality, and too bigger $m$ makes the distribution of solution much scattered. From the mean value curve of $\gamma$, which shows the changing trend of population diversity, we can find the bigger $m$ makes its curve fall rapidly in early stages of searching process and the smaller $m$ makes its curve fall slowly. The quick diversity loss in early searching stages is likely to make the probability of premature increasing. Conversely, too slow convergence will affect the solution accuracy. The formula (17) makes $m$ increase gradually during the searching process. This test shows it is a feasible balancing mechanism, which makes the population keep a higher diversity against premature convergence in early searching stage while making population accelerate convergence to improve the accuracy of solution in later searching stage.
The parameter $\lambda$ is another important controlling parameter, which is used to adjust the running probability between SIG1 and SIG2. For discussing the proper range of $\lambda$, the CA problem ($v=3$, $k=6$ and with strength-3) is used to do the test. In this experiment, the $\lambda$ has been set 3 different values 0, 0.4 and 1. The values of other parameter in CSA are same as in the above section.
Fig. 5 is the statistical data of this experiment after 30 independent trials. Obviously, this parameter has a great influence on the gene sequence reducing process. If $\lambda=0$, only SIG2 operation would be executed in searching inside group process. The boxplot shows it makes the distribution of solution much scattered and the quality of solution more unstable. Oppositely, if $\lambda=1$, it would take only SIG1 operation to be executed in searching inside group process and makes the gene sequence unable to reduce effectively. The mean value curve of $\gamma$ has shown the same changing trends. When $\lambda=0$, this curve has fallen more rapidly. Meanwhile, when $\lambda=1$, the falling speed of $\gamma$ is too slow to ensure the quality of population convergence. Therefore, the efficiency of searching inside group process depends on the reasonable coordination between SIG1 and SIG2. Obviously, 0.4 is a feasible experience value for this parameter.
6. Conclusion
In this paper, we propose a cluster searching driven combinatorial test data global optimization and generation method. A program based on the proposed method that can be executed on compatible PC has been implemented. The experimental results show the proposed method can get a good performance for small-size CA problems. Within a reasonable time, the optimal test suite can be obtained with higher probability when the scale of complete test case set is less than 2000. But, the quality of its solution declines clearly when the count of test case is more than 4000. So, the proposed method is not ideal for solving the large-size CA problems. Furthermore, we have discussed 2 main control parameters of CSA and given a feasible adjusting approach for them. In future work, we will try to make this approach applicable to MCA problems.
-
表 1 目前一些主流的深度学习开源仿真工具及其下载地址
Table 1 Some mainstream deep-learning open source toolboxes and their download address at present
工具名称 说明及备注 下载地址 Caffe[112] UC Berkeley BVLC 实验室发布的深度学习开源工具,是目前使用最为广泛的深度学习实验平台之一 https://github.com/BVLC/caffe Theano[113-114] 基于Python 语言的深度学习开源仿真工具 https://github.com/Theano/Theano Torch[115] 基于Lua 脚本语言的工具,支持iOS、Android 等嵌入式平台 http://torch.ch/ Purine[116] 支持多GPU,提供线性加速能力 https://github.com/purine/purine2 MXNet[117] 由百度牵头组织的深度机器学习联盟(DMCL) 发布的C++ 深度学习工具库 https://github.com/dmlc/mxnet DIGITS[118] 由NVIDIA 公司集成开发发布的一款基于Web 页面的可视化深度学习仿真工具,支持Caffe 及Touch 工程代码 https://github.com/NVIDIA/DIGITS ConvNet[119] 最早的支持GPU 的CNN 开源工具之一,ILSVRC2012 比赛第一名提供的代码 https://code.google.com/p/cuda-convnet/ Cuda-ConvNet2[109] 支持多GPU 的ConvNet https://github.com/akrizhevsky/cuda-convnet2 DeepCNet[120] 英国Warwick 大学Graham 教授发布的开源CNN 仿真工具,曾获ICDAR 2013 联机手写汉字识别竞赛第一名 https://github.com/btgraham/SparseConvNet Petuum[121] CMU 发布的一款基于多CPU/GPU 集群并行化分布式,机器学习开源仿真平台除了支持深度学习的常用算法之外,还提供很多传统机器学习算法的实现. 可部署在云计算平台之中 https://github.com/petuum/bosen/wiki CURRENT[122] 支持GPU 的回归神经网络函数库 http://sourceforge.net/projects/currennt/ Minerva[123] 深度机器学习联盟(DMCL) 发布的支持多GPU 并行化的深度学习工具 https://github.com/dmlc/minerva TensorFlow[124] 谷歌发布的机器学习可视化开发工具,支持多CPU 及多GPU 并行化仿真,支持CNN、RNN 等深度学习模型 https://github.com/tensor°ow/tensor°ow DMTK[125] 微软发布的一套通用的分布式深度学习开源仿真工具 https://github.com/Microsoft/DMTK 表 2 不同方法在CASIA-OLHWDB1.1联机手写中文单字数据集上的识别结果对比
Table 2 Comparison with different methods on the CASIA-OLHWDB1.1
方法 准确率 (%) 伪样本变形 模型集成 (模型数量) 传统最佳方法: DFE+DLQDF[10] 94.85 × × HDNN-SSM-MCE[66] 89.39 × × MCDNN[127] 94.39 √ √(35) DeepCNet[40] 96.42 √ × DeepCNet-8方向直方图特征[40] 96.18 √ × DCNN (4种领域知识融合)[60] 96.35 √ × HSP-DCNN (4种领域知识集成)[64] 96.87 √ √(8) DeepCNet-FMP (单次测试)[132] 96.74 √ × DeepCNet-FMP (多次测试)[132] 97.03 √ √(12 test) DropSample-DCNN[61] 96.55 √ × DropSample-DCNN (集成)[61] 97.06 √ √(9) 表 3 不同深度学习方法在CASIA-OLHWDB1.0-1.1以及ICDAR2013竞赛数据集上的识别结果 (%)
Table 3 Comparison with different methods on the CASIA-OLHWDB1.0-1.1 and ICDAR 2013 Online CompetitionDB (%)
表 4 不同深度学习方法及部分典型的传统方法在ICDAR2013脱机手写汉字竞赛集上的识别性能
Table 4 Comparison with different traditional and deep-learning besed methods on ICDAR 2013 Offline CompetitionDB
方法 Top1 (%) Top5 (%) Top10 (%) 模型存储量 HCCR-Gradient-GoogLeNet[77] 96.28 99.56 99.80 27.77MB HCCR-Gabor-GoogLeNet[77] 96.35 99.6 99.80 27.77MB HCCR-Ensemble-GoogLeNet[77] (average of 4 models) 96.64 99.64 99.83 110.91MB HCCR-Ensemble-GoogLeNet[77] (average of 10 models) 96.74 99.65 99.83 277.25MB CNN-Fujitsu[39] 94.77 - 99.59 2460MB MCDNN-INSIA[74] 95.79 - 99.54 349MB MQDF-HIT[39] 92.61 - 98.99 120MB MQDF-THU[39] 92.56 - 99.13 198MB DLQDF[39] 92.72 - - - ART-CNN[76] 95.04 - - 51.64MB2 R-CNN Voting[76] 95.55 - - 51.64MB2 ATR-CNN Voting[76] 96.06 - - 206.56MB2 MQDF-CNN[78] 94.44 - - - Multi-CNN Voting[129] 96.79 - - - 2根据文献[76]给出的模型参数(CNN层数、各层卷积核大小及数量、聚合层大小及数量、全连接数量),按照每个参数以浮点数存储(占用4个字节)方式推算而得. 表 5 不同研究方法在ICDAR 2013 Offine Text CompetitionDB 数据对比记录表(%)
Table 5 Comparison with di®erent methods on the ICDAR 2013 Offine Text CompetitionDB (%)
-
[1] Hildebrandt T H, Liu W T. Optical recognition of handwritten Chinese characters:advances since 1980. Pattern Recognition, 1993, 26(2):205-225 doi: 10.1016/0031-3203(93)90030-Z [2] Suen C Y, Berthod M, Mori S. Automatic recognition of handprinted characters——the state of the art. Proceedings of the IEEE, 1980, 68(4):469-487 doi: 10.1109/PROC.1980.11675 [3] Tai J W. Some research achievements on Chinese character recognition in China. International Journal of Pattern Recognition and Artificial Intelligence, 1991, 5(01n02):199-206 doi: 10.1142/S0218001491000132 [4] Liu C L, Jaeger S, Nakagawa M. Online recognition of Chinese characters:the state-of-the-art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2):198-213 doi: 10.1109/TPAMI.2004.1262182 [5] Cheriet M, Kharma N, Liu C L, Suen C Y. Character Recognition Systems:a Guide for Students and Practitioners. USA:John Wiley & Sons, 2007. [6] Plamondon R, Srihari S N. Online and off-line handwriting recognition:a comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(1):63-84 doi: 10.1109/34.824821 [7] Dai R W, Liu C L, Xiao B H. Chinese character recognition:history, status and prospects. Frontiers of Computer Science in China, 2007, 1(2):126-136 doi: 10.1007/s11704-007-0012-5 [8] Liu C L. High accuracy handwritten Chinese character recognition using quadratic classifiers with discriminative feature extraction. In:Proceedings of the 18th International Conference on Pattern Recognition. Hong Kong, China:IEEE, 2006.942-945 [9] Long T, Jin L W. Building compact MQDF classifier for large character set recognition by subspace distribution sharing. Pattern Recognition, 2008, 41(9):2916-2925 doi: 10.1016/j.patcog.2008.02.009 [10] Liu C L, Yin F, Wang D H, Wang Q F. Online and offline handwritten Chinese character recognition:benchmarking on new databases. Pattern Recognition, 2013, 46(1):155-162 doi: 10.1016/j.patcog.2012.06.021 [11] Zhang H G, Guo J, Chen G, Li C G. HCL2000——a large-scale handwritten Chinese character database for handwritten character recognition. In:Proceedings of the 10th International Conference on Document Analysis and Recognition. Barcelona, Spain:IEEE, 2009.286-290 http://cn.bing.com/academic/profile?id=2137472923&encoded=0&v=paper_preview&mkt=zh-cn [12] 钱跃良, 林守勋, 刘群, 刘洋, 刘宏, 谢萦. 863计划中文信息处理与智能人机接口基础数据库的设计和实现. 高技术通讯, 2005, 15(1):107-110Qian Yue-Liang, Lin Shou-Xun, Liu Qun, Liu Yang, Liu Hong, Xie Ying. Design and construction of HTRDP corpora resources for Chinese language processing and intelligent human-machine interaction. Chinese High Technology Letters, 2005, 15(1):107-110 [13] Jin L W, Gao Y, Liu G, Liu G Y, Li Y Y, Ding K. SCUT-COUCH2009——a comprehensive online unconstrained Chinese handwriting database and benchmark evaluation. International Journal on Document Analysis and Recognition, 2011, 14(1):53-64 doi: 10.1007/s10032-010-0116-6 [14] Liu C L, Sako H, Fujisawa H. Handwritten Chinese character recognition:alternatives to nonlinear normalization. In:Proceedings of the 7th International Conference on Document Analysis and Recognition. Edinburgh, UK:IEEE, 2003.524-528 [15] Liu C L, Marukawa K. Pseudo two-dimensional shape normalization methods for handwritten Chinese character recognition. Pattern Recognition, 2005, 38(12):2242-2255 doi: 10.1016/j.patcog.2005.04.019 [16] Jin L W, Huang J C, Yin J X, He Q H. Deformation transformation for handwritten Chinese character shape correction. In:Proceedings of the 3rd International Conference on Advances in Multimodal Interfaces. Beijing, China:Springer, 2000.450-457 [17] Miyao H, Maruyama M. Virtual example synthesis based on PCA for off-line handwritten character recognition. In:Proceedings of the 7th International Workshop on Document Analysis Systems VⅡ. Nelson, New Zealand:Springer, 2006.96-105 [18] Chen G, Zhang H G, Guo J. Learning pattern generation for handwritten Chinese character using pattern transform method with cosine function. In:Proceedings of the 2006 International Conference on Machine Learning and Cybernetics. Dalian, China:IEEE, 2006.3329-3333 [19] Leung K C, Leung C H. Recognition of handwritten Chinese characters by combining regularization, Fisher's discriminant and distorted sample generation. In:Proceedings of the 10th International Conference on Document Analysis and Recognition. Barcelona, Spain:IEEE, 2009.1026-1030 https://www.computer.org/web/csdl/index/-/csdl/proceedings/icdar/2009/3725/00/index.html [20] Okamoto M, Nakamura A, Yamamoto K. Direction-change features of imaginary strokes for on-line handwriting character recognition. In:Proceedings of the 14th International Conference on Pattern Recognition. Brisbane, QLD:IEEE, 1998.1747-1751 [21] Okamoto M, Yamamoto K. On-line handwriting character recognition using direction-change features that consider imaginary strokes. Pattern Recognition, 1999, 32(7):1115-1128 doi: 10.1016/S0031-3203(98)00153-8 [22] Ding K, Deng G Q, Jin L W. An investigation of imaginary stroke techinique for cursive online handwriting Chinese character recognition. In:Proceedings of the 10th International Conference on Document Analysis and Recognition. Barcelona, Spain:IEEE, 2009.531-535 [23] Jin L W, Wei G. Handwritten Chinese character recognition with directional decomposition cellular features. Journal of Circuits, Systems, and Computers, 1998, 8(4):517-524 doi: 10.1142/S0218126698000316 [24] Bai Z L, Huo Q. A study on the use of 8-directional features for online handwritten Chinese character recognition. In:Proceedings of the 8th International Conference on Document Analysis and Recognition. Seoul, Korea:IEEE, 2005.262-266 [25] Liu C L, Zhou X D. Online Japanese character recognition using trajectory-based normalization and direction feature extraction. In:Proceedings of 10th International Workshop on Frontiers in Handwriting Recognition. La Baule, France:IEEE, 2006. http://or.nsfc.gov.cn/bitstream/00001903-5/96633/1/1000007198379.pdf [26] Ge Y, Huo Q, Feng Z D. Offline recognition of handwritten Chinese characters using Gabor features, CDHMM modeling and MCE training. In:Proceedings of the 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing. Orlando, FL, USA:IEEE, 2002. I-1053-I-1056 [27] Liu C L. Normalization-cooperated gradient feature extraction for handwritten character recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(8):1465-1469 doi: 10.1109/TPAMI.2007.1090 [28] Kimura F, Takashina K, Tsuruoka S, Miyake Y. Modified quadratic discriminant functions and the application to Chinese character recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987, PAMI-9(1):149-153 http://cn.bing.com/academic/profile?id=2041570030&encoded=0&v=paper_preview&mkt=zh-cn [29] Mangasarian O L, Musicant D R. Data discrimination via nonlinear generalized support vector machines. Complementarity:Applications, Algorithms and Extensions. US:Springer, 2001.233-251 http://cn.bing.com/academic/profile?id=1518494348&encoded=0&v=paper_preview&mkt=zh-cn [30] Kim H J, Kim K H, Kim S K, Lee J K. On-line recognition of handwritten Chinese characters based on hidden Markov models. Pattern Recognition, 1997, 30(9):1489-1500 doi: 10.1016/S0031-3203(96)00161-6 [31] Liu C L, Sako H, Fujisawa H. Discriminative learning quadratic discriminant function for handwriting recognition. IEEE Transactions on Neural Networks, 2004, 15(2):430-444 doi: 10.1109/TNN.2004.824263 [32] Jin X B, Liu C L, Hou X W. Regularized margin-based conditional log-likelihood loss for prototype learning. Pattern Recognition, 2010, 43(7):2428-2438 doi: 10.1016/j.patcog.2010.01.013 [33] Srihari S N, Yang X S, Ball G R. Offline Chinese handwriting recognition:an assessment of current technology. Frontiers of Computer Science in China, 2007, 1(2):137-155 doi: 10.1007/s11704-007-0015-2 [34] Su T H, Zhang T W, Guan D J, Huang H J. Off-line recognition of realistic Chinese handwriting using segmentation-free strategy. Pattern Recognition, 2009, 42(1):167-182 doi: 10.1016/j.patcog.2008.05.012 [35] Wang Q F, Yin F, Liu C L. Handwritten Chinese text recognition by integrating multiple contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(8):1469-1481 doi: 10.1109/TPAMI.2011.264 [36] Zhou X D, Wang D H, Tian F, Liu C L, Nakagawa M. Handwritten Chinese/Japanese text recognition using semi-Markov conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10):2413-2426 doi: 10.1109/TPAMI.2013.49 [37] Qiu L Q, Jin L W, Dai R F, Zhang Y X, Li L. An open source testing tool for evaluating handwriting input methods. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.136-140 [38] Lin C L, Yin F, Wng Q F, Wang D H. ICDAR 2011 Chinese handwriting recognition competition. In:Proceedings of the 11th International Conference on Document Analysis and Recognition. Beijing, China:IEEE, 2011.1464-1469 [39] Yin F, Wang Q F, Zhang X Y, Liu C L. ICDAR 2013 Chinese handwriting recognition competition. In:Proceedings of the 12th International Conference on Document Analysis and Recognition. Washington, DC, USA:IEEE, 2013.1464-1470 [40] Graham B. Spatially-sparse convolutional neural networks. arXiv:1409.6070, 2014. http://cn.bing.com/academic/profile?id=2270144854&encoded=0&v=paper_preview&mkt=zh-cn [41] Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786):504-507 doi: 10.1126/science.1127647 [42] Bengio Y, Courville A, Vincent P. Representation learning:a review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8):1798-1828 doi: 10.1109/TPAMI.2013.50 [43] Schmidhuber J. Deep learning in neural networks:an overview. Neural Networks, 2015, 61:85-117 doi: 10.1016/j.neunet.2014.09.003 [44] LeCun Y, Boser B, Denker J S, Howard R E, Habbard W, Jackel L D, Henderson D. Handwritten digit recognition with a back-propagation network. In:Proceedings of Advances in Neural Information Processing Systems 2. San Francisco, CA, USA:Morgan Kaufmann Publishers Inc., 1990.396-404 http://cn.bing.com/academic/profile?id=2109779438&encoded=0&v=paper_preview&mkt=zh-cn [45] LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11):2278-2324 doi: 10.1109/5.726791 [46] Ranzato M A, Poultney C, Chopra S, LeCun Y. Efficient learning of sparse representations with an energy-based model. In:Proceedings of the 2007 Advances in Neural Information Processing Systems. USA:MIT Press, 2007.1137-1144 [47] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8):1735-1780 doi: 10.1162/neco.1997.9.8.1735 [48] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In:Proceedings of the 2012 Advances in Neural Information Processing Systems 25. Lake Tahoe, Nevada, USA:Curran Associates, Inc., 2012.1097-1105 [49] Ouyang W L, Wang X G, Zeng X Y, Qiu S, Luo P, Tian Y L, Li H S, Yang S, Wang Z, Loy C C, Tang X O. Deepid-net:Deformable deep convolutional neural networks for object detection. In:Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA:IEEE, 2015.2403-2412 [50] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014. http://cn.bing.com/academic/profile?id=1445015017&encoded=0&v=paper_preview&mkt=zh-cn [51] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473, 2014. http://arxiv.org/abs/1409.0473v6 [52] Graves A, Mohamed A, Hinton G. Speech recognition with deep recurrent neural networks. In:Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, BC, Canada:IEEE, 2013.6645-6649 http://cn.bing.com/academic/profile?id=2276532228&encoded=0&v=paper_preview&mkt=zh-cn [53] Xu K, Ba J, Kiros R, Cho, Courville A, Salakhutdinov R, Zemel R, Bengio Y. Show, attend and tell:neural image caption generation with visual attention. arXiv:1502.03044, 2015. https://arxiv.org/pdf/1505.00393.pdf [54] Vinyals O, Toshev A, Bengio S, Erhan D. Show and tell:a neural image caption generator. In:Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA:IEEE, 2015.3156-3164 http://arxiv.org/pdf/1602.05875.pdf [55] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553):436-444 doi: 10.1038/nature14539 [56] Tang Y C, Mohamed A R. Multiresolution deep belief networks. In:Proceedings of the 15th International Conference on Artificial Intelligence and Statistics. La Palma, Canary Islands, Spain:Microtome Publishing, 2012.1203-1211 [57] Srivastava N, Salakhutdinov R. Multimodal learning with deep Boltzmann machines. In:Proceedings of the 2012 Advances in Neural Information Processing Systems. Tahoe, Nevada, USA:Curran Associates, Inc., 2012.2222-2230 [58] Shao J, Kang K, Loy C C, Wang X G. Deeply learned attributes for crowded scene understanding. In:Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA:IEEE, 2015.4657-4666 [59] Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. In:Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA:IEEE, 2014.1717-1724 http://cn.bing.com/academic/profile?id=2396013981&encoded=0&v=paper_preview&mkt=zh-cn [60] Yang W X, Jin L W, Xie Z C, Feng Z Y. Improved deep convolutional neural network for online handwritten Chinese character recognition using domain-specific knowledge. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.551-555 http://dl.acm.org/citation.cfm?id=2880878 [61] Yang W X, Jin L W, Tao D C, Xie Z C, Feng Z Y. DropSample:a new training method to enhance deep convolutional neural networks for large-scale unconstrained handwritten Chinese character recognition. arXiv:1505.05354, 2015. http://arxiv.org/pdf/1606.05763v1.pdf [62] Yang W X, Jin L W, Liu M F. Character-level Chinese writer identification using path signature feature, dropstroke and deep CNN. arXiv:1505.04922, 2015. [63] Yang W X, Jin L W, Liu M F. DeepWriterID:an end-to-end online text-independent writer identification system. arXiv:1508.04945, 2015. [64] Su T H, Liu C L, Zhang X Y. Perceptron learning of modified quadratic discriminant function. In:Proceedings of the 2011 International Conference on Document Analysis and Recognition. Beijing, China:IEEE, 2011.1007-1011 [65] Du J, Hu J S, Zhu B, Wei S, Dai L R. A study of designing compact classifiers using deep neural networks for online handwritten Chinese character recognition. In:Proceedings of the 22nd International Conference on Pattern Recognition. Stockholm, Sweden:IEEE, 2014.2950-2955 [66] Du J. Irrelevant variability normalization via hierarchical deep neural networks for online handwritten Chinese character recognition. In:Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition. Heraklion, Greece:IEEE, 2014.303-308 [67] Du J, Huo Q, Chen K. Designing compact classifiers for rotation-free recognition of large vocabulary online handwritten Chinese characters. In:Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto, Japan:IEEE, 2012.1721-1724 [68] Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7):1527-1554 doi: 10.1162/neco.2006.18.7.1527 [69] Du J, Hu J S, Zhu B, Wei S, Dai L R. Writer adaptation using bottleneck features and discriminative linear regression for online handwritten Chinese character recognition. In:Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition. Heraklion, Greece:IEEE, 2014.311-316 [70] Liwicki M, Graves A, Bunke H. A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks. In:Proceedings of the 9th International Conference on Document Analysis and Recognition. Curitiba, Paraná, Brazil, 2007.367-371 [71] Frinken V, Bhattacharya N, Uchida S, Pal U. Improved BLSTM neural networks for recognition of on-line Bangla complex words. Structural, Syntactic, and Statistical Pattern Recognition. Berlin Heidelberg, German:Springer, 2014.404-413 [72] Wu W, Gao G L. Online cursive handwriting Mongolia words recognition with recurrent neural networks. International Journal of Information Processing and Management, 2011, 2(3):20-26 doi: 10.4156/ijipm [73] Graves A. Generating sequences with recurrent neural networks. arXiv:1308.0850, 2013. http://arxiv.org/pdf/1605.00064.pdf [74] Cireçsan D, Meier U. Multi-column deep neural networks for offline handwritten Chinese character classification. In:Proceedings of the 2015 International Joint Conference on Neural Networks. Killarney, Ireland:IEEE, 2015.1-6 [75] Cireçsan D C, Meier U, Gambardella L M, Schmidhuber J. Convolutional neural network committees for handwritten character classification. In:Proceedings of the 2011 International Conference on Document Analysis and Recognition. Beijing, China:IEEE, 2011.1135-1139 [76] Wu C P, Fan W, He Y, Sun J, Naoi S. Handwritten character recognition by alternately trained relaxation convolutional neural network. In:Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition. Crete, Greece:IEEE, 2014.291-296 [77] Zhong Z Y, Jin L W, Xie Z C. High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps. In:Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR). Tunis:IEEE, 2015.846-850 http://dl.acm.org/citation.cfm?id=2880878 [78] Wang Y W, Li X, Liu C S, Ding X Q, Chen Y X. An MQDF-CNN hybrid model for offline handwritten Chinese character recognition. In:Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition. Heraklion, Greece:IEEE, 2014.246-249 [79] 高学, 王有旺. 基于CNN和随机弹性形变的相似手写汉字识别. 华南理工大学学报:自然科学版, 2014, 42(1):72-76 http://www.cnki.com.cn/Article/CJFDTOTAL-HNLG201401016.htmGao Xue, Wang You-Wang. Recognition of similar handwritten Chinese characters based on CNN and random elastic deformation. Journal of South China University of Technology:Natural Science Edition, 2014, 42(1):72-76 http://www.cnki.com.cn/Article/CJFDTOTAL-HNLG201401016.htm [80] 杨钊, 陶大鹏, 张树业, 金连文. 大数据下的基于深度神经网的相似汉字识别. 通信学报, 2014, 35(9):184-189 http://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201409019.htmYang Zhao, Tao Da-Peng, Zhang Shu-Ye, Jin Lian-Wen. Similar handwritten Chinese character recognition based on deep neural networks with big data. Journal on Communications, 2014, 35(9):184-189 http://www.cnki.com.cn/Article/CJFDTOTAL-TXXB201409019.htm [81] Feng B Y, Ren M W, Zhang X Y, Suen C Y. Automatic recognition of serial numbers in bank notes. Pattern Recognition, 2014, 47(8):2621-2634 doi: 10.1016/j.patcog.2014.02.011 [82] He M J, Zhang S Y, Mao H Y, Jin L W. Recognition confidence analysis of handwritten Chinese character with CNN. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.61-65 http://dl.acm.org/citation.cfm?id=2880731 [83] Bengio Y, Goodfellow I J, Courville A. Deep learning[Online], available:http://www.deeplearningbook.org,May11,2016 [84] LeCun Y, Boser B, Denker J S, Henderson D, Howard R E, Hubbard W, Jackel L D. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1989, 1(4):541-551 doi: 10.1162/neco.1989.1.4.541 [85] Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In:Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA:IEEE, 2015.1-9 http://www.mdpi.com/2072-4292/8/6/483/htm [86] Lin M, Chen Q, Yan S C. Network in network. arXiv:1312.4400, 2013. http://cn.bing.com/academic/profile?id=2293132816&encoded=0&v=paper_preview&mkt=zh-cn [87] Orr G B, Müller K R. Neural Networks:Tricks of the Trade. German:Springer, 1998. [88] Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580, 2012. http://cn.bing.com/academic/profile?id=2195273494&encoded=0&v=paper_preview&mkt=zh-cn [89] Wan L, Zeiler M, Zhang S X, LeCun Y, Fergus R. Regularization of neural networks using dropConnect. In:Proceedings of the 30th International Conference on Machine Learning. Atlanta, USA, 2013.1058-1066 https://arxiv.org/pdf/1505.00393.pdf [90] Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S A, Huang Z H, Karpathy A, Khosla A, Bernstein M, Berg A C, Li F F. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3):211-252 doi: 10.1007/s11263-015-0816-y [91] Sun Y, Chen Y H, Wang X G, Tang X O. Deep learning face representation by joint identification-verification. In:Proceedings of Advances in Neural Information Processing Systems 27. Montréal, Canada:MIT, 2014.1988-1996 [92] Taigman Y, Yang M, Ranzato M A, Wolf L. DeepFace:closing the gap to human-level performance in face verification. In:Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA:IEEE, 2014.1701-1708 http://europepmc.org/articles/PMC4373928 [93] Toshev A, Szegedy C. Deeppose:Human pose estimation via deep neural networks. In:Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA:IEEE, 2014.1653-1660 https://www.computer.org/csdl/proceedings/cvpr/2014/5118/00/index.html [94] Williams R J, Zipser D. A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1989, 1(2):270-280 doi: 10.1162/neco.1989.1.2.270 [95] Graham B. Sparse arrays of signatures for online character recognition. arXiv:1308.0371, 2013. http://cn.bing.com/academic/profile?id=2360228825&encoded=0&v=paper_preview&mkt=zh-cn [96] Jaderberg M, Simonyan K, Vedaldi A, Zisserman A. Synthetic data and artificial neural networks for natural scene text recognition. arXiv:1406.2227, 2014. http://arxiv.org/abs/1406.2227?context=cs [97] Jaderberg M, Vedaldi A, Zisserman A. Deep features for text spotting. In:Proceedings of the 13th European Conference Computer Vision. Zurich, Switzerland:Springer, 2014.512-528 http://cn.bing.com/academic/profile?id=70975097&encoded=0&v=paper_preview&mkt=zh-cn [98] Wu Y C, Yin F, Liu C L. Evaluation of neural network language models in handwritten Chinese text recognition. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.166-170 [99] Bengio Y, Schwenk H, Senécal J S, Morin F, Gauvain J L. Neural probabilistic language models. Innovations in Machine Learning. Berlin Heidelberg, Germany:Springer, 2006.137-186 [100] Chen X, Tan T, Liu X, Lanchantin P, Wan M, Gales MJF, Woodland PC. Recurrent neural network language model adaptation for multi-genre broadcast speech recognition. In:Proceedings of the 2015 International Speech Communication Association Interspeech. Dresden, Germany, 2015.3511-3515 [101] Sak H, Senior A, Rao K,ÌIrsoy O, Graves A, Beaufays F, Schalkwyk J. Learning acoustic frame labeling for speech recognition with recurrent neural networks. In:Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, QLD:IEEE, 2015.4280-4284 [102] De Mulder W, Bethard S, Moens M F. A survey on the application of recurrent neural networks to statistical language modeling. Computer Speech & Language, 2015, 30(1):61-98 http://cn.bing.com/academic/profile?id=2154137718&encoded=0&v=paper_preview&mkt=zh-cn [103] He K M, Zhang X Y, Ren S Q, Sun J. Delving deep into rectifiers:surpassing human-level performance on imagenet classification. In:Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago, Chile:IEEE, 2015.1026-1034 [104] Ioffe S, Szegedy C. Batch normalization:accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167, 2015. http://cn.bing.com/academic/profile?id=2397299141&encoded=0&v=paper_preview&mkt=zh-cn [105] Fukushima K. Neocognitron:a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 1980, 36(4):193-202 doi: 10.1007/BF00344251 [106] Werbos P J. Backpropagation through time:what it does and how to do it. Proceedings of the IEEE, 1990, 78(10):1550-1560 doi: 10.1109/5.58337 [107] Littman M L. Reinforcement learning improves behaviour from evaluative feedback. Nature, 2015, 521(7553):445-451 doi: 10.1038/nature14540 [108] Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540):529-533 doi: 10.1038/nature14236 [109] Cuda-ConvNet2 [Online], available:https://github.com/akrizhevsky/cuda-convnet2, May 11, 2016 [110] Bengio Y, LeCun Y, Nohl C, Burges C. LeRec:a NN/HMM hybrid for on-line handwriting recognition. Neural Computation, 1995, 7(6):1289-1303 doi: 10.1162/neco.1995.7.6.1289 [111] Simard P Y, Steinkraus D, Platt J C. Best practices for convolutional neural networks applied to visual document analysis. In:Proceedings of the 7th International Conference on Document Analysis and Recognition. Edinburgh, UK:IEEE, 2003.958-963 [112] Caffe[Online], available:http://caffe.berkeleyvision.org/, May 11, 2016 [113] Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow I, Bergeron A, Bouchard N, Warde-Farley D, Bengio Y. Theano:new features and speed improvements. arXiv:1211.5590, 2012. http://cn.bing.com/academic/profile?id=2166015963&encoded=0&v=paper_preview&mkt=zh-cn [114] Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y. Theano:a CPU and GPU math expression compiler. In:Proceedings of the 9th Python for Scientific Computing Conference. Austin, TX, USA, 2010.1-7 http://dl.acm.org/citation.cfm?id=2912118 [115] Torch[Online], available:http://torch.ch/, May 11, 2016 [116] Lin M, Li S, Luo X, Yan S C. Purine:a bi-graph based deep learning framework. arXiv:1412.6249, 2014. [117] MXNet[Online], available:https://github.com/dmlc/mx-net,May11,2016 [118] DIGITS[Online], available:https://developer.nvidia.com/digits, May 11, 2016 [119] ConvNet[Online], available:https://code.google.com/p/cuda-convnet/, May 11, 2016 [120] DeepCNet[Online], available:http://www2.warwick.ac.u-k/fac/sci/statistics/staff/academic-research/graham/,May11,2016 [121] Xing E P, Ho Q R, Dai W, Kim J K, Wei J L, Lee S, Zheng X, Xie P T, Kumar A, Yu Y L. Petuum:a new platform for distributed machine learning on big data. IEEE Transactions on Big Data, 2015, 1(2):49-67 doi: 10.1109/TBDATA.2015.2472014 [122] Weninger F, Bergmann J, Schuller B. Introducing CURRENNT:the Munich open-source CUDA recurrent neural network toolkit. The Journal of Machine Learning Research, 2015, 16(1):547-551 [123] Minerva[Online], available:https://github.com/dmlc/min-erva,May11,2016 [124] TensorFlow[Online], available:https://github.com/tensor-flow/tensorflow,May11,2016 [125] DMTK[Online], available:https://github.com/Microsoft/DMTK,May3,2016 [126] Cireçsan D C, Meier U, Schmidhuber J. Transfer learning for Latin and Chinese characters with deep neural networks. In:Proceedings of the 2012 International Joint Conference on Neural Networks. Brisbane, QLD:IEEE, 2012.1-6 [127] Ciresan D, Meier U, Schmidhuber J. Multi-column deep neural networks for image classification. In:Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, Rhode Island:IEEE, 2012.3642-3649 [128] Bastien F, Bengio Y, Bergeron A, Boulanger-Lewandowski N, Breuel T, Chherawala Y, Cisse M, Côté M, Erhan D, Eustache J, Glorot X, Muller X, Lebeuf S P, Pascanu R, Rifai S, Savard F, Sicard G. Deep self-taught learning for handwritten character recognition. arXiv:1009.3589, 2010. [129] Chen L, Wang S, Fan W, Sun J, Naoi S. Beyond human recognition:a CNN-based framework for handwritten character recognition. In:Proceedings of the 3rd IAPR Asian Conference on Pattern Recognition. Kuala Lumpur, Malaysia:IEEE, 2015.695-699 [130] Chen K T. Integration of paths-A faithful representation of paths by noncommutative formal power series. Transactions of the American Mathematical Society, 1958, 89(2):395-407 http://cn.bing.com/academic/profile?id=2074884967&encoded=0&v=paper_preview&mkt=zh-cn [131] Lyons T. Rough paths, Signatures and the modelling of functions on streams. arXiv:1405.4537, 2014. http://econpapers.repec.org/RePEc:arx:papers:1405.4537 [132] Graham B. Fractional max-pooling. arXiv:1412.6071, 2014. http://arxiv.org/abs/1412.6071 [133] Graves A, Fernández S, Gomez F, Schmidhuber J. Connectionist temporal classification:labelling unsegmented sequence data with recurrent neural networks. In:Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh, Pennsylvania, USA:ACM, 2006.369-376 http://cn.bing.com/academic/profile?id=2168772685&encoded=0&v=paper_preview&mkt=zh-cn [134] Graves A, Schmidhuber J. Offline handwriting recognition with multidimensional recurrent neural networks. In:Proceedings of the 2009 Advances in Neural Information Processing Systems 21. Vancouver, B.C., Canada:Curran Associates, Inc., 2009.545-552 [135] Zhang X, Wang M, Wang L J, Huo Q, Li H F. Building handwriting recognizers by leveraging skeletons of both offline and online samples. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.406-410 [136] Simistira F, Ul-Hassan A, Papavassiliou V, Gatos B, Katsouros V, Liwicki M. Recognition of historical Greek polytonic scripts using LSTM networks. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.766-770 http://dl.acm.org/citation.cfm?id=2880878 [137] Frinken V, Uchida S. Deep BLSTM neural networks for unconstrained continuous handwritten text recognition. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.911-915 http://dl.acm.org/citation.cfm?id=2880731 [138] Messina R, Louradour J. Segmentation-free handwritten Chinese text recognition with LSTM-RNN. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.171-175 http://dl.acm.org/citation.cfm?id=2880731 [139] Mioulet L, Garain U, Chatelain C, Barlas P, Paquet T. Language identification from handwritten documents. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.676-680 [140] Huang S M, Jin L W, Lv J. A novel approach for rotation free online handwritten Chinese character recognition. In:Proceedings of the 10th International Conference on Document Analysis and Recognition. Barcelona, Spain:IEEE, 2009.1136-1140 [141] Moysset B, Kermorvant C, Wolf C, Louradour J. Paragraph text segmentation into lines with recurrent neural networks. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015, 456-460 http://dl.acm.org/citation.cfm?id=2880731 [142] He P, Huang W L, Qiao Y, Loy C C, Tang X O. Reading scene text in deep convolutional sequences. arXiv:1506.04395, 2015. http://cn.bing.com/academic/profile?id=2338605913&encoded=0&v=paper_preview&mkt=zh-cn [143] Shi B G, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. arXiv:1507.05717, 2015. http://arxiv.org/abs/1507.05717 [144] iiMedia Research. 2015Q2 Report of input methods for mobile phone in China market[Online], available:http://www.iimedia.com.cn/,May11,2016 [145] 中华人民共和国国家质量监督检验检疫总局, 中国国家标准化管理委员会. GB/T18790-2010联机手写汉字识别系统技术要求与测试规程. 2011General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China, Standardization Administration of the People's Republic of China. GB/T18790-2010 Requirements and test procedure of on-line handwriting Chinese character recognition system. 2011 [146] Long T, Jin L W. A novel orientation free method for online unconstrained cursive handwritten Chinese word recognition. In:Proceedings of the 19th International Conference on Pattern Recognition. Tampa, FL, USA:IEEE, 2008.1-4 [147] He T T, Huo Q. A character-structure-guided approach to estimating possible orientations of a rotated isolated online handwritten Chinese character. In:Proceedings of the 10th International Conference on Document Analysis and Recognition. Barcelona, Spain:IEEE, 2009.536-540 [148] 黄盛明.联机手写汉字的旋转无关识别研究[硕士学位论文].华南理工大学, 2010 http://cdmd.cnki.com.cn/article/cdmd-10561-1014063919.htmHuang S. A Study on Recognition for Rotated Isolated Online Handwritten Chinese Character[Master dissertation], South China University of Technology, China, 2010 http://cdmd.cnki.com.cn/article/cdmd-10561-1014063919.htm [149] Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar V R, Lu S J, Shafait F, Uchida S, Valveny E. ICDAR 2015 competition on robust reading. In:Proceedings of the 13th International Conference on Document Analysis and Recognition. Tunis:IEEE, 2015.1156-1160 期刊类型引用(9)
1. 邵雪卷,韩雨豪,周亮亮,孙来庆,陈志梅,张井岗. 参数带宽化的双摆桥式起重机线性自抗扰控制. 振动与冲击. 2025(04): 82-90 . 百度学术
2. 宁佳慧,文新宇. 输入时滞系统的多频干扰预测补偿. 太原科技大学学报. 2024(05): 467-473 . 百度学术
3. 林海. 基于模糊线性二次型的桥式起重机定位控制方法. 机械研究与应用. 2023(03): 31-33 . 百度学术
4. 尤智腾,徐为民,许贤先,朱鑫磊,胡强. 基于视觉反馈的桥吊自适应滑模防摇控制. 控制工程. 2023(06): 1051-1061+1128 . 百度学术
5. 王守瑞,靳伍银,芮执元,张霞. 基于快速非奇异终端滑模的三维天车负载摆动控制. 吉林大学学报(工学版). 2023(12): 3508-3517 . 百度学术
6. 王子赟,李南江,王艳,纪志成. 基于凸空间收缩滤波的噪声不确定时滞系统状态估计. 控制理论与应用. 2022(12): 2331-2339 . 百度学术
7. 曹海昕,郝运嵩,林静正,卢彪,方勇纯. 绳长时变情况下轮胎式集装箱起重机非线性防摆控制算法. 自动化学报. 2021(08): 1876-1884 . 本站查看
8. 刘青松. 基于嵌套-伪预估器反馈的时滞控制系统输入时滞补偿. 自动化学报. 2021(10): 2464-2471 . 本站查看
9. 邢东峰,陈光武,刘孝博. 基于STM32的姿态感知四轴飞行器的设计. 传感技术学报. 2019(11): 1603-1607 . 百度学术
其他类型引用(7)
-