On the other hand, some classification methods do not depend on the manually extraction of features. In some classification tools, features are manually extracted and then fed to the classifier. Every classification depends on the features extraction. The more important part of classification is the proper selection of classification methods. So, a comparison of the publicly available dataset is required to observe which of the dataset contains more versatility irrespective of dataset dimension. It depends on how much varieties in shape, size, resolution, writing styles, papers quality, etc have been added in the samples of the dataset. Training with larger dataset might not always means better training of a classifier. The limitation of small datasets of Bengali handwritten numerals was recently resolved by the publication of a large dataset called NUMTADB by Alam et al. In general, many samples have diversity which implies improved training of the classifier. The number of samples of the Bengali datasets is small compared to the English dataset. On the other hand, MNIST, the largest dataset of Hindu–Arabic numerals (i.e., the so-called English numerals set) consists of 60,000 samples, and the best accuracy reported in the literature on this dataset is 99.79%. Among them, the most commonly used datasets are CMATERdb3.1.1 and ISI, consisting of 602 handwritten samples, respectively. There are very few handwritten Bengali numeral datasets available in the literature. To get optimum accuracy of a classifier, it should be trained properly with sufficiently large dataset. Handwritten recognition of numerals is a classical problem of pattern recognition and machine learning, where proper choice of classification tool and datasets play an important role in this kind of problem. Therefore, efforts with new tools and methods for better HBNR are a timely demand. But more importantly, there are Bengali numerals whose shapes are very similar and this shape similarity makes the recognition more challenging. Recognizing Bengali handwritten numerals is challenging as for Arabic numerals, because of their varied sizes and critical shapes. The automatic recognition of printed Bengali numerals is also very high however, the progress of handwritten Bengali numeral recognition (HBNR) is far behind these languages. Research on handwritten numerals has made impressive progress in some languages such as Arabic, Chinese, and English. It is used as the official language of Bangladesh and several Indian states including Assam, Jharkhand, Tripura, and West Bengal. Bengali is the 7th most widely spoken language in the World and the mother language of Bangladesh. Bengal is a region of eastern South Asia, which comprises Bangladesh and the West Bengal of India. Some major applications include automatic bank cheque processing, form data entry, and postal sorting. Handwritten recognition has gained much attention to the researchers because of its numerous potential applications in real life. The proposed VGG-11M outperformed the existing architectures of CNN on HBNR. The highest accuracy 99.80%, 99.66%, and 99.25% was obtained using the proposed architecture on the test set of ISI, CMATERDB, and NUMTADB dataset, respectively, at resolution \(32\times \). Finally the performance of the model was compared with four other architectures (LeNet-5, ResNet-50, VGG-11, and VGG-16). The recognition accuracy of the developed system was tested on both training and test sets of three publicly available handwritten Bengali numerals database at different resolutions. Then, the images were used to train the proposed VGG-11M model. The normalized and rescaled images of each numeral were augmented by different transformation operations to increase the training samples and to add diversity in the dataset. We proposed a new CNN architecture, VGG-11M, which improves an existing one (VGG-11). The main purpose of this study is to provide an architecture of a CNN to improve the accuracy of handwritten Bengali numerals recognition (HBNR) and compare its performance with the existing ones. Recently, the use of convolutional neural network (CNN) has been reported with high accuracy in pattern recognition and computer vision problems. Research on unconstrained handwritten recognition in some languages has achieved attractive advancement, but it lags behind for Bengali even though it is the major language spoken by about 230 million people in the Indian subcontinent, and even the first and official language of Bangladesh. Handwritten recognition has drawn profound attention since decades ago due to its numerous potential applications in real life.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |