Publications

Permanent URI for this collectionhttps://ir.daiict.ac.in/handle/123456789/32

Browse

Search Results

Now showing 1 - 10 of 32
  • Publication
    A contour-based thinning algorithm with post-processing using auto-encoder
    (Inderscience Online, 18-04-2023) Goswami, Mukesh M; Panara, Bhavika B; Vyas, Archana N; Mitra, Suman; DA-IICT, Gandhinagar
    One of the critical pre-processing phases in a pattern recognition system is thinning. The process of finding a one-pixel-wide representation of a binary object while preserving its original shape is known as �thinning�. Thinning makes recognition easier and more efficient due to the fact that the thin image of the object is less complex than the original one. It is also easy to find topological features like endpoints, junction points, lines, curves, etc. The general problem with the thinning method is that it often results in unwanted edges or hairs that deform the original shape of the object and ultimately decrease the system�s accuracy. The current proposal presents a contour-based thinning algorithm to address this issue. The fundamental part of the suggested algorithm is to extract a skeleton from the initial binary object. The potential distribution of pixels in the junction region is estimated in the second phase using an auto-encoder trained on similar thinned images in order to provide more accurate thinned images. The third stage employs post-processing to produce an image that is one pixel wide. The proposed method was tested on a handwritten Gujarati numeral database with 99.89% accuracy.
  • Publication
    Noise level estimation using locality preserving natural image statistics
    (Elsevier,, 01-07-2024) Shikkenawis, Gitam; Saxena, Ashutosh; Mitra, Suman; DA-IICT, Gandhinagar
    Natural images are known to have certain regular statistical properties. These properties get modified under any artificial change or distortion in natural images. Most common form of image degradation occurs in the form of noise. The amount of degradation in noisy images is measured by estimating the noise level. Many image processing applications such as denoising, restoration, segmentation, compression etc. use noise level information as a prior; inaccurate estimate of which may impact their performance. In this article, we explore natural image statistics in locality preserving transform domain. This property groups structurally similar images/image patches when projected in the transform domain. Image patches corrupted with similar noise level get projected close by in the locality preserving domain and show consistent coefficient behaviour. In particular, we use Two Dimensional Orthogonal Locality Preserving Projection (2DOLPP) as the domain transformation technique. 2DOLPP basis, representing natural images, are learnt in advance from a set of clean images, thereby reducing the computational time significantly. Features based on natural image statistics are extracted from 2DOLPP domain representation of input image patches. Mapping from feature space to noise level is carried out using support vector regression. The proposed noise estimation approach is at par with or surpasses the state-of-the-art techniques with much less computational time. Performance of this approach is stable across a wide range of noise levels and independent of the image structure.
  • Publication
    HML-RF: Hybrid Multi-Label Random Forest
    (IEEE, 24-02-2022) Jain, Vikas; Phophalia, Ashish; Mitra, Suman; DA-IICT, Gandhinagar
    Multi-label classification is the supervised learning problem in which an instance is associated with a set of labels. In this, labels are correlated, and hence label dependency information plays a vital role. Its always been a question of research to decide the order of labels to exploit their inter-dependency. Hence, to this end, many research works are done that, in general, can be categorized as problem transformation and algorithm adaptation techniques. The problem transformation reconstructs the multi-label problem as a multiple single class problem. The algorithm transformation modifies the existing well-known machine learning approaches to solve the multi-label classification problem. However, these two techniques have their pros and cons. In this paper, we propose a novel approach to consider the merits of both techniques, hence named Hybrid Multi-Label Random Forest (HML-RF). The multi-label decision trees are used as base classifiers in the proposed approach to construct the HML-RF model. Each base classifier is constructed over a randomly selected subset of labels to exploit the label dependency. We also formulate a way to compute the tree strength of a multi-label decision tree, which is used to construct the HML-RF with strength (HML-RFws). The efficacy of the proposed approach is tested over the ten well-known and publicly available datasets. Experimental results show the HML-RF is performing better for at-least six datasets, and the HML-RFws is performing better for at-least nine datasets in comparison to state-of-the-art approaches in terms of accuracy, hamming loss, and zero-one loss. Finally, the statistical test is also validating all the experimental results.
  • Publication
    Multi-Script Text Detection from Image Using FRCNN
    (World Scientific, 01-06-2021) Goswami, Mukesh M; Dadiya, Nidhi J; Goswami, Tanvi; Mitra, Suman; DA-IICT, Gandhinagar
    Textual information is the most common type of way by which we can determine what text/texts we are looking for. In order to retrieve text from images the first and foremost step is text detection from the image. Text detection has a wide range of applications such as translation, smart car driving system, information retrieval, indexing of multimedia archives, sign board reading, and countless. Multilingual text detection from images adds an extra complication to a computer vision problem. As India is a multilingual country and therefore multi-script texts can be found almost everywhere. A multi-script text differs in terms of formats, strokes, width, and height. Also, universal features for such an environment are unknown and difficult to determine as well. Therefore, detecting multi-script text from images is an important yet unsolved problem. In this work, we proposed a faster RCNN-based method for detecting English, Hindi, and Gujarati text from Images. Faster RCNN is the state-of-the-art approach for object detection. As it works for objects which are of large size and texts are of smaller size, the parameters are tuned to meet the objective of multi-script text detection. The dataset is created by collecting images as there is no standard dataset available that includes English, Gujarati, and Hindi texts in the public domain.
  • Publication
    Discrimination of multi-crop scenarios with polarimetric SAR data using Wishart mixture model
    (SPIE, 20-08-2021) Chaudhari, Nilam; Mitra, Suman; Chirakkal, Sanid; Mandal, Srimanta; Putrevu, Deepak; Misra, Arundhati; DA-IICT, Gandhinagar; Chaudhari, Nilam (201911063)
    Discrimination of crop varieties spanned over heterogeneous agriculture land is a vital application of polarimetric SAR images for agriculture monitoring and assessment. The covariance matrix of polarimetric SAR images is observed to follow a complex Wishart distribution for major classification tasks. It is true for homogeneous regions, but for heterogeneous regions, the covariance matrix follows a mixture of multiple Wishart distributions. We aim to improve the classification accuracy when the terrain under observation is heterogeneous. For this purpose, Wishart mixture model is employed along with expectation-maximization (EM) algorithm for parameter estimation. Elbow method helps us to devise the number of mixtures. The convergence of the EM algorithm depends on the choice of initial points. So, to improve the robustness of the model, different initialization approaches, such as random,�K-means, and global�K-means, are embedded in the EM algorithm. Further, the degrees of freedom is one of the crucial parameters of Wishart distribution. Therefore, the impact of different degrees of freedom is analyzed on classification accuracy. The method that is equipped with initialization technique along with optimum degrees of freedom is assessed using three full polarimetric SAR data sets of agriculture lands. The first two are benchmark data sets of Flevoland, Netherlands, region acquired by AIRSAR sensor, and third is our study area of Mysore, India, acquired by RADARSAT-2 sensor.
  • Publication
    Edge-Preserving classification of polarimetric SAR images using Wishart distribution and conditional random field
    (Taylor & Francis, 08-04-2022) Chaudhari, Nilam; Mitra, Suman; Mandal, Srimanta; Chirakkal, Sanid; Putrevu, Deepak; Misra, Arundhati; DA-IICT, Gandhinagar; Chaudhari, Nilam (201911063)
    Classification of polarimetric SAR images into different ground covers has important applications in fields such as land mapping, agriculture monitoring, and assessment. The Wishart supervised classifier is one of the most widely used and general purpose classifier for polarimetric SAR data. However, it is a pixel-based classifier, so the performance is greatly affected by inherent speckle noise. The impact of speckle noise can be reduced by considering the spatial information from neighbouring pixels for classification tasks. In this paper, we aim to improve classification results by incorporating spatial-contextual information along with preservation of significant details such as edges and micro-regions. For this purpose, a conditional random field (CRF) based model is proposed for polarimetric SAR data along with Wishart and Wishart mixture model (WMM) classifiers, namely Wishart-CRF and WMM-CRF, to perform the classification. The model is compared with the Markov random field (MRF) based model as well as neural network-based models. The results are analysed in terms of accuracy and preservation of details such as edges and micro-regions. The model is assessed using three full polarimetric SAR benchmark data sets. The CRF model exhibits better classification results by significantly reducing the noise and preserving the finer details of edges and small regions.
  • Publication
    Image denoising using orthogonal locality preserving projections
    (SPIE, 01-08-2015) Shikkenawis, Gitam; Mitra, Suman; Rajwade, Ajit; DA-IICT, Gandhinagar; Shikkenawis, Gitam (201221004)
    Image denoising approaches that learn spatially adaptive dictionaries from the observed noisy image have gathered a lot of attention in the past decade. These methods rely on the hypothesis that patches from the underlying clean image can be expressed as sparse linear combinations of these dictionary vectors (bases). We present a framework for inferring an orthonormal set of dictionary vectors using orthogonal locality preserving projection (OLPP). This ensures that patches that are similar in the noisy image should produce similar coefficients when projected in the OLPP domain. Unlike other projection methods, the locality preserving property of OLPP automatically groups similar patches together during inference of the basis. Hence, only one global orthonormal basis suffices to sparsely represent patches from a large subimage or a large portion of the image. The proposed amalgamation of the sparsity and global dictionary make the current approach more suitable for an image denoising task with reduced computational complexity. Experiments on several benchmark datasets made it clear that the proposed method is capable of preserving fine textures while denoising an image, on par with or surpassing several state-of-the-art methods for gray-scale and color images.
  • Publication
    On reconnection of broken ridges and binarization for fingerprint images
    (01-01-2014) Munshi, Paridhi; Mitra, Suman; DA-IICT, Gandhinagar; MunshI, Paridhi (201011042)
  • Publication
    Offline handwritten Gujarati numeral recognition using low-level strokes
    (InderScience, 01-10-2015) Goswami, Mukesh M; Mitra, Suman; DA-IICT, Gandhinagar
    This paper focuses on the development of offline handwritten Gujarati numeral database of reasonable size and its recognition using low-level stroke features. The database consists of 14,000 samples collected from 140 people with different age group, educational background, and work culture. A novel technique for the extraction of various low-level stroke features, like endpoints, junction points, line segments, and curve segments, is proposed, and the block-wise histogram of low-level stroke features is used for the recognition of offline handwritten numerals from two of the popular Indian scripts, namely Gujarati and Devanagari. The baseline experiments were performed using k-nearest neighbour (k-NN) classifier, and the results were further improved by using the statistically advance support vector machine (SVM) classifier with radial basis function (RBF) kernel. The average test accuracy obtained on Gujarati and Devanagari database were 98.46% and 98.65%, respectively, which is comparable to other existing work. The experiments were also performed on the mixed numerals recognition from Gujarati-Devanagari and Gujarati-English considering the multi-script scenarios in Indian documents.
  • Publication
    L1-norm orthogonal neighbourhood preserving projection and its applications
    (Springer, 01-11-2019) Koringa, Purvi A; Mitra, Suman; DA-IICT, Gandhinagar; Koringa, Purvi A (201321010)
    Dimensionality reduction techniques based on manifold learning are becoming very popular for computer vision tasks like image recognition and image classification. Generally, most of these techniques involve optimizing a cost function in L2-norm and thus they are susceptible to outliers. However, recently, due to capability of handling outliers, L1-norm optimization is drawing the attention of researchers. The work documented here is the first attempt towards the same goal where orthogonal neighbourhood preserving projection (ONPP) technique is performed using optimization in terms of L1-norm to handle data having outliers. In particular, the relationship between ONPP and PCA is established theoretically in the light of L2-norm and then ONPP is optimized using an already proposed mechanism of PCA-L1. Extensive experiments are performed on synthetic as well as real data for applications like classification and recognition. It has been observed that when larger number of training data is available L1-ONPP outperforms its counterpart L2-ONPP.