Theses and Dissertations

Permanent URI for this collectionhttp://ir.daiict.ac.in/handle/123456789/1

Browse

Search Results

Now showing 1 - 8 of 8
  • ItemOpen Access
    On designing DNA codes and their applications
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2019) Limbachiya, Dixita; Gupta, Manish K.
    Bio-computing uses the complexes of biomolecules such as DNA (Deoxyribonucleic acid), RNA (Ribonucleic acid) and proteins to perform the computational processes for encoding and processing the data. In 1994, L. Adleman introduced the field of DNA computing by solving an instance of the Hamiltonian path problem using the bunch of DNA sequences and biotechnology lab methods. An idea of DNA hybridization was used to perform this experiment. DNA hybridization is a backbone for any computation using the DNA sequences. However, it is also cause of errors. To use the DNA for computing, a specific set of the DNA sequences (DNA codes) which satisfies particular properties (DNA codes constraints) that avoid cross-hybridization are designed to perform a particular task. Contributions of this dissertation can be broadly divided into two parts as 1) Designing the DNA codes by using algebraic coding theory. 2) Codes for DNA data storage systems to encode the data in the DNA. The main research objective in designing the DNA codes over the quaternary alphabets {A, C, G, T}, is to find the largest possible set of M codewords each of length n such that they are at least at the distance d and satisfies the desired constraints which are feasible with respect to practical implementation. In the literature, various computational and theoretical approaches have been used to design a set of DNA codes which are sufficiently dissimilar. Furthermore, DNA codes are constructed using coding theoretic approaches using fields and rings. In this dissertation, one such approach is used to generate the DNA codes from the ring R = Z4 + wZ4, where w2 = 2 + 2w. Some of the algebraic properties of the ring R are explored. In order to define an isometry from the elements of the ring R to DNA, a new distance called Gau distance is defined. The Gau distance motivates the distance preserving map called Gau map f. Linear and closure properties of the Gau map are obtained. General conditions on the generator matrix over the ring R to satisfy reverse and reverse complement constraints on the DNA code are derived. Using this map, several new classes of the DNA codes which satisfies the Hamming distance, reverse and reverse complement constraints are given. The families of the DNA codes via the Simplex type codes, first order and rth order Reed-Muller type codes and Octa type codes are developed. Some of the general results on the generator matrix to satisfy the reverse and reverse complement constraints are given. Some of the constructed DNA codes are optimal with respect to the bounds on M, the size of the code. These DNA codes can be used for a myriad of applications, one of which is data storage. DNA is stable, robust and reliable. Theoretically, it is estimated that one gram of DNA can store 455 EB (1 Exabyte = 1018 bytes). These properties make the DNA a potential candidate for data storage. However, there are various practical constraints for the DNA data storage system. In this work, we construct DNA codes with some of the DNA constraints to design efficient codes to store data in DNA. One of the practical constraints in designing DNA codes for storage is the repeated bases (runlengths) of the same DNA nucleotides. Hence, it is essential that each DNA codeword should avoid long runlengths. In this thesis, codes are proposed for data storage that will dis-allow runlengths of any base to develop DNA data storage error-free codes. A fixed GC-weight u (the occurrence of G and C nucleotides in a DNA codeword) is another requirement for DNA codewords used in DNA storage. DNA codewords with large GC-weight lead to insertion and deletion (indel) errors in DNA reading and amplification process thus, it is crucial to consider a fixed GCweight for DNA code. In this work, we propose methods that generate families of codes for the DNA data storage systems that satisfy no-runlength and fixed GC-weight constraints for the DNA codewords used for data storage. The first is the constrained codes which use the quaternary code and the second is DNA Golay subcodes that use the ternary encoding. The constrained quaternary coding is presented to generate DNA codes for the data storage. We give a construction algorithm for finding families of DNA codes with the no-runlength and fixed GC-weight constraints. The number of DNA codewords of fixed GC-weight with the no-runlength constraint is enumerated. We note that the prior work only gave bounds on the number of such codewords while in this work we count the number of these DNA codewords exactly. We observe that the bound mentioned in the previous work does not take into account the distance of the code which is essential for data reliability. Thus, we consider distance to obtain a lower bound on the number of codewords along with the fixed GC-weight and no-runlength constraints. In the second method, we demonstrate the Golay subcode method to encode the data in a variable chunk architecture of the DNA using ternary encoding. N.Goldman et al. introduced the first proof of concept of the DNA data storage in 2013 by encoding the data without using error correction in the DNA which motivated us to implement this method. While implementing this method, a bottleneck of this approach was identified which limited the amount of data that can be encoded due to fix length chunk architecture used for data encoding. In this work, we propose a modified scheme using a non-linear family of ternary codes based on the Golay subcode that includes flexible length chunk architecture for data encoding in DNA. By using the Golay ternary subcode, two substitution errors can be corrected. In a nutshell, the significant contributions of this thesis are designing DNA codes with specific constraints. First, DNA codes from the ring using algebraic coding by defining a new type of distance (Gau distance) and map (Gau map) are proposed. These DNA codes satisfy reverse, reverse complement and complement with the minimum Hamming distance constraints. Several families of these DNA codes and their properties are studied. Second, DNA codes using constrained coding and Golay subcode method are developed that satisfy norunlength and GC-weight constraints for a DNA data storage system.
  • ItemOpen Access
    Learning based approach for image compression
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2015) Kumar, Dheeraj; Joshi, Manjunath V.
    Data compression is a process of storing same information with less data or space in computer memory. There are many image compression techniques that are available for storing images with less storage space. Minimizing storage space minimizes the bandwidth required for transmission. In the proposed algorithm first level Discrete Wavelet Transformation (DWT, with Daubechies wavelets db4 as a mother wavelet) is applied on original image after which only low resolution coefficients are retained. Further Embedded Zero Tree Wavelet based algorithm (EZW) [10] is applied for best image quality at the given bit rate. We are using a set of images as database. For every input image Content Based Image Retrieval (CBIR) [7] technique is applied on database which results in some images, having similar content. At the receiver a learning based approach is used to decompress from resulted database images. Structure Similarity Index Measurement (SSIM) [15] an image quality assessment is used for similarity check. Inverse DWT is applied to get the estimate of the original. This is a lossy compression and results are compared with JPEG [13] and JPEG2000 [8] compression.
  • ItemOpen Access
    Binarizing degraded document image for text extraction
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2015) Patel, Radhika; Mitra, Suman K.
    The recent era of digitization is expected to be digitized many old important documents which are degraded due to various reasons. Binarizing Degraded Document Image for Text Extraction is a conversation of document color image to binary image. Document images have mostly two classes: background and text. It can also be considered as a text retrieval procedure as it extracts text from a degraded document. Degraded document image binarization have many challenges like huge text intensity variation, background contrast variation, bleed through, text size or stroke width variation in a single image, highly overlapped background and foreground intensity ranges etc. Many approaches are available for document image binarization, but none can handle all kind of degradation at the same time. Mostly, a combination of global and/or local thresholding along with various preprocessing as well as postprocessing techniques are used for document image binarization to handle most of the challenges. The approach proposed in this thesis is basically divided into three stages: preprocessing, Text-Area detection, post-processing. Preprocessing employs PCA to convert image from RGB to Gray, followed by gamma correction that enhances the contrast of the image. Contrast-enhanced image is filtered with DoG (Difference of Gaussian) filter to boost local features of a text, followed by equalization. Next stage involves identifying Text-Area. A Rough set based edge detection technique is used to find closed boundary around texts, which results into locating Text- Area along with some non-text area detected as text. Text is detected by applying logical operators on preprocessed image and edge detected image. Postprocessing technique takes care of false positives and false negative based on intensity values of preprocessed and gray image. The algorithm is also expected to be independent of the script. To demonstrate this, the algorithm is tested on Gujarati degraded document images. The Performance is evaluated based on various quantitative measures like Distance Reciprocal Distortion (DRD), Peak Signal-to-Noise Ratio (PSNR), F-Measure, and pseudo F-measure and It is compared with the state-of-the-art (SOTA) method. The proposed approach is close to the SOTA methods based on performance. It is able to binarize without losing text in some of the very challenging images, where state-of-the-art methods lose the text.
  • ItemOpen Access
    Techniques for denoising brain magnetic resonance images
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2016) Phophalia, Ashish; Mitra, Suman K.
    Advances in the computational science joined medical imaging domain to help humanity. It offers great support in clinical practices where automatic Computer Added Systems (CAD) help in identification and localization of abnormal tissues. In recent decades, a lot of research tuned non-invasive techniques have been devised to serve mankind. One of them is Magnetic Resonance Imaging (MRI) which provides structural information at higher resolution even in presence of bone structures in the body. Although it is free from ionizing ingredient, factors like electronic circuitry, patient movement etc. provoke some artifacts in imaging system considered as noise. One needs to get rid of these artifacts by means of software processing to enhance the performance of diagnostic process. This thesis is also an attempt to deal with noisy part of MRI and comply with preserving image structures such as boundary details and preventing over-smoothing. It has been observed that, in case of MR data, noise follows Rician distribution. As opposed to additive Gaussian noise, Rician noise is signal dependent in nature due to MR image acquisition process. The thesis constitutes a relationship between MRI denoising and uncertainty model defined by Rough Set Theory (RST). RST already has shown some promising outcomes in image processing problems including segmentation, clustering whereas not much attention has been paid in image restoration task. The first part of the thesis proposes a novel method for object based segmentation and edge derivation given the noisy MR image. The edges are closed and continuous in nature and segmentation accuracy turns out to be better than well-known methods. The prior information is used as cues in various image denoising frameworks. In Bilateral filter framework along with spatial and intensity cues, a new weighing factor is derived using prior segmentation and edge information. This further extends to non local framework where waiver in spatial relation conceded to access similar information from far of neighbors. Under non locality paradigm, a clustering based method is proposed which clubs together similar patches based on similarity criteria. The proposed clustering method uniquely defines clusters of patches under multiple class set up. These clusters are then used to define the basis vectors using Principal Component Analysis (PCA) and Kernel Principal Component Analysis (KPCA) method followed by hard thresholding shrinkage procedure. Afterwards, multiple estimations of a pixel are averaged by number of estimations. In total, number of PCA or KPCA operations are far less than other contemporary methods which repeat the same process over chunks of patches in the image space. The concept is then extended for 3D MRI data. The 3D imaging provides better view of objects from three directions as compared to 2D imaging where only one face of object can be viewed. It involves a complex relationship as compared to 2D imaging and hence is computationally expensive. But it also includes more information which helps in visualizing the object, its shape, boundary etc. similar to real world phenomenon. We extended the segmentation and edge derivation mechanism to 3D data in last part of the thesis. Clustering process is also extended by converting each voxel to one dimensional vector. This part explores various kernels over Rician noise distributed MR data. The results are promising in terms of structure measures even with some simple kernels.
  • ItemOpen Access
    Practical approach for depth estimation and image restoration using defocus cue
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2011) Ranipa, Keyur R.; Joshi, Manjunath V.
    Reconstruction of depth from 2D images is an important research issue in computer vision. Depth from defocus (DFD) technique uses space varying blurring of an image as a cue in reconstructing the 3D structure of a scene. In this thesis we explore the regularization based approach for simultaneous estimation of depth and image restoration from defocused observations. We are given two defocused observations of a scene that are captured with di erent camera parameters. Our method consists of two steps. First we obtain the initial estimates for the depth as well as for the focused image. In the second step we re ne the solution by using a fast optimization technique. Here we use the classic depth recovery method due to Subbarao for ob- taining the initial depth map and Wiener lter approach for initial image restoration. Since the problem we are solving is ill-posed and does not yield unique solution, it is necessary to regularize the solution by imposing additional constraint to restrict the solution space. The regularization is performed by imposing smoothness constraint only. However, for preserving the depth and image intensity discontinuities, they are indenti ed prior to the minimization process from initial estimates of the depth map and the restored image. The nal solution is obtained by using computationally e client gradient descent algorithm, thus avoiding the need for computationally taxing algorithms. The depth as well as intensity edge details of the nal solution correspond to those obtained using the initial estimates. The experimental results indicate that the quality of the restored image is found to be satisfactory even under severe space-varying blur conditions.
  • ItemOpen Access
    Depth from defocus
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2010) Khatri, Nilay; Banerjee, Asim
    With the recent innovations in 3D technology accurate estimation of depth is very fascinating and challenging problem. In this thesis, a depth estimation algorithm, utilizing Singular Value Decomposition to compute orthogonal operators, has been implemented to test the algorithm on a variety of image database. Due to the difficulty in obtaining the database, an algorithm is implemented, that attempts to generate various synthetic image database of a scene from two defocused images by varying camera parameters. Thus, providing a researcher with more databases to work upon.
  • ItemOpen Access
    Classification of 3D volume data using finite mixture models
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2010) Phophalia, Ashish; Mitra, Suman K.
    The 3D imaging provides better view of objects from three directions as compared to 2D imaging where front face of object can be viewed only. It involves a complex relationship as compared to 2D imaging and hence computationally expensive also. But it also includes more information which helps in visualizing the object, its shape, boundary similar to real world phenomenon. The segmentation method should take care of 3D relationship that exists between voxels. The multi channel 3D imaging provides exibility in changing voxel size by changing echo pulse signals which helps in analysis of soft tissues. The application of 3D imaging in MRI brain images help in understanding more clearly the brain anatomy and function. Mixture model based image segmentation methods provide platform to many real life segmentation problems. Finite Mixture Model (FMM) segmentation techniques have been applied in 2D imaging successfully. But these methods do not involve spatial relationship among neighboring pixels. To overcome this drawback, Spatially Variant Finite Mixture Model (SVFMM) was given for classification purpose. In the medical imaging, the probability of noise is high due to environment, technician expertise level, etc. So, a robust method is required which can reduce the noise effect of the images. The Gaussian Distribution is more preferred in the literature but it is not robust against the noisy data. The Student's t Distribution uses Mahalanobis squared distance to reduce the effect of outlier data. A comparative study has been presented between these two distribution functions. In Medical Imaging, segmentation procedures provide facility to separate out different type of tissues instead of manual processing which requires time and efforts. The segmentation methods automate this classification procedure. To reduce the computation time in 3D medical imaging, a sampling based approach called Column Sampling is used. The variance of a column is taken as a measure in sample selection. A comparison is presented for time taken in sample selection from whole volume with Random Sampling. The selected samples are provided to the estimation technique. The parameters of mixture model are estimated using Maximum Likelihood Estimation and Bayesian Learning Estimation in the presented work. The method for estimating parameters of SVFMM using Bayesian Learning is proposed. The Misclassification Rate (MCR) is used for quantitative measure among these methods. This work analyzes FMM and SVFMM models with different probability distribution over two different estimation techniques. The MCR and computational time are considered as quantitative measures for performance evaluation. The different sampling percentage is tried out to estimate the parameters and their MCR and computational time are presented. In conclusion, Bayesian learning estimation SVFMM using Student's t distribution gives comparatively better results.
  • ItemOpen Access
    Multiresolution fusion of satellite images and super-resolution of hyper-spectral images
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2010) Shripat, Abhishek Kumar; Joshi, Manjunath V.
    This thesis presents a model based approach for multi-resolution fusion of the satellite images. Given a high resolution panchromatic (Pan) image and a low spatial but high spectral resolution multi spectral (MS) image acquired over the same geographical area, the objective is to obtain a high spatial resolution MS image. To solve this problem, maximum a posteriori (MAP) - Markov random field (MRF) based approach is used. Each of the low spatial resolution MS images are modeled as the aliased and noisy versions of their high resolution versions. The high spatial resolution MS images to be estimated are modeled separately as discontinuity preserving MRF that serve as prior information. The MRF parameters are estimated from the available high resolution Pan image using homotopy continuation method. The proposed approach has the advantage of having minimum spectral distortion in the fused image as it does not directly operate on the Pan digital numbers. This method does not require registration of MS and Pan images. Also the number of MRF parameters to be estimated from the Pan image is limited as homogeneous MRF is used. The time complexity of the approach is reduced by using the particle swarm optimization (PSO) in order to minimize the final cost function. The effectiveness of the approach is demonstrated by conducting experiments on real image captured by Landsat-7 ETM+ Satellite.

    This thesis also presents the Super-resolution of Hyper-spectral satellite images using Discrete Wavelet Transform based (DWT) learning. Given low resolution hyper spectral images and a data base consisting of sets of LR and HR textured images and satellite images; super-resolution of the hyper spectral image is obtained. Four hyper spectral test images are selected from 224 bands of hyper-spectral images through principal component analysis (PCA) technique. Using minimum absolute difference (MAD) criterion the best match wavelet coefficients are obtained. The finer details of test image are learned from the high resolution wavelet coefficients of the training data set. The inverse wavelet transform gives super resolved image corresponding to the test image. The effectiveness of above approach is demonstrated by conducting experiments on real Hyper-spectral images captured by Airborne Visible Infrared Imaging Spectrometer (AVIRIS).