M Tech Dissertations

Now showing 1 - 4 of 4

Open Access
Imbalanced bioassay data classification for drug discovery
(Dhirubhai Ambani Institute of Information and Communication Technology, 2018) Shah, Jeni Snehal; Joshi, Manjunath V.
All the methods developed for pattern recognition will show inferior performance if the dataset presented to it is imbalanced, i.e. if the samples belonging to one class are much more in number compared to the samples from the other class/es. Due to this, imbalanced dataset classification has been an active area of research in machine learning. In this thesis, a novel approach to classifying imbalanced bioassay data is presented. Bioassay data classification is an important task in drug discovery. Bioassay data consists of feature descriptors of various compounds and the corresponding label which denotes its potency as a drug: active or inactive. This data is highly imbalanced, with the percentage of active compounds ranging from 0.1% to 1.4%, leading to inaccuracies in classification for the minority class. An approach for classification in which separate models are trained by using different features derived by training stacked autoencoders (SAE) is proposed. After learning the features using SAEs, feed-forward neural networks (FNN) are used for classification, which are trained to minimize a class sensitive cost function. Before learning the features, data cleaning is performed using Synthetic Minority Oversampling Technique (SMOTE) and removing Tomek links. Different levels of features can be obtained using SAE. While some active samples may not be correctly classified by a trained network on a certain feature space, it is assumed that it can be classified correctly in another feature space. This is the underlying assumption behind learning hierarchical feature vectors and learning separate classifiers for each feature space. vi
Open Access
Locality preserving projection: a study and applications
(Dhirubhai Ambani Institute of Information and Communication Technology, 2012) Shikkenawis, Gitam; Mitra, Suman K
Locality Preserving Projection (LPP) is a recently proposed approach for dimensionality reduction that preserves the neighbourhood information and obtains a subspace that best detects the essential data manifold structure. Currently it is widely used for finding the intrinsic dimensionality of the data which is usually of high dimension. This characteristic of LPP has made it popular among other available dimensionality reduction approaches such as Principal Component Analysis (PCA). A study on LPP reveals that it tries to preserve the information about nearest neighbours of data points, thus may lead to misclassification in the overlapping regions of two or more classes while performing data analysis. It has also been observed that the dimension reducibility capacity of conventional LPP is much less than that of PCA. A new proposal called Extended LPP (ELPP) which amicably resolves two issues mentioned above is introduced. In particular, a new weighing scheme is designed that pays importance to the data points which are at a moderate distance, in addition to the nearest points. This helps to resolve the ambiguity occurring at the overlapping regions as well as increase the reducibility capacity. LPP is used for a variety of applications for reducing the dimensions one of which is Face Recognition. Face Recognition is one of the most widely used biometric technology for person identification. Face images are represented as highdimensional pixel arrays and due to high correlation between the neighbouring pixel values; they often belong to an intrinsically low dimensional manifold. The distribution of data in a high dimensional space is non-uniform and is generally concentrated around some kind of low dimensional structures. Hence, one of the ways of performing Face Recognition is by reducing the dimensionality of the data and finding the subspace of the manifold in which face images reside. Both LPP and ELPP are used for Face and Expression Recognition tasks. As the aim is to separate the clusters in the embedded space, class membership information may add more discriminating power. With this in mind, the proposal is further extended to the supervised version of LPP (SLPP) that uses the known class labels of data points to enhance the discriminating power along with inheriting the properties of ELPP
Open Access
Fingerprint image preprocessing for robust recognition
(Dhirubhai Ambani Institute of Information and Communication Technology, 2012) Munshi, Paridhi; Mitra, Suman K
Fingerprint is the oldest and most widely used form of biometric identification. Since they are mainly used in forensic science, accuracy in the fingerprint identification is highly important. This accuracy is dependent on the quality of image. Most of the fingerprint identification systems are based on minutiae matching and a critical step in correct matching of fingerprint minutiae is to reliably extract minutiae from the fingerprint images. However, fingerprint images may not be of good quality. They may be degraded and corrupted due to variations in skin, pressure and impression conditions. Most of the feature extraction algorithms work on binary images instead of the gray scale image and results of the feature extraction depends upon the quality of binary image used. Keeping these points in mind, image preprocessing including enhancement and binarization is proposed in this work. This preprocessing is employed prior to minutiae extraction to obtain a more reliable estimation of minutiae locations and hence to get a robust matching performance. In this dissertation, we give an introduction to the ngerprint structure and identification system . A discussion on the proposed methodology and implementation of technique for fingerprint image enhancement is given. Then a rough-set based method for binarization is proposed followed by the discussion on the methods for minutiae extraction. Experiments are conducted on real fingerprint images to evaluate the performance of the implemented techniques.
Open Access
Fractal based approach for image segmentation
(Dhirubhai Ambani Institute of Information and Communication Technology, 2004) Londhe, Tushar; Banerjee, Asim
In this thesis, we have proposed an algorithm for image segmentation, using the fractal codes. The basic idea behind this algorithm is to use fractal codes for the image segmentation. This method uses compressed codes instead of the gray levels of the image. Therefore it is cost effective in the sense of storage space and time as no decoding is performed before using the segmentation algorithm. Moreover, the proposed scheme can directly use on the images accessed from the image database where images are kept in fractal-compressed code.

M Tech Dissertations

Browse

Filters

Settings

Sort By

Results per page

Search Results