M Tech Dissertations
Permanent URI for this collectionhttp://ir.daiict.ac.in/handle/123456789/3
Browse
6 results
Search Results
Item Open Access Biomedical information retrieval(Dhirubhai Ambani Institute of Information and Communication Technology, 2018) Purabia, Pooja R.; Majumder, PrasenjitIt is well known that the volume of biomedical literature is growing exponentially and that scientists are being overwhelmed when they sift through the scope and diversity of this unstructured knowledge to find relevant information. TREC Precision Medicine 2017 is a track focusing on retrieving relevant scientific abstract and clinical trials from PubMed and Clinicaltrails.gov for cancer patients given their medical case. This report describes the system architecture for the TREC 2017 Precision Medicine Track. I explored query expansion techniques using wellknown broad knowledge sources such as Metamap and Entrez database. I used different pseudo relevance feedback technique like TF-IDF, BO1 and Local Context Analysis to retrieve relevant medical abstracts. I have used hidden aspects of topic like precision medicine and treatment aspect to improve the scores. I report infNDCG, R-Prec and P@10 scores.Item Open Access Collaborative filtering approach with decision tree technique(Dhirubhai Ambani Institute of Information and Communication Technology, 2008) Srivastava, Anit; Jotwani, Naresh D.Rapid advances in data collection and storage technology has enabled organizations (especially e-commerce) to accumulate vast amounts of data. The amount of data kept in computer files and databases is growing at a phenomenal rate because customers are evolving to use e- commerce services. So processing of large number of coustomer’s past purchase records is becoming a new challenge in e-commerce. The primary goal of e-commerce services is to build the systems where customers can get their likely recommended products relevant to their past purchase. We have implemented collaboratives filtering with supervised learning techniques. One of supervised learning techniques is Decision Tree. We have used Decision Tree to cluster similar type of customers according to active customer preferences (or tastes). In our new approach, a collaborative filtering based recommender system will recommended Top-k likely products according to customers preferences (or tastes) by considering past purchase record (or implicit ratings) of its clustered customers. This system will also recommend or predict Top-k likely products to particular customers by considering the cases when clustered customers have given explicit ratings (or votes) to their previously purchased products.Item Open Access Application of BTrees in data mining(Dhirubhai Ambani Institute of Information and Communication Technology, 2008) Srivastava, Amit; Jotwani, Naresh D.As massive amount of information are becoming available electronically, techniques for making the decision to analyze statistics on the large dataset are tending to be very complex. Making of such a decision requires more disk accesses in the main memory. So there is a need of such important techniques which can take least number of disk accesses as well as less running time to perform some operations in the main memory. Building of such a strategic goal oriented decision, there is requisite to classify the information into different classes with the help of some given properties of the information which enabled us to make two BTrees that are running simultaneous. One BTree is used as a classifier for making the decision and another bTree maintains the organization of the information of dataset from where we make the strategic decisions. Our research embodies around the learning, implementation and usage of advances data structure (i.e. BTree). In our thesis work we have used the binary search approach instead of the linear search takes running time O (T), has enhanced the performance of the BTree during execution of the operations on the BTree.Item Open Access Mining effective association rules using support-conviction framework(Dhirubhai Ambani Institute of Information and Communication Technology, 2007) Sharma, Adarsh; Jotwani, Naresh D.Discovering association rules is one of the most important tasks in data mining. Most of the research has been done on association rule mining by using the support-confidence framework. In this thesis, we point out some drawbacks of the support-confidence framework for mining association rules. In order to avoid the limitations in the rule selection criterion, we replace confidence by the conviction, which is a more reliable measure of implication rules. We have generated the test data synthetically by the Hierarchical Synthetic Data Generator, which appropriately models the customer behaviour in the retailing environment. Experimental Results show that there is higher correlation between the antecedent and consequent of the rules produced by the supportconviction framework compared with the rules produced by support-confidence framework. Although support-conviction framework mines the effective associations but the association rules generated are large in numbers that are difficult to deal with. To overcome this problem, we propose an association rule pruning algorithm, which produces non-redundant and significant rules. Results obtained with synthetic data show that the proposed approach for mining association rules is quite effective and generates meaningful associations among the sets of data items.Item Open Access Web content outlier detection using latent semantic indexing(Dhirubhai Ambani Institute of Information and Communication Technology, 2007) Paluri, Santosh Kumar; Jotwani, Naresh D.Outliers are data elements different from the other elements in the category from which they are mined. Finding outliers in web data is considered as web outlier mining. This thesis explores web content outlier mining which finds applications in electronic commerce, finding novelty in text, etc. Web content outliers are text documents having varying contents from the rest of the documents taken from the same domain. Existing approaches for this problem uses lexical match techniques such as n-grams which are prone to problems like synonymy (expressing the same word in different ways), which leads to poor recall (an important measure for evaluating a search strategy). In this thesis we use Latent Semantic Indexing (LSI) to represent the documents and terms as vectors in a reduced dimensional space and thereby separating the outlying documents from the rest of the corpus. Experimental results using embedded outliers in chapter four indicate the proposed idea is successful and also better than the existing approaches to mine web content outliers.Item Open Access Efficient algorithms for hierarchical online rule mining(Dhirubhai Ambani Institute of Information and Communication Technology, 2006) Banda, Kishore Kumar; Jotwani, Naresh D.Association rule Mining, as one of the technologies equipped with Data Mining, deals with the challenge of mining the informative associations from the fast accumulating data. From the past decade, the research community has been busy progressing day by day towards the task of rule mining. Hierarchical Online rule mining opens a new trend to achieve an online approach in real sense. In this thesis, we further develop the theory of Hierarchical Association Rules. Notably, we propose a new algorithm that further improves the efficiency of the previously proposed works in three aspects. In phase 1 of the rule-mining problem, we introduce Hierarchy Aware Counting and Transaction Reduction concepts that reduce the computational complexity by a considerable factor. We also propose Redundancy Check while generating rules in phase 2 of the problem. We propose a modified version of a Synthetic Data Generator that deals with Hierarchical data and evaluate the performance of the proposed new algorithm. We finally discuss the issues that can form the future perspectives of the proposed new approach.