M Tech Dissertations
Permanent URI for this collectionhttp://ir.daiict.ac.in/handle/123456789/3
Browse
4 results
Search Results
Item Open Access Semantic web data management: data partitioning and query execution(Dhirubhai Ambani Institute of Information and Communication Technology, 2012) Padiya, Trupti; Bhise, MinalSemantic Web database is an RDF database. Due to increased use of Semantic Web in real life applications, we can find immense growth in the use of RDF databases. As there is a tremendous increase in RDF data, efficient management of this data at a larger scale, and query performance are two major concerns. RDF data can be stored using various storage techniques. The RDF data used for this experiment is FOAF dataset which is a social network data. Here we study and evaluate query performance for various storage techniques in terms of query execution time and scalability using FOAF data set. Thesis demonstrates effect of data partitioning techniques on query performance. For our experiments, we have used Triple Store, Property Tables, vertically and horizontally partitioned data store to store FOAF data. Experiments were performed to analyze query execution time for all these data stores. Partitioning techniques have been observed to make queries 168 times faster compared to Triple Stores. Materialized views are used to improve query performance further for the queries which are seen frequently for social web data. Materialized views have shown better query performance in terms of execution time which is 8 times faster than the partitioned data.Item Open Access Collaborative filtering approach with decision tree technique(Dhirubhai Ambani Institute of Information and Communication Technology, 2008) Srivastava, Anit; Jotwani, Naresh D.Rapid advances in data collection and storage technology has enabled organizations (especially e-commerce) to accumulate vast amounts of data. The amount of data kept in computer files and databases is growing at a phenomenal rate because customers are evolving to use e- commerce services. So processing of large number of coustomer’s past purchase records is becoming a new challenge in e-commerce. The primary goal of e-commerce services is to build the systems where customers can get their likely recommended products relevant to their past purchase. We have implemented collaboratives filtering with supervised learning techniques. One of supervised learning techniques is Decision Tree. We have used Decision Tree to cluster similar type of customers according to active customer preferences (or tastes). In our new approach, a collaborative filtering based recommender system will recommended Top-k likely products according to customers preferences (or tastes) by considering past purchase record (or implicit ratings) of its clustered customers. This system will also recommend or predict Top-k likely products to particular customers by considering the cases when clustered customers have given explicit ratings (or votes) to their previously purchased products.Item Open Access Application of BTrees in data mining(Dhirubhai Ambani Institute of Information and Communication Technology, 2008) Srivastava, Amit; Jotwani, Naresh D.As massive amount of information are becoming available electronically, techniques for making the decision to analyze statistics on the large dataset are tending to be very complex. Making of such a decision requires more disk accesses in the main memory. So there is a need of such important techniques which can take least number of disk accesses as well as less running time to perform some operations in the main memory. Building of such a strategic goal oriented decision, there is requisite to classify the information into different classes with the help of some given properties of the information which enabled us to make two BTrees that are running simultaneous. One BTree is used as a classifier for making the decision and another bTree maintains the organization of the information of dataset from where we make the strategic decisions. Our research embodies around the learning, implementation and usage of advances data structure (i.e. BTree). In our thesis work we have used the binary search approach instead of the linear search takes running time O (T), has enhanced the performance of the BTree during execution of the operations on the BTree.Item Open Access Web content outlier detection using latent semantic indexing(Dhirubhai Ambani Institute of Information and Communication Technology, 2007) Paluri, Santosh Kumar; Jotwani, Naresh D.Outliers are data elements different from the other elements in the category from which they are mined. Finding outliers in web data is considered as web outlier mining. This thesis explores web content outlier mining which finds applications in electronic commerce, finding novelty in text, etc. Web content outliers are text documents having varying contents from the rest of the documents taken from the same domain. Existing approaches for this problem uses lexical match techniques such as n-grams which are prone to problems like synonymy (expressing the same word in different ways), which leads to poor recall (an important measure for evaluating a search strategy). In this thesis we use Latent Semantic Indexing (LSI) to represent the documents and terms as vectors in a reduced dimensional space and thereby separating the outlying documents from the rest of the corpus. Experimental results using embedded outliers in chapter four indicate the proposed idea is successful and also better than the existing approaches to mine web content outliers.