M Tech Dissertations
Permanent URI for this collectionhttp://ir.daiict.ac.in/handle/123456789/3
Browse
9 results
Search Results
Item Open Access Integrating semantics into biomedical information retrieval(Dhirubhai Ambani Institute of Information and Communication Technology, 2015) Thakrar, Fenny; Majumder, PrasenjitIntegrating semantics into Biomedical Information Retrieval is concerned with studying the meaning of concepts and focusing on their relationships. We have used semantic document representation approach to applying domain-specific knowledge into the information retrieval system. Single and multi word concepts are extracted from the document using an external semantic structure UMLS Metathesaurus. Word sense disambiguation is performed on the extracted concepts to disambiguate different concept senses. And, the document is represented in the form of UMLS concepts. The documents and queries are represented in semantic space and fed to an information retrieval system to rank those documents, according to the given query. We have performed experiments on TREC 2014 CDS Task data and its 30 queries. Two types of retrieval techniques namely single word and multi word retrieval are experimented. The results obtained using conceptual information retrieval are compared with the results obtained using traditional term based retrieval. The conceptual IR approach proved better compared to term based IR system for the evaluation metrics MAP, P10 and RPrec. And, single word retrieval proved better compared to multi word retrieval technique for conceptual IR. Also, query expansion in conceptual IR system proved better compared to non query expanded conceptual IR system.Item Open Access Schema based indexing for namespace mapping of raw sparql and summarization of lod(Dhirubhai Ambani Institute of Information and Communication Technology, 2015) Hapani, Hitesh; Jat, P. M.Linked open data(LOD) in Semantic Web is growing day by day. There are datasets available that can be used in different application. However, identifying useful dataset from cloud, determining the quality and obtaining inductive information from dataset are all tasks that require to be addressed. The more traffic on LOD increases, the more difficult it will become to identify useful dataset. The reason behind this problem is that there is no useful summary available about datasets. While querying any dataset through endpoint, The most cumbersome part is remembering URIs for resources. There is no known interface that provides URIs for the user terms. There are some standard available for providing summary and metadata about datasets. But till now no standard is available that is universally accepted. Index structure proposed in this thesis gives a schema level information about any dataset and provides URI information for dataset. This index structure has been successfully implemented on local dataset server and remote dataset server in this thesis.Item Open Access Automated sparql generation(Dhirubhai Ambani Institute of Information and Communication Technology, 2015) Ladhar, Kanahiya Lal; Jat, P. M.Observing the last decade, semantic data on web is increasing exponentially with time and it has become difficult for amateur users to retrieve information. An enhancement of the existing web is semantic web where semantics is attached with the information, allowing machine and users to work in cooperation. A RDF is a standard model for data interchange on web, developed to interlink large amount of Linked data for web semantic and to extract data efficiently in less amount of time. To retrieve data from RDF database, SPARQL is used as a standard query language. To use SPARQL it is necessary to have knowledge of all Uri’s which is unique for each resource in DBpedia, but to figure out the Uri’s for all the resources in not feasible and predicate is restricted to domain and range. In this work we propose an interface which maps the keyword into URI, a major step towards the automated SPARQL generation. Our system take keyword-SPARQL as input and produces SPARQL as output which can be executed on SPARQL endpoint. Studying the structure of DBpedia, we create an interface which provides Auto suggestion technique to users to resolve the general problems caused due to DBpedia structure and commonly occurring typing errors . Concept and property mapping functions are used to map instances to concept in DBpedia and WSD is used to resolve the predicate disambiguation. The advantage of this approach is that users, who are unaware of the Schema of DBpedia and complexity of SPARQL, can retrieve information.Item Open Access Ontology learning from relational database and reuse it with popular vocabularies(Dhirubhai Ambani Institute of Information and Communication Technology, 2015) Malodiya, Prakasha B.; Jat, P. M.Ontologies are the web documents generated by Web Ontology Languages that are used to develop semantic web. Semantic web requires data either in terms of manual creation or through conversion from existing data. A manual method for building ontologies is a very complex process and time-consuming. It also requires domain expert‘s need to understand the syntax and semantics of ontology development languages. It can also generate an error prone ontology. The data in the form of structured (relational) are used for ontology learning because they are most valuable data source available on the web. The research work for creating automatic ontology from structured database is not new. For this research work, many tools and methods were created to solve this type of problem. The primary limitation of the existing tools and methods for learning ontology from relational database is that the generated ontology is a simply copy of input database schema. This type of generated ontology gives the information from database schema and it does not contain any information about data. In this thesis we propose a tool for the automatic creation of ontology that gives the information about relational schema model and also the data stored in a database. Our aim is to analyze existing tools that were used for creating automatic ontology from relational databases and identify the advantages and disadvantages of these tools so that effective and valuable tool can be proposed. We have given detailed analysis of different existing tools used for creating automatic ontology from relational databases based on database schema and data stored in database and also performed a comparative analysis of these tools with our proposed tool.Item Open Access Ontology alignment based support for searching LOD-datasets using SPARQL query(Dhirubhai Ambani Institute of Information and Communication Technology, 2015) Vijayvergiya, Nishant; Jat, P. M.The Linked Open Data (LOD) cloud is rapidly becoming the largest interconnected source of structured data on the web. Due to distributed nature of LOD and a growing number of ontologies and vocabularies used in LOD datasets,querying over multiple datasets and retrieving data from LOD-cloud remains a challenging task. The data on the LOD-cloud is stored in Resource Description Format(RDF) and SPARQL is the standard language for searching over RDF data. Different systems like LOQUS[1], ALOQUS[2] has been designed to formulate SPARQL queries which can obtain data from this LOD-cloud effectively and efficiently. Our work focusses on implementing a system like LOQUS,ALOQUS which uses SPARQL query to obtain data from the different LOD-datasets. But before querying on the datasets, mapping of the dataset specific ontologies with Upper Level Ontology is required, for formulating the SPARQL queries. In our work, we demonstrate ""Mapping Approach"" that is used for implementing our system, how two ontologies are mapped, how sparql queries are created using these mappings and how results are obtained after executing these queries.Item Open Access Semantic web data management: data partitioning and query execution(Dhirubhai Ambani Institute of Information and Communication Technology, 2012) Padiya, Trupti; Bhise, MinalSemantic Web database is an RDF database. Due to increased use of Semantic Web in real life applications, we can find immense growth in the use of RDF databases. As there is a tremendous increase in RDF data, efficient management of this data at a larger scale, and query performance are two major concerns. RDF data can be stored using various storage techniques. The RDF data used for this experiment is FOAF dataset which is a social network data. Here we study and evaluate query performance for various storage techniques in terms of query execution time and scalability using FOAF data set. Thesis demonstrates effect of data partitioning techniques on query performance. For our experiments, we have used Triple Store, Property Tables, vertically and horizontally partitioned data store to store FOAF data. Experiments were performed to analyze query execution time for all these data stores. Partitioning techniques have been observed to make queries 168 times faster compared to Triple Stores. Materialized views are used to improve query performance further for the queries which are seen frequently for social web data. Materialized views have shown better query performance in terms of execution time which is 8 times faster than the partitioned data.Item Open Access SPARQLGen: generation of SPARQL from pseudo BGP(Dhirubhai Ambani Institute of Information and Communication Technology, 2012) Mandloi, Dipendra Singh; Chaudhary, Sanjay; Jat, Pokhar MalSPARQL is the querying language and communication protocol for communicating with RDF data sources. SPARQL query requires knowledge of URIs of bound values in the triple patterns and ontological schema used by dataset. A person, even expert in SPARQL, nds hard to gure out URIs for bound values to be used in the query. This requirement brings a gap between end user and SPARQL query formation. In this work, we aim to facilitate semantic search over web of data by converting keywords into URIs, and present SPARQLGen. SPARQLGen provides an easy way of writing SPARQL query for a given query over Web of Data (RDF data). Through appropriate interface, semantic annotations of keywords are captured. We derive a Pseudo Basic Graph Pattern which is basically similar to SPARQL BGP except that it contains keywords rather than full resource URIs. Here, we propose heuristics that discover URIs for annotated keywords and build corresponding SPARQL query. SPARQLGen takes services of falcons, a semantic search engine. The Linked Open Data plays the major role in nding aliased URIs of an entity. The nal set of results contains a list of URIs of different data sources. SPARQLGen bridges the gap between end user and SPARQL query formation. The interface allows users to write user intended keywords instead of highly syntactic SPARQL query so that he/she needs not worry about the URIs of entities while writing their queries.Item Open Access Service integration on social network(Dhirubhai Ambani Institute of Information and Communication Technology, 2011) Patel, Mehul; Chaudhary, Sanjay; Bise, MinalMicroblogging services are part of social network platforms, which allow people to exchange short messages. Social networks provide people to play an active role in collecting, analyzing and reporting news and information. People can use social network platform for marketing, buying and selling of their products. A sellers can tweet regarding product information including links of related photos, videos etc. A buyer can show interest in the product by means of tweets. Social network can be used as a mechanism to bring sellers and buyers closer. It provides a common platform for buyers and sellers to sell and buy their products. Microblogs can be parsed and analyzed to generate useful suggestions, e.g. sellers can be informed about potential buyers to get higher profit. Such information can be used to generate classified information to help users to take decision, e.g. minimum expected price of a crop that sellers expect in a given region. Microblogs can be written in different regional languages. Agro-produce marketing information can be processed and then stored in RDF/RDF(S) and OWL data store. SPARQL and conjunctive queries with pellet like reasoner or SPARQL-DL can be used to generate classified summarized information from RDF/RDF(S) and OWL data store.Item Open Access SPARQL query optimization(Dhirubhai Ambani Institute of Information and Communication Technology, 2011) Singh, Rohit Kumar; Chaudhary, SanjayQuery Optimization is the process of selecting the most efficient query evaluation plan among the many strategies possible for processing a given query, especially if the query is complex. The users are not expected to write their queries in such a way so that they can be processed efficiently; rather it is expected from system to construct a query evaluation plan that minimizes the cost of query evaluation. In any query optimization, the goal is to find the execution plan which is expected to return the result set without actually executing the query or subparts with optimal cost. Query engines for ontological data mostly execute user queries without considering any optimization. Especially for large ontologies,optimization techniques are required to ensure that query results are delivered within reasonable time. SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. So, Query optimization may speed up SPARQL query answering by knowledge intensive reformulation. In our research work, we have proposed learning approach to solve this problem. In our approach, the learning is triggered by user queries. Then the system uses an inductive learning algorithm to generate semantic rules. This inductive learning algorithm can automatically select useful join paths and properties to construct rules from a ontology with many concepts. The learned semantic rules are effective for optimization of SPARQL query because they match query patterns and reflect data regularities.