M Tech Dissertations

Permanent URI for this collectionhttp://ir.daiict.ac.in/handle/123456789/3

Browse

Search Results

Now showing 1 - 2 of 2
  • ItemOpen Access
    Query Processing in Different Domains
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2021) Mishra, Sonal; Majumder, Prasenjit
    In this modern era, digital content is exploding in every domain. Biomedical domain is also no exception.In this modern era, digital content is exploding in every domain. Biomedical domain is also no exception. Finding potentially relevant medical documents that can help to diagnose a particular disease is a challenging problem with the increase in biomedical documents over time. The medical queries are usually short and often contains just three to four words. The queries usually contain disease name, genetic variant, treatment for the disease.The law queries usually describe a situation and the documents that are retrieved belong to the Prior Cases document collections. Various methods of pre-retrieval query expansion is explored like word embeddings. These word embeddings are made from existing PubMed articles that are provided in the document collection. The set of experiments are performed on TREC 2018 and TREC 2020 datatsets. A detailed description has been provided in the thesis about these experiments and retrieval systems, as well as about the intuition behind the building the models. In this thesis we propose a cross relevance language model which is effective in finding potentially relevant biomedical documents from a biomedical document collection. Experiments on TREC 2018 and 2019 precision medicine track and FIRE AILA 2019 Track show that our proposed cross relevance language model is more effective compared to existing standard relevance language model for medical document retrieval.
  • ItemOpen Access
    What does BERT learn about questions
    (2020) Tyagi, Akansha; Majumder, Prasenjit
    Recent research in Question Answering is highly motivated by the introduction of the BERT [5] model. This model has gained considerable attention since the researcher of Google AI Language has claimed state-of-the-art results over various NLP tasks, including QA. On one side, where the introduction of end-to-end pipeline models consisting of an IR and an RC model has opened the scope of research in two different areas, new BERT representations alone show a significant improvement in the performance of a QA system. In this study, we have covered several pipeline models like R3: Reinforced Ranker-Reader [15], Re-Ranker Model [16], and Interactive Retriever-Reader Model [4] along with the transformer-based QA system i.e., BERT. The motivation of this work is to deeply understand the black-box BERT model and try to identify the BERT’s learning about the question to predict the correct answer for it from a given context. We will discuss all the experiments that we have performed to understand BERT’s behavior from a different perspective. We have performed all the experiments using the SQuAD dataset. We have also used the LRP [3] technique to get a better understanding and for a better analysis of the experiment results. Along with the study about what the model learns, we have also tried to find what the model does not learn. For this, we have analyzed various examples from the dataset to determine the types of questions for whom the model predicts an incorrect answer. Finally, we have presented the overall findings of the BERT model in the conclusion section.