Theses and Dissertations

Image captioning is a method of generating captions/descriptions for the image. Image captioning have many applications in various fields like image indexing for content based image retrieval, Self-driving car, for visually impaired persons, in smart surveillance system and many more. It connects two major research communities of computer vision and natural language processing. The main challenges in image captioning are to recognize the important objects, their attributes, and their visual relationships of objects within an image, then it also needs to generate syntactically and semantically correct sentences. Currently, most of the architectures for image captioning are based on the encoder-decoder model, in which the image is first encoded using CNN to get an abstract version of the image then it is decoded using RNN to get proper caption for the image. So finally I have selected one base paper which was based on visual attention on the image to attend the most appropriate region of the image while generating each word for the caption. But they have miss one important factor while generating the caption for the image which was visual relationships between the objects present in the image. So I have decided to add one relationship detector module to that model to consider the relationships between objects. After combining this module with existing show-attend and tell model we get the caption for the image which consider the relationships between object, which ultimately enhance the quality of the caption for the image. I have performed experiments on various publicly available standard datasets like Flickr8k dataset, Flickr30k dataset and MSCOCO dataset.

Research in the field of text summarisation has primarily been dominated by investigationsof various sentence extraction techniques with a significant focus towards news articles.In this thesis, we intend to look beyond generic sentence extraction and instead focuson domain-specific summarisation, methods for creating ensembles of multiple extractivesummarisation techniques and using sentence compression as the first step towardsabstractive summarisation.We start by proposing two new datasets for domain-specific summarisation. The firstcorpus is a collection of court judgements with corresponding handwritten summaries,while the second one is a collection of scientific articles from ACL anthology. The legalsummaries are recall-oriented and semi-extractive, compared to the abstracts of ACL articleswhich are more precision oriented and abstractive. Both collections have a reasonablenumber of article-summary pairs, enabling us to use data-driven techniques. Excludingnewswire corpora where the summaries are usually article headlines, the proposed collectionsare amongst the largest openly available collections of document summarisation.Next, we propose a completely data-driven technique for sentence extraction from legaland scientific articles. In both legal and ACL corpus, the summaries have a predefinedformat. Hence, it is possible to identify summary worthy sentences depending on whetherthey contain certain key phrases. Our proposed approach based on attention-based neuralnetwork learns to automatically identify these key phrases from pseudo-labelled data,without requiring any annotation or handcrafted rules. The proposed model outperformsexisting baselines and state of the art systems by a large margin.There are a large number of sentence extraction techniques, none of which guaranteebetter performance than the others. As a part of this thesis, we explore if it is possibleto leverage this variance in performance for generating an ensemble of several extractivetechniques. In the first model, we study the effect of using multiple sentence similarityscores, ranking algorithms and text representation techniques. We demonstrate that suchvariations can be used for improving Rank Aggregation. Using several sentence similaritymetrics, with any given ranking algorithm, always generates better abstracts. Next, wepropose several content-based aggregation models. Given the variation in performanceof extractive techniques across documents, the apriori knowledge about which techniquewould give the best result for a given document will drastically improve the result. Insuch case, an oracle ensemble system can be made which chose best possible summaryfor a given document. In the proposed content-based aggregation models, we estimatethe probability of a summary being good by looking at the amount of content it shareswith other candidate summaries. We present a hypothesis that a good summary will necessarilyshare more information with another good summary, but not with a bad summary.We build upon this argument to construct several content-based aggregation techniques,achieving a substantial improvement in the Rouge scores.In the end, we propose another attention based neural model for sentence compression.We use a novel context encoder, which helps the network to handle rare but informativeterms better. We compare the proposed approach to some sentence compression and abstractivetechniques that have been proposed in past few years. We present our argumentsfor and against these techniques and build a further roadmap for abstractive summarisation.In the end, we present the results on an end to end system which performs sentenceextraction using standalone summarisation systems as well as their ensembles and thenuses the sentence compression technique for generating the final abstractive summary.

Theses and Dissertations

Browse

Filters

Settings

Sort By

Results per page

Search Results