Question Answering System using NLP models (Splinter and SpanBERT)
Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural language. It’s is a critical NLP problem and a longstanding artificial intelligence milestone. In this project we present the use of Splinter and SpanBERT models to showcase and try to solve the question-answering problem both in the closed domain and open domain as well, where the dataset used was (open domain) is SQuAD 2.0 and also a separate dataset (COVID dataset) has been generated by us for the splinter model (closed domain).
Splinter is a model that has been pre-trained for few-shot question answering in a self-supervised manner. This implies it was pre-trained on raw texts solely, with no human labelling (which is why it can use so much publicly available data), and then used an automatic method to build inputs and labels from those texts.



Code architecture for the BERT model.

Code architecture for SpanBERT QuestionAnswering system.

Representation of the Closed Domain, preprocessed data (COVID data).

Predicted output of the splinter model.

Training process representing the training & validation loss

In this project we tried to build a question-answering system with the acquired data using pre-processing methods and doing data acquisition for the dataset to work for these models Splinter and SpanBERT models. Our approach converts texts into a set of questions that need to be answered simultaneously.
Thank You :)