The rapid growth of the Internet has made it possible for individuals to communicate online, sharing information and opinions. Reviews are an important source of information to help individuals to make informed purchase decisions. Manufacturers use the available information to improve their designs and services. However, the extensive amount of uncontrolled user-generated reviews on the Internet has raised concerns about their quality and reliability. Therefore, automatically ranking and classifying reviews according to their quality is gaining much attention from the research community. Prior studies on determining the quality of product reviews are concerned with the classification of complete documents into helpful or unhelpful classes. Little attention has been paid to performing a deep analysis of the reviews' helpful sentences. Our work aims to fill this gap by identifying useful sentences related only to the product and its features. This work extends the quality prediction of product reviews by examining each review on a fine-grained level, which is at the sentence level. Additionally, we examine the feasibility of supervised text classification methods for classifying sentences into helpful or unhelpful classes. Other studies apply supervised classification methods to classify sentences into predefined classes, for example in sentiment classification and helpdesk emails. This thesis describes experiments concerned with the automatic prediction of helpful sentences from product reviews using supervised text classification methods and four machine learning algorithms. We introduce a novel framework including three phases to evaluate our research hypotheses and to answer the research questions. The first phase focuses on the analysis of the characteristics and specifications of helpful sentences from a review text. The other two phases of the framework are concerned with examining the impact of existing feature sets on helpfulness classification performance individually and collectively. These feature sets have been used in helpfulness prediction at the document level, and our aim is to examine their effect at the sentence level. We evaluate our results against our gold-standard helpfulness dataset that was generated for this thesis in the first phase of the framework. Results show that it is feasible to identify helpful sentences from online product reviews using supervised learning. Furthermore, our framework can be adapted for other review domains because the novel specifications of helpful sentences that we introduce can be adopted in any domain.
|Date of Award||3 Jan 2020|
- The University of Manchester
|Supervisor||Sophia Ananiadou (Supervisor)|