Home
Add
Get on Google Play
Home
> Edit
Add/Update Thesis
Title*
Author's Name*
Supervisor's Name
Abstract
Time is deemed as paramount aspect in Information Retrieval (IR) and it pro foundly influence the interpretation as well as the users intention and expectation. The temporal patterns in a document or collection of documents plays a central role in the effectiveness of IR systems. The accurate discernment plays an immense role in persuading the time-based intention of a user. There exists a plethora of documents on the web wherein most on them contain the divergent temporal pat terns. Assimilation of these temporal patterns in IR is referred to as Temporal Information Retrieval (TIR). The comprehension of TIR systems is requisite to address the temporal intention of a user in an efficient manner. For time specific queries (i.e. query for an event), the relevant document must relate to the time period of the event. To attenuate the problem, the IR systems must: determine whether the document is temporal specific (i.e. focusing on single time period) and determine the focus time (to which the document content refers) of the documents. This thesis exploits the temporal features of the news documents to improve the retrieval effectiveness of IR systems.As best to our knowledge, this thesis is the pioneer study that focuses on the problem of temporal specificity in news docu ments. This thesis defines and evaluate novel approaches to determine the tem poral specificity in news documents. Thereafter, these approaches are utilized to classify news documents into three novel temporal classes. Furthermore, the study also considers 24 implicit temporal features of news documents to classify in to; a) High Temporal Specificity (HTS), b) Medium Temporal Specificity (MTS), and c) Low Temporal Specificity (LTS) classes. For such classification, Rule-based and Temporal Specificity Score (TSS) based classification approaches are proposed. In the former approach, news documents are classified using a proposed set of rules that are based on temporal features. The later approach classifies news documents based on a TSS score using the temporal features. The results of the proposed approaches are compared with four Machine Learning classification algorithms: Bayes Net, Support Vector Machine (SVM),Random Forest and Decision Tree. x The outcomes of the study indicate that the proposed rule-based classifier outper forms the four algorithms by achieving 82% accuracy, whereas TSS classification achieves 77% accuracy. In addition, to determine the focus time of news documents, the thesis contem plates the temporal nature of news documents. The type and structure of doc uments influence the performance of focus time detection methods. This thesis propose different splitting methods to split the news document into three logical sections by scrutinizing the inverted pyramid news paradigm. These methods in clude: the Paragraph based Method (PBM), the Words Based Method (WBM), the Sentence Based Method (SBM), and the Semantic Based Method (SeBM). Temporal expressions in each section are assigned weights using a linear regres sion model. Finally, a scoring function is used to calculate the temporal score for each time expression appearing in the document. Afterwards, these temporal expressions are ranked on the basis of their temporal score, where the most suit able expression appears on top. Two evaluation measures are used to evaluate the performance of proposed framework, a) precision score (P@1, P@2) and average error years. Precision score at position 1 (P@1) and position 2 (P@2) represent the correct estimation of focus at the top 2 positions in the ranked list of focus time whereas, average error year is the distance between the estimated year and the actual focus year of news document. The effectiveness of proposed method is evaluated on a diverse dataset of news related to popular events; the results re vealed that the proposed splitting methods achieved an average error of less than 5.6 years, whereas the SeBM achieved a high precision score of 0.35 and 0.77 at positions 1 and 2 respectively. The overall findings presented in this thesis demonstrate that the valuable tempo ral insights of documents can be used to enhance the performance of IR systems. The time aware information retrieval systems can adopt these findings to satisfy the user expectation for temporal queries.
Subject/Specialization
Language
Program
Faculty/Department's Name
Institute Name
Univeristy Type
Public
Private
Campus (if any)
Institute Affiliation Inforamtion (if any)
City where institute is located
Province
Country
Degree Starting Year
Degree Completion Year
Year of Viva Voce Exam
Thesis Completion Year
Thesis Status
Completed
Incomplete
Number of Pages
Urdu Keywords
English Keywords
Link
Select Category
Religious Studies
Social Sciences & Humanities
Science
Technology
Any other inforamtion you want to share such as Table of Contents, Conclusion.
Your email address*