Search or add a thesis

Advanced Search (Beta)
Home > Identification of Temporal Specificity and Focus Time Estimation in News Documents

Identification of Temporal Specificity and Focus Time Estimation in News Documents

Thesis Info

Access Option

External Link

Author

Khan, Shafiq Ur Rehman

Program

PhD

Institute

Capital University of Science & Technology

City

Islamabad

Province

Islamabad.

Country

Pakistan

Thesis Completing Year

2019

Thesis Completion Status

Completed

Subject

Computer Science

Language

English

Link

http://prr.hec.gov.pk/jspui/bitstream/123456789/11804/1/Shafiq%20ur%20Rehman%20Khan%20CS%202019%20cust%20isb%20prr.pdf

Added

2021-02-17 19:49:13

Modified

2024-03-24 20:25:49

ARI ID

1676727770879

Asian Research Index Whatsapp Chanel
Asian Research Index Whatsapp Chanel

Join our Whatsapp Channel to get regular updates.

Similar


Time is deemed as paramount aspect in Information Retrieval (IR) and it pro foundly influence the interpretation as well as the users intention and expectation. The temporal patterns in a document or collection of documents plays a central role in the effectiveness of IR systems. The accurate discernment plays an immense role in persuading the time-based intention of a user. There exists a plethora of documents on the web wherein most on them contain the divergent temporal pat terns. Assimilation of these temporal patterns in IR is referred to as Temporal Information Retrieval (TIR). The comprehension of TIR systems is requisite to address the temporal intention of a user in an efficient manner. For time specific queries (i.e. query for an event), the relevant document must relate to the time period of the event. To attenuate the problem, the IR systems must: determine whether the document is temporal specific (i.e. focusing on single time period) and determine the focus time (to which the document content refers) of the documents. This thesis exploits the temporal features of the news documents to improve the retrieval effectiveness of IR systems.As best to our knowledge, this thesis is the pioneer study that focuses on the problem of temporal specificity in news docu ments. This thesis defines and evaluate novel approaches to determine the tem poral specificity in news documents. Thereafter, these approaches are utilized to classify news documents into three novel temporal classes. Furthermore, the study also considers 24 implicit temporal features of news documents to classify in to; a) High Temporal Specificity (HTS), b) Medium Temporal Specificity (MTS), and c) Low Temporal Specificity (LTS) classes. For such classification, Rule-based and Temporal Specificity Score (TSS) based classification approaches are proposed. In the former approach, news documents are classified using a proposed set of rules that are based on temporal features. The later approach classifies news documents based on a TSS score using the temporal features. The results of the proposed approaches are compared with four Machine Learning classification algorithms: Bayes Net, Support Vector Machine (SVM),Random Forest and Decision Tree. x The outcomes of the study indicate that the proposed rule-based classifier outper forms the four algorithms by achieving 82% accuracy, whereas TSS classification achieves 77% accuracy. In addition, to determine the focus time of news documents, the thesis contem plates the temporal nature of news documents. The type and structure of doc uments influence the performance of focus time detection methods. This thesis propose different splitting methods to split the news document into three logical sections by scrutinizing the inverted pyramid news paradigm. These methods in clude: the Paragraph based Method (PBM), the Words Based Method (WBM), the Sentence Based Method (SBM), and the Semantic Based Method (SeBM). Temporal expressions in each section are assigned weights using a linear regres sion model. Finally, a scoring function is used to calculate the temporal score for each time expression appearing in the document. Afterwards, these temporal expressions are ranked on the basis of their temporal score, where the most suit able expression appears on top. Two evaluation measures are used to evaluate the performance of proposed framework, a) precision score (P@1, P@2) and average error years. Precision score at position 1 (P@1) and position 2 (P@2) represent the correct estimation of focus at the top 2 positions in the ranked list of focus time whereas, average error year is the distance between the estimated year and the actual focus year of news document. The effectiveness of proposed method is evaluated on a diverse dataset of news related to popular events; the results re vealed that the proposed splitting methods achieved an average error of less than 5.6 years, whereas the SeBM achieved a high precision score of 0.35 and 0.77 at positions 1 and 2 respectively. The overall findings presented in this thesis demonstrate that the valuable tempo ral insights of documents can be used to enhance the performance of IR systems. The time aware information retrieval systems can adopt these findings to satisfy the user expectation for temporal queries.
Loading...
Loading...

Similar Books

Loading...

Similar Chapters

Loading...

Similar News

Loading...

Similar Articles

Loading...

Similar Article Headings

Loading...

سیرِ افلاک کو جائیں گے خلا ڈھونڈیں گے


سیرِِ افلاک کو جائیں گے خلا ڈھونڈیں گے
اپنی پرواز سے کچھ لوگ خدا ڈھونڈیں گے

کچھ نہیں پائیں گے یہ شہرِ صبا کے باسی
رب کو سورج کی شعائوں میں بڑا ڈھونڈیں گے

پتھروں کو کبھی پوجیں گے تو آتش کو کبھی
بے وفائوں سے وفائوں کا صلہ ڈھونڈیں گے

روشنی پھیل چکی نورِ حرا کی ؛ پھر بھی
آتشِ وادیِ سینا میں ضیاء ڈھونڈیں گے

گردشِ وقت سے اوراق ہیں بکھرے بکھرے
ایسے تحریف شدہ نسخوں میں کیا ڈھونڈیں گے

شہرِ خاموش میں جا کر یہ صدائیں دیں گے
ماہیِ آب کو صحرائوں میں جا ڈھونڈیں گے

نکہتِ بادِ بہاری سے چھڑا کر دامن
بادِ صر صر میں یہ خوشبو کی فضا ڈھونڈیں گے

قافلہ راہ میں اخلاص کا لُٹ جائے گا
راہزن لوٹ ہی لیں گے جو بھلا ڈھونڈیں گے

راہبر راہ میں رہ جائیں گے ہر راہی کے
راہ جب آپؐ کی راہوں سے جدا ڈھونڈیں گے

نورِ عرفانؔ خدا اُن کا مقدّر ہے جو لوگ
’’آپؐ کی سیرتِ اطہر سے ضیاء ڈھونڈیں گے‘‘

سیرت طیبہ کے تناظر میں منصبی ذمے داریاں اور تقاضا ہائے حقوق مصطفی

 Rights of Holy Prophet Muḥammad (P.B.U.H) have been studied from various perspectives. This paper reviewing the extant research on the subject; identifies the duties of government officials from the referred side. It concludes that Prophet Muḥammad (P.B.U.H) is the most benefactor and humanitarian to mankind in the word. In this context only those Govt. Officials can be considered true in their claim of love for Prophet Muḥammad (P.B.U.H) who adhere to his teachings, concerning ability of one’s position, piety, liability, morality and uprightness and those who refrain from being footloose and profligate, and free themselves from the hunger of wealth and status, censoriously evaluate their deeds, keep an eye on the life hereafter and accountability. Moreover, those who hold justice and avoid dishonesty and bias are true according to the teachings of Islam. Without such qualities and characteristics claim of love is just deceit and forgery.

Reconstruction of Qualitative Gene Regulatory Networks

Genes provide instructions for the synthesis of functional products, such as, proteins. Gene expression develops the functional products using the instructions encoded in the genes. Gene regulation controls the process of gene expression in a way that it can regulate the increase or decrease of the gene expression resulting in the synthesis of specific functional products. It also controls when or when not to express a particular gene to produce a particular protein. Collection of regulatory elements, such as, genes and their interconnections showing the gene expression levels, are visualized as a Gene Regulatory Network (GRN). GRNs act as a tool for understanding the causation relationships between the genes and proteins representing complex cellular functionalities. Computational biology has laid its main focus nowadays on the reverse engineering or reconstruction of GRNs from gene expression data to decode the complex mechanism of the cellular functionalities. These efforts have resulted in improved and more precise diagnostics and therapeutics. Microarray technology of analyzing gene expressions calculates expression of thousands of genes simultaneously under different conditions, like, control or disease conditions. It helps in identifying over-expressed genes likely to be associated with the disease. Multiple approaches to reconstruct GRNs from gene expression data, apply various techniques, such as, distance measures, correlations, mutual information algorithms, dynamic and quantitative probabilities. These approaches result in identifying symmetric and diagonal gene pair interactions. Symmetric gene pair interactions cannot be modeled as direct activation and inhibition interactions. Moreover, diagonal nature shows that a gene cannot self-regulates itself, which is also contradictory with the true nature of gene pair regulatory interactions. Compromising the true asymmetric and non-diagonal nature of the actual gene pair regulatory interactions, can lead to incomplete and inferior predictions. To our knowledge, no such complete model exists to generate GRN representing all possible network motifs between gene pairs, such as, activation, inhibition and self-regulations. The proposed approach, named as, Multivariate Covariance Network (MCNet), aims at reconstructing GRN applies multivariate co-variance analysis and Principal Component Analysis (PCA) to identify asymmetric and non-diagonal gene interactions. The GRN developed using the MCNet approach holds all the possible network motifs, representing all kinds of gene-pair regulatory interactions (i.e., positive and negative feedback loops as well as self-loops). The asymmetry is achieved by computing the distance measure of the genes with respect to the eigen values of the related genes showing variable behaviors under different conditions. PCA in the MCNet approach selects gene-pair interactions showing maximum variances in gene regulatory expressions. Asymmetric gene regulatory interactions help in identifying the controlling regulatory agents, thus, lowering the false positive rate of interacting genes by minimizing the connections between previously unlinked network components. The performance of the proposed approach, MCNet, has been evaluated using a real data set as well as three synthetic and gold standard data sets. The MCNet approach predicts the regulatory vi interactions with higher precision and accuracy as compared to some currently state-of-the-art approaches. The results of the MCNet approach using the real time-series RTX therapy data set identified self-regulatory interactions of the differentially expressed (DE) genes with 80.6% accuracy. The MCNet approach predicted the gene regulatory interactions of the time-series synthetic Arabidopsis Thaliana circadian clock data set with 90.3% accuracy. The self-regulatory interactions identified in the RTX therapy and synthetic Arabidopsis Thaliana data sets are further verified from the literature because gold standards are not available for these data sets. Gold standard DREAM-3 and DREAM-8 in silico data sets, are also used to evaluate the performance of the proposed approach, while comparing with some existing approaches. The DREAM-3 in silico E-coli gold standard data set does not contain any self-regulations, while the DREAM-8 in silico phosphoproteins gold standard data set hold self-regulations. The results demonstrate the enhanced performance of the MCNet approach for predicting self-regulations only in the DREAM-8 in silico phosphoproteins data set with 75.8% accuracy. The MCNet approach for reconstructing GRN identifies direct activation and inhibition interactions as well as self-regulatory interactions from microarray gene expression data sets. The generated GRN can constitute positive and negative feedback loops as well as self-loops to demonstrate true nature of the gene-pair regulatory interactions. In future, it is aimed to enhance the functionality of the MCNet approach by modeling the dynamics of the GRNs, such as, oscillations and bifurcations towards steady state