Authorship Attribution for Urdu Newspapers Columns Using Text Mining Techniques

Thesis Info

Access Option

External Link

Author

Waheed Anwar

Program

PhD

Institute

COMSATS University Islamabad

City

Islamabad

Province

Islamabad.

Country

Pakistan

Thesis Completing Year

2019

Thesis Completion Status

Completed

Subject

Computer Science

Language

English

Link

http://prr.hec.gov.pk/jspui/bitstream/123456789/12355/1/Waheed%20Anwar%20Computer%20Sci%202019%20iub%20prr.pdf

Added

2021-02-17 19:49:13

Modified

2024-03-24 20:25:49

ARI ID

1676727707210

Similar

With emergence of big data analytics in last decade, the importance of analyzing semistructured and unstructured data (such as text) is also highlighted. Since, the text (such as customer reviews, newspaper articles, etc.) contain significant business information, the text analytics becomes more significant to predict, infer or analyse information to add value to the business. In this research, we present a unified approach for intelligent association analysis of text that how much a piece of text is related to a customer or a person In this dissertation, an approach is presented for Authorship attribution in Urdu text using LDA model with n-grams texts of authors and improved sqrt-cosine similarity for the sake of forensic analysis. The proposed approach uses n-grams words to identify various learned representations of stylometric features and use them to identify the writing style of a particular author. The LDA based approach emphasizes instance-based and profile-based classification of an author’s text. Here, LDA suitably handles high dimensional and sparse data by allowing more expressive representation of text. The presented approach is an unsupervised computational methodology that can handle the heterogeneity of the dataset, diversity in writing styles of authors, and the inherent ambiguity of the Urdu language. A large corpus has been collected for performance testing of the presented approach. The results of experiments show superiority of the proposed approach over the state-of-the-art representations and other algorithms used for Authorship attribution. Manifold contributions of the presented work are use of improved sqrt-cosine similarity with LDA topics to measure similarity in vectors of text documents for the forensic analysis purpose, construction of a large data set of 6000 documents of columns, and achievement of 92% results on Urdu columns with fifteen authors and 78.57% results on PAN12 English dataset with fourteen authors without using any labels for authorship attribution task.

Chapters

Show entries

Filter:

Showing 0 to 0 of 0 entries

Title	Author	Supervisor	Degree	Institute
No data available in table
Title	Author	Supervisor	Degree	Institute

Showing 0 to 0 of 0 entries

Similar Thesis

Show entries

Filter:

Showing 1 to 20 of 100 entries

Title	Author	Supervisor	Degree	Institute
Authorship Attribution for Urdu Newspapers Columns Using Text Mining Techniques	Waheed Anwar		PhD	COMSATS University Islamabad, Islamabad, Pakistan
Online Urdu Handwritten Text Recognition for Mobile Devices Using Intelligent Techniques	Fareeha Anwar		PhD	International Islamic University, Islamabad, Pakistan
Uncovering Interesting Relationships in Text Using Mining	Farrah Shafique		MSc	Quaid-i-Azam University, Islamabad, Pakistan
Text mining using statistical and hierarchical clustering	Rizwan Ahmed Rathore, Naeem Ashfaq		MS	International Islamic University, Islamabad, Pakistan
Generic Urdu Nlp Framework for Urdu Text Analysis: Hybridization of Heuristics and Machine Learning Techniques	Khan, Wahab	Ali Daud	PhD	International Islamic University, Islamabad, Pakistan
Outlier Detection Using Data Mining Techniques	Rubina Adnan	Manzoor Illahi Tamimy	MS	COMSATS University Islamabad, Islamabad, Pakistan
Using Text Processing Techniques for Linking News Stories for Digital Preservation	Khan, Muzammil		PhD	Preston University, Kohat, Pakistan
Text and image cryptography using various techniques	Hina Saeed	Humaira Ashraf	BS	International Islamic University, Islamabad, Pakistan
Offline Urdu Nastaliq O C R for Printed Text Using Analytical Approach	Danish Altaf Satti		Mphil	Quaid-i-Azam University, Islamabad, Pakistan
Urdu Text Editor	Hassan Raza	Ahmed Salman	MS	COMSATS University Islamabad, Islamabad, Pakistan
Predicting Stock Prices Using Data Mining Techniques [Ms Finance]	Muhammad Arsal Burhan		MS	University of Management and Technology, Lahore, Pakistan
Medical Blogs Text Mining: An Efficient Methodology for Knowledge Identification & Sharing	Mahmood, Sajid		PhD	University of Engineering and Technology, Lahore, Pakistan
Urdu Text Editor [Bcs Programme]	Imran Munir; Khurram Rafique; Sheheryaar Sheikh	Taimoor Tanveer	BSc	University of Management and Technology, Lahore, Pakistan
Urdu Text Editor {Bcs Programme]	Imran Munir; Khurram Rafique; Sheheryaar Sheikh	Taimoor Tanveer	BSc	University of Management and Technology, Lahore, Pakistan
Feature Selection for Agile Development Through Data Mining Techniques	Waqas Jawaid			Virtual University of Pakistan, Lahore, Pakistan
Evaluation Visualization Techniques for Data Mining Results on Mobile Devices	Muzammil Khan		Mphil	Quaid-i-Azam University, Islamabad, Pakistan
The Classification of Multispectral and Statistical Texture Data Using Data Mining Techniques	Qadri, Salman		PhD	The Islamia University of Bahawalpur, Bahawalpur, Pakistan
Exploration of Research Issues from Text of Certain Field Domain : Proposed Text Mining Framework	Hameed Hussain	Maqbool Uddin Shaikh	MS	COMSATS University Islamabad, Islamabad, Pakistan
An Evaluation Based Study on Performance Predictions of Schools Using Data Mining Techniques	Zahida Parveen			Riphah International University, Faisalabad, Pakistan
Descriptive Analysis of Pakistani Crime Data Using Data Mining Techniques [Ms Software Engineering]	Faria Ferooz		MS	University of Management and Technology, Lahore, Pakistan
Title	Author	Supervisor	Degree	Institute

Showing 1 to 20 of 100 entries

Similar News

Show entries

Filter:

Showing 0 to 0 of 0 entries

Headline	Date	News Paper	Country
No data available in table
Headline	Date	News Paper	Country

Showing 0 to 0 of 0 entries

Article Title	Authors	Journal	Vol Info	Language
No data available in table
Article Title	Authors	Journal	Vol Info	Language

Heading	Article Title	Authors	Journal	Vol Info
No data available in table
Heading	Article Title	Authors	Journal	Vol Info

Search or add a thesis