Search or add a thesis

Advanced Search (Beta)
Home > Document Clustering Based on Semantic Notions

Document Clustering Based on Semantic Notions

Thesis Info

Access Option

External Link

Author

Rafi, Muhammad

Program

PhD

Institute

National University of Computer and Emerging Sciences

City

Islamabad

Province

Islamabad

Country

Pakistan

Thesis Completing Year

2017

Thesis Completion Status

Completed

Subject

Computer Science

Language

English

Link

http://prr.hec.gov.pk/jspui/bitstream/123456789/9739/1/muhammad.rafi.phd.thesis.pdf

Added

2021-02-17 19:49:13

Modified

2024-03-24 20:25:49

ARI ID

1676727739904

Similar


The exponential growth of electronic documents, in both proprietary and public information systems, pose new challenges in finding relevant information from these large repositories. Document clustering is a specialized technique that has found its niche in effectively browsing, filtering, managing and summarizing these collections. Document clustering process has three distinct steps: (i) document representation, (ii) computation of pair-wise document similarity, and (iii) application of clustering algorithm. Document clustering methods are very sensitive to document representation schemes. Conventionally, document representations are based on extracting simple features such as terms/n-grams/frequent words/sequences from the documents that can be used as meta-descriptors for documents. These features reduce the dimensionality of the problem but simply fail to capture the semantics of the text in a transformed compact representation. These representations completely ignore the order and relationships among words/features. Documents written in human languages generally contain a context and use of words are mainly dependent on the same context. Motivated by this a novel document representation scheme that first extracts lexical chains from the documents and exploits topic maps structure for the lexical chains is proposed. The scheme takes advantage of lexical cohesion structure along with topic map relationships to get a semantic based representation of document. Topic Maps (TM) is an international standard for codification of knowledge. Moreover, a good similarity measure is essential for the clustering task. The similarity function should make use of semantic relationship among features (lexical topics) to provide a viable clue for relatedness between any pair of documents. A similarity function based on lexical chain similarity and frequent common tree patterns extracted from the topic maps of documents is defined. Hence these patterns (hierarchical lexical topics with different granularity) also inherently capture semantics in similarity calculation. An extensive set of experiments on four publicly available document datasets is performed. The evaluation measures like F-score, purity and entropy clearly established that the proposed approach is better than traditional document clustering approaches.
Loading...
Loading...

Similar Books

Loading...

Similar Chapters

Loading...

Similar News

Loading...

Similar Articles

Loading...

Similar Article Headings

Loading...

ہادیِؐ جن و بشر کی صورت و سیرت کمال


ہادیِ جنّ و بشرؐ کی صورت و سیرت کمال
شاہراہِ زندگی میں آپؐ کی سنّت کمال

آسماں پر اوج اُس کا تو برائے نام ہے
صاحبِ شقّ القمرؐ کی عظمت و رفعت کمال

آپؐ ہی اسریٰ کی شب ٹھہرے امامِ انبیاء
کثرتِ خاصاں میں بھی ہے خاصۂ وحدت کمال

عرش پر بلوا کے خود اللہ نے دل جوئی کی
آپؐ کے دلدار کی ہے آپؐ سے اُلفت کمال

باعثِ تسکینِ قلب و روح و جسم و جان ہے
تذکرۂ رحمتِ کونینؐ میں راحت کمال

دولتِ دیدارِ محبوبِ خداؐ معراجِ دید
حلقۂ اصحاب کو حاصل ہے یہ دولت کمال

رہبرِؐ کامل نے آ کر دین اکمل کر دیا
دولتِ عرفانِ رب کی مل گئی نعمت کمال

قرآنی قصص مریم علیھا السلام کا ماخذ: استشراقی رجحانات کا تنقیدی جائزہ

Sources of Quranic Narrates of Syda Mariam (AS): A Critical Analysis of Orientalistic Approach Origin of the Qur’ᾱnic narrates towards Mariam Virgin (AS) has broadly been under debate in orientalist studies. Orientalist, in general, have had the opinion that Qur’ᾱnic stories of virgin Mariam were not the divine revelation; these are plagiarized and utilized from Christian apocryphal sources and literature like Arabic Gospel of infancy and Protoevangelium of James. This research paper’s questions were that according to Orientalist: is it true that the sources of Qur’ᾱnic Stories of Mariam (AS) are plagiarized from apocryphal Christian literature and Quran has done mistake about name of Mariam (AS)'s father and brother. This research is historical and textual. In conclusion, if anyone who has sincerely studied dating apocryphal literature and its dating should be able to see that his sincerity negates taking data for making Qur'ᾱnic narrates towards virgin Mariam (AS) from Christian apocryphal impacts upon Qur'ᾱnic narrate and also this paper shows that Christian developed their apocryphal literature after revelation of Quran.

Extremism Tendencies, Personality Traits, Social Axioms, and Gender Role Beliefs

The present research aimed to assess extremism tendencies, personality traits, social axioms and gender role beliefs among graduating young adults. This research was completed in three independent studies. Study I aimed for translation, and cross language validation of the Social Axioms Survey Scale (Leung et al., 2002) into Urdu. Study II, the pilot study was done to assess psychometrics for the study variables and general trends in the data on a sample of 210 young adults. Results showed that Urdu Version of Social Axioms Survey Scale, Urdu version of Gender Role Beliefs Scale (Khan, 2006), Urdu Version of NEO PI-R (Chishti, 2002), and The Extremism Scale (Altaf, 2002) were internally consistent and can be used in the study. 9 Study III: the main study was carried out to achieve the overall objectives of the study. Sample (N=1000) consisted of young adults with an age range of 1824 years and mean age of 21.40 years. Alpha reliability coefficients were established on a large data set of adults for the Urdu versions of Social Axioms Survey Scale ( , .81 - .92); Gender Role Beliefs Scale ( , .90); NEO PI-R ( , .87 - .92); and The Extremism Scale ( , .76 - .88). Factorial structure of the study instruments was validated with 1st and 2nd order confirmatory factor analyses. All the Indices of model fit (GFI, AGFI, CFI, NFI) indicated a good fit for the Urdu versions of Social Axioms Survey Scale (.90 - .96); Gender Role Beliefs Scale (.95 - .97); NEO PI-R (.93 - .96); and The Extremism Scale (.92 - 98) with acceptable factor loadings. Norms for the domain scales (neuroticism, extraversion, openness, agreeableness, and conscientiousness) of the Urdu version of NEO PI-R (Chishti, 2002) on a data of adults in Pakistan were reported in the form of Percentiles, Z scores and T scores. Results showed that an individual with raw score of 120 on extraversion domain has 3 percentile score in present study. While at the same raw score, percentile score is 69 for the English man. These findings supported the idea of having the local norms for the NEO PI-R-Urdu version. The effects of personality domain scales on subscales of extremism tendencies were explored and it was found that neuroticism has negative impact on submission to authority and agreeableness has negative impact on hostility/intolerance and rigidity. Subscales of social axioms like social flexibility has negative impact on submission to authority; 10 fate control has positive effect on rigidity; and religiosity also has significant positive impact on power and toughness. Gender role beliefs have no direct impact on extremism tendencies. Finally, the mediating role of gender role beliefs and social axioms on relationship between personality domain scales and extremism tendencies was tested through model fit indices. Results partially supported the mediating role of both the variables. Gender role beliefs fully mediated the relationship between extraversion, and power and toughness. Multivariate analyses revealed significant differences in hostility/intolerance where men had significantly higher mean score as compared to women. Adults with high income were high in intolerance while people with low income were high in submission to authority. Adults, with high level of education, have less traditional gender role beliefs as compared to adults with low level of education. Overall, findings of the study have highlighted the role of gender, age, monthly income, level of education, neuroticism, openness, agreeableness, social axioms, and gender role beliefs to predict extremism tendencies.