Search or add a thesis

Advanced Search (Beta)
Home > Document Clustering Based on Semantic Notions

Document Clustering Based on Semantic Notions

Thesis Info

Access Option

External Link

Author

Rafi, Muhammad

Program

PhD

Institute

National University of Computer and Emerging Sciences

City

Islamabad

Province

Islamabad

Country

Pakistan

Thesis Completing Year

2017

Thesis Completion Status

Completed

Subject

Computer Science

Language

English

Link

http://prr.hec.gov.pk/jspui/bitstream/123456789/9739/1/muhammad.rafi.phd.thesis.pdf

Added

2021-02-17 19:49:13

Modified

2024-03-24 20:25:49

ARI ID

1676727739904

Asian Research Index Whatsapp Chanel
Asian Research Index Whatsapp Chanel

Join our Whatsapp Channel to get regular updates.

Similar


The exponential growth of electronic documents, in both proprietary and public information systems, pose new challenges in finding relevant information from these large repositories. Document clustering is a specialized technique that has found its niche in effectively browsing, filtering, managing and summarizing these collections. Document clustering process has three distinct steps: (i) document representation, (ii) computation of pair-wise document similarity, and (iii) application of clustering algorithm. Document clustering methods are very sensitive to document representation schemes. Conventionally, document representations are based on extracting simple features such as terms/n-grams/frequent words/sequences from the documents that can be used as meta-descriptors for documents. These features reduce the dimensionality of the problem but simply fail to capture the semantics of the text in a transformed compact representation. These representations completely ignore the order and relationships among words/features. Documents written in human languages generally contain a context and use of words are mainly dependent on the same context. Motivated by this a novel document representation scheme that first extracts lexical chains from the documents and exploits topic maps structure for the lexical chains is proposed. The scheme takes advantage of lexical cohesion structure along with topic map relationships to get a semantic based representation of document. Topic Maps (TM) is an international standard for codification of knowledge. Moreover, a good similarity measure is essential for the clustering task. The similarity function should make use of semantic relationship among features (lexical topics) to provide a viable clue for relatedness between any pair of documents. A similarity function based on lexical chain similarity and frequent common tree patterns extracted from the topic maps of documents is defined. Hence these patterns (hierarchical lexical topics with different granularity) also inherently capture semantics in similarity calculation. An extensive set of experiments on four publicly available document datasets is performed. The evaluation measures like F-score, purity and entropy clearly established that the proposed approach is better than traditional document clustering approaches.
Loading...
Loading...

Similar Books

Loading...

Similar Chapters

Loading...

Similar News

Loading...

Similar Articles

Loading...

Similar Article Headings

Loading...

درد کا حد سے گزرنا ہے دوا ہو جانا

درد کا حد سے گزرنا ہے دوا ہو جانا
نحمدہ ونصلی علی رسولہ الکریم امّا بعد فاعوذ بااللہ من الشیطن الرجیم
بسم اللہ الرحمن الرحیم
معزز اسا تذہ کرام اور میرے ہم مکتب ساتھیو!السلام علیکم! آج مجھے جس موضوع پر گفتگو کرنی ہے وہ ہے:’’درد کا حد سے گزرنا ہے دوا ہو جانا ‘‘
جنابِ صدر!
دکھ درد، تکلیف ہم معنی الفاظ ہیں، زندگی میں ہر شخص کوکسی نہ کسی موقع پر رنج وغم اور دکھ و تکلیف سے واسطہ پڑتا ہے خوشیاں روٹھ جاتی ہیں، رنج و الم کے بادل گھٹائیں بن کر برسنا شروع ہو جاتے ہیں گھر کے آنگن میں نوید ومسرت کی چاندنی بکھیرنے والا قمر گہنا جاتا ہے۔
صدرِ ذی وقار!
زندگی کے نشیب وفراز سے انسان ہمکنارر ہتا ہے۔ افراط و تفریط کا سلسلہ شروع رہتا ہے، کامیاب انسان وہ ہے جوایسے حالات میں مستقل مزاج رہتا ہے ان بوقلمونیوں سے اس کے پائے استقلال میں لغزش نہیں آتی اور یوں اس کی زندگی کی گاڑی رواں دواں رہتی ہے۔بقول غالبؔ
رنج سے خوگر ہوا انساں تو مٹ جاتا ہے رنج
مشکلیں اتنی پڑیں مجھ پر کہ آساں ہو گئیں
صدرِ محترم!
جب کوئی چیز حد سے بڑھ جاتی ہے، اپنی انتہا کو پہنچ جاتی ہے تو اس کا وجود عنقا ہو جاتاہے اس کی حیثیت بدل جاتی ہے اس کے نفع نقصان کا تصور تبدیل ہو جاتا ہے۔ اس کے مضر اثرات مصلح ہو جاتے ہیں اس میں یکسر تبدیلی آجاتی ہے اور ایّام کے ساتھ ساتھ وہ قصہ پارینہ بن جاتی ہے۔
محترم سامعین!
رات اپنی انتہا کو پہنچتی ہے تو بادِ نسیم صبح کے حیات بخش جھونکوں سے آشنا ہے۔ دن اپنی بلندیوں کومس کرتا ہے تو قمر کی برووت بھری چاندنی قلب و ذہن کی طراوت کا باعث بنتی...

مسئلہ حجاب: فرانسیسی مسلمان خواتین اور اسلامی تعلیمات

Human history is replete with preposterous and unjustifiable incidents of unearned sufferings against the women. Sometimes they were maltreated and molested harshly and sometimes they were abused, persecuted bestially. Contrary to these incidents occasionally they were considered superior and super angelic but on the contrary Islam has bestowed a dignified status to them regarding their rights and responsibilities. In this regard a comprehensive manifestation has been introduced by the Islam and until this manifestation was being followed by the Muslims no single complain was lodged by any woman against the violation of her basic in the Islamic societies till the climax of Islamic regime. But today some European countries are holding discussions to impose illegal sanctions against the veil of women and girls. The parliament of France has approved a discriminatory law against veil of the Muslim women or girls. It is amazing that Christian nun is at her liberty to cover her head with scarf or not but if Muslim women consider themselves safe in veil they are contemptuously scorned with derision and disdained. In this article views of France and Islamic teachings have been brought under discussion.

Studies on Amylases from Locally Isolated Strains of Aspergillus and Their Characterization

Microbial amylases are as important in industrial processes as are proteases. Among the microbes, fungi are gaining repute for the production of amylases. Keeping this in view, the present study was carried out to isolate, identify, characterize and explore the biotechnological applications of indigenous fungal strains. The study began by reviving fungal cultures from the stock collection in our lab and six more fungi were further isolated from the contaminated starch-agar plates. The isolates identified on the basis of cultural and morphological characteristics belonged to genus Aspergillus, Penicillium and Rhizopus. Preliminary screening was performed on starch-agar plate method with minor modification. Amylase production from the fungal isolates was also carried out under submerged fermentation conditions using mineral-salt media supplemented with starch and amylase production was quantitatively evaluated. Based on the results for quantitative production of amylases, 4 fungal isolates showing high IU/ml of amylase productivity were selected for further studies. The amylases from these isolates were characterized on the basis of activities at high temperatures and 2 fungal strains A. tubingensis SY 1 and A. niger MS 101 showing activities at 60oC and 64oC, respectively, were selected. Afterwards, the conditions for the optimum production of amylases from A. tubingensis SY 1 and A. niger MS 101 were worked out. The fungal strains showed optimum amylase production at 30oC with an initial pH of 5.9. Among the carbon sources; starch, glucose and maltose displayed higher amylase production along with the organic nitrogen source peptone. Amylase production was also optimized using a Plackett-Burman statistical design, and the results revealed peptone as the superior factor responsible for higher amylase titers. The optimum pH for amylase activity was determined along with the determination of optimum substrate concentration, the effect of various metal-ions and enzyme modulators. The pH 5.6 was optimum for amylase activity from both the fungal strains, while starch concentration of 0.5% was found to be optimum for the enzyme-substrate reaction to be carried out. Mn2+, K+ and NH4+ ions enhanced amylase activities while urea crystals and EDTA slightly inhibited the amylase activities of both fungal strains. Studies on solid-state fermentation (SSF) and submerged fermentation (SmF) for amylase production was also performed using variety of natural substrates including 2 halophytic plants and the results were compared. Whenever studies were compared with crude natural carbon substrates, whether under solid-state or submerged fermentation conditions together with the quantitative determination of amylase, the concentration of other enzymes, like xylanase, pectinase and cellulose enzyme system (β-glucosidase, endoglucanase, filter paper assay) were also determined. Potato-peels were found to be the most suitable substrate for amylase production by both fungal strains under SmF and SSF conditions. The Tm of amylase from the strain MS101 of A. niger was 65oC and from A. tubingensis SY 1 was 67oC, while Ea values were 73.64 KJ/mol and 46.07 KJ/mol for A. niger MS101 and A. tubingensis SY 1 amylases, respectively. Because of higher Tm values and low energies of activation (Ea) the industrial potential of amylases was determined. For this purpose, the starch-sized fabric was treated with fungal amylases at different temperatures for different time intervals to determine the d-sizing efficiency of amylases. The fabric after de-sizing by A. niger MS 101 amylase resulted in a TEGEWA rating of 8, while by A. tubingensis SY 1 amylase a TEGEWA rating of 9 was observed at 54oC in 12 hr. The results are promising for the use of these amylases in de-sizing. Co-culture studies for bioethanol production under SmF and SSF conditions were carried out using potato-peels under SmF and SSF, when the fermentation medium was simultaneously inoculated with the fungal and yeast strains, ~4 g/Kg and 6 g/Kg ethanol was produced in 120 hr. of incubation at 30oC. The yeast Pichia kudriavzevii SY 11 was also able to produce almost similar amount of ethanol under SmF of potatopeels. Indicating no contribution of fungal amylase to bioethanol. However, when coculture studies were carried out on purified starch 7- to 12- fold more ethanol production was noted (12 and 28 g/Kg) compared to potato-peel (1 and 4 g/Kg). Amylases were subjected to purification using different techniques: affinity and gelfiltration chromatography. No fruitful results were obtained by affinity chromatography while by using gel-filtration technique; a band of ~116 kDa was observed for A. tubingensis SY 1 amylase.