





Semi-Automatic Domain Ontology Construction for Tamil Documents
Subscribe/Renew Journal
Ontology is an explicit specification of a conceptualization. That is, ontology is a description of the concepts and relationships that can exist for an agent or a community of agents. Ontology construction is a challenging task and in this paper a new technique is employed for the semi-automatic construction of ontology. It involves two modules. They are ontological word selection and semantic relationship extraction. Ontological nodes and semantically related words are selected from tamil text corpus. The input to the system is the tamil text documents. Each and every tamil text document is word segmented and then morphologically analyzed to find out the parts of speech. This is because, ontological words are supposed to be nouns. The confinement of the noun list is performed using TF-IDF technique. Semantically related words are identified based on the notion of serial clustering of words in text and by exploring the value of such clustering as an indicator of a word’s bearing content. This approach is flexible in the sense that is it is sensitive to context. A term is assessed as content bearing within one collection, but not another. In this way, a domain ontology is constructed semi-automatically for tamil text documents.
Keywords
Ontology, Semi-Automatic Ontology, Semantic Relationship Extraction, Content Bearing Words, TF-IDF, Morphological Analysis and Clustering.
User
Subscription
Login to verify subscription
Font Size
Information

Abstract Views: 860

PDF Views: 4