College Affiliation

College of Science and Engineering

Document Type

Research Paper

Publication Date

2008

Department or Program Affiliation

Computer Science Department

Keywords

text mining, data mining information extraction, natural language processing, knowledge discovery

Abstract

Text mining’s goal, simply put, is to derive information from text. Using multitudes of technologies from overlapping fields like Data Mining and Natural Language Processing we can yield knowledge from our text and facilitate other processing. Information Extraction (IE) plays a large part in text mining when we need to extract this data. In this survey we concern ourselves with general methods borrowed from other fields, with lower-level NLP techniques, IE methods, text representation models, and categorization techniques, and with specific implementations of some of these methods. Finally, with our new understanding of the field we can discuss a proposal for a system that combines WordNet, Wikipedia, and extracted definitions and concepts from web pages into a user-friendly search engine designed for topicspecific knowledge.

Subject – LCSH

Data mining, Text processing (Computer science), Natural language processing (Computer science)

Publisher

Western Washington University

Genre/Form

term papers

Type

Text

Rights

Copying of this document in whole or in part is allowable only for scholarly purposes. It is understood, however, that any copying or publication of this document for commercial purposes, or for financial gain, shall not be allowed without the author’s written permission.

Language

English

Format

application/pdf

COinS