Document Type

Research Paper

College Affiliation

College of Science and Engineering

Department or Program Affiliation

Computer Science Department

Abstract

Text mining’s goal, simply put, is to derive information from text. Using multitudes of technologies from overlapping fields like Data Mining and Natural Language Processing we can yield knowledge from our text and facilitate other processing. Information Extraction (IE) plays a large part in text mining when we need to extract this data. In this survey we concern ourselves with general methods borrowed from other fields, with lower-level NLP techniques, IE methods, text representation models, and categorization techniques, and with specific implementations of some of these methods. Finally, with our new understanding of the field we can discuss a proposal for a system that combines WordNet, Wikipedia, and extracted definitions and concepts from web pages into a user-friendly search engine designed for topicspecific knowledge.

Subject – LCSH

Data mining, Text processing (Computer science), Natural language processing (Computer science)

Genre/Form

Term papers

Publisher

Western Washington University

Date

2008

Rights

Copying of this document in whole or in part is allowable only for scholarly purposes. It is understood, however, that any copying or publication of this document for commercial purposes, or for financial gain, shall not be allowed without the author’s written permission.

Type

Text

Format

application/pdf

Language

English

Language Code

eng

Share

COinS