The Influence of Preprocessing Parameters on Text Categorization
Creators
Description
Text categorization (the assignment of texts in natural language into predefined categories) is an important and extensively studied problem in Machine Learning. Currently, popular techniques developed to deal with this task include many preprocessing and learning algorithms, many of which in turn require tuning nontrivial internal parameters. Although partial studies are available, many authors fail to report values of the parameters they use in their experiments, or reasons why these values were used instead of others. The goal of this work then is to create a more thorough comparison of preprocessing parameters and their mutual influence, and report interesting observations and results.
Files
6976.pdf
Files
(1.4 MB)
Name | Size | Download all |
---|---|---|
md5:f250a1ba1331854e71f84c29a6911c0d
|
1.4 MB | Preview Download |