Isolating Natural Problem Environments in Unconstrained Natural Language Processing: Corruption and Skew

Apply

Isolating Natural Problem Environments in Unconstrained Natural Language Processing: Corruption and Skew

Journal Title: Transactions on Machine Learning and Artificial Intelligence - Year 2017, Vol 5, Issue 3

Abstract

This work examines the full range of commonly available natural language processors' behaviors in a natural, unconstrained, and unguided environment. While permissible for typical research to constrain the language environment and to use in-depth knowledge to guide the processor for enhanced accuracy, this work purposefully avoids a clean laboratory in favor of a natural, chaotic, and uncontrollable environment. This shifts the focus towards natural processor behaviors in natural, unknown environments. This work provides a standardized comparison framework to compare and contrast each of a full range of processors' theoretical strengths. It continues to examine empirical behaviors on a full range of environments from typically used baseline sample documents, to actual raw natural texts used in an intent marketing business, to a series of increasingly corrupted and inconsistent sample documents to further differentiate processor behaviors. In all cases, the texts are unconstrained and the processors operate in their most na�ve, default forms. Results complement and extend prior work. It adds that accuracy-centric processors like artificial neural networks or support vector machines require both highly constrained environments and in-depth knowledge of the processor to operate. Descriptive-centric processors like k-nearest neighbors, Rocchio, and na�ve Bayes require only highly constrained environments. An explanatory-centric neurocognitive processor like Adaptive Resonance Theory can operate robustly with neither environmental constraint nor in-depth processing knowledge, but exposes operations to basic human temporal neurocognitive behaviors

Authors and Affiliations

Charles Wong

Keywords

Natural language processing; skew; corruption; natural behavior; neural networks; intentigeria

EP ID EP275519
DOI 10.14738/tmlai.53.3229
Views 86
Downloads 0