Date of Award
2006
Degree Type
Dissertation
First Advisor
Scott J. Lloyd
Abstract
This dissertation contains three manuscripts related to each other. The first manuscript is a review of existing data mining literature in the areas of: machine learning, rule induction, neural networks, case based reasoning, genetic algorithms, and rough sets. For each area a brief description of what the topic is, examples of applications in the area, current research, and directions for future research are provided. The second manuscript presents an information criterion for choosing between decision trees exhibiting different characteristics of accuracy and complexity. The information criterion allows decision-makers to choose between decision tree model subsets based on their preference for parsimony and their individual problem domain. The second manuscript also presents a metric to quantify opportunity losses between decision trees thereby providing quantitative data to better enable decision-making. The proposed decision tree information criterion and opportunity loss measure provides decision support for managerial decision-making. The third manuscript details an implementation of the decision tree information criterion and opportunity loss measure developed in manuscript 2. It outlines the construction of a program to automate the discretization process and decision tree analysis. The program analyzes a dataset containing insurance company call center statistics, and the results confirm that the measures developed in manuscript 2 perform as predicted. Implications for managerial decision-making are then discussed.
Recommended Citation
Kyper, Eric, "An information criterion for use in predictive data mining" (2006). Open Access Dissertations. Paper 2095.
https://digitalcommons.uri.edu/oa_diss/2095
Terms of Use
All rights reserved under copyright.