public class JCHAID
extends weka.classifiers.trees.J48
implements weka.core.OptionHandler, weka.core.Drawable, weka.core.Matchable, weka.classifiers.Sourcable, weka.core.Summarizable, weka.core.AdditionalMeasureProducer, weka.core.TechnicalInformationHandler, weka.core.PartitionGenerator
@article{Kass1980, author = "G. V. Kass", title = "An Exploratory Technique for Investigating Large Quantities of Categorical Data", journal = "Journal of the Royal Statistical Society. Series C (Applied Statistics)", year = "1980", volume = "29 (2)", pages = "119-127", abstract = "The technique set out in the paper, CHAID, is an offshoot of AID (Automatic Interaction Detection) designed for a categorized dependent variable. Some important modifications which are relevant to standard AID include: built-in significance testing with the consequence of using the most significant predictor (rather than the most explanatory), multi-way splits (in contrast to binary) and a new type of predictor which is especially useful in handling missing information.", url = "http://www.jstor.org/stable/2986296" }Valid options are: J48 options
-U Use unpruned tree.
-C <pruning confidence> Set confidence threshold for pruning. (default 0.25)
-M <minimum number of instances> Set minimum number of instances per leaf. (default 2)
-S Don't perform subtree raising.
-L Do not clean up after the tree has been built.
-A Laplace smoothing for predicted probabilities.
-Q <seed> Seed for random data shuffling (default 1).CHAID options
-CH-A <attribute significance level> Set the significance level for the selection of the attribute to split a node. (default 0.05)
-CH-M <merge-split significance level> Set the significance level for the quest of the best combination of values of attributes. (default 0.05)
-CH-S Look for the best binary split after merging 3 or more categories This process could add a considerable latency and that is why it is optional. (default true)
-CH-N <minimum number of instances to split a node> Set minimum number of instances to consider a node to be split. (default 3)
-CH-O <att1,att2-att4,...> Specifies list of attribute indexes to set as ordinal. 'First' and 'last' are valid indexes. Warning: The list of attributes includes the class! Warning: if XRFF file was used, this option will be ignored! (default none)
Modifier and Type | Field and Description |
---|---|
protected int |
m_CHminNumObjSplit
Minimum number of instances to split a node
|
protected weka.core.Range |
m_CHordinalAtts
Stores which attributes are ordinals (monotonic predictors)
(based on the member m_DiscretizeCols of weka.filters.supervised.attribute.Discretize class)
|
protected boolean |
m_CHsearchBestSplit
Indicates if the quest of the best binary split will be done, after merging 3 or more categories
This process could add a considerable latency and that is why it is optional.
|
protected double |
m_CHsigLevelAtt
Significance level for the selection of the attribute to split a node.
|
protected double |
m_CHsigLevelMergeSplit
Significance level for the quest of the best combination of the categories of an attribute
|
protected boolean |
m_XRFFUsed
Indicates if XRFF format was used
|
private static long |
serialVersionUID
for serialization
|
m_binarySplits, m_CF, m_collapseTree, m_doNotMakeSplitPointActualValue, m_minNumObj, m_noCleanup, m_numFolds, m_reducedErrorPruning, m_root, m_Seed, m_subtreeRaising, m_unpruned, m_useLaplace, m_useMDLcorrection
Constructor and Description |
---|
JCHAID() |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
binarySplitsTipText()
Returns the tip text for this property
(Rewritten to indicate this option is not implemented for JCHAID)
|
void |
buildClassifier(weka.core.Instances instances)
Generates the classifier.
|
java.lang.String |
CHminNumObjSplitTipText()
Returns the tip text for this property
|
java.lang.String |
CHordinalAttributeIndicesTipText()
Returns the tip text for this property
|
java.lang.String |
CHsearchBestSplitTipText()
Returns the tip text for this property
|
java.lang.String |
CHsigLevelAttTipText()
Returns the tip text for this property
|
java.lang.String |
CHsigLevelMergeSplitTipText()
Returns the tip text for this property
|
weka.core.Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
int |
getCHminNumObjSplit()
Gets the value of m_CHsigLevelMergeSplit.
|
java.lang.String |
getCHordinalAttributeIndices()
Gets the current range selection
|
boolean |
getCHsearchBestSplit()
Gets the value of m_CHsigLevelMergeSplit.
|
double |
getCHsigLevelAtt()
Gets the value of m_CHsigLevelAtt.
|
double |
getCHsigLevelMergeSplit()
Gets the value of m_CHsigLevelMergeSplit.
|
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier.
|
weka.core.TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing
detailed information about the technical background of this class,
e.g., paper reference or book this class is based on.
|
java.lang.String |
globalInfo()
Returns a string describing the classifier
|
java.util.Enumeration<weka.core.Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class
|
java.lang.String |
numFoldsTipText()
Returns the tip text for this property
(Rewritten to indicate this option is not implemented for JCHAID)
|
protected void |
prepareOrdinalAtts(weka.core.Instances instances)
Prepares the list of ordinal attributes, extracting metadata from XRFF file, if exists,
or through the m_CHordinalAtts option.
|
java.lang.String |
reducedErrorPruningTipText()
Returns the tip text for this property
(Rewritten to indicate this option is not implemented for JCHAID)
|
void |
setBinarySplits(boolean v)
Set the value of binarySplits.
|
void |
setCHminNumObjSplit(int v)
Sets the value of m_CHminNumObjSplit.
|
void |
setCHordinalAttributeIndices(java.lang.String rangeList)
Sets which attributes are to be considered as ordinal
|
void |
setCHsearchBestSplit(boolean searchBestSplit)
Sets the value of m_CHsearchBestSplit.
|
void |
setCHsigLevelAtt(double sigLevel)
Sets the value of m_CHsigLevelAtt.
|
void |
setCHsigLevelMergeSplit(double sigLevel)
Sets the value of m_CHsigLevelMergeSplit.
|
void |
setNumFolds(int v)
Set the value of numFolds.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setReducedErrorPruning(boolean v)
Set the value of reducedErrorPruning.
|
void |
setUseMDLcorrection(boolean v)
Set the value of useMDLcorrection.
|
java.lang.String |
toString()
Returns a description of the classifier.
|
java.lang.String |
toStringOrdinalAttributesList()
Return the list of attributes treated as ordinal
|
java.lang.String |
useMDLcorrectionTipText()
Returns the tip text for this property
(Rewritten to indicate this option is not implemented for JCHAID)
|
classifyInstance, collapseTreeTipText, confidenceFactorTipText, distributionForInstance, doNotMakeSplitPointActualValueTipText, enumerateMeasures, generatePartition, getBinarySplits, getCollapseTree, getConfidenceFactor, getDoNotMakeSplitPointActualValue, getMeasure, getMembershipValues, getMinNumObj, getNumFolds, getReducedErrorPruning, getRevision, getSaveInstanceData, getSeed, getSubtreeRaising, getUnpruned, getUseLaplace, getUseMDLcorrection, graph, graphType, measureNumLeaves, measureNumRules, measureTreeSize, minNumObjTipText, numElements, prefix, saveInstanceDataTipText, seedTipText, setCollapseTree, setConfidenceFactor, setDoNotMakeSplitPointActualValue, setMinNumObj, setSaveInstanceData, setSeed, setSubtreeRaising, setUnpruned, setUseLaplace, subtreeRaisingTipText, toSource, toSummaryString, unprunedTipText, useLaplaceTipText
batchSizeTipText, debugTipText, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, implementsMoreEfficientBatchPrediction, makeCopies, makeCopy, numDecimalPlacesTipText, postExecution, preExecution, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces
private static final long serialVersionUID
protected double m_CHsigLevelAtt
protected double m_CHsigLevelMergeSplit
protected boolean m_CHsearchBestSplit
protected int m_CHminNumObjSplit
protected weka.core.Range m_CHordinalAtts
protected boolean m_XRFFUsed
public java.lang.String globalInfo()
globalInfo
in class weka.classifiers.trees.J48
public weka.core.TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface weka.core.TechnicalInformationHandler
getTechnicalInformation
in class weka.classifiers.trees.J48
public weka.core.Capabilities getCapabilities()
getCapabilities
in interface weka.classifiers.Classifier
getCapabilities
in interface weka.core.CapabilitiesHandler
getCapabilities
in class weka.classifiers.trees.J48
Capabilities
public void buildClassifier(weka.core.Instances instances) throws java.lang.Exception
buildClassifier
in interface weka.classifiers.Classifier
buildClassifier
in class weka.classifiers.trees.J48
instances
- the data to train the classifier withjava.lang.Exception
- if classifier can't be built successfullyprotected void prepareOrdinalAtts(weka.core.Instances instances)
instances
- the data to train the classifier withpublic java.util.Enumeration<weka.core.Option> listOptions()
-U Use unpruned tree.
-C <pruning confidence> Set confidence threshold for pruning. (default 0.25)
-M <minimum number of instances> Set minimum number of instances per leaf. (default 2)
-S Don't perform subtree raising.
-L Do not clean up after the tree has been built.
-A Laplace smoothing for predicted probabilities.
-Q <seed> Seed for random data shuffling (default 1).CHAID options
-CH-A <attribute significance level> Set the significance level for the selection of the attribute to split a node. (default 0.05)
-CH-M <merge-split significance level> Set the significance level for the quest of the best combination of values of attributes. (default 0.05)
-CH-S Look for the best binary split after merging 3 or more categories This process could add a considerable latency and that is why it is optional. (default true)
-CH-N <minimum number of instances to split a node> Set minimum number of instances to consider a node to be split. (default 3)
-CH-O <att1,att2-att4,...> Specifies list of attribute indexes to set as ordinal. 'First' and 'last' are valid indexes. Warning: The list of attributes includes the class! Warning: if XRFF file was used, this option will be ignored! (default none)
listOptions
in interface weka.core.OptionHandler
listOptions
in class weka.classifiers.trees.J48
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-U Use unpruned tree.
-C <pruning confidence> Set confidence threshold for pruning. (default 0.25)
-M <minimum number of instances> Set minimum number of instances per leaf. (default 2)
-S Don't perform subtree raising.
-L Do not clean up after the tree has been built.
-A Laplace smoothing for predicted probabilities.
-Q <seed> Seed for random data shuffling (default 1).CHAID options
-CH-A <attribute significance level> Set the significance level for the selection of the attribute to split a node. (default 0.05)
-CH-M <merge-split significance level> Set the significance level for the quest of the best combination of values of attributes. (default 0.05)
-CH-S Look for the best binary split after merging 3 or more categories This process could add a considerable latency and that is why it is optional. (default true)
-CH-N <minimum number of instances to split a node> Set minimum number of instances to consider a node to be split. (default 3)
-CH-O <att1,att2-att4,...> Specifies list of attribute indexes to set as ordinal. 'First' and 'last' are valid indexes. Warning: The list of attributes includes the class! Warning: if XRFF file was used, this option will be ignored! (default none)
setOptions
in interface weka.core.OptionHandler
setOptions
in class weka.classifiers.trees.J48
options
- the list of options as an array of stringsjava.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface weka.core.OptionHandler
getOptions
in class weka.classifiers.trees.J48
public java.lang.String toString()
toString
in class weka.classifiers.trees.J48
public java.lang.String toStringOrdinalAttributesList()
public java.lang.String CHsigLevelAttTipText()
public double getCHsigLevelAtt()
public void setCHsigLevelAtt(double sigLevel)
CHsigLevelAtt
- the value to setpublic java.lang.String CHsigLevelMergeSplitTipText()
public double getCHsigLevelMergeSplit()
public void setCHsigLevelMergeSplit(double sigLevel)
CHsigLevelMergeSplit
- the value to setpublic java.lang.String CHsearchBestSplitTipText()
public boolean getCHsearchBestSplit()
public void setCHsearchBestSplit(boolean searchBestSplit)
searchBestSplit
- the value to setpublic java.lang.String CHminNumObjSplitTipText()
public int getCHminNumObjSplit()
public void setCHminNumObjSplit(int v)
CHminNumObjSplit
- the value to setpublic java.lang.String CHordinalAttributeIndicesTipText()
public java.lang.String getCHordinalAttributeIndices()
public void setCHordinalAttributeIndices(java.lang.String rangeList)
rangeList
- a string representing the list of attributes. Since the
string will typically come from a user, attributes are indexed
from 1. java.lang.IllegalArgumentException
- if an invalid range list is suppliedpublic java.lang.String reducedErrorPruningTipText()
reducedErrorPruningTipText
in class weka.classifiers.trees.J48
public void setReducedErrorPruning(boolean v)
setReducedErrorPruning
in class weka.classifiers.trees.J48
v
- Value to assign to reducedErrorPruning.public java.lang.String numFoldsTipText()
numFoldsTipText
in class weka.classifiers.trees.J48
public void setNumFolds(int v)
setNumFolds
in class weka.classifiers.trees.J48
v
- Value to assign to numFolds.public java.lang.String binarySplitsTipText()
binarySplitsTipText
in class weka.classifiers.trees.J48
public void setBinarySplits(boolean v)
setBinarySplits
in class weka.classifiers.trees.J48
v
- Value to assign to binarySplits.public java.lang.String useMDLcorrectionTipText()
useMDLcorrectionTipText
in class weka.classifiers.trees.J48
public void setUseMDLcorrection(boolean v)
setUseMDLcorrection
in class weka.classifiers.trees.J48
v
- Value to assign to binarySplits.public static void main(java.lang.String[] argv)
argv
- the command line options