public class CHAIDSplit
extends weka.classifiers.trees.j48.ClassifierSplitModel
Modifier and Type | Field and Description |
---|---|
protected int |
m_attIndex
Attribute to split on.
|
protected double |
m_chiSquaredProb
ChiSquared probability of split.
|
protected int |
m_complexityIndex
Desired number of branches.
|
protected int |
m_index
Number of split points.
|
protected int |
m_minNoObj
Minimum number of objects in a split.
|
protected int |
m_missingIdx
Original position for missing values on the attribute to split on.
|
protected boolean |
m_ordered
Indicate if the nature of the categories is ordered
Default: false, that is, Attribute.ORDERING_SYMBOLIC
|
protected boolean |
m_searchBestSplit
Indicates if the quest of the best binary split will be done, after merging 3 or more categories
This process could add a considerable latency and that is why it is optional.
|
protected double |
m_sigLevelAtt
Set the significance level for the selection of the attribute to split a node.
|
protected double |
m_sigLevelMergeSplit
Set the significance level for the quest of the best combination of the categories of an attribute
|
private static long |
serialVersionUID
for serialization
|
Constructor and Description |
---|
CHAIDSplit(int attIndex,
int minNoObj,
double sumOfWeights,
boolean useMDLcorrection,
double sigLevelAtt,
double sigLevelMergeSplit,
boolean searchBestSplit,
boolean ordered)
Initializes the split model.
|
Modifier and Type | Method and Description |
---|---|
void |
buildClassifier(weka.core.Instances trainInstances)
Creates a CHAID-type split on the given data.
|
double |
chiSquaredProb()
Returns Chi Squared Probability for the generated split.
|
CHAIDDistribution |
getCHAIDDistribution()
Returns the distribution of class values.
|
int |
getMissingCurrentIndex()
Get the current index of the missing values
|
java.lang.String |
getRevision()
Returns the revision string.
|
protected void |
handleEnumeratedAttribute(weka.core.Instances trainInstances)
Creates split on enumerated attribute.
|
boolean |
hasMissingValues()
Returns whether the split has missing values or not
|
java.lang.String |
leftSide(weka.core.Instances data)
Prints left side of condition..
|
void |
resetDistribution(weka.core.Instances data)
Sets distribution associated with model.
|
java.lang.String |
rightSide(int index,
weka.core.Instances data)
Prints the condition satisfied by instances in a subset.
|
java.lang.String |
sourceExpression(int index,
weka.core.Instances data)
Returns a string containing java source code equivalent to the test made at
this node.
|
weka.core.Instances[] |
split(weka.core.Instances data)
Splits the given set of instances into subsets.
|
double[] |
weights(weka.core.Instance instance)
Returns null, indicating that instance is only assigned to one subset.
|
int |
whichSubset(weka.core.Instance instance)
Returns index of subset instance is assigned to.
|
private static final long serialVersionUID
protected int m_complexityIndex
protected final int m_attIndex
protected final int m_minNoObj
protected int m_index
protected int m_missingIdx
protected double m_chiSquaredProb
protected double m_sigLevelAtt
protected double m_sigLevelMergeSplit
protected boolean m_searchBestSplit
protected boolean m_ordered
public CHAIDSplit(int attIndex, int minNoObj, double sumOfWeights, boolean useMDLcorrection, double sigLevelAtt, double sigLevelMergeSplit, boolean searchBestSplit, boolean ordered)
attIndex
- Attribute to split onminNoObj
- minimum number of instances that have to occur in at least
two subsets induced by splitsumOfWeights
- sum of the weightsuseMDLcorrection
- whether to use MDL adjustement when finding splits
on numeric attributessigLevelAtt
- Significance level for the selection of attributessigLevelMergeSplit
- Significance level for the best combination of categoriesminNoObjSplit
- minimum number of instances to split a nodeordered
- true if the nature of the categories is orderedpublic void buildClassifier(weka.core.Instances trainInstances) throws java.lang.Exception
buildClassifier
in class weka.classifiers.trees.j48.ClassifierSplitModel
java.lang.Exception
- if something goes wrongpublic final double chiSquaredProb()
protected void handleEnumeratedAttribute(weka.core.Instances trainInstances) throws java.lang.Exception
java.lang.Exception
- if something goes wrongpublic final CHAIDDistribution getCHAIDDistribution()
public boolean hasMissingValues()
public int getMissingCurrentIndex()
public int whichSubset(weka.core.Instance instance) throws java.lang.Exception
whichSubset
in class weka.classifiers.trees.j48.ClassifierSplitModel
java.lang.Exception
- if something goes wrongpublic final java.lang.String leftSide(weka.core.Instances data)
leftSide
in class weka.classifiers.trees.j48.ClassifierSplitModel
data
- training set.public java.lang.String rightSide(int index, weka.core.Instances data)
rightSide
in class weka.classifiers.trees.j48.ClassifierSplitModel
index
- of subsetdata
- training set.public java.lang.String sourceExpression(int index, weka.core.Instances data)
sourceExpression
in class weka.classifiers.trees.j48.ClassifierSplitModel
index
- index of the nominal value testeddata
- the data containing instance structure infopublic final weka.core.Instances[] split(weka.core.Instances data) throws java.lang.Exception
split
in class weka.classifiers.trees.j48.ClassifierSplitModel
java.lang.Exception
- if something goes wrongpublic final double[] weights(weka.core.Instance instance)
weights
in class weka.classifiers.trees.j48.ClassifierSplitModel
public void resetDistribution(weka.core.Instances data) throws java.lang.Exception
resetDistribution
in class weka.classifiers.trees.j48.ClassifierSplitModel
java.lang.Exception
public java.lang.String getRevision()