Complete results and additional material for the article “BFPART: Best-First PART”

2016-05-13

 

This page contains the full tables related to the work presented in the article “BFPART: Best-First PART”.

First, we present the table with the characteristics for the 36 real world problem datasets used in this study.

Then, we include the full tables of the results related to the different proposed variants of the PART algorithm for the used performance metrics.

Finally, we include the full tables of the results related to the analyzed algorithms: C4.5, CHAID*, PART and the proposed BFPART.

All the tables of results can be downloaded as an Excel document or as a CSV file.

 

INDEX

1. Datasets characteristics

·         Table 1. Description of the datasets used in this work.

 

2. Results for the sixteen variants of PART

·         Table 2. AUC values of the sixteen variants of the PART algorithm for each dataset

·         Table 3. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the AUC metric

·         Table 4. Significant p-values according to Holm’s test for the 16 PART variant comparison for the AUC metric

·         Table 5. Error values of the sixteen variants of the PART algorithm for each dataset

·         Table 6. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the Error metric

·         Table 7. Significant p-values according to Holm’s test for the 16 PART variant comparison for the Error metric

·         Table 8. Complexity values of the sixteen variants of the PART algorithm for each dataset

·         Table 9. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the Complexity metric

·         Table 10. Significant p-values according to Holm’s test for the 16 PART variant comparison for the Complexity metric

·         Table 11. Length values of the sixteen variants of the PART algorithm for each dataset

·         Table 12. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the Length metric

·         Table 13. Significant p-values according to Holm’s test for the 16 PART variant comparison for the Length metric

·         Table 14. Time values of the sixteen variants of the PART algorithm for each dataset

·         Table 15. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the Time metric

·         Table 16. Significant p-values according to Holm’s test for the 16 PART variant comparison for the Time metric

 

 

3. Results for C4.5, CHAID*, PART and BFPART

·         Table 17. AUC values of the four analyzed algorithms for each dataset

·         Table 18. Significant p-values according to Shaffer’s test for the comprehensible algorithm comparison for the AUC metric

·         Table 19. Significant p-values according to Holm’s test for the comprehensible algorithm comparison for the AUC metric

·         Table 20. Error values of the four analyzed algorithms for each dataset

·         Table 21. Complexity values of the four analyzed algorithms for each dataset

·         Table 22. Significant p-values according to Shaffer’s test for the comprehensible algorithm comparison for the Complexity metric

·         Table 23. Significant p-values according to Holm’s test for the comprehensible algorithm comparison for the Complexity metric

·         Table 24. Length values of the four analyzed algorithms for each dataset

·         Table 25. Significant p-values according to Shaffer’s test for the comprehensible algorithm comparison for the Length metric

·         Table 26. Significant p-values according to Holm’s test for the comprehensible algorithm comparison for the Length metric

·         Table 27. Time values of the four analyzed algorithms for each dataset

·         Table 28. Significant p-values according to Shaffer’s test for the comprehensible algorithm comparison for the Time metric

·         Table 29. Significant p-values according to Holm’s test for the comprehensible algorithm comparison for the Time metric

 

References

 

1. Datasets characteristics

This section contains the table with the characteristics for the 36 real world problem datasets used in this study. Maximum and minimum values are shown in italics and underlined. The datasets whose name ends with the subindex 2 represent two-class versions of multi-class datasets. This conversion was done using the same methodology as in [1] by grouping all classes other than the one with the least examples. In these datasets the number between parentheses for the #classes column represents the number of classes on the original dataset.

 


 

Table 1. Description of the datasets used in this work.

 

#Examples

#Attributes

#Classes

% Minority Class

Missing values

#Discretes

 

#Nominals

#Ordinals

#Continuous

#Total

breast-w

699

 

10

 

10

2

34.5

y

heartc

303

7

1

5

13

2

45.87

y

spam

4601

 

 

57

57

2

39.4

n

hypo

3163

18

 

7

25

2

4.77

y

liver

345

 

 

6

6

2

42.03

n

lymph

148

17

1

 

18

4

1.35

n

lymph2

148

17

1

 

18

2 (4)

41.22

n

credit_a

690

8

 

6

14

2

44.49

n

vehicle

846

 

 

18

18

4

23.52

n

vehicle2

846

 

 

18

18

2 (4)

23.52

n

iris

150

 

 

4

4

3

33.33

n

iris2

150

 

 

4

4

2 (3)

33.33

n

glass

214

 

 

9

9

7

4.2

n

glass2

214

 

 

9

9

2 (7)

23.83

n

breast-y

286

5

4

 

9

2

29.72

y

voting

435

16

 

 

16

2

38.62

y

heart-h

294

8

 

5

13

2

36.05

y

hepatitis

155

13

 

6

19

2

20.65

y

credit_g

1000

10

3

7

20

2

30

n

soybean_15CL

290

34

1

 

35

15

3.45

y

soybean_15CL2

290

34

1

 

35

2 (15)

13.79

y

segment2310

2310

 

 

19

19

7

14.29

n

segment23102

2310

 

 

19

19

2 (7)

14.29

n

segment210

210

 

 

19

19

7

14.29

n

segment2102

210

 

 

19

19

2 (7)

14.29

n

sick-euthyroid

3164

18

 

7

25

2

9.26

n

bands

540

18

 

21

39

2

42.2

y

ks-vs-kp

3196

36

 

 

36

2

47.8

n

optdigits2

5620

 

64

 

64

2 (10)

9.9

n

car2

1728

 

6

 

6

2 (4)

30

n

abalone2

4177

 

 

8

8

2 (29)

8.6

n

solar_flare

1389

13

 

 

13

2

15.7

n

yeast2

1484

 

 

8

8

2 (10)

28.9

n

splice_junction2

3190

60

 

 

60

2 (3)

24.1

n

kddcup

4941

7

 

34

41

2

19.69

n

pima

768

 

 

8

8

2

34.9

n

Min

148

5

1

4

4

2

1.35

Max

5620

60

64

57

64

15

47.80

Mean

1402.89

18.83

9.20

13.46

20.94

2.92

24.88

Median

694.5

16.5

2

8

18

2.00

23.97

 


 

2. Results for the sixteen variants of PART

This section includes the full tables of the results related to the different proposed variants of the PART algorithm for the five performance metrics used to the comparison: AUC, Error, Complexity, Length and Time. For Complexity the unit of measurement is the number of decisions in the rule-set. Length is measured as the average number of decisions on each rule of the rule-set. Computational cost is measured in milliseconds taken to build the full rule-set.

Table 2. AUC values of the sixteen variants of the PART algorithm for each dataset

AUC

Hill Climbing

Best-First

Treated Leaves

All Leaves

Treated Leaves

All Leaves

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

PART

HC_TL_NP_DP

HC_TL_PR_PP

HC_TL_PR_DP

HC_AL_NP_PP

HC_AL_NP_DP

HC_AL_PR_PP

HC_AL_PR_DP

BF_TL_NP_PP

BF_TL_NP_DP

BF_TL_PR_PP

BF_TL_PR_DP

BF_AL_NP_PP

BF_AL_NP_DP

BF_AL_PR_PP

BF_AL_PR_DP

breast-w

97.50

96.17

97.26

97.09

97.32

97.32

97.07

97.07

97.39

95.64

97.48

97.28

97.48

97.23

97.32

97.55

heartc

80.83

78.27

78.37

77.36

76.41

76.41

73.80

74.03

80.66

78.76

77.36

81.64

79.85

78.75

75.75

81.44

spam

95.77

95.96

90.60

90.58

70.18

70.18

70.47

70.45

95.72

95.84

90.73

96.15

70.20

93.08

70.49

93.98

hypo

96.03

96.21

96.73

96.73

95.52

95.52

96.73

96.73

95.87

97.11

96.71

96.72

96.02

96.07

96.71

96.72

liver

65.05

66.02

63.14

63.63

59.66

59.66

58.77

59.06

64.58

66.16

63.75

64.21

61.13

66.53

61.35

63.85

lymph

81.68

80.47

80.68

80.75

79.02

79.02

80.92

81.10

81.52

81.03

81.18

86.30

82.08

81.18

80.62

85.31

lymph2

80.43

80.16

81.79

80.33

75.73

75.73

75.15

75.15

82.03

78.76

78.93

83.98

75.80

82.87

74.94

84.08

credit_a

84.32

84.02

85.97

85.97

85.65

85.65

85.97

85.97

84.13

84.07

85.94

89.20

87.94

84.20

85.94

89.13

vehicle

86.50

86.61

87.69

87.25

77.24

77.24

77.47

77.29

86.67

86.36

86.11

87.68

77.39

86.80

76.66

87.89

vehicle2

93.95

94.38

95.41

94.84

89.37

89.37

90.00

90.00

94.33

94.55

95.61

95.76

92.53

94.16

92.53

95.09

iris

96.47

96.55

97.09

97.18

96.47

96.47

97.09

97.09

96.47

96.47

97.09

97.09

96.47

96.97

97.09

97.09

iris2

99.00

99.00

99.00

99.00

99.00

99.00

99.00

99.00

99.00

99.00

99.00

99.00

99.00

99.00

99.00

99.00

glass

80.00

80.94

79.99

80.50

64.35

64.35

64.46

64.46

80.31

82.13

80.35

81.70

64.35

82.60

64.45

82.62

glass2

92.10

91.95

91.45

91.63

92.10

92.10

91.45

91.96

92.10

93.15

90.76

91.22

92.10

94.18

90.76

92.11

breast-y

60.42

60.61

59.55

59.53

60.31

60.31

60.01

60.01

59.67

57.76

59.71

61.39

62.56

57.13

60.01

61.39

voting

97.77

97.88

97.16

97.16

97.61

97.61

97.16

97.16

97.74

98.13

97.63

97.78

97.89

98.26

97.63

97.78

heart-h

86.41

85.58

77.06

77.37

83.72

83.72

76.91

77.21

85.99

85.77

76.91

77.08

84.45

85.03

76.91

77.04

hepatitis

75.62

74.45

64.70

65.26

74.88

74.88

64.59

64.56

76.07

75.78

64.45

67.16

73.97

76.11

64.45

66.71

credit_g

65.63

63.17

50.00

50.00

71.15

71.15

50.00

50.00

62.95

63.81

50.00

71.44

70.81

63.55

50.00

71.27

soybean_15CL

96.82

95.11

96.81

96.21

86.16

86.16

86.16

86.16

96.77

95.04

97.19

95.68

86.16

96.49

86.16

96.47

soybean_15CL2

90.37

89.84

77.37

76.58

90.68

90.68

78.20

78.20

87.96

90.86

81.55

85.40

89.03

89.67

81.55

85.68

segment2310

98.55

98.34

93.95

98.42

76.35

76.35

82.27

76.35

98.51

98.50

93.80

98.57

76.35

98.55

82.31

98.62

segment23102

98.96

98.74

79.01

79.00

78.11

78.11

78.11

78.11

98.95

99.16

88.93

98.52

97.14

98.91

88.93

98.36

segment210

93.02

92.11

93.19

92.69

77.19

77.19

77.19

77.19

92.88

92.45

93.40

93.04

77.19

94.18

77.19

94.09

segment2102

92.87

94.81

92.87

94.81

92.87

92.87

92.87

92.87

92.87

94.13

92.11

94.13

92.87

92.87

92.11

92.87

sick-euthyroid

95.70

95.70

67.07

66.76

94.60

94.60

66.83

66.83

96.03

95.05

66.74

86.02

94.88

94.74

66.79

85.84

bands

76.96

77.72

68.18

68.78

69.28

69.28

67.95

67.95

77.83

78.96

69.89

78.22

71.68

78.15

71.06

79.48

ks-vs-kp

99.64

99.72

99.43

99.17

94.52

94.52

93.57

93.45

99.67

99.61

98.85

99.48

97.04

99.50

98.10

99.72

optdigits2

96.46

95.70

66.22

65.97

78.31

78.31

66.06

66.06

96.74

96.18

66.17

93.44

78.29

97.28

66.03

93.62

car2

99.48

99.06

96.64

95.77

92.42

92.42

91.84

91.85

99.55

99.82

97.02

98.15

93.08

99.55

92.56

98.72

abalone2

72.62

72.79

50.01

50.01

72.37

72.37

50.01

50.01

72.69

72.34

50.00

50.00

72.53

72.48

50.00

50.00

solar_flare

70.12

70.22

50.00

50.00

70.64

70.64

50.00

50.00

69.00

67.50

50.33

57.78

71.16

67.38

50.33

57.78

yeast2

75.78

75.28

57.99

57.97

65.22

65.22

57.27

57.20

75.71

75.80

64.57

72.87

72.90

75.38

63.85

72.50

splice_junction2

94.05

94.12

50.00

50.00

82.30

82.30

50.00

50.00

94.23

94.01

50.00

97.41

82.30

94.23

50.00

97.66

kddcup

99.44

99.50

99.27

99.32

99.33

99.33

99.26

99.32

99.45

99.58

99.17

99.49

99.32

99.48

99.17

99.52

pima

77.93

77.73

71.20

71.98

74.37

74.37

71.49

71.31

77.88

78.26

71.56

75.34

75.31

78.42

71.78

75.70

Mean

87.34

87.08

80.08

80.16

81.68

81.68

76.84

76.70

87.22

87.15

80.58

86.01

83.03

87.25

77.79

86.02

Median

92.49

92.03

81.23

80.63

78.66

78.66

77.33

77.20

92.49

92.80

83.74

90.21

82.19

92.97

77.05

90.62

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

Table 3. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the AUC metric

Comparison

p-value (Shaffer adjusted)

BF_AL_PR_DP vs. HC_AL_PR_PP

0

BF_AL_PR_DP vs. HC_AL_PR_DP

0

BF_AL_PR_DP vs. BF_AL_PR_PP

0

HC_TL_NP_PP vs. HC_AL_PR_PP

0

BF_AL_NP_DP vs. HC_AL_PR_PP

0

BF_TL_PR_DP vs. HC_AL_PR_PP

0

BF_TL_NP_DP vs. HC_AL_PR_PP

0

HC_TL_NP_PP vs. HC_AL_PR_DP

0

BF_AL_NP_DP vs. HC_AL_PR_DP

0

HC_TL_NP_PP vs. BF_AL_PR_PP

0.000001

BF_TL_PR_DP vs. HC_AL_PR_DP

0.000001

BF_AL_NP_DP vs. BF_AL_PR_PP

0.000001

BF_TL_NP_PP vs. HC_AL_PR_PP

0.000001

BF_TL_NP_DP vs. HC_AL_PR_DP

0.000001

BF_TL_PR_DP vs. BF_AL_PR_PP

0.000001

BF_TL_NP_DP vs. BF_AL_PR_PP

0.000001

BF_TL_NP_PP vs. HC_AL_PR_DP

0.000002

BF_TL_NP_PP vs. BF_AL_PR_PP

0.000003

HC_TL_NP_DP vs. HC_AL_PR_PP

0.000012

HC_TL_NP_DP vs. HC_AL_PR_DP

0.000031

HC_TL_NP_DP vs. BF_AL_PR_PP

0.000053

BF_AL_PR_DP vs. HC_AL_NP_DP

0.000113

BF_AL_PR_DP vs. HC_AL_NP_PP

0.000113

HC_TL_NP_PP vs. HC_AL_NP_DP

0.000863

HC_TL_NP_PP vs. HC_AL_NP_PP

0.000863

BF_AL_NP_DP vs. HC_AL_NP_DP

0.001025

BF_AL_NP_DP vs. HC_AL_NP_PP

0.001025

BF_AL_PR_DP vs. BF_TL_PR_PP

0.001215

BF_TL_PR_DP vs. HC_AL_NP_DP

0.00136

BF_TL_PR_DP vs. HC_AL_NP_PP

0.00136

BF_AL_PR_DP vs. HC_TL_PR_DP

0.00136

BF_AL_PR_DP vs. HC_TL_PR_PP

0.001583

BF_TL_NP_DP vs. HC_AL_NP_DP

0.001673

BF_TL_NP_DP vs. HC_AL_NP_PP

0.001673

BF_TL_NP_PP vs. HC_AL_NP_DP

0.00358

BF_TL_NP_PP vs. HC_AL_NP_PP

0.00358

HC_TL_NP_PP vs. BF_TL_PR_PP

0.006716

BF_AL_NP_DP vs. BF_TL_PR_PP

0.007833

HC_TL_NP_PP vs. HC_TL_PR_DP

0.008243

BF_AL_NP_DP vs. HC_TL_PR_DP

0.009599

HC_TL_NP_PP vs. HC_TL_PR_PP

0.009599

BF_TL_PR_DP vs. BF_TL_PR_PP

0.009846

BF_AL_NP_DP vs. HC_TL_PR_PP

0.010749

BF_TL_PR_DP vs. HC_TL_PR_DP

0.010964

BF_TL_NP_DP vs. BF_TL_PR_PP

0.012109

BF_TL_PR_DP vs. HC_TL_PR_PP

0.012723

BF_TL_NP_DP vs. HC_TL_PR_DP

0.014744

BF_TL_NP_DP vs. HC_TL_PR_PP

0.017063

BF_TL_NP_PP vs. BF_TL_PR_PP

0.023874

HC_TL_NP_DP vs. HC_AL_NP_DP

0.02399

HC_TL_NP_DP vs. HC_AL_NP_PP

0.02399

BF_TL_NP_PP vs. HC_TL_PR_DP

0.027632

BF_AL_NP_PP vs. HC_AL_PR_PP

0.028537

BF_TL_NP_PP vs. HC_TL_PR_PP

0.030864

 

Table 4. Significant p-values according to Holm’s test for the 16 PART variant comparison for the AUC metric

Comparison

p-value (Holm adjusted)

BF_AL_PR_DP vs. HC_AL_PR_PP

0

BF_AL_PR_DP vs. HC_AL_PR_DP

0

BF_AL_PR_DP vs. BF_AL_PR_PP

0

BF_AL_PR_DP vs. HC_AL_NP_DP

0.000121

BF_AL_PR_DP vs. HC_AL_NP_PP

0.000121

BF_AL_PR_DP vs. BF_TL_PR_PP

0.001228

BF_AL_PR_DP vs. HC_TL_PR_DP

0.001488

BF_AL_PR_DP vs. HC_TL_PR_PP

0.00174

 


 

Table 5. Error values of the sixteen variants of the PART algorithm for each dataset

Error

Hill Climbing

Best-First

Treated Leaves

All Leaves

Treated Leaves

All Leaves

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

PART

HC_TL_NP_DP

HC_TL_PR_PP

HC_TL_PR_DP

HC_AL_NP_PP

HC_AL_NP_DP

HC_AL_PR_PP

HC_AL_PR_DP

BF_TL_NP_PP

BF_TL_NP_DP

BF_TL_PR_PP

BF_TL_PR_DP

BF_AL_NP_PP

BF_AL_NP_DP

BF_AL_PR_PP

BF_AL_PR_DP

breast-w

5.21

5.58

6.78

5.18

6.93

6.93

6.33

6.33

5.72

5.81

6.72

5.67

6.81

4.98

6.15

5.95

heartc

21.12

23.11

23.84

24.50

26.30

26.30

27.07

26.74

22.25

23.71

23.97

21.65

24.04

22.93

24.76

21.79

spam

6.15

6.20

9.06

9.02

23.65

23.65

23.58

23.57

6.38

6.46

8.75

6.02

23.65

8.32

23.56

7.42

hypo

0.93

0.94

0.74

0.74

1.60

1.60

0.78

0.78

0.93

0.95

0.75

0.75

1.46

1.22

0.78

0.75

liver

35.05

35.16

35.28

34.76

41.87

41.87

38.86

38.86

36.15

34.87

33.08

35.43

40.44

35.26

35.57

36.00

lymph

21.78

25.52

21.91

21.92

22.02

22.02

20.73

20.45

22.27

22.00

20.13

18.55

25.09

22.64

20.95

19.05

lymph2

22.12

21.78

16.89

17.85

25.75

25.75

23.56

23.56

20.87

21.89

17.59

18.25

29.94

19.13

23.83

17.41

credit_a

16.78

17.26

14.69

14.69

15.24

15.24

14.69

14.69

16.42

18.16

14.81

14.60

15.35

18.87

14.81

14.72

vehicle

27.81

27.40

29.40

29.68

49.74

49.74

49.58

49.72

27.42

28.53

31.95

28.40

47.37

29.33

47.47

27.89

vehicle2

5.34

5.91

6.48

6.14

11.99

11.99

11.63

11.63

5.77

5.29

6.98

5.46

10.90

5.70

10.33

5.56

iris

6.00

5.60

5.07

5.07

6.00

6.00

5.07

5.07

6.00

5.60

5.07

5.07

6.00

6.00

5.07

5.07

iris2

0.67

0.67

0.67

0.67

0.67

0.67

0.67

0.67

0.67

0.67

0.67

0.67

0.67

0.67

0.67

0.67

glass

31.17

29.33

31.80

30.90

50.94

50.94

50.85

50.85

30.99

29.57

31.31

29.13

50.94

31.22

50.85

31.46

glass2

6.54

6.71

6.73

7.19

6.54

6.54

6.73

6.73

6.54

6.54

6.83

7.47

6.54

6.16

6.83

6.92

breast-y

35.62

33.96

26.29

26.29

32.43

32.43

26.48

26.48

35.71

35.33

26.29

25.45

32.09

37.61

26.63

25.45

voting

4.88

4.83

4.14

4.14

4.56

4.56

4.14

4.14

4.55

4.78

3.87

3.68

4.65

4.83

3.87

3.73

heart-h

19.74

20.03

19.53

19.39

20.97

20.97

19.53

19.40

19.75

21.70

19.60

19.81

19.88

21.56

19.53

19.81

hepatitis

19.70

20.17

17.59

17.59

20.82

20.82

17.72

17.72

18.44

21.03

17.85

19.29

22.04

18.95

17.85

19.06

credit_g

29.30

31.44

30.00

30.00

31.92

31.92

30.00

30.00

30.84

31.88

30.00

27.26

32.10

31.98

30.00

27.26

soybean_15CL

9.52

12.28

11.38

13.24

56.76

56.76

56.76

56.76

9.79

12.07

11.52

13.31

56.76

8.69

56.76

11.10

soybean_15CL2

7.10

8.00

7.66

7.59

7.59

7.59

7.45

7.45

7.66

7.10

6.69

5.72

6.21

6.90

6.69

5.79

segment2310

3.31

3.83

21.14

5.99

56.81

56.81

44.43

56.81

3.37

3.79

21.60

4.05

56.81

3.55

44.32

3.63

segment23102

0.55

0.54

6.08

6.10

6.45

6.45

6.45

6.45

0.55

0.41

3.43

0.73

2.01

0.61

3.43

0.74

segment210

13.71

15.14

13.33

13.62

55.81

55.81

55.81

55.81

14.00

13.81

13.24

13.52

55.81

12.48

55.81

12.10

segment2102

4.48

2.48

4.48

2.48

4.48

4.48

4.48

4.48

4.48

4.38

4.76

4.38

4.48

4.48

4.76

4.48

sick-euthyroid

2.81

2.60

6.68

6.65

5.84

5.84

6.93

6.93

2.90

2.79

6.73

3.74

5.56

4.53

6.73

3.69

bands

24.68

24.16

29.72

29.35

29.67

29.67

29.82

29.82

23.82

22.49

28.86

24.75

29.63

25.74

28.96

25.02

ks-vs-kp

0.73

0.66

1.80

1.61

8.27

8.27

9.17

9.24

0.77

0.84

4.04

0.84

9.99

1.11

6.46

0.95

optdigits2

1.82

2.07

7.27

7.36

7.38

7.38

7.48

7.43

1.89

2.37

7.28

2.84

7.38

1.49

7.51

3.11

car2

1.86

2.71

5.71

8.00

15.59

15.59

15.11

15.12

1.44

0.71

5.50

3.09

15.14

1.09

14.34

3.14

abalone2

8.75

8.74

8.67

8.67

8.70

8.70

8.67

8.67

8.72

8.84

8.67

8.67

8.75

8.81

8.67

8.67

solar_flare

16.90

17.13

15.69

15.69

15.98

15.98

15.69

15.69

17.05

17.00

15.71

15.58

16.00

16.76

15.71

15.58

yeast2

24.77

24.80

27.51

27.64

29.07

29.07

28.19

28.28

24.95

24.49

26.65

24.37

28.60

25.34

27.95

24.62

splice_junction2

4.98

5.23

24.08

24.08

24.08

24.08

24.08

24.08

5.02

5.95

24.08

4.31

24.08

5.09

24.08

4.27

kddcup

0.46

0.46

0.80

0.76

0.96

0.96

0.85

0.85

0.43

0.36

1.03

0.54

0.92

0.45

1.07

0.63

pima

26.97

27.00

25.67

25.88

28.07

28.07

26.50

26.50

27.15

26.90

25.47

26.14

27.99

25.99

26.30

26.17

Mean

13.04

13.32

14.57

14.18

21.15

21.15

20.16

20.49

13.10

13.31

14.48

12.48

21.00

13.34

19.70

12.49

Median

7.93

8.37

12.36

11.13

18.40

18.40

16.71

16.71

8.19

7.97

12.38

8.07

17.94

8.51

16.78

8.04

 

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

 

Table 6. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the Error metric

Comparison

p-value (Shaffer adjusted)

BF_TL_PR_DP vs. HC_AL_NP_DP

0

BF_TL_PR_DP vs. HC_AL_NP_PP

0

BF_TL_PR_DP vs. BF_AL_NP_PP

0

BF_AL_PR_DP vs. HC_AL_NP_DP

0

BF_AL_PR_DP vs. HC_AL_NP_PP

0

BF_AL_PR_DP vs. BF_AL_NP_PP

0

HC_TL_NP_PP vs. HC_AL_NP_DP

0

HC_TL_NP_PP vs. HC_AL_NP_PP

0.000002

HC_TL_PR_DP vs. HC_AL_NP_DP

0.000002

HC_TL_PR_DP vs. HC_AL_NP_PP

0.000041

HC_TL_NP_PP vs. BF_AL_NP_PP

0.000041

BF_TL_PR_DP vs. HC_AL_PR_PP

0.000044

BF_TL_PR_DP vs. HC_AL_PR_DP

0.000083

BF_TL_NP_PP vs. HC_AL_NP_DP

0.000083

BF_TL_NP_PP vs. HC_AL_NP_PP

0.000106

BF_TL_NP_DP vs. HC_AL_NP_DP

0.000106

BF_TL_NP_DP vs. HC_AL_NP_PP

0.000253

BF_TL_PR_PP vs. HC_AL_NP_DP

0.000253

BF_TL_PR_PP vs. HC_AL_NP_PP

0.000253

HC_TL_PR_PP vs. HC_AL_NP_DP

0.000253

HC_TL_PR_PP vs. HC_AL_NP_PP

0.000381

HC_TL_PR_DP vs. BF_AL_NP_PP

0.000381

BF_AL_NP_DP vs. HC_AL_NP_DP

0.000483

BF_AL_NP_DP vs. HC_AL_NP_PP

0.000576

HC_TL_NP_DP vs. HC_AL_NP_DP

0.000576

HC_TL_NP_DP vs. HC_AL_NP_PP

0.000726

BF_TL_PR_DP vs. BF_AL_PR_PP

0.000726

BF_AL_PR_DP vs. HC_AL_PR_PP

0.000914

BF_AL_PR_DP vs. HC_AL_PR_DP

0.000914

BF_TL_NP_PP vs. BF_AL_NP_PP

0.000914

BF_TL_NP_DP vs. BF_AL_NP_PP

0.001136

BF_TL_PR_PP vs. BF_AL_NP_PP

0.002203

HC_TL_PR_PP vs. BF_AL_NP_PP

0.002456

BF_AL_NP_DP vs. BF_AL_NP_PP

0.00358

HC_TL_NP_DP vs. BF_AL_NP_PP

0.00518

BF_AL_PR_DP vs. BF_AL_PR_PP

0.006378

 

Table 7. Significant p-values according to Holm’s test for the 16 PART variant comparison for the Error metric

Comparison

p-value (Holm adjusted)

BF_TL_PR_DP vs. HC_AL_NP_DP

0

BF_TL_PR_DP vs. HC_AL_NP_PP

0

BF_TL_PR_DP vs. BF_AL_NP_PP

0

BF_TL_PR_DP vs. HC_AL_PR_PP

0.000086

BF_TL_PR_DP vs. HC_AL_PR_DP

0.000086

BF_TL_PR_DP vs. BF_AL_PR_PP

0.000934

 

 


Table 8. Complexity values of the sixteen variants of the PART algorithm for each dataset

Complexity

Hill Climbing

Best-First

Treated Leaves

All Leaves

Treated Leaves

All Leaves

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

PART

HC_TL_NP_DP

HC_TL_PR_PP

HC_TL_PR_DP

HC_AL_NP_PP

HC_AL_NP_DP

HC_AL_PR_PP

HC_AL_PR_DP

BF_TL_NP_PP

BF_TL_NP_DP

BF_TL_PR_PP

BF_TL_PR_DP

BF_AL_NP_PP

BF_AL_NP_DP

BF_AL_PR_PP

BF_AL_PR_DP

breast-w

26.92

42.00

9.56

11.32

12.00

12.00

5.32

5.32

29.62

40.58

8.04

5.98

13.28

25.18

5.54

7.32

heartc

23.70

23.92

6.04

5.96

7.86

7.86

4.68

4.78

28.24

25.58

6.12

7.80

13.04

25.08

5.74

7.90

spam

47.12

49.10

9.10

8.90

6.92

6.92

4.48

4.70

42.44

30.80

8.44

17.90

8.24

33.70

5.04

19.16

hypo

11.58

10.72

3.14

3.14

8.88

8.88

3.04

3.04

18.80

19.12

3.90

4.36

16.20

17.08

3.90

4.36

liver

8.28

9.86

5.94

6.08

7.20

7.20

5.82

5.96

9.12

10.48

7.00

7.82

8.28

9.44

7.02

7.74

lymph

18.04

18.24

6.02

6.30

8.56

8.56

5.60

5.64

19.32

17.72

5.32

7.40

10.04

18.34

5.12

7.28

lymph2

15.00

15.38

5.14

5.16

7.46

7.46

3.12

3.12

16.14

16.24

3.48

5.78

10.34

15.68

3.44

5.70

credit_a

41.08

43.88

2.24

2.24

11.68

11.68

2.24

2.24

70.40

57.24

3.74

8.32

29.44

67.52

3.74

8.24

vehicle

32.78

38.12

19.64

23.06

6.70

6.70

6.62

6.66

32.02

42.00

18.72

22.92

9.20

28.14

7.62

22.38

vehicle2

13.64

16.16

9.10

8.30

5.70

5.70

4.50

4.50

13.50

17.34

8.68

10.54

8.66

13.18

7.68

10.70

iris

3.90

3.74

3.08

3.12

3.90

3.90

3.08

3.12

3.90

3.64

3.08

3.08

3.90

3.90

3.08

3.08

iris2

2.00

2.00

2.00

2.00

2.00

2.00

2.00

2.00

2.00

2.00

2.00

2.00

2.00

2.00

2.00

2.00

glass

15.78

15.66

10.88

11.36

3.38

3.38

3.20

3.20

15.68

15.24

11.56

13.06

7.62

15.56

6.36

13.16

glass2

4.92

5.52

4.80

4.76

4.92

4.92

4.80

4.80

4.82

4.96

4.76

4.70

4.82

4.58

4.76

4.84

breast-y

41.00

42.80

3.14

3.14

12.48

12.48

3.02

3.02

52.54

47.56

3.12

3.38

15.20

47.34

3.08

3.38

voting

10.14

10.00

2.46

2.46

6.48

6.48

2.46

2.46

20.72

20.00

4.80

4.80

26.86

27.64

4.80

4.80

heart-h

17.92

18.08

2.84

2.78

7.78

7.78

2.76

2.70

20.76

21.50

3.18

3.26

17.50

21.68

3.18

3.26

hepatitis

10.90

10.08

3.72

3.72

7.76

7.76

3.66

3.66

12.30

10.76

5.02

5.00

12.20

12.18

5.02

5.10

credit_g

102.64

109.64

1.00

1.00

13.60

13.60

1.00

1.00

114.36

125.96

1.00

18.54

20.78

97.12

1.00

18.40

soybean_15CL

30.40

30.10

22.68

24.04

5.56

5.56

5.56

5.56

32.56

31.34

23.14

24.84

6.06

30.54

6.04

24.56

soybean_15CL2

17.82

13.54

3.16

2.96

10.18

10.18

3.16

3.16

17.82

15.60

3.70

3.76

8.78

16.92

3.70

4.00

segment2310

27.74

29.46

14.14

20.20

4.96

4.96

5.42

4.96

28.52

28.44

15.34

23.36

5.64

25.96

7.08

21.38

segment23102

7.16

15.64

2.20

2.36

3.54

3.54

3.30

3.30

8.56

7.94

5.72

6.08

8.48

8.56

5.72

6.08

segment210

10.38

10.52

9.54

9.62

3.18

3.18

3.18

3.18

10.60

9.86

9.40

9.06

3.52

10.04

3.40

9.16

segment2102

4.62

3.02

4.62

3.02

4.62

4.62

4.62

4.62

4.62

3.80

4.48

3.80

4.62

4.62

4.48

4.62

sick-euthyroid

25.30

23.90

3.46

3.10

11.14

11.14

2.06

2.06

27.92

42.94

5.48

7.54

16.98

31.18

5.44

7.76

bands

36.44

36.90

5.90

7.04

5.44

5.44

4.02

4.02

39.24

37.24

9.50

16.50

10.74

34.46

5.06

17.48

ks-vs-kp

22.88

17.22

12.74

9.26

7.46

7.46

5.32

4.28

25.76

22.14

10.30

9.44

8.84

24.18

8.72

15.62

optdigits2

72.54

112.10

7.48

7.46

8.90

8.90

4.58

4.62

74.96

109.64

7.16

13.16

8.86

58.34

4.48

9.60

car2

37.62

61.52

12.06

23.56

11.90

11.90

7.82

7.94

33.52

24.02

12.46

19.66

15.16

28.38

9.96

21.90

abalone2

9.42

9.52

1.06

1.06

7.32

7.32

1.06

1.06

9.34

12.24

1.00

1.00

8.96

10.38

1.00

1.00

solar_flare

46.00

47.60

1.00

1.00

11.70

11.70

1.00

1.00

76.02

63.14

1.56

2.52

25.92

55.10

1.56

2.52

yeast2

11.98

13.74

3.30

3.36

5.86

5.86

2.44

2.48

11.36

10.88

5.88

8.14

8.26

10.26

5.26

7.98

splice_junction2

75.10

104.00

1.00

1.00

11.76

11.76

1.00

1.00

93.94

115.22

1.00

22.26

11.76

81.48

1.00

21.02

kddcup

10.16

13.86

6.70

7.44

7.54

7.54

6.46

6.30

10.70

9.98

5.98

7.34

7.90

9.86

5.84

7.18

pima

7.28

8.90

5.58

6.08

6.06

6.06

6.34

6.48

8.58

8.78

5.94

7.02

6.92

8.52

6.10

6.82

Mean

25.01

28.79

6.29

6.87

7.57

7.57

3.85

3.83

28.91

30.05

6.67

9.50

11.25

25.95

4.80

9.65

Median

17.87

16.69

4.97

4.96

7.39

7.39

3.48

3.48

20.02

19.56

5.60

7.47

8.91

20.01

5.03

7.53

 

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

 

Table 9. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the Complexity metric

Comparison

p-value (Shaffer adjusted)

HC_AL_PR_PP vs. BF_TL_NP_PP

0

HC_AL_PR_DP vs. BF_TL_NP_PP

0

HC_AL_PR_PP vs. BF_TL_NP_DP

0

HC_AL_PR_PP vs. HC_TL_NP_DP

0

HC_AL_PR_DP vs. BF_TL_NP_DP

0

HC_AL_PR_DP vs. HC_TL_NP_DP

0

HC_AL_PR_PP vs. BF_AL_NP_DP

0

HC_AL_PR_PP vs. HC_TL_NP_PP

0

BF_AL_PR_PP vs. BF_TL_NP_PP

0

HC_AL_PR_DP vs. BF_AL_NP_DP

0

HC_AL_PR_DP vs. HC_TL_NP_PP

0

BF_AL_PR_PP vs. BF_TL_NP_DP

0

BF_AL_PR_PP vs. HC_TL_NP_DP

0

HC_TL_PR_PP vs. BF_TL_NP_PP

0

HC_TL_PR_DP vs. BF_TL_NP_PP

0

BF_AL_PR_PP vs. BF_AL_NP_DP

0

BF_TL_PR_PP vs. BF_TL_NP_PP

0

BF_AL_PR_PP vs. HC_TL_NP_PP

0

HC_TL_PR_PP vs. BF_TL_NP_DP

0

HC_TL_PR_DP vs. BF_TL_NP_DP

0

HC_TL_PR_PP vs. HC_TL_NP_DP

0

BF_TL_PR_PP vs. BF_TL_NP_DP

0

HC_TL_PR_DP vs. HC_TL_NP_DP

0

BF_TL_PR_PP vs. HC_TL_NP_DP

0

HC_TL_PR_PP vs. BF_AL_NP_DP

0

HC_TL_PR_PP vs. HC_TL_NP_PP

0

HC_TL_PR_DP vs. BF_AL_NP_DP

0

HC_TL_PR_DP vs. HC_TL_NP_PP

0

BF_TL_PR_PP vs. BF_AL_NP_DP

0

BF_TL_PR_PP vs. HC_TL_NP_PP

0

HC_AL_PR_PP vs. BF_AL_NP_PP

0

HC_AL_NP_DP vs. BF_TL_NP_PP

0

HC_AL_NP_PP vs. BF_TL_NP_PP

0

HC_AL_PR_DP vs. BF_AL_NP_PP

0

BF_TL_PR_DP vs. BF_TL_NP_PP

0.000001

HC_AL_NP_DP vs. BF_TL_NP_DP

0.000003

HC_AL_NP_PP vs. BF_TL_NP_DP

0.000003

HC_AL_NP_DP vs. HC_TL_NP_DP

0.00001

HC_AL_NP_PP vs. HC_TL_NP_DP

0.00001

BF_AL_PR_DP vs. BF_TL_NP_PP

0.000024

BF_TL_PR_DP vs. BF_TL_NP_DP

0.000027

HC_AL_PR_PP vs. BF_AL_PR_DP

0.000055

BF_TL_PR_DP vs. HC_TL_NP_DP

0.000074

HC_AL_NP_DP vs. BF_AL_NP_DP

0.000074

HC_AL_NP_PP vs. BF_AL_NP_DP

0.000074

HC_AL_NP_DP vs. HC_TL_NP_PP

0.0001

HC_AL_NP_PP vs. HC_TL_NP_PP

0.0001

HC_AL_PR_DP vs. BF_AL_PR_DP

0.000154

BF_AL_PR_PP vs. BF_AL_NP_PP

0.000154

BF_AL_PR_DP vs. BF_TL_NP_DP

0.000322

BF_TL_PR_DP vs. BF_AL_NP_DP

0.000458

HC_AL_PR_PP vs. BF_TL_PR_DP

0.000611

BF_TL_PR_DP vs. HC_TL_NP_PP

0.000611

BF_AL_PR_DP vs. HC_TL_NP_DP

0.00079

HC_AL_PR_DP vs. BF_TL_PR_DP

0.001609

HC_AL_PR_PP vs. HC_AL_NP_DP

0.003195

HC_AL_PR_PP vs. HC_AL_NP_PP

0.003195

BF_AL_NP_PP vs. BF_TL_NP_PP

0.003901

BF_AL_PR_DP vs. BF_AL_NP_DP

0.00411

HC_TL_PR_PP vs. BF_AL_NP_PP

0.005324

BF_AL_PR_DP vs. HC_TL_NP_PP

0.005324

HC_AL_PR_DP vs. HC_AL_NP_DP

0.007354

HC_AL_PR_DP vs. HC_AL_NP_PP

0.007354

HC_TL_PR_DP vs. BF_AL_NP_PP

0.010072

BF_AL_PR_PP vs. BF_AL_PR_DP

0.016882

BF_TL_PR_PP vs. BF_AL_NP_PP

0.019123

BF_AL_NP_PP vs. BF_TL_NP_DP

0.028813

 

Table 10. Significant p-values according to Holm’s test for the 16 PART variant comparison for the Complexity metric

Comparison

p-value (Holm adjusted)

HC_AL_PR_PP vs. BF_TL_NP_PP

0

HC_AL_PR_PP vs. BF_TL_NP_DP

0

HC_AL_PR_PP vs. HC_TL_NP_DP

0

HC_AL_PR_PP vs. BF_AL_NP_DP

0

HC_AL_PR_PP vs. HC_TL_NP_PP

0

HC_AL_PR_PP vs. BF_AL_NP_PP

0

HC_AL_PR_PP vs. BF_AL_PR_DP

0.000055

HC_AL_PR_PP vs. BF_TL_PR_DP

0.000611

HC_AL_PR_PP vs. HC_AL_NP_DP

0.003195

HC_AL_PR_PP vs. HC_AL_NP_PP

0.003195

 


Table 11. Length values of the sixteen variants of the PART algorithm for each dataset

Length

Hill Climbing

Best-First

Treated Leaves

All Leaves

Treated Leaves

All Leaves

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

PART

HC_TL_NP_DP

HC_TL_PR_PP

HC_TL_PR_DP

HC_AL_NP_PP

HC_AL_NP_DP

HC_AL_PR_PP

HC_AL_PR_DP

BF_TL_NP_PP

BF_TL_NP_DP

BF_TL_PR_PP

BF_TL_PR_DP

BF_AL_NP_PP

BF_AL_NP_DP

BF_AL_PR_PP

BF_AL_PR_DP

breast-w

1.54

1.98

0.91

1.19

1.14

1.14

0.81

0.81

1.66

1.83

0.90

0.97

1.27

1.54

0.83

0.95

heartc

2.59

3.34

1.21

1.32

1.44

1.44

0.96

0.99

3.12

2.98

1.31

1.74

2.43

2.85

1.29

1.70

spam

5.01

7.96

2.62

2.92

3.04

3.04

0.93

0.99

6.87

16.97

3.15

8.90

3.93

10.70

1.85

7.87

hypo

2.69

3.09

1.30

1.30

2.00

2.00

1.27

1.27

4.36

4.26

1.38

1.45

4.03

4.12

1.38

1.44

liver

2.48

3.01

1.59

1.86

1.20

1.20

1.10

1.10

2.58

2.79

1.68

2.19

2.00

2.61

1.58

2.14

lymph

2.61

3.19

1.54

1.58

1.92

1.92

1.49

1.49

2.83

2.98

1.59

2.13

2.34

2.93

1.56

2.12

lymph2

1.90

2.47

1.05

1.13

1.21

1.21

0.86

0.87

2.17

2.31

1.16

1.66

1.66

2.18

1.13

1.60

credit_a

2.44

3.08

0.57

0.57

1.72

1.72

0.54

0.54

3.33

3.67

0.82

1.69

2.49

3.40

0.81

1.68

vehicle

3.72

4.67

3.44

4.01

1.16

1.16

1.15

1.17

4.39

5.03

3.32

4.60

2.32

5.01

1.82

4.35

vehicle2

2.37

4.09

1.90

3.08

1.20

1.20

1.06

1.06

2.91

3.23

1.93

2.51

1.97

2.89

1.71

2.61

iris

1.00

1.55

1.00

1.36

1.00

1.00

1.00

1.00

1.02

1.12

1.00

1.00

1.02

1.02

1.00

1.00

iris2

0.50

0.50

0.50

0.50

0.50

0.50

0.50

0.50

0.50

0.50

0.50

0.50

0.50

0.50

0.50

0.50

glass

2.80

3.09

2.36

2.50

0.77

0.77

0.75

0.75

3.23

3.74

2.70

3.20

2.31

3.48

2.03

3.07

glass2

1.49

2.13

1.44

2.00

1.49

1.49

1.44

1.45

1.58

2.04

1.38

1.72

1.58

1.57

1.38

1.46

breast-y

2.50

2.65

0.64

0.65

1.35

1.35

0.63

0.63

2.93

2.81

0.65

0.73

1.66

2.93

0.65

0.73

voting

1.93

1.96

0.58

0.58

1.16

1.16

0.58

0.58

5.12

5.07

1.00

1.00

5.33

5.36

1.00

1.00

heart-h

2.52

2.68

0.84

0.83

1.57

1.57

0.81

0.80

3.13

3.35

0.96

1.01

2.92

3.19

0.96

1.01

hepatitis

2.41

3.10

0.89

0.92

1.89

1.89

0.88

0.88

2.66

2.98

1.76

1.85

2.64

2.65

1.76

1.79

credit_g

3.25

3.97

0.00

0.00

1.91

1.91

0.00

0.00

3.83

4.82

0.00

2.58

2.63

4.28

0.00

2.56

soybean_15CL

2.67

5.29

2.45

4.38

1.36

1.36

1.28

1.28

3.02

5.34

2.72

4.29

1.44

3.21

1.36

3.16

soybean_15CL2

2.03

4.60

0.55

0.72

1.56

1.56

0.54

0.54

2.04

3.85

0.68

0.87

1.66

2.03

0.68

0.75

segment2310

3.06

4.27

1.98

3.67

1.18

1.18

1.19

1.18

3.51

6.29

2.19

4.78

1.29

4.29

1.41

3.68

segment23102

1.73

2.81

1.02

1.07

0.71

0.71

0.69

0.69

1.76

2.19

1.40

2.14

1.76

1.84

1.40

1.48

segment210

1.85

3.11

1.79

2.83

0.82

0.82

0.82

0.82

2.00

3.26

1.78

2.96

0.83

2.01

0.87

2.04

segment2102

1.14

1.48

1.14

1.48

1.14

1.14

1.14

1.14

1.14

1.36

1.10

1.36

1.14

1.14

1.10

1.14

sick-euthyroid

4.61

6.43

0.52

0.95

2.46

2.46

0.40

0.40

5.63

6.08

1.37

2.02

4.26

6.04

1.37

1.98

bands

2.92

4.08

1.39

2.02

1.07

1.07

1.03

1.03

3.77

5.02

1.92

4.48

2.28

4.21

1.76

3.56

ks-vs-kp

3.11

7.23

2.01

4.77

2.28

2.28

1.67

1.63

4.18

6.89

2.01

5.06

2.10

4.47

2.00

3.39

optdigits2

1.76

3.60

1.21

1.45

2.08

2.08

1.20

1.14

1.89

3.41

1.25

1.75

2.09

2.36

1.23

1.62

car2

2.07

4.12

1.00

3.07

0.92

0.92

0.86

0.86

2.03

2.93

1.07

1.91

1.21

1.86

0.94

1.55

abalone2

1.90

2.09

0.02

0.02

1.70

1.70

0.02

0.02

1.95

2.53

0.00

0.00

1.91

2.18

0.00

0.00

solar_flare

3.10

3.33

0.00

0.00

1.59

1.59

0.00

0.00

4.19

4.62

0.19

0.51

2.49

4.17

0.19

0.51

yeast2

2.88

3.46

0.60

0.66

1.31

1.31

0.54

0.54

3.03

3.44

1.63

2.30

2.30

3.06

1.46

2.27

splice_junction2

2.43

4.37

0.00

0.00

0.91

0.91

0.00

0.00

2.85

4.35

0.00

2.43

0.91

3.60

0.00

1.75

kddcup

2.56

3.74

1.87

2.65

2.05

2.05

1.83

1.99

2.76

3.23

1.88

2.85

2.21

2.55

1.80

2.40

pima

2.02

2.31

1.44

1.73

1.11

1.11

1.28

1.29

2.16

2.40

1.44

1.98

1.54

2.23

1.39

1.92

Mean

2.43

3.47

1.21

1.66

1.44

1.44

0.87

0.87

2.95

3.88

1.38

2.31

2.12

3.18

1.17

2.02

Median

2.46

3.15

1.09

1.34

1.33

1.33

0.87

0.87

2.88

3.30

1.38

1.94

2.05

2.91

1.33

1.73

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

Table 12. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the Length metric

Comparison

p-value (Shaffer adjusted)

HC_AL_PR_PP vs. BF_TL_NP_DP

0

HC_AL_PR_DP vs. BF_TL_NP_DP

0

HC_AL_PR_PP vs. HC_TL_NP_DP

0

HC_AL_PR_DP vs. HC_TL_NP_DP

0

HC_AL_PR_PP vs. BF_AL_NP_DP

0

HC_AL_PR_DP vs. BF_AL_NP_DP

0

BF_AL_PR_PP vs. BF_TL_NP_DP

0

HC_TL_PR_PP vs. BF_TL_NP_DP

0

HC_AL_PR_PP vs. BF_TL_NP_PP

0

BF_AL_PR_PP vs. HC_TL_NP_DP

0

HC_TL_PR_PP vs. HC_TL_NP_DP

0

HC_AL_PR_DP vs. BF_TL_NP_PP

0

BF_TL_PR_PP vs. BF_TL_NP_DP

0

BF_TL_PR_PP vs. HC_TL_NP_DP

0

BF_AL_PR_PP vs. BF_AL_NP_DP

0

HC_TL_PR_PP vs. BF_AL_NP_DP

0

HC_AL_NP_DP vs. BF_TL_NP_DP

0

HC_AL_NP_PP vs. BF_TL_NP_DP

0

HC_AL_PR_PP vs. HC_TL_NP_PP

0

HC_AL_PR_PP vs. BF_TL_PR_DP

0

HC_AL_NP_DP vs. HC_TL_NP_DP

0

HC_AL_NP_PP vs. HC_TL_NP_DP

0

HC_AL_PR_DP vs. HC_TL_NP_PP

0

BF_AL_PR_PP vs. BF_TL_NP_PP

0

HC_TL_PR_PP vs. BF_TL_NP_PP

0

HC_AL_PR_DP vs. BF_TL_PR_DP

0

BF_TL_PR_PP vs. BF_AL_NP_DP

0

HC_AL_PR_PP vs. BF_AL_NP_PP

0

HC_TL_PR_DP vs. BF_TL_NP_DP

0

HC_AL_PR_DP vs. BF_AL_NP_PP

0

BF_TL_PR_PP vs. BF_TL_NP_PP

0

HC_TL_PR_DP vs. HC_TL_NP_DP

0

HC_AL_NP_DP vs. BF_AL_NP_DP

0.000001

HC_AL_NP_PP vs. BF_AL_NP_DP

0.000001

BF_AL_PR_DP vs. BF_TL_NP_DP

0.000001

HC_AL_PR_PP vs. BF_AL_PR_DP

0.000003

BF_AL_PR_DP vs. HC_TL_NP_DP

0.000008

HC_AL_PR_DP vs. BF_AL_PR_DP

0.00001

HC_AL_NP_DP vs. BF_TL_NP_PP

0.000017

HC_AL_NP_PP vs. BF_TL_NP_PP

0.000017

BF_AL_PR_PP vs. HC_TL_NP_PP

0.000017

HC_TL_PR_PP vs. HC_TL_NP_PP

0.000018

BF_AL_PR_PP vs. BF_TL_PR_DP

0.00002

HC_TL_PR_PP vs. BF_TL_PR_DP

0.00002

HC_AL_PR_PP vs. HC_TL_PR_DP

0.000106

HC_TL_PR_DP vs. BF_AL_NP_DP

0.000136

BF_AL_NP_PP vs. BF_TL_NP_DP

0.000196

HC_AL_PR_DP vs. HC_TL_PR_DP

0.000265

BF_AL_PR_PP vs. BF_AL_NP_PP

0.000536

HC_TL_PR_PP vs. BF_AL_NP_PP

0.000545

BF_AL_NP_PP vs. HC_TL_NP_DP

0.000964

BF_TL_PR_PP vs. HC_TL_NP_PP

0.001682

BF_TL_PR_PP vs. BF_TL_PR_DP

0.001953

HC_TL_PR_DP vs. BF_TL_NP_PP

0.002145

BF_AL_PR_DP vs. BF_AL_NP_DP

0.002766

BF_TL_PR_DP vs. BF_TL_NP_DP

0.003743

HC_TL_NP_PP vs. BF_TL_NP_DP

0.00411

HC_AL_PR_PP vs. HC_AL_NP_DP

0.009289

HC_AL_PR_PP vs. HC_AL_NP_PP

0.009289

BF_TL_PR_DP vs. HC_TL_NP_DP

0.014456

HC_TL_NP_PP vs. HC_TL_NP_DP

0.016435

HC_AL_PR_DP vs. HC_AL_NP_DP

0.018655

HC_AL_PR_DP vs. HC_AL_NP_PP

0.018655

BF_TL_PR_PP vs. BF_AL_NP_PP

0.025064

BF_AL_PR_DP vs. BF_TL_NP_PP

0.027022

BF_AL_PR_PP vs. BF_AL_PR_DP

0.027022

HC_TL_PR_PP vs. BF_AL_PR_DP

0.027022

HC_AL_NP_DP vs. HC_TL_NP_PP

0.036178

HC_AL_NP_PP vs. HC_TL_NP_PP

0.036178

HC_AL_NP_DP vs. BF_TL_PR_DP

0.040604

HC_AL_NP_PP vs. BF_TL_PR_DP

0.040604

 

Table 13. Significant p-values according to Holm’s test for the 16 PART variant comparison for the Length metric

Comparison

p-value (Holm adjusted)

HC_AL_PR_PP vs. BF_TL_NP_DP

0

HC_AL_PR_PP vs. HC_TL_NP_DP

0

HC_AL_PR_PP vs. BF_AL_NP_DP

0

HC_AL_PR_PP vs. BF_TL_NP_PP

0

HC_AL_PR_PP vs. HC_TL_NP_PP

0

HC_AL_PR_PP vs. BF_TL_PR_DP

0

HC_AL_PR_PP vs. BF_AL_NP_PP

0

HC_AL_PR_PP vs. BF_AL_PR_DP

0.000004

HC_AL_PR_PP vs. HC_TL_PR_DP

0.000112

HC_AL_PR_PP vs. HC_AL_NP_DP

0.009594

HC_AL_PR_PP vs. HC_AL_NP_PP

0.009594


Table 14. Time values of the sixteen variants of the PART algorithm for each dataset

Time

Hill Climbing

Best-First

Treated Leaves

All Leaves

Treated Leaves

All Leaves

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

No Pruning

Pruning

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

Prioritize Pures

Do not Prioritize

PART

HC_TL_NP_DP

HC_TL_PR_PP

HC_TL_PR_DP

HC_AL_NP_PP

HC_AL_NP_DP

HC_AL_PR_PP

HC_AL_PR_DP

BF_TL_NP_PP

BF_TL_NP_DP

BF_TL_PR_PP

BF_TL_PR_DP

BF_AL_NP_PP

BF_AL_NP_DP

BF_AL_PR_PP

BF_AL_PR_DP

breast-w

26.14

40.50

10.84

15.02

10.66

13.50

8.80

8.68

29.30

50.26

10.94

13.78

14.40

32.46

8.52

14.66

heartc

33.42

38.62

12.42

12.94

11.82

12.20

10.08

10.26

60.28

84.52

15.26

21.02

30.74

58.02

15.16

21.10

spam

5005.74

6347.72

1957.70

1941.84

727.20

726.82

689.98

689.16

6713.92

11330.38

2328.40

5925.82

1721.82

10312.92

1379.28

6002.66

hypo

116.96

117.48

82.40

80.52

96.70

97.96

80.20

80.18

238.28

563.74

92.66

92.62

219.86

230.96

93.88

93.50

liver

19.02

20.84

14.14

12.84

11.92

12.44

10.86

10.34

21.86

29.32

17.82

27.14

19.66

27.34

16.14

26.22

lymph

25.60

29.90

34.66

23.78

19.96

8.48

12.18

4.96

29.74

31.74

22.48

22.50

27.28

22.46

19.38

13.46

lymph2

28.04

22.06

51.52

20.54

17.12

6.82

11.80

5.34

32.14

28.72

20.02

21.90

34.94

21.52

24.36

10.68

credit_a

170.96

183.74

92.10

57.70

60.02

52.86

57.74

40.58

403.74

770.94

78.38

81.14

187.84

483.36

62.38

71.14

vehicle

302.64

392.50

241.14

263.42

52.38

50.32

55.86

50.58

351.86

1075.78

224.96

479.60

98.88

556.98

92.06

477.74

vehicle2

63.04

150.36

64.94

72.94

31.82

31.90

31.80

31.42

91.06

154.16

67.34

90.62

62.48

105.44

61.72

92.72

iris

5.60

7.50

3.42

5.58

5.28

4.06

4.62

5.66

4.40

4.02

4.32

5.30

4.66

4.36

5.66

5.30

iris2

3.46

1.84

2.82

3.46

2.52

2.20

2.22

4.34

2.22

1.88

1.84

3.44

4.02

2.82

2.20

2.82

glass

79.66

95.40

78.68

64.32

32.68

18.46

20.58

13.12

76.10

116.74

63.92

86.78

49.64

84.86

40.92

78.94

glass2

47.28

42.10

72.68

47.70

33.94

24.72

42.22

25.60

27.08

32.80

60.20

43.58

30.62

25.58

34.06

24.04

breast-y

32.70

35.48

4.70

5.24

10.32

12.22

5.02

4.98

66.78

88.28

5.62

8.76

17.56

86.66

8.46

10.22

voting

17.72

15.30

13.72

11.86

13.18

14.02

12.34

11.56

112.28

114.56

18.64

17.78

133.90

135.34

17.76

17.54

heart-h

43.68

43.34

23.72

16.02

18.62

18.98

9.28

10.68

85.56

90.28

20.88

20.00

66.34

89.26

19.00

19.26

hepatitis

16.56

16.18

6.20

6.54

13.70

12.04

7.24

7.74

26.88

31.26

13.66

13.76

26.18

27.16

12.82

12.44

credit_g

670.58

755.16

123.30

67.78

112.64

106.72

65.82

50.52

907.08

3124.64

99.52

260.12

156.60

1389.64

64.50

238.72

soybean_15CL

44.92

58.36

44.28

49.56

6.22

9.08

9.10

8.78

61.84

120.44

51.78

92.68

10.96

92.26

13.26

90.52

soybean_15CL2

24.66

25.02

5.94

7.16

16.60

15.52

6.54

7.78

27.16

36.54

8.66

8.76

14.16

28.34

7.72

9.70

segment2310

3262.62

5707.16

2440.46

4602.98

1072.36

1068.92

1655.84

1075.48

3607.44

16028.40

2654.52

7945.40

1153.22

4746.84

1999.56

4578.24

segment23102

1062.98

3512.80

975.64

1035.04

521.08

519.48

524.84

522.66

936.72

1034.34

853.64

900.10

936.38

945.96

850.30

858.02

segment210

107.12

138.54

118.28

125.76

37.12

38.86

37.06

37.50

115.78

152.90

113.86

145.40

42.46

138.62

42.04

136.90

segment2102

24.32

27.70

36.26

34.32

28.20

23.44

25.62

24.98

22.84

23.70

27.56

24.04

24.00

24.74

22.40

27.50

sick-euthyroid

283.60

413.10

134.80

138.24

192.26

189.70

124.54

124.48

431.16

2203.04

173.74

192.86

337.24

808.04

173.64

193.82

bands

216.78

272.02

59.26

64.72

28.06

28.74

27.84

25.26

319.56

621.16

81.82

250.76

72.34

425.16

48.70

251.50

ks-vs-kp

107.68

244.34

98.88

124.16

43.38

42.16

41.80

37.74

143.58

332.20

106.34

174.72

79.56

189.46

97.86

183.96

optdigits2

1154.08

4611.62

176.46

178.24

253.76

254.12

138.28

138.30

1221.78

5018.54

172.66

553.90

258.24

1586.44

138.46

478.70

car2

35.30

117.56

37.72

54.88

13.48

13.88

15.06

15.28

43.64

46.88

26.30

60.92

19.02

57.10

19.46

63.38

abalone2

1243.28

1251.70

616.78

616.50

970.30

969.78

610.92

613.94

1121.10

1281.36

647.06

653.60

1117.82

1257.90

650.62

655.78

solar_flare

67.34

68.88

10.40

11.46

28.30

27.42

10.30

9.64

284.50

296.12

16.86

28.42

100.48

268.28

17.18

29.70

yeast2

62.70

70.68

21.40

19.72

22.70

20.94

17.34

15.66

75.22

96.76

43.78

75.56

47.46

89.18

39.72

73.88

splice_junction2

1211.22

2718.50

44.14

33.00

63.16

63.98

26.84

27.22

2006.58

6061.32

27.92

542.00

64.78

2639.94

27.80

513.46

kddcup

208.44

636.46

182.90

342.66

178.78

179.06

167.22

168.08

260.74

557.58

193.46

360.12

229.02

263.56

191.32

234.96

pima

68.68

79.86

41.80

41.52

41.84

41.52

37.70

38.72

83.00

90.80

45.24

76.72

56.10

89.58

44.92

73.28

Mean

441.51

786.40

220.46

283.61

133.34

131.48

128.21

109.92

556.76

1436.84

233.67

536.71

207.52

759.68

176.76

435.74

Median

65.19

87.63

47.90

48.63

30.06

26.07

26.23

25.12

88.31

118.59

48.51

78.93

59.29

98.85

36.89

73.58

 

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

Table 15. Significant p-values according to Shaffer’s test for the 16 PART variant comparison for the Time metric

Comparison

p-value (Shaffer adjusted)

HC_AL_PR_DP vs. BF_TL_NP_DP

0

HC_AL_PR_PP vs. BF_TL_NP_DP

0

HC_AL_PR_DP vs. BF_AL_NP_DP

0

HC_AL_PR_PP vs. BF_AL_NP_DP

0

HC_AL_PR_DP vs. HC_TL_NP_DP

0

HC_AL_NP_DP vs. BF_TL_NP_DP

0

HC_AL_PR_PP vs. HC_TL_NP_DP

0

HC_AL_PR_DP vs. BF_TL_NP_PP

0

HC_AL_PR_PP vs. BF_TL_NP_PP

0

HC_AL_NP_PP vs. BF_TL_NP_DP

0

HC_AL_NP_DP vs. BF_AL_NP_DP

0

BF_AL_PR_PP vs. BF_TL_NP_DP

0

HC_AL_NP_DP vs. HC_TL_NP_DP

0

HC_AL_PR_DP vs. HC_TL_NP_PP

0

HC_AL_PR_DP vs. BF_TL_PR_DP

0

HC_AL_PR_PP vs. HC_TL_NP_PP

0

HC_AL_PR_PP vs. BF_TL_PR_DP

0

HC_AL_NP_PP vs. BF_AL_NP_DP

0

HC_AL_NP_DP vs. BF_TL_NP_PP

0

BF_AL_PR_PP vs. BF_AL_NP_DP

0

HC_AL_NP_PP vs. HC_TL_NP_DP

0

HC_TL_PR_PP vs. BF_TL_NP_DP

0

BF_TL_PR_PP vs. BF_TL_NP_DP

0

BF_AL_PR_PP vs. HC_TL_NP_DP

0

HC_TL_PR_DP vs. BF_TL_NP_DP

0

HC_AL_PR_DP vs. BF_AL_PR_DP

0

HC_AL_PR_PP vs. BF_AL_PR_DP

0.000001

HC_AL_PR_DP vs. BF_AL_NP_PP

0.000001

HC_AL_NP_PP vs. BF_TL_NP_PP

0.000003

HC_AL_PR_PP vs. BF_AL_NP_PP

0.000003

HC_AL_NP_DP vs. HC_TL_NP_PP

0.000003

HC_AL_NP_DP vs. BF_TL_PR_DP

0.000004

BF_AL_PR_PP vs. BF_TL_NP_PP

0.000009

HC_TL_PR_PP vs. BF_AL_NP_DP

0.000016

BF_TL_PR_PP vs. BF_AL_NP_DP

0.000026

HC_TL_PR_DP vs. BF_AL_NP_DP

0.000053

HC_TL_PR_PP vs. HC_TL_NP_DP

0.000072

BF_TL_PR_PP vs. HC_TL_NP_DP

0.000112

HC_TL_PR_DP vs. HC_TL_NP_DP

0.00022

HC_AL_NP_PP vs. HC_TL_NP_PP

0.000451

HC_AL_NP_PP vs. BF_TL_PR_DP

0.000494

HC_AL_NP_DP vs. BF_AL_PR_DP

0.000494

BF_AL_NP_PP vs. BF_TL_NP_DP

0.000732

BF_AL_PR_PP vs. HC_TL_NP_PP

0.001126

BF_AL_PR_PP vs. BF_TL_PR_DP

0.001259

HC_AL_NP_DP vs. BF_AL_NP_PP

0.001854

BF_AL_PR_DP vs. BF_TL_NP_DP

0.00271

HC_TL_PR_PP vs. BF_TL_NP_PP

0.00271

HC_AL_PR_DP vs. HC_TL_PR_DP

0.003017

BF_TL_PR_PP vs. BF_TL_NP_PP

0.00377

HC_AL_PR_DP vs. BF_TL_PR_PP

0.005159

HC_TL_PR_DP vs. BF_TL_NP_PP

0.006673

HC_AL_PR_DP vs. HC_TL_PR_PP

0.007282

HC_AL_PR_PP vs. HC_TL_PR_DP

0.007282

HC_AL_PR_PP vs. BF_TL_PR_PP

0.012252

HC_AL_PR_PP vs. HC_TL_PR_PP

0.016968

HC_AL_NP_PP vs. BF_AL_PR_DP

0.021209

BF_AL_PR_PP vs. BF_AL_PR_DP

0.048566

 

Table 16. Significant p-values according to Holm’s test for the 16 PART variant comparison for the Time metric

Comparison

p-value (Holm adjusted)

HC_AL_PR_DP vs. BF_TL_NP_DP

0

HC_AL_PR_DP vs. BF_AL_NP_DP

0

HC_AL_PR_DP vs. HC_TL_NP_DP

0

HC_AL_PR_DP vs. BF_TL_NP_PP

0

HC_AL_PR_DP vs. HC_TL_NP_PP

0

HC_AL_PR_DP vs. BF_TL_PR_DP

0

HC_AL_PR_DP vs. BF_AL_PR_DP

0

HC_AL_PR_DP vs. BF_AL_NP_PP

0.000001

HC_AL_PR_DP vs. HC_TL_PR_DP

0.003017

HC_AL_PR_DP vs. BF_TL_PR_PP

0.005234

HC_AL_PR_DP vs. HC_TL_PR_PP

0.007282


 

3. Results for C4.5, CHAID*, PART and BFPART

This section includes the full tables of the results related to the analyzed algorithms (C4.5, CHAID*, PART and the proposed BFPART) for the five performance metrics used to the comparison: AUC, Error, Complexity, Length and Time. For Complexity the unit of measurement is the number of decisions in the rule-set (number of internal nodes in the case of the trees). Length is measured as the average number of decisions on each rule of the rule-set (average number of decisions for each branch in the case of the trees). Computational cost is measured in milliseconds taken to build the full rule-set.

Table 17. AUC values of the four analyzed algorithms for each dataset

AUC

C4.5

CHAID*

PART

BFPART

breast-w

94.64

96.28

97.50

97.55

heartc

77.07

81.51

80.83

81.44

spambase

94.04

93.89

95.77

93.98

hypo

96.04

96.65

96.03

96.72

liver

63.41

63.30

65.05

63.85

lymph

80.92

77.31

81.68

85.31

lymph2

76.46

57.99

80.43

84.08

credit_a

88.10

87.69

84.32

89.13

vehicle

85.61

87.06

86.50

87.89

vehicle2

93.34

95.92

93.95

95.09

iris

96.19

96.09

96.47

97.09

iris2

99.00

99.00

99.00

99.00

glass

81.37

79.54

80.00

82.62

glass2

88.98

94.60

92.10

92.11

breast-y

61.33

66.71

60.42

61.39

voting

98.09

95.00

97.77

97.78

heart-h

75.87

77.72

86.41

77.04

hepatitis

68.51

69.77

75.62

66.71

credit_g

64.07

71.48

65.63

71.27

soybean15CL

96.62

96.51

96.82

96.47

soybean15CL2

82.97

84.25

90.37

85.68

segment2310

98.63

98.83

98.55

98.63

segment23102

98.65

98.42

98.96

98.36

segment210

93.23

93.30

93.02

94.09

segment2102

94.83

93.50

92.87

92.87

sick_euthyroid

94.06

95.75

95.70

85.84

bands

79.27

78.58

76.96

79.48

kr-vs-kp

99.76

99.62

99.64

99.72

optdigits2

82.99

95.11

96.46

93.62

car2

97.85

99.46

99.48

98.72

abalone2

50.01

50.00

72.62

50.00

solar_flare

58.63

50.00

70.12

57.78

yeast2

72.73

70.62

75.78

72.50

splice_junction2

96.46

97.35

94.05

97.66

kddcup

99.38

99.53

99.44

99.52

pima

75.14

74.40

77.93

75.70

Mean

84.84

85.08

87.34

86.02

Median

86.85

90.49

91.24

88.51

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

 

There were no significant differences found between comprehensible algorithms for the AUC metric.

 


Table 18. Error values of the four analyzed algorithms for each dataset

Error

C4.5

CHAID*

PART

BFPART

breast-w

5.58

5.09

5.21

5.95

heartc

24.75

20.50

21.12

21.79

spambase

7.06

8.39

6.15

7.42

hypo

0.73

0.86

0.93

0.75

liver

35.50

33.47

35.05

36.00

lymph

20.44

26.51

21.78

19.05

lymph2

22.63

19.38

22.12

17.41

credit_a

14.55

15.98

16.78

14.72

vehicle

27.68

31.01

27.81

27.89

vehicle2

5.41

6.52

5.34

5.56

iris

5.73

5.60

6.00

5.07

iris2

0.67

0.67

0.67

0.67

glass

31.63

34.35

31.17

31.46

glass2

6.66

6.82

6.54

6.92

breast-y

25.67

24.95

35.62

25.45

voting

3.45

4.94

4.88

3.73

heart-h

20.90

20.14

19.74

19.81

hepatitis

20.16

19.86

19.70

19.06

credit_g

28.48

27.59

29.30

27.26

soybean15CL

10.90

12.36

9.52

11.10

soybean15CL2

5.86

6.21

7.10

5.79

segment2310

3.21

4.19

3.31

3.63

segment23102

0.68

0.62

0.55

0.74

segment210

13.62

14.19

13.71

12.10

segment2102

2.57

2.67

4.48

4.48

sick_euthyroid

1.95

2.23

2.81

3.69

bands

22.97

26.70

24.68

25.02

kr-vs-kp

0.55

0.60

0.73

0.94

optdigits2

5.78

1.65

1.82

3.11

car2

4.75

1.67

1.86

3.14

abalone2

8.67

8.67

8.75

8.67

solar_flare

15.52

15.69

16.90

15.58

yeast2

24.57

24.13

24.77

24.62

splice_junction2

4.10

4.88

4.98

4.27

kddcup

0.46

0.58

0.46

0.63

pima

25.88

25.47

26.97

26.17

Mean

12.77

12.92

13.04

12.49

Median

9.78

10.51

9.13

9.88

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

 

There were no significant differences found between comprehensible algorithms for the Error metric.


 

Table 19. Complexity values of the four analyzed algorithms for each dataset

Complexity

C4.5

CHAID*

PART

BFPART

breast-w

28.90

12.24

26.92

7.32

heartc

27.22

8.62

23.70

7.90

spambase

100.76

57.08

47.12

19.16

hypo

6.28

7.40

11.58

4.36

liver

24.28

4.08

8.28

7.74

lymph

20.74

11.02

18.04

7.28

lymph2

17.84

7.86

15.00

5.70

credit_a

23.06

5.62

41.08

8.24

vehicle

67.06

31.30

32.78

22.38

vehicle2

19.68

12.38

13.64

10.70

iris

4.72

3.96

3.90

3.08

iris2

2.00

2.00

2.00

2.00

glass

23.28

11.58

15.78

13.16

glass2

6.82

4.62

4.92

4.84

breast-y

8.16

3.90

41.00

3.38

voting

5.80

3.92

10.14

4.80

heart-h

6.78

4.10

17.92

3.26

hepatitis

9.52

5.70

10.90

5.10

credit_g

92.84

16.36

102.64

18.40

soybean15CL

48.28

25.30

30.40

24.56

soybean15CL2

8.40

4.80

17.82

4.00

segment2310

41.22

36.76

27.74

21.38

segment23102

8.74

7.70

7.16

6.08

segment210

12.70

13.02

10.38

9.16

segment2102

4.44

4.24

4.62

4.62

sick_euthyroid

12.86

8.96

25.30

7.76

bands

59.24

18.38

36.44

17.48

kr-vs-kp

29.22

34.54

22.88

15.62

optdigits2

415.40

67.76

72.54

9.60

car2

70.92

32.50

37.62

21.90

abalone2

1.06

1.00

9.42

1.00

solar_flare

2.56

1.00

46.00

2.52

yeast2

33.98

7.10

11.98

7.98

splice_junction2

121.68

19.54

75.10

21.02

kddcup

51.98

15.72

10.16

7.18

pima

22.50

5.18

7.28

6.82

Mean

40.03

14.37

25.01

9.65

Median

21.62

8.24

17.87

7.75

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

 

Table 20. Significant p-values according to Shaffer’s test for the comprehensible algorithm comparison for the Complexity metric

Comparison

p-value (Shaffer adjusted)

BFPART vs. C4.5

0

BFPART vs. PART

0

C4.5 vs. CHAID*

0.000001

PART vs. CHAID*

0.000546

 

Table 21. Significant p-values according to Holm’s test for the comprehensible algorithm comparison for the Complexity metric

Comparison

p-value (Holm adjusted)

BFPART vs. PART

0

BFPART vs. C4.5

0


Table 24. Length values of the four analyzed algorithms for each dataset

Length

C4.5

CHAID*

PART

BFPART

breast-w

1.92

2.19

1.54

0.95

heartc

3.45

3.16

2.59

1.70

spambase

11.08

7.65

5.01

7.87

hypo

3.45

3.20

2.69

1.44

liver

5.71

2.21

2.48

2.14

lymph

3.35

3.34

2.61

2.12

lymph2

3.17

2.53

1.90

1.60

credit_a

4.44

2.46

2.44

1.68

vehicle

7.88

5.68

3.72

4.35

vehicle2

6.77

4.43

2.37

2.61

iris

2.65

2.23

1.00

1.00

iris2

1.00

1.00

0.50

0.50

glass

5.87

4.23

2.80

3.07

glass2

3.02

2.25

1.49

1.46

breast-y

1.97

2.12

2.50

0.73

voting

3.22

1.49

1.93

1.00

heart-h

2.46

2.13

2.52

1.01

hepatitis

3.92

2.70

2.41

1.79

credit_g

5.78

3.51

3.25

2.56

soybean15CL

6.16

4.82

2.67

3.16

soybean15CL2

1.79

1.90

2.03

0.75

segment2310

7.44

7.45

3.06

3.68

segment23102

3.57

3.20

1.73

1.48

segment210

4.92

5.52

1.85

2.04

segment2102

2.32

2.31

1.14

1.14

sick_euthyroid

5.05

4.20

4.61

1.98

bands

7.20

4.20

2.92

3.56

kr-vs-kp

7.79

7.76

3.11

3.39

optdigits2

2.68

3.59

1.76

1.62

car2

4.32

5.01

2.07

1.55

abalone2

0.05

0.00

1.90

0.00

solar_flare

1.30

0.00

3.10

0.51

yeast2

7.59

3.20

2.88

2.27

splice_junction2

5.55

6.10

2.43

1.75

kddcup

5.90

4.31

2.56

2.40

pima

5.69

2.43

2.02

1.92

Mean

4.46

3.46

2.43

2.02

Median

4.38

3.20

2.46

1.77

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

 

Table 22. Significant p-values according to Shaffer’s test for the comprehensible algorithm comparison for the Length metric

Comparison

p-value (Shaffer adjusted)

BFPART vs. C4.5

0

BFPART vs. CHAID*

0.000001

PART vs. C4.5

0.000001

PART vs. CHAID*

0.0121

BFPART vs. PART

0.039843

C4.5 vs. CHAID

0.039843

 

Table 23. Significant p-values according to Holm’s test for the comprehensible algorithm comparison for the Length metric

Comparison

p-value (Holm adjusted)

BFPART vs. C4.5

0

BFPART vs. CHAID*

0.000001

BFPART vs. PART

0.039843


Table 27. Time values of the four analyzed algorithms for each dataset

Time

C4.5

CHAID*

PART

BFPART

breast-w

8.40

71.80

26.14

14.66

heartc

15.64

100.06

33.42

21.10

spambase

1354.78

31639.10

5005.74

6002.66

hypo

65.22

1165.22

116.96

93.50

liver

15.44

34.98

19.02

26.22

lymph

7.42

71.40

25.60

13.46

lymph2

6.70

42.74

28.04

10.68

credit_a

31.86

280.84

170.96

71.14

vehicle

76.70

816.82

302.64

477.74

vehicle2

33.76

426.76

63.04

92.72

iris

4.68

18.08

5.60

5.30

iris2

5.32

9.02

3.46

2.82

glass

24.78

167.48

79.66

78.94

glass2

10.96

104.60

47.28

24.04

breast-y

8.66

21.54

32.70

10.22

voting

15.92

81.44

17.72

17.54

heart-h

16.08

66.08

43.68

19.26

hepatitis

9.94

56.76

16.56

12.44

credit_g

59.14

587.98

670.58

238.72

soybean15CL

12.38

105.14

44.92

90.52

soybean15CL2

8.44

40.54

24.66

9.70

segment2310

1067.74

22230.28

3262.62

4578.24

segment23102

625.90

14286.46

1062.98

858.02

segment210

46.30

442.44

107.12

136.90

segment2102

21.20

246.38

24.32

27.50

sick_euthyroid

129.76

1312.00

283.60

193.82

bands

50.44

539.84

216.78

251.50

kr-vs-kp

38.38

515.12

107.68

183.96

optdigits2

183.84

4186.40

1154.08

478.70

car2

15.08

89.00

35.30

63.38

abalone2

526.04

8643.60

1243.28

655.78

solar_flare

23.98

24.32

67.34

29.70

yeast2

32.38

291.26

62.70

73.88

splice_junction2

73.86

1072.28

1211.22

513.46

kddcup

107.36

4529527.60

208.44

234.96

pima

34.38

373.78

68.68

73.28

Mean

132.47

128324.70

441.51

435.74

Median

32.12

286.05

68.01

76.41

This table can be downloaded as an Excel document or as a CSV file by clicking on the following links xls and csv.

 

Table 24. Significant p-values according to Shaffer’s test for the comprehensible algorithm comparison for the Time metric

Comparison

p-value (Shaffer adjusted)

C4.5 vs. CHAID*

0

C4.5 vs. PART

0

BFPART vs. CHAID*

0.000004

C4.5 vs. BFPART

0.000035

PART vs. CHAID*

0.000252

 

Table 25. Significant p-values according to Holm’s test for the comprehensible algorithm comparison for the Time metric

Comparison

p-value (Holm adjusted)

C4.5 vs. CHAID*

0

C4.5 vs. PART

0

C4.5 vs. BFPART

0.000035

 

References

[1] G. M. Weiss and F. Provost, “Learning when training data are costly: The effect of class distribution on tree induction,” Journal of Artificial Intelligence Research, vol. 19, pp. 315–354, Oct. 2003.