This page contains the complete material related to "J48PartiallyConsolidated: An implementation of the PCTBagging algorithm for WEKA".
Below you can find all information related to the most recent update (v1.1, July 2025) done on this implementation.
To access to previous versions:
⚫ |
We have updated the code to go deeper into different possibilities of the algorithm in order to improve the results.
Technical information:
Class for generating a Partially Consolidated Tree-Bagging (PCTBagging) multiple classifier.
Allows building a classifier between a single consolidated tree (100%), based on J48Consolidated, and a bagging (0%), according to the given consolidation percent. The objective is to build a classifier with high discriminative capability, such as Multiple Classifier Systems (MCS), in this case Bagging, while preserving interpretability through the consolidated partial tree generated in the initial phase.
Originally, the only parameter in PCTBagging was the consolidation percentage, defining the proportion of internal nodes to retain from the whole consolidated tree relative to its total node count. The process was as follows: First, the complete consolidated tree is built, and its internal nodes are counted. The exact number of nodes to preserve is then derived from the consolidation percentage. Starting at the root, the algorithm iteratively selects the largest nodes (by instance count) from the available nodes in the current subtree, ensuring top-down hierarchical integrity. All non-selected nodes are then collapsed, yielding a partially retained consolidated tree, which will provide the interpretable component of the final classifier. From this point onward, all trees associated with the sample set are developed independently, following the standard Bagging procedure.
In this revised version of PCTBagging, we focused on practical improvements to the algorithm's implementation. Specifically, we enabled users to directly specify the exact number of internal nodes to retain in the partially consolidated tree. Unlike the original approach (which required full tree construction before node selection) the enhanced implementation terminates the consolidation process immediately when the target node count is reached. This significantly reduces classifier construction time, particularly since the requested number of explanatory internal nodes (which form the interpretable component) is typically very low. However, we have kept the possibility to indicate the number of consolidated nodes of the partial tree relative to the final size of the fully developed tree, as a percentage, and let the user decide how to specify it through a parameter.
This iterative (non-recursive) implementation introduces a predefined node-selection criterion as a configurable parameter. While the original approach used node size (Size), this version allows specifying one of seven possible criteria (including Size) prior to execution. The chosen criterion guides the order in which nodes are expanded during partial tree construction, directly influencing both the tree's final structure and, consequently, the classifier's explanatory capacity and discriminative performance. The text-mode tree display in Weka Explorer now annotates each internal node with its development sequence number (e.g., [3]), indicating the exact order in which the node was expanded during partial consolidated tree construction, as determined by the active selection criterion.
Possible values of the priority criterion for selecting the next node to be developed in the partial consolidated tree:
For non-topological criteria (Size, Gain Ratio variants), users may select one of two heuristic search algorithms via a dedicated parameter:
We introduce a new pruning parameter, pruneBaseTreesWithoutPreservingConsolidatedStructure, which modifies the treatment of Bagging-associated trees after partial consolidated tree construction. When enabled, base trees are pruned independently from their roots without structural alignment to the consolidated tree (unlike the original approach where pruning preserved the consolidated structure). In addition, it counts how many base trees in the final Bagging set retained that node during independent pruning, and then calculates the percentage it represents of the total set size (e.g. [Str: 75%]). These annotations appear in the text-mode representation of the tree in the ‘Classifier output’ panel of Weka Explorer.
Finally, to enable comprehensive evaluation of the enhanced PCTBagging algorithm, we have implemented four categories of performance measures in WEKA's Experimenter: timing diagnostics, tree structure analysis, ensemble aggregation statistics, and structure preservation metrics.
For more information, see:
Jesús M. Pérez and Olatz Arbelaitz. "Multi-Criteria Node Selection in Direct PCTBagging: Balancing Interpretability and Accuracy with Bootstrap Sampling and Unrestricted Pruning" Information Sciences (2025), submitted. doi:10.1016/j.ins.2025.XX.XXX
Igor Ibarguren, Jesús M. Pérez, Javier Muguerza, Olatz Arbelaitz and Ainhoa Yera. "PCTBagging: From Inner Ensembles to Ensembles. A trade-off between Discriminating Capacity and Interpretability". Information Sciences (2022), Vol. 583, pp 219-238. doi:10.1016/j.ins.2021.11.010
Weka package:
The Weka package containing the J48PartiallyConsolidated classifier (along with the J48Consolidated package on which it is based) (tested for 3.8.7-SNAPSHOT) to be installed from Weka's package manager, including compiled code, source code, javadocs and package description files, can be found here (soon hopefully also in the official Weka package list (v0.3)):
Source code:
The source code of the classes that implement the J48PartiallyConsolidated classifier (on stable-3-8-6 version of Weka) can be found in:
This classifier is based on the J48Consolidated package whose code is also included in the above zip file, can be installed via the Weka package manager or can be found on the package's website (http://www.aldapa.eus/res/weka-ctc/).
In order to complete the whole source code of the implementation, download the Weka source code from https://waikato.github.io/weka-wiki/downloading_weka/.
Executable file:
The executable file in Weka is a .jar file. The file with the current J48PartiallyConsolidated implementation included in the stable-3-8-6 version of Weka can be found in:
To run Weka type:
java -Xmx1000M -jar Weka-DrivenPCTBag.jar
(see https://waikato.github.io/weka-wiki/downloading_weka/ for more information)
Last modification: 2025/07/03