Data that can be considere logs

There was a classic task using tabular data to preict a certain event whether it will happen or not. And no matter how I approache this data from whatever angle I looke at it the result alas was not impressive. There was little data and what was available had little preictive power. Although it seeme that something could still be pulle out. And so looking through individual decision trees it dawne on me Ill try to prune all the trees use in Random Forest to one but the most efficient branch.

In the tree in the leaf of

Behold both accuracy precision and completeness recall have increase noticeably. And recall especially increase at high levels of accuracy. I teste this method on other tasks. And everywhere with accuracy completeness increase noticeably. What have I done Idea Pruning or pruning Peru WhatsApp Number List a tree is a way to combat overfitting. Usually it comes down to specifying the criterion for stopping growth maximum tree depth minimum number of objects in a leaf etc This is the socalle prepruning but in fact it is simply a growth restriction. But there is also very delicate pruning of the constructe tree postpruning.

Forest increasing recall at accuracy

For example the minimal cost complexity pruning technique . In my case it did not provide any improvement. The effect was achieve by a very simple option and even more likely not by pruning but by pulling out one branch from each tree. An example of a tree in my task Benin Phone Number List Decision tree example Decision tree example It can be seen that objects of the class that nees to be preicte are clearly distinguishe including along the rightmost branch blue leaves. This is the picture on almost all trees. At that time I already had models traine on other data and I use them mixe with the original ones as features.

Leave a comment

Your email address will not be published. Required fields are marked *