Content


Bank churn

Reference to code Bank churn



This benchmark was found on Neural Designer. The location of their article may change after some time. The model determines likelihood of churning for the bank clients depending on demographics. The fragment of dataset is shown below:
15737888;850;Spain;Female;43;2;125510.82;1;1;1;79084.1;0
15574012;645;Spain;Male;44;8;113755.78;2;1;0;149756.71;1
15592531;822;France;Male;50;7;0;2;1;1;10062.8;0
15656148;376;Germany;Female;29;4;115046.74;4;1;0;119346.88;1
15792365;501;France;Male;44;4;142051.07;2;0;1;74940.5;0
15592389;684;France;Male;27;2;134603.88;1;1;1;71725.73;0
It is actually the data for probabilistic model because both targets $0$ and $1$ may exist for the same features, but when there are only two classes the model output may be a real number and be treated as probability. Also, in this test we need only a comparison to the code designed by other developers.

The dataset has $10,000$ records. It is divided into training (60%), selection (20%) and testing (20%). The selection plays the role of validation sample and test is what we need for conclusion.

The model output is real number, but it is rounded and returned as 0 or 1. The accuracy metric is the ratio of correct predictions. Neural Designer reports near 87.4%, our piecewise linear model of KAN gives near 90% and spline model near 92%.