Content


Download Piecewise Linear C++ version

Code KANC++
Version with smart array pointers KANC2




The reason for this version is a proof of the claimed performance. C# and Java along with their IDEs (integrated development environment) are very user friendly and not very far from C++ in performance, but unfortunately their compiled executables are still behind C++ binaries. Many people are using MATLAB, which is running very optimized code in the background and this is to which my code can be easily compared. On that reason I wrote relatively optimized C++ code, compared it myself to MATLAB and can certify that it is same or better. The version available for download is already good for comparison but I did not exhaust all possible options. The code still has a lot of room for improvement, such as parallel training of different model pieces and embedded assembly language. Keeping the code friendly and readable for users, I refrain from that for this moment, since it is already same or better of professionally developed MLP libraries. But I will improve it further some time in the future just to establish the record.

The dataset is computed by formula
$$ y = \frac{2 + 2 x_3}{3\pi} \left[ {arctan}\left( 20( x_1 - \frac{1}{2} + \frac{x_2}{6})\exp\left(x_5\right) \right) + \frac{\pi}{2} \right] + \frac{2 + 2 x_4}{3\pi} \left[ {arctan}\left( 20( x_1 - \frac{1}{2} - \frac{x_2}{6})\exp\left(x_5\right) \right) + \frac{\pi}{2} \right], $$
10 000 records, 5 features. The accuracy test is performed on not used in training records. The relative error is less than 1%, the training time is less than 0.1 second, 36 epochs.



During the training, it skips records on which the required accuracy already achieved and still training on other records with larger residual errors, but when finished training on them, it returns to entire dataset.

The model target $T$ is a sum of 11 individual addends:
$$ z_k = \Phi_k\left(\sum_{p=1}^{5} \phi_{k,p}(x_{p})\right), $$ $$ T = \sum_{k=0}^{10} z_k, $$ which needs 66 functions. The model is initialized randomly at the start.
    int nModels = 11;
    double zmin = targetMin / nModels;
    double zmax = targetMax / nModels;
    KANAddend** addends = new KANAddend*[nModels];
    for (int i = 0; i < nModels; ++i) {
        addends[i] = new KANAddend(xmin, xmax, zmin, zmax, 6, 12, 0.1, 0.01, f3->nInputs);
    }
Each addend has 6 function points (5 linear blocks) in inner functions $\phi$ and 12 points in outer functions $\Phi$. The model is fully defined by 462 parameters.

The computational part is fully explained in KAN's core section. The addends may not have the same number of linear blocks. Also, the regularization parameters (which are 0.1 and 0.01) can be chosen individually. The result does not significantly depend on the above choices. For example, changing number of addends to 8, number of points to 10 and 8 and regularization to 0.05 and 0.07 makes no significant difference in result. That is stability indicator. The choices for the model should not be hard to make. End user needs to hit very large target.

The total length of the logical part is about 250 lines (*.cpp files) and no 3rd party libraries, only built in C++ types, 3 classes, 15 functions and almost all of them are shorter than 8 lines. In order to understand both mathematical concept and the code user needs only to find another 30 minutes to read KAN's core section.