Content


Download Piecewise Linear version

Code KAN lin



This is piecewise version. Its accuracy is very insignificantly lower than the spline version, but the training performance is much higher. All datasets of the physical systems have some unaccounted features and observation errors, making accurate modeling impossible. For them piecewise linear version is good enough. Spline version should be used for the cases when exact solution theoretically exists, for example, for numerical solution of partial differential equations.

Kolmogorov-Arnold network looks complicated $$ M(x_1, x_2, x_3, ... , x_n) = \sum_{q=0}^{2n} \Phi_q\left(\sum_{p=1}^{n} \phi_{q,p}(x_{p})\right), $$ but each individual addend looks much more simple $$ \Phi\left(\sum_{p=1}^{n} \phi_{p}(x_{p})\right). $$ The matter of the training process for this code is to apply same difference $\Delta M = M_i - \hat M_i$ for modification of each individual addend. Here is the main processing loop
            for (int epoch = 0; epoch < 201; ++epoch)
            {
                double error2 = 0.0;
                for (int i = 0; i < inputs.Count; ++i)
                {
                    double residual = target[i];
                    for (int j = 0; j < addends.Length; ++j)
                    {
                        residual -= addends[j].ComputeUsingInput(inputs[i]);
                    }
                    for (int j = 0; j < addends.Length; ++j)
                    {
                        //this method reuses properties computed in ComputeUsingInput
                        addends[j].UpdateUsingMemory(residual);

                        //next method updates independently without reusing properties computed
                        //by ComputeUsingInput
                        //addends[j].UpdateUsingInput(inputs[i], residual);
                    }
                    error2 += residual * residual;
                }
                error2 /= inputs.Count;
                error2 = Math.Sqrt(error2);
                error2 /= (targetMax - targetMin);
                if (0 == epoch % 25)
                {
                    Console.WriteLine("Training step {0}, relative RMSE {1:0.0000}", epoch, error2);
                }
            }
Object addends[j] is an addend. The residual difference is computed and passed to a method for updating involved model structure.

Some comparisons to MLP show approximate same performance and accuracy. This code can be tested on Mike's formula. It is just an algebraic expression, 5 features, 10 000 records.

$$ y = \frac{2 + 2 x_3}{3\pi} \left[ {arctan}\left( 20( x_1 - \frac{1}{2} + \frac{x_2}{6})\exp\left(x_5\right) \right) + \frac{\pi}{2} \right] + \frac{2 + 2 x_4}{3\pi} \left[ {arctan}\left( 20( x_1 - \frac{1}{2} - \frac{x_2}{6})\exp\left(x_5\right) \right) + \frac{\pi}{2} \right] $$

It is executed for about half second and achieve relative error near half percent.



Network configuration for Mike's formula needs to be changed, number of epochs may be taken near 50.
            Formula3 f3 = new Formula3();
            (List inputs, List target) = f3.GenerateData(10000);

            (double[] xmin, double[] xmax, double targetMin, double targetMax) = FindMinMax(inputs, target);

            DateTime start = DateTime.Now;

            int nModels = 11;
            double zmin = targetMin / nModels;
            double zmax = targetMax / nModels;
            KANAddendPL[] addends = new KANAddendPL[nModels];
            for (int i = 0; i < nModels; ++i)
            {
                addends[i] = new KANAddendPL(xmin, xmax, zmin, zmax, 6, 12, 0.01, 0.01);
            }