Content

About this project

Spambase

Reference to code

The dataset is found on UCI.
Number of instances 4601, features 57. The sample of only one record is shown below.

0.15,0,0.46,0,0.61,0,0.3,0,0.92,0.76,0.76,0.92,0,0,0,0,0,0.15,1.23,3.53,2,0,0,0.15,0,0,0,
0,0,0,0,0,0.15,0,0,0,0,0,0,0,0,0,0.3,0,0,0,0,0,0,0.271,0,0.181,0.203,0.022,9.744,445,1257,1

The training block in code is 62% of all records, the rest is validation, accuracy in percent $93.3 \mp 0.4$ which is the same as other people report. Execution takes near $0.06$ seconds. The performance is hard to compare, because I found only Python implementations. This result can be compared to this recent preprint. They report training time 0.69 for MLP and 0.38 for KAN.

I have to note that KAN implementation in preprint uses improved Broyden method and mine Kaczmarz, explained in KAN's Core.