kan package

Submodules

kan.KAN module

class kan.KAN.KAN(*args: Any, **kwargs: Any)

Bases: Module

KAN class

Attributes:

biases: a list of nn.Linear()

biases are added on nodes (in principle, biases can be absorbed into activation functions. However, we still have them for better optimization)

act_fun: a list of KANLayer

KANLayers

depth: int

depth of KAN

width: list

number of neurons in each layer. e.g., [2,5,5,3] means 2D inputs, 3D outputs, with 2 layers of 5 hidden neurons.

grid: int

the number of grid intervals

k: int

the order of piecewise polynomial

base_fun: fun

residual function b(x). an activation function phi(x) = sb_scale * b(x) + sp_scale * spline(x)

symbolic_fun: a list of Symbolic_KANLayer

Symbolic_KANLayers

symbolic_enabled: bool

If False, the symbolic front is not computed (to save time). Default: True.

Methods:

__init__():

initialize a KAN

initialize_from_another_model():

initialize a KAN from another KAN (with the same shape, but potentially different grids)

update_grid_from_samples():

update spline grids based on samples

initialize_grid_from_another_model():

initalize KAN grids from another KAN

forward():

forward

set_mode():

set the mode of an activation function: ‘n’ for numeric, ‘s’ for symbolic, ‘ns’ for combined (note they are visualized differently in plot(). ‘n’ as black, ‘s’ as red, ‘ns’ as purple).

fix_symbolic():

fix an activation function to be symbolic

suggest_symbolic():

suggest the symbolic candicates of a numeric spline-based activation function

lock():

lock activation functions to share parameters

unlock():

unlock locked activations

get_range():

get the input and output ranges of an activation function

plot():

plot the diagram of KAN

train():

train KAN

prune():

prune KAN

remove_edge():

remove some edge of KAN

remove_node():

remove some node of KAN

auto_symbolic():

automatically fit all splines to be symbolic functions

symbolic_formula():

obtain the symbolic formula of the KAN network

__init__(width=None, grid=3, k=3, noise_scale=0.1, noise_scale_base=0.1, base_fun=torch.nn.SiLU, symbolic_enabled=True, bias_trainable=True, grid_eps=1.0, grid_range=[-1, 1], sp_trainable=True, sb_trainable=True, device='cpu', seed=0)

initalize a KAN model

Args:

widthlist of int

\([n_0, n_1, .., n_{L-1}]\) specify the number of neurons in each layer (including inputs/outputs)

gridint

number of grid intervals. Default: 3.

kint

order of piecewise polynomial. Default: 3.

noise_scalefloat

initial injected noise to spline. Default: 0.1.

base_funfun

the residual function b(x). Default: torch.nn.SiLU().

symbolic_enabledbool

compute or skip symbolic computations (for efficiency). By default: True.

bias_trainablebool

bias parameters are updated or not. By default: True

grid_epsfloat

When grid_eps = 0, the grid is uniform; when grid_eps = 1, the grid is partitioned using percentiles of samples. 0 < grid_eps < 1 interpolates between the two extremes. Default: 0.02.

grid_rangelist/np.array of shape (2,))

setting the range of grids. Default: [-1,1].

sp_trainablebool

If true, scale_sp is trainable. Default: True.

sb_trainablebool

If true, scale_base is trainable. Default: True.

devicestr

device

seedint

random seed

Returns:

self

Example

>>> model = KAN(width=[2,5,1], grid=5, k=3)
>>> (model.act_fun[0].in_dim, model.act_fun[0].out_dim), (model.act_fun[1].in_dim, model.act_fun[1].out_dim)
((2, 5), (5, 1))
auto_symbolic(a_range=(-10, 10), b_range=(-10, 10), lib=None, verbose=1)

automatic symbolic regression: using top 1 suggestion from suggest_symbolic to replace splines with symbolic activations

Args:

libNone or a list of function names

the symbolic library

verboseint

verbosity

Returns:

None (print suggested symbolic formulas)

Example 1

>>> # default library
>>> from utils import create_dataset
>>> model = KAN(width=[2,5,1], grid=5, k=3, noise_scale=0.1, seed=0)
>>> f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
>>> dataset = create_dataset(f, n_var=2)
>>> model.train(dataset, opt='LBFGS', steps=50, lamb=0.01);
>>> >>> model = model.prune()
>>> model(dataset['train_input'])
>>> model.auto_symbolic()
fixing (0,0,0) with sin, r2=0.9994837045669556
fixing (0,1,0) with cosh, r2=0.9978033900260925
fixing (1,0,0) with arctan, r2=0.9997088313102722

Example 2

>>> # customized library
>>> from utils import create_dataset
>>> model = KAN(width=[2,5,1], grid=5, k=3, noise_scale=0.1, seed=0)
>>> f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
>>> dataset = create_dataset(f, n_var=2)
>>> model.train(dataset, opt='LBFGS', steps=50, lamb=0.01);
>>> >>> model = model.prune()
>>> model(dataset['train_input'])
>>> model.auto_symbolic(lib=['exp','sin','x^2'])
fixing (0,0,0) with sin, r2=0.999411404132843
fixing (0,1,0) with x^2, r2=0.9962921738624573
fixing (1,0,0) with exp, r2=0.9980258941650391
clear_ckpts(folder='./model_ckpt')

clear all checkpoints

Args:

folderstr

the folder that stores checkpoints

Returns:

None

fix_symbolic(l, i, j, fun_name, fit_params_bool=True, a_range=(-10, 10), b_range=(-10, 10), verbose=True, random=False)

set (l,i,j) activation to be symbolic (specified by fun_name)

Args:

lint

layer index

iint

input neuron index

jint

output neuron index

fun_namestr

function name

fit_params_boolbool

obtaining affine parameters through fitting (True) or setting default values (False)

a_rangetuple

sweeping range of a

b_rangetuple

sweeping range of b

verbosebool

If True, more information is printed.

randombool

initialize affine parameteres randomly or as [1,0,1,0]

Returns:

None or r2 (coefficient of determination)

Example 1

>>> # when fit_params_bool = False
>>> model = KAN(width=[2,5,1], grid=5, k=3)
>>> model.fix_symbolic(0,1,3,'sin',fit_params_bool=False)
>>> print(model.act_fun[0].mask.reshape(2,5))
>>> print(model.symbolic_fun[0].mask.reshape(2,5))
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 0., 1., 1.]])
tensor([[0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0.]])

Example 2

>>> # when fit_params_bool = True
>>> model = KAN(width=[2,5,1], grid=5, k=3, noise_scale=1.)
>>> x = torch.normal(0,1,size=(100,2))
>>> model(x) # obtain activations (otherwise model does not have attributes acts)
>>> model.fix_symbolic(0,1,3,'sin',fit_params_bool=True)
>>> print(model.act_fun[0].mask.reshape(2,5))
>>> print(model.symbolic_fun[0].mask.reshape(2,5))
r2 is 0.8131332993507385
r2 is not very high, please double check if you are choosing the correct symbolic function.
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 0., 1., 1.]])
tensor([[0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0.]])
forward(x)

KAN forward

Args:

x2D torch.float

inputs, shape (batch, input dimension)

Returns:

y2D torch.float

outputs, shape (batch, output dimension)

Example

>>> model = KAN(width=[2,5,3], grid=5, k=3)
>>> x = torch.normal(0,1,size=(100,2))
>>> model(x).shape
torch.Size([100, 3])
get_range(l, i, j, verbose=True)

Get the input range and output range of the (l,i,j) activation

Args:

lint

layer index

iint

input neuron index

jint

output neuron index

Returns:

x_minfloat

minimum of input

x_maxfloat

maximum of input

y_minfloat

minimum of output

y_maxfloat

maximum of output

Example

>>> model = KAN(width=[2,3,1], grid=5, k=3, noise_scale=1.)
>>> x = torch.normal(0,1,size=(100,2))
>>> model(x) # do a forward pass to obtain model.acts
>>> model.get_range(0,0,0)
x range: [-2.13 , 2.75 ]
y range: [-0.50 , 1.83 ]
(tensor(-2.1288), tensor(2.7498), tensor(-0.5042), tensor(1.8275))
initialize_from_another_model(another_model, x)

initialize from a parent model. The parent has the same width as the current model but may have different grids.

Args:

another_modelKAN

the parent model used to initialize the current model

x2D torch.float

inputs, shape (batch, input dimension)

Returns:

self : KAN

Example

>>> model_coarse = KAN(width=[2,5,1], grid=5, k=3)
>>> model_fine = KAN(width=[2,5,1], grid=10, k=3)
>>> print(model_fine.act_fun[0].coef[0][0].data)
>>> x = torch.normal(0,1,size=(100,2))
>>> model_fine.initialize_from_another_model(model_coarse, x);
>>> print(model_fine.act_fun[0].coef[0][0].data)
tensor(-0.0030)
tensor(0.0506)
initialize_grid_from_another_model(model, x)

initialize grid from a parent model

Args:

modelKAN

parent model

x2D torch.float

inputs, shape (batch, input dimension)

Returns:

None

Example

>>> model_parent = KAN(width=[1,1], grid=5, k=3)
>>> model_parent.act_fun[0].grid.data = torch.linspace(-2,2,steps=6)[None,:]
>>> x = torch.linspace(-2,2,steps=1001)[:,None]
>>> model = KAN(width=[1,1], grid=5, k=3)
>>> print(model.act_fun[0].grid.data)
>>> model = model.initialize_from_another_model(model_parent, x)
>>> print(model.act_fun[0].grid.data)
tensor([[-1.0000, -0.6000, -0.2000,  0.2000,  0.6000,  1.0000]])
tensor([[-2.0000, -1.2000, -0.4000,  0.4000,  1.2000,  2.0000]])
load_ckpt(name, folder='./model_ckpt')

load a checkpoint to the current model

Args:

name: str

the name of the checkpoint to be loaded

folderstr

the folder that stores checkpoints

Returns:

None

lock(l, ids)

lock ids in the l-th layer to be the same function

Args:

lint

layer index

ids2D list

\([[i_1,j_1],[i_2,j_2],...]\) set \((l,i_i,j_1), (l,i_2,j_2), ...\) to be the same function

Returns:

None

Example

>>> model = KAN(width=[2,3,1], grid=5, k=3, noise_scale=1.)
>>> print(model.act_fun[0].weight_sharing.reshape(3,2))
>>> model.lock(0,[[1,0],[1,1]])
>>> print(model.act_fun[0].weight_sharing.reshape(3,2))
tensor([[0, 1],
        [2, 3],
        [4, 5]])
tensor([[0, 1],
        [2, 1],
        [4, 5]])
plot(folder='./figures', beta=3, mask=False, mode='supervised', scale=0.5, tick=False, sample=False, in_vars=None, out_vars=None, title=None)

plot KAN

Args:

folderstr

the folder to store pngs

betafloat

positive number. control the transparency of each activation. transparency = tanh(beta*l1).

maskbool

If True, plot with mask (need to run prune() first to obtain mask). If False (by default), plot all activation functions.

modebool

“supervised” or “unsupervised”. If “supervised”, l1 is measured by absolution value (not subtracting mean); if “unsupervised”, l1 is measured by standard deviation (subtracting mean).

scalefloat

control the size of the diagram

in_vars: None or list of str

the name(s) of input variables

out_vars: None or list of str

the name(s) of output variables

title: None or str

title

Returns:

Figure

Example

>>> # see more interactive examples in demos
>>> model = KAN(width=[2,3,1], grid=3, k=3, noise_scale=1.0)
>>> x = torch.normal(0,1,size=(100,2))
>>> model(x) # do a forward pass to obtain model.acts
>>> model.plot()
prune(threshold=0.01, mode='auto', active_neurons_id=None)

pruning KAN on the node level. If a node has small incoming or outgoing connection, it will be pruned away.

Args:

thresholdfloat

the threshold used to determine whether a node is small enough

modestr

“auto” or “manual”. If “auto”, the thresold will be used to automatically prune away nodes. If “manual”, active_neuron_id is needed to specify which neurons are kept (others are thrown away).

active_neuron_idlist of id lists

For example, [[0,1],[0,2,3]] means keeping the 0/1 neuron in the 1st hidden layer and the 0/2/3 neuron in the 2nd hidden layer. Pruning input and output neurons is not supported yet.

Returns:

model2KAN

pruned model

Example

>>> # for more interactive examples, please see demos
>>> from utils import create_dataset
>>> model = KAN(width=[2,5,1], grid=5, k=3, noise_scale=0.1, seed=0)
>>> f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
>>> dataset = create_dataset(f, n_var=2)
>>> model.train(dataset, opt='LBFGS', steps=50, lamb=0.01);
>>> model.prune()
>>> model.plot(mask=True)
remove_edge(l, i, j)

remove activtion phi(l,i,j) (set its mask to zero)

Args:

lint

layer index

iint

input neuron index

jint

output neuron index

Returns:

None

remove_node(l, i)

remove neuron (l,i) (set the masks of all incoming and outgoing activation functions to zero)

Args:

lint

layer index

iint

neuron index

Returns:

None

save_ckpt(name, folder='./model_ckpt')

save the current model as checkpoint

Args:

name: str

the name of the checkpoint to be saved

folderstr

the folder that stores checkpoints

Returns:

None

set_mode(l, i, j, mode, mask_n=None)

set (l,i,j) activation to have mode

Args:

lint

layer index

iint

input neuron index

jint

output neuron index

modestr

‘n’ (numeric) or ‘s’ (symbolic) or ‘ns’ (combined)

mask_nNone or float)

magnitude of the numeric front

Returns:

None

suggest_symbolic(l, i, j, a_range=(-10, 10), b_range=(-10, 10), lib=None, topk=5, verbose=True)

suggest the symbolic candidates of phi(l,i,j)

Args:

lint

layer index

iint

input neuron index

jint

output neuron index

libdic

library of symbolic bases. If lib = None, the global default library will be used.

topkint

display the top k symbolic functions (according to r2)

verbosebool

If True, more information will be printed.

Returns:

None

Example

>>> model = KAN(width=[2,5,1], grid=5, k=3, noise_scale=0.1, seed=0)
>>> f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
>>> dataset = create_dataset(f, n_var=2)
>>> model.train(dataset, opt='LBFGS', steps=50, lamb=0.01);
>>> model = model.prune()
>>> model(dataset['train_input'])
>>> model.suggest_symbolic(0,0,0)
function , r2
sin , 0.9994412064552307
gaussian , 0.9196369051933289
tanh , 0.8608126044273376
sigmoid , 0.8578218817710876
arctan , 0.842217743396759
symbolic_formula(floating_digit=2, var=None, normalizer=None, simplify=False, output_normalizer=None)

obtain the symbolic formula

Args:

floating_digitint

the number of digits to display

varlist of str

the name of variables (if not provided, by default using [‘x_1’, ‘x_2’, …])

normalizer[mean array (floats), varaince array (floats)]

the normalization applied to inputs

simplifybool

If True, simplify the equation at each step (usually quite slow), so set up False by default.

output_normalizer: [mean array (floats), varaince array (floats)]

the normalization applied to outputs

Returns:

symbolic formula : sympy function

Example

>>> model = KAN(width=[2,5,1], grid=5, k=3, noise_scale=0.1, seed=0, grid_eps=0.02)
>>> f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
>>> dataset = create_dataset(f, n_var=2)
>>> model.train(dataset, opt='LBFGS', steps=50, lamb=0.01);
>>> model = model.prune()
>>> model(dataset['train_input'])
>>> model.auto_symbolic(lib=['exp','sin','x^2'])
>>> model.train(dataset, opt='LBFGS', steps=50, lamb=0.00, update_grid=False);
>>> model.symbolic_formula()
train(dataset, opt='LBFGS', steps=100, log=1, lamb=0.0, lamb_l1=1.0, lamb_entropy=2.0, lamb_coef=0.0, lamb_coefdiff=0.0, update_grid=True, grid_update_num=10, loss_fn=None, lr=1.0, stop_grid_update_step=50, batch=-1, small_mag_threshold=1e-16, small_reg_factor=1.0, metrics=None, sglr_avoid=False, save_fig=False, in_vars=None, out_vars=None, beta=3, save_fig_freq=1, img_folder='./video', device='cpu')

training

Args:

datasetdic

contains dataset[‘train_input’], dataset[‘train_label’], dataset[‘test_input’], dataset[‘test_label’]

optstr

“LBFGS” or “Adam”

stepsint

training steps

logint

logging frequency

lambfloat

overall penalty strength

lamb_l1float

l1 penalty strength

lamb_entropyfloat

entropy penalty strength

lamb_coeffloat

coefficient magnitude penalty strength

lamb_coefdifffloat

difference of nearby coefficits (smoothness) penalty strength

update_gridbool

If True, update grid regularly before stop_grid_update_step

grid_update_numint

the number of grid updates before stop_grid_update_step

stop_grid_update_stepint

no grid updates after this training step

batchint

batch size, if -1 then full.

small_mag_thresholdfloat

threshold to determine large or small numbers (may want to apply larger penalty to smaller numbers)

small_reg_factorfloat

penalty strength applied to small factors relative to large factos

devicestr

device

save_fig_freqint

save figure every (save_fig_freq) step

Returns:

resultsdic

results[‘train_loss’], 1D array of training losses (RMSE) results[‘test_loss’], 1D array of test losses (RMSE) results[‘reg’], 1D array of regularization

Example

>>> # for interactive examples, please see demos
>>> from utils import create_dataset
>>> model = KAN(width=[2,5,1], grid=5, k=3, noise_scale=0.1, seed=0)
>>> f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
>>> dataset = create_dataset(f, n_var=2)
>>> model.train(dataset, opt='LBFGS', steps=50, lamb=0.01);
>>> model.plot()
unfix_symbolic(l, i, j)

unfix the (l,i,j) activation function.

unfix_symbolic_all()

unfix all activation functions.

unlock(l, ids)

unlock ids in the l-th layer to be the same function

Args:

lint

layer index

ids2D list)

[[i1,j1],[i2,j2],…] set (l,ii,j1), (l,i2,j2), … to be unlocked

Example:

>>> model = KAN(width=[2,3,1], grid=5, k=3, noise_scale=1.)
>>> model.lock(0,[[1,0],[1,1]])
>>> print(model.act_fun[0].weight_sharing.reshape(3,2))
>>> model.unlock(0,[[1,0],[1,1]])
>>> print(model.act_fun[0].weight_sharing.reshape(3,2))
tensor([[0, 1],
        [2, 1],
        [4, 5]])
tensor([[0, 1],
        [2, 3],
        [4, 5]])
update_grid_from_samples(x)

update grid from samples

Args:

x2D torch.float

inputs, shape (batch, input dimension)

Returns:

None

Example

>>> model = KAN(width=[2,5,1], grid=5, k=3)
>>> print(model.act_fun[0].grid[0].data)
>>> x = torch.rand(100,2)*5
>>> model.update_grid_from_samples(x)
>>> print(model.act_fun[0].grid[0].data)
tensor([-1.0000, -0.6000, -0.2000,  0.2000,  0.6000,  1.0000])
tensor([0.0128, 1.0064, 2.0000, 2.9937, 3.9873, 4.9809])

kan.KANLayer module

class kan.KANLayer.KANLayer(*args: Any, **kwargs: Any)

Bases: Module

KANLayer class

Attributes:

in_dim: int

input dimension

out_dim: int

output dimension

size: int

the number of splines = input dimension * output dimension

k: int

the piecewise polynomial order of splines

grid: 2D torch.float

grid points

noises: 2D torch.float

injected noises to splines at initialization (to break degeneracy)

coef: 2D torch.tensor

coefficients of B-spline bases

scale_base: 1D torch.float

magnitude of the residual function b(x)

scale_sp: 1D torch.float

mangitude of the spline function spline(x)

base_fun: fun

residual function b(x)

mask: 1D torch.float

mask of spline functions. setting some element of the mask to zero means setting the corresponding activation to zero function.

grid_eps: float in [0,1]

a hyperparameter used in update_grid_from_samples. When grid_eps = 0, the grid is uniform; when grid_eps = 1, the grid is partitioned using percentiles of samples. 0 < grid_eps < 1 interpolates between the two extremes.

weight_sharing: 1D tensor int

allow spline activations to share parameters

lock_counter: int

counter how many activation functions are locked (weight sharing)

lock_id: 1D torch.int

the id of activation functions that are locked

device: str

device

Methods:

__init__():

initialize a KANLayer

forward():

forward

update_grid_from_samples():

update grids based on samples’ incoming activations

initialize_grid_from_parent():

initialize grids from another model

get_subset():

get subset of the KANLayer (used for pruning)

lock():

lock several activation functions to share parameters

unlock():

unlock already locked activation functions

__init__(in_dim=3, out_dim=2, num=5, k=3, noise_scale=0.1, scale_base=1.0, scale_sp=1.0, base_fun=torch.nn.SiLU, grid_eps=0.02, grid_range=[-1, 1], sp_trainable=True, sb_trainable=True, device='cpu')

‘ initialize a KANLayer

Args:

in_dimint

input dimension. Default: 2.

out_dimint

output dimension. Default: 3.

numint

the number of grid intervals = G. Default: 5.

kint

the order of piecewise polynomial. Default: 3.

noise_scalefloat

the scale of noise injected at initialization. Default: 0.1.

scale_basefloat

the scale of the residual function b(x). Default: 1.0.

scale_spfloat

the scale of the base function spline(x). Default: 1.0.

base_funfunction

residual function b(x). Default: torch.nn.SiLU()

grid_epsfloat

When grid_eps = 0, the grid is uniform; when grid_eps = 1, the grid is partitioned using percentiles of samples. 0 < grid_eps < 1 interpolates between the two extremes. Default: 0.02.

grid_rangelist/np.array of shape (2,)

setting the range of grids. Default: [-1,1].

sp_trainablebool

If true, scale_sp is trainable. Default: True.

sb_trainablebool

If true, scale_base is trainable. Default: True.

devicestr

device

Returns:

self

Example

>>> model = KANLayer(in_dim=3, out_dim=5)
>>> (model.in_dim, model.out_dim)
(3, 5)
forward(x)

KANLayer forward given input x

Args:

x2D torch.float

inputs, shape (number of samples, input dimension)

Returns:

y2D torch.float

outputs, shape (number of samples, output dimension)

preacts3D torch.float

fan out x into activations, shape (number of sampels, output dimension, input dimension)

postacts3D torch.float

the outputs of activation functions with preacts as inputs

postspline3D torch.float

the outputs of spline functions with preacts as inputs

Example

>>> model = KANLayer(in_dim=3, out_dim=5)
>>> x = torch.normal(0,1,size=(100,3))
>>> y, preacts, postacts, postspline = model(x)
>>> y.shape, preacts.shape, postacts.shape, postspline.shape
(torch.Size([100, 5]),
 torch.Size([100, 5, 3]),
 torch.Size([100, 5, 3]),
 torch.Size([100, 5, 3]))
get_subset(in_id, out_id)

get a smaller KANLayer from a larger KANLayer (used for pruning)

Args:

in_idlist

id of selected input neurons

out_idlist

id of selected output neurons

Returns:

spb : KANLayer

Example

>>> kanlayer_large = KANLayer(in_dim=10, out_dim=10, num=5, k=3)
>>> kanlayer_small = kanlayer_large.get_subset([0,9],[1,2,3])
>>> kanlayer_small.in_dim, kanlayer_small.out_dim
(2, 3)
initialize_grid_from_parent(parent, x)

update grid from a parent KANLayer & samples

Args:

parentKANLayer

a parent KANLayer (whose grid is usually coarser than the current model)

x2D torch.float

inputs, shape (number of samples, input dimension)

Returns:

None

Example

>>> batch = 100
>>> parent_model = KANLayer(in_dim=1, out_dim=1, num=5, k=3)
>>> print(parent_model.grid.data)
>>> model = KANLayer(in_dim=1, out_dim=1, num=10, k=3)
>>> x = torch.normal(0,1,size=(batch, 1))
>>> model.initialize_grid_from_parent(parent_model, x)
>>> print(model.grid.data)
tensor([[-1.0000, -0.6000, -0.2000,  0.2000,  0.6000,  1.0000]])
tensor([[-1.0000, -0.8000, -0.6000, -0.4000, -0.2000,  0.0000,  0.2000,  0.4000,
  0.6000,  0.8000,  1.0000]])
lock(ids)

lock activation functions to share parameters based on ids

Args:

idslist

list of ids of activation functions

Returns:

None

Example

>>> model = KANLayer(in_dim=3, out_dim=3, num=5, k=3)
>>> print(model.weight_sharing.reshape(3,3))
>>> model.lock([[0,0],[1,2],[2,1]]) # set (0,0),(1,2),(2,1) functions to be the same
>>> print(model.weight_sharing.reshape(3,3))
tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])
tensor([[0, 1, 2],
        [3, 4, 0],
        [6, 0, 8]])
unlock(ids)

unlock activation functions

Args:

idslist

list of ids of activation functions

Returns:

None

Example

>>> model = KANLayer(in_dim=3, out_dim=3, num=5, k=3)
>>> model.lock([[0,0],[1,2],[2,1]]) # set (0,0),(1,2),(2,1) functions to be the same
>>> print(model.weight_sharing.reshape(3,3))
>>> model.unlock([[0,0],[1,2],[2,1]]) # unlock the locked functions
>>> print(model.weight_sharing.reshape(3,3))
tensor([[0, 1, 2],
        [3, 4, 0],
        [6, 0, 8]])
tensor([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]])
update_grid_from_samples(x)

update grid from samples

Args:

x2D torch.float

inputs, shape (number of samples, input dimension)

Returns:

None

Example

>>> model = KANLayer(in_dim=1, out_dim=1, num=5, k=3)
>>> print(model.grid.data)
>>> x = torch.linspace(-3,3,steps=100)[:,None]
>>> model.update_grid_from_samples(x)
>>> print(model.grid.data)
tensor([[-1.0000, -0.6000, -0.2000,  0.2000,  0.6000,  1.0000]])
tensor([[-3.0002, -1.7882, -0.5763,  0.6357,  1.8476,  3.0002]])

kan.LBFGS module

class kan.LBFGS.LBFGS(*args: Any, **kwargs: Any)

Bases: Optimizer

Implements L-BFGS algorithm.

Heavily inspired by minFunc.

Warning

This optimizer doesn’t support per-parameter options and parameter groups (there can be only one).

Warning

Right now all parameters have to be on a single device. This will be improved in the future.

Note

This is a very memory intensive optimizer (it requires additional param_bytes * (history_size + 1) bytes). If it doesn’t fit in memory try reducing the history size, or use a different algorithm.

Args:

lr (float): learning rate (default: 1) max_iter (int): maximal number of iterations per optimization step

(default: 20)

max_eval (int): maximal number of function evaluations per optimization

step (default: max_iter * 1.25).

tolerance_grad (float): termination tolerance on first order optimality

(default: 1e-7).

tolerance_change (float): termination tolerance on function

value/parameter changes (default: 1e-9).

history_size (int): update history size (default: 100). line_search_fn (str): either ‘strong_wolfe’ or None (default: None).

__init__(params, lr=1, max_iter=20, max_eval=None, tolerance_grad=1e-07, tolerance_change=1e-09, tolerance_ys=1e-32, history_size=100, line_search_fn=None)
step(closure)

Perform a single optimization step.

Args:
closure (Callable): A closure that reevaluates the model

and returns the loss.

kan.Symbolic_KANLayer module

class kan.Symbolic_KANLayer.Symbolic_KANLayer(*args: Any, **kwargs: Any)

Bases: Module

KANLayer class

Attributes:

in_dim: int

input dimension

out_dim: int

output dimension

funs: 2D array of torch functions (or lambda functions)

symbolic functions (torch)

funs_name: 2D arry of str

names of symbolic functions

funs_sympy: 2D array of sympy functions (or lambda functions)

symbolic functions (sympy)

affine: 3D array of floats

affine transformations of inputs and outputs

Methods:

__init__():

initialize a Symbolic_KANLayer

forward():

forward

get_subset():

get subset of the KANLayer (used for pruning)

fix_symbolic():

fix an activation function to be symbolic

__init__(in_dim=3, out_dim=2, device='cpu')

initialize a Symbolic_KANLayer (activation functions are initialized to be identity functions)

Args:

in_dimint

input dimension

out_dimint

output dimension

devicestr

device

Returns:

self

Example

>>> sb = Symbolic_KANLayer(in_dim=3, out_dim=3)
>>> len(sb.funs), len(sb.funs[0])
(3, 3)
fix_symbolic(i, j, fun_name, x=None, y=None, random=False, a_range=(-10, 10), b_range=(-10, 10), verbose=True)

fix an activation function to be symbolic

Args:

iint

the id of input neuron

jint

the id of output neuron

fun_namestr

the name of the symbolic functions

x1D array

preactivations

y1D array

postactivations

a_rangetuple

sweeping range of a

b_rangetuple

sweeping range of a

verbosebool

print more information if True

Returns:

r2 (coefficient of determination)

Example 1

>>> # when x & y are not provided. Affine parameters are set to a = 1, b = 0, c = 1, d = 0
>>> sb = Symbolic_KANLayer(in_dim=3, out_dim=2)
>>> sb.fix_symbolic(2,1,'sin')
>>> print(sb.funs_name)
>>> print(sb.affine)
[['', '', ''], ['', '', 'sin']]
Parameter containing:
tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [1., 0., 1., 0.]], requires_grad=True)
Example 2
---------
>>> # when x & y are provided, fit_params() is called to find the best fit coefficients
>>> sb = Symbolic_KANLayer(in_dim=3, out_dim=2)
>>> batch = 100
>>> x = torch.linspace(-1,1,steps=batch)
>>> noises = torch.normal(0,1,(batch,)) * 0.02
>>> y = 5.0*torch.sin(3.0*x + 2.0) + 0.7 + noises
>>> sb.fix_symbolic(2,1,'sin',x,y)
>>> print(sb.funs_name)
>>> print(sb.affine[1,2,:].data)
r2 is 0.9999701976776123
[['', '', ''], ['', '', 'sin']]
tensor([2.9981, 1.9997, 5.0039, 0.6978])
forward(x)

Args:

x2D array

inputs, shape (batch, input dimension)

Returns:

y2D array

outputs, shape (batch, output dimension)

postacts3D array

activations after activation functions but before summing on nodes

Example

>>> sb = Symbolic_KANLayer(in_dim=3, out_dim=5)
>>> x = torch.normal(0,1,size=(100,3))
>>> y, postacts = sb(x)
>>> y.shape, postacts.shape
(torch.Size([100, 5]), torch.Size([100, 5, 3]))
get_subset(in_id, out_id)

get a smaller Symbolic_KANLayer from a larger Symbolic_KANLayer (used for pruning)

Args:

in_idlist

id of selected input neurons

out_idlist

id of selected output neurons

Returns:

spb : Symbolic_KANLayer

Example

>>> sb_large = Symbolic_KANLayer(in_dim=10, out_dim=10)
>>> sb_small = sb_large.get_subset([0,9],[1,2,3])
>>> sb_small.in_dim, sb_small.out_dim
(2, 3)

kan.spline module

kan.spline.B_batch(x, grid, k=0, extend=True, device='cpu')

evaludate x on B-spline bases

Args:

x2D torch.tensor

inputs, shape (number of splines, number of samples)

grid2D torch.tensor

grids, shape (number of splines, number of grid points)

kint

the piecewise polynomial order of splines.

extendbool

If True, k points are extended on both ends. If False, no extension (zero boundary condition). Default: True

devicestr

devicde

Returns:

spline values3D torch.tensor

shape (number of splines, number of B-spline bases (coeffcients), number of samples). The numbef of B-spline bases = number of grid points + k - 1.

Example

>>> num_spline = 5
>>> num_sample = 100
>>> num_grid_interval = 10
>>> k = 3
>>> x = torch.normal(0,1,size=(num_spline, num_sample))
>>> grids = torch.einsum('i,j->ij', torch.ones(num_spline,), torch.linspace(-1,1,steps=num_grid_interval+1))
>>> B_batch(x, grids, k=k).shape
torch.Size([5, 13, 100])
kan.spline.coef2curve(x_eval, grid, coef, k, device='cpu')

converting B-spline coefficients to B-spline curves. Evaluate x on B-spline curves (summing up B_batch results over B-spline basis).

Args:

x_eval2D torch.tensor)

shape (number of splines, number of samples)

grid2D torch.tensor)

shape (number of splines, number of grid points)

coef2D torch.tensor)

shape (number of splines, number of coef params). number of coef params = number of grid intervals + k

kint

the piecewise polynomial order of splines.

devicestr

devicde

Returns:

y_eval2D torch.tensor

shape (number of splines, number of samples)

Example

>>> num_spline = 5
>>> num_sample = 100
>>> num_grid_interval = 10
>>> k = 3
>>> x_eval = torch.normal(0,1,size=(num_spline, num_sample))
>>> grids = torch.einsum('i,j->ij', torch.ones(num_spline,), torch.linspace(-1,1,steps=num_grid_interval+1))
>>> coef = torch.normal(0,1,size=(num_spline, num_grid_interval+k))
>>> coef2curve(x_eval, grids, coef, k=k).shape
torch.Size([5, 100])
kan.spline.curve2coef(x_eval, y_eval, grid, k, device='cpu')

converting B-spline curves to B-spline coefficients using least squares.

Args:

x_eval2D torch.tensor

shape (number of splines, number of samples)

y_eval2D torch.tensor

shape (number of splines, number of samples)

grid2D torch.tensor

shape (number of splines, number of grid points)

kint

the piecewise polynomial order of splines.

devicestr

devicde

Example

>>> num_spline = 5
>>> num_sample = 100
>>> num_grid_interval = 10
>>> k = 3
>>> x_eval = torch.normal(0,1,size=(num_spline, num_sample))
>>> y_eval = torch.normal(0,1,size=(num_spline, num_sample))
>>> grids = torch.einsum('i,j->ij', torch.ones(num_spline,), torch.linspace(-1,1,steps=num_grid_interval+1))
torch.Size([5, 13])

kan.utils module

kan.utils.add_symbolic(name, fun)

add a symbolic function to library

Args:

namestr

name of the function

funfun

torch function or lambda function

Returns:

None

Example

>>> print(SYMBOLIC_LIB['Bessel'])
KeyError: 'Bessel'
>>> add_symbolic('Bessel', torch.special.bessel_j0)
>>> print(SYMBOLIC_LIB['Bessel'])
(<built-in function special_bessel_j0>, Bessel)
kan.utils.create_dataset(f, n_var=2, ranges=[-1, 1], train_num=1000, test_num=1000, normalize_input=False, normalize_label=False, device='cpu', seed=0)

create dataset

Args:

ffunction

the symbolic formula used to create the synthetic dataset

rangeslist or np.array; shape (2,) or (n_var, 2)

the range of input variables. Default: [-1,1].

train_numint

the number of training samples. Default: 1000.

test_numint

the number of test samples. Default: 1000.

normalize_inputbool

If True, apply normalization to inputs. Default: False.

normalize_labelbool

If True, apply normalization to labels. Default: False.

devicestr

device. Default: ‘cpu’.

seedint

random seed. Default: 0.

Returns:

datasetdic
Train/test inputs/labels are dataset[‘train_input’], dataset[‘train_label’],

dataset[‘test_input’], dataset[‘test_label’]

Example

>>> f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
>>> dataset = create_dataset(f, n_var=2, train_num=100)
>>> dataset['train_input'].shape
torch.Size([100, 2])
kan.utils.fit_params(x, y, fun, a_range=(-10, 10), b_range=(-10, 10), grid_number=101, iteration=3, verbose=True, device='cpu')

fit a, b, c, d such that

\[|y-(cf(ax+b)+d)|^2\]

is minimized. Both x and y are 1D array. Sweep a and b, find the best fitted model.

Args:

x1D array

x values

y1D array

y values

funfunction

symbolic function

a_rangetuple

sweeping range of a

b_rangetuple

sweeping range of b

grid_numint

number of steps along a and b

iterationint

number of zooming in

verbosebool

print extra information if True

devicestr

device

Returns:

a_bestfloat

best fitted a

b_bestfloat

best fitted b

c_bestfloat

best fitted c

d_bestfloat

best fitted d

r2_bestfloat

best r2 (coefficient of determination)

Example

>>> num = 100
>>> x = torch.linspace(-1,1,steps=num)
>>> noises = torch.normal(0,1,(num,)) * 0.02
>>> y = 5.0*torch.sin(3.0*x + 2.0) + 0.7 + noises
>>> fit_params(x, y, torch.sin)
r2 is 0.9999727010726929
(tensor([2.9982, 1.9996, 5.0053, 0.7011]), tensor(1.0000))

Module contents