# How to Interpolate Data with Scipy

Interpolation may sound like a fancy mathematical exercise, but in many ways, it is much like what machine learning does.

• Start with a limited set of data points relating multiple variables
• Interpolate (basically, create a model)
• Construct a new function that can be used to predict any future or new point from the interpolation

So, the idea is — ingest, interpolate, predict.

Concretely, suppose we have a limited number of data points for a pair of variables (x,y) that have an unknown (and nonlinear) relationship between them i.e. y = f(x). From this limited data, we want to construct a prediction function that can generate y values for any given x values (within the same range that was used for the interpolation).

There are a lot of mathematical theories and work on this subject. You can certainly write your own algorithm to implement those interpolation methods. But why not take advantage of the open-source (and optimized) options?

## Scipy interpolate

We start with a quadratic function where we have only 11 data points. The code to interpolate is basically a one-liner:

f1 = interp1d(x, y, kind='linear')


Note that this interp1d class of Scipy has a __call__ method that returns back a function. This is the function f1 we get back as our prediction model.

Here the code we have used:

from scipy.interpolate import interp1d
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate

NUM_DATA = 11
NUM_INTERPOLATE = 41

x = np.linspace(0, 10, num=NUM_DATA, endpoint=True)
y = x**2+7*x-28

f1 = interp1d(x, y, kind='linear')
xnew = np.linspace(0, 10, num=NUM_INTERPOLATE, endpoint=True)

fig, ax  = plt.subplots(1,2,figsize=(6,3),dpi=120)
ax[0].scatter(x, y)
ax[0].set_title("Original data")
ax[1].scatter(x, y)
ax[1].plot(xnew, f1(xnew), color='orange',linestyle='--')
ax[1].set_title("Interpolation")
plt.show()


Let us go one degree higher to a cubic generating function. The interpolation result looks as smooth as ever.

x = np.linspace(0, 10, num=NUM_DATA, endpoint=True)
y = 0.15*x**3+0.23*x**2-7*x+18

f1 = interp1d(x, y, kind='linear')
xnew = np.linspace(0, 10, num=NUM_INTERPOLATE, endpoint=True)

fig, ax  = plt.subplots(1,2,figsize=(6,3),dpi=120)
ax[0].scatter(x, y)
ax[0].set_title("Original data")
ax[1].scatter(x, y)
ax[1].plot(xnew, f1(xnew), color='red',linestyle='--')
ax[1].set_title("Interpolation")
plt.show()



Note that the original data may come from a cubic function, but this is still a ‘linear’ interpolation which is set by the kind parameter as shown above. This means that the intermediate points between the original data points lie on a linear segment.

But Scipy offers quadratic or cubic splines too. Let’s see them in action.

Things can be a little tricky to handle with linear interpolation when the original data is not polynomial in nature or the data has inherent noise (natural for any scientific measurement).

Here is a demo example for a particularly tricky nonlinear example:

x = np.linspace(0, 10, num=NUM_DATA, endpoint=True)
y = 0.15*x**3+0.23*x**2-7*x+18

f1 = interp1d(x, y, kind='linear')
f3 = interp1d(x,y, kind='cubic')
xnew = np.linspace(0, 10, num=NUM_INTERPOLATE, endpoint=True)

fig, ax  = plt.subplots(1,3,figsize=(6,3),dpi=120)
ax[0].scatter(x, y)
ax[0].set_title("Original data")
ax[1].scatter(x, y)
ax[1].plot(xnew, f1(xnew), color='red',linestyle='--')
ax[1].set_title("Linear")
ax[2].scatter(x, y)
ax[2].plot(xnew, f3(xnew), color='orange',linestyle='--')
ax[2].set_title("Cubic and Linear splines")

plt.show()