Regression

1 minute read

Published:

This post explains Simple and Multiple Regression.

Regression

  • Relationship between variables to predict the outcome of future events

Linear Regression

  • uses the relationship between the data-points to draw a straight line through all them

  • Line can be used to predict future values

  • Example - Car Age vs Speed

  • $Age = [5,7,8,7,2,17,2,9,4,11,12,9,6]$

  • $Speed = [99,86,87,88,111,86,103,87,94,78,77,85,86]$

  • %matplotlib inline
      
    import matplotlib.pyplot as plt
      
    ages = [5,7,8,7,2,17,2,9,4,11,12,9,6]
    speeds = [99,86,87,88,111,86,103,87,94,78,77,85,86]
      
    plt.scatter(ages, speeds);
    plt.xlabel('Age of car', fontsize=14);
    plt.ylabel('Speed of Car', fontsize=14);
    
  • from scipy import stats
    slope, intercept, r, p, std_err = stats.linregress(x, y)
      
    def model(age):
        return slope * age + intercept
      
    y = map(model, ages)
    y = list(y)
      
    plt.scatter(ages, speeds)
    plt.plot(ages, y)
      
    plt.title(f'Relationship {r:.3f}', fontsize=14);
    plt.xlabel('Age of car', fontsize=14);
    plt.ylabel('Speed of Car', fontsize=14);
    
  • Predict Future Value

  • Find Speed for age value as 10

    • Speed = 85.59

    • # Predicting Future Value
      age = 10
      speed = model(age)
      print(f'Speed for Age {age} is {speed:.2f}') # 85.59
      

Multiple Regression

  • Regression with more than one independent value, meaning that we try to predict a value based on two or more variables.

  • import pandas as pd
    from sklearn import linear_model
      
    df = pd.read_csv('./data/cars.csv')
    df.head(2)
      
    X = df[['Weight', 'Volume']]
    y = df['CO2']
      
    model = linear_model.LinearRegression()
      
    model.fit(X, y)
      
    #predict the CO2 emission of a car - weight = 2300kg, volume = 1300cm3
    predictedCO2 = model.predict([[2300, 1300]])
    print(f'Predicted CO2 = {predictedCO2}') # [107.2087328]
      
    print(f'Model Variables Relationship {model.coef_}') # [0.00755095 0.00780526]