Show understanding of back propagation of errors and regression methods in machine learning

Resources | Subject Notes | Computer Science

Cambridge A-Level Computer Science 9618 - 18.1 Artificial Intelligence

Artificial Intelligence (AI)

Objective

Show understanding of back propagation of errors and regression methods in machine learning.

Regression Methods

Regression is a fundamental machine learning technique used to model the relationship between a dependent variable (the one being predicted) and one or more independent variables (the predictors). In the context of AI, regression algorithms are used for tasks like predicting house prices, sales figures, or stock values.

There are several types of regression, including:

Linear Regression: Assumes a linear relationship between the variables.
Polynomial Regression: Models a non-linear relationship using polynomial terms.
Support Vector Regression (SVR): Uses support vector machines for regression tasks.
Decision Tree Regression: Employs decision trees to make predictions.

Linear Regression in Detail

Linear regression aims to find the best-fitting straight line (in the case of one independent variable) or hyperplane (in the case of multiple independent variables) that minimizes the difference between the predicted values and the actual values.

The equation of a simple linear regression model is:

$$y = mx + c$$

where:

y is the dependent variable.
x is the independent variable.
m is the slope of the line.
c is the y-intercept.

The goal is to determine the optimal values of m and c that minimize the sum of squared errors (SSE).

The formula for SSE is:

$$SSE = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2$$

where:

y_i is the actual value of the dependent variable for the i-th data point.
$$\hat{y}_i$$ is the predicted value of the dependent variable for the i-th data point. It is calculated as mx_i + c.
n is the total number of data points.

The values of m and c can be calculated using the following formulas:

Parameter	Formula
Slope (m)	$m = \frac{n\sum_{i=1}^{n}(x_i y_i) - (\sum_{i=1}^{n}x_i)(\sum_{i=1}^{n}y_i)}{n\sum_{i=1}^{n}(x_i^2) - (\sum_{i=1}^{n}x_i)^2}$
Y-intercept (c)	$c = \bar{y} - m\bar{x}$ where $\bar{y}$ is the mean of y and $\bar{x}$ is the mean of x.

Back Propagation of Errors in Neural Networks

Back propagation is a crucial algorithm used to train artificial neural networks. It's a method for calculating the gradient of the cost function with respect to the network's weights, allowing the weights to be adjusted to minimize the error.

The process involves the following steps:

Forward Pass: The input data is fed forward through the network to produce an output.
Calculate Error: The difference between the predicted output and the actual output is calculated (the error).
Backward Pass: The error is propagated backward through the network, layer by layer.
Gradient Calculation: The gradient of the cost function with respect to each weight in the network is calculated. This indicates how much the cost function would change if each weight were adjusted slightly.
Weight Update: The weights are updated using the calculated gradients and a learning rate. The learning rate controls the size of the adjustments.

The chain rule of calculus is used to calculate the gradients during the backward pass. This allows the algorithm to determine how each weight contributed to the overall error.

The cost function is a measure of how well the network is performing. Common cost functions include:

Mean Squared Error (MSE): $$MSE = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2$$
Cross-Entropy Loss: Commonly used for classification tasks.

Back propagation is an iterative process, and the weights are adjusted repeatedly until the cost function reaches a minimum or a satisfactory level of accuracy.