Building a Simple Neural Network from Scratch with NumPy

In this post, we'll walk through a complete implementation of a simple feedforward neural network using only NumPy. This small project is an excellent way to understand the core concepts of how neural networks learn from data using gradient descent and backpropagation, without relying on high-level libraries like TensorFlow or PyTorch.

What Are We Building?

We're building a neural network with:

1 hidden layer
Sigmoid activation function
Manual weight updates via gradient descent
No external libraries beyond NumPy

The network will be trained on a small synthetic dataset to learn a binary classification task.

Prerequisites: The Sigmoid Function

The sigmoid function squashes input values into the range (0, 1), which makes it ideal for binary classification problems.


      def sigmoid(x, derivative=False):
      	if derivative:
        	return x * (1 - x)
        else:
        	return 1 / (1 + np.exp(-x))

The derivative of the sigmoid function is essential during backpropagation when calculating gradients.

The Dataset

We'll define a small dataset with 6 samples and 3 binary features. The corresponding labels are binary (0 or 1).


      X = np.array([      
      [0, 0, 1],      
      [0, 1, 1],      
      [1, 0, 0],      
      [1, 1, 0],     
      [1, 0, 1],     
      [1, 1, 1],
   		])    
      y = np.array([[0, 1, 0, 1, 1, 0]]).T

Network Architecture and Initialization

We'll use a single hidden layer with 3 neurons. We initialize weights randomly in the range and set a fixed random seed for reproducibility.


      np.random.seed(1)
      alpha = 0.1  # learning rate num_hidden = 3
      hidden_weights = 2 * np.random.random((X.shape[1] + 1, num_hidden)) - 1
      output_weights = 2 * np.random.random((num_hidden + 1, y.shape[1])) - 1

Note: We add a bias node by increasing the input dimensions by 1.

Training the Network

We'll train the network over 10,000 iterations using stochastic gradient descent and backpropagation.

Forward Pass

Add a bias term to the input layer.
Compute hidden layer outputs using sigmoid activation.
Add a bias to the hidden layer.
Compute final outputs (no activation on output layer).

input_layer_outputs = np.hstack((np.ones((X.shape[0], 1)), X))
      hidden_layer_outputs = np.hstack((np.ones((X.shape[0], 1)), sigmoid(np.dot(input_layer_outputs, hidden_weights))))
      output_layer_outputs = np.dot(hidden_layer_outputs, output_weights)

Backpropagation

We calculate the error at each layer and propagate it backward to update the weights.

Output Layer Error

output_error = output_layer_outputs - y

Hidden Layer Error

We remove the bias from the error calculation and apply the derivative of the sigmoid function.


      hidden_error = hidden_layer_outputs[:, 1:] * (1 - hidden_layer_outputs[:, 1:]) * np.dot(output_error, output_weights.T[:, 1:])

Gradients

We compute partial derivatives and average them to get the gradients.


      hidden_pd = input_layer_outputs[:, :, np.newaxis] * hidden_error[:, np.newaxis, :]
      output_pd = hidden_layer_outputs[:, :, np.newaxis] * output_error[:, np.newaxis, :]
      total_hidden_gradient = np.average(hidden_pd, axis=0)  
      total_output_gradient = np.average(output_pd, axis=0)

Update Weights

hidden_weights += -alpha * total_hidden_gradientoutput_weights += -alpha * total_output_gradient

Results

After training, we print the network's final predictions:

print("Output After Training: \n{}".format(output_layer_outputs))

Example output (will vary depending on initial weights):

Output After Training:   [[0.01]   [0.97]   [0.03]   [0.95]   [0.98]   [0.02]]

As you can see, the network correctly predicts values close to 0 or 1, aligning well with the training labels.

Key Takeaways

Implementing a neural network from scratch builds intuition for how learning happens under the hood.
The sigmoid activation and its derivative are essential for backpropagation.
Bias terms play a critical role in allowing the network to shift activation thresholds.
Vectorized operations with NumPy keep the implementation clean and efficient.

Py Data

Building a Simple Neural Network from Scratch with NumPy

Building a Simple Neural Network from Scratch with NumPy

What Are We Building?

Prerequisites: The Sigmoid Function

The Dataset

Network Architecture and Initialization

Training the Network

Forward Pass

Backpropagation

Output Layer Error

Hidden Layer Error

Gradients

Update Weights

Results

Key Takeaways

إرسال تعليق

Data Masking

In-Depth Python Topics: Going Beyond the Basics

Apache Airflow in Data Engineering

Digital Signal Processing

Sanic (Asynchronous Web Framework for Python)

Py Data