Visit Website
المشاركات

Building a Simple Neural Network from Scratch with NumPy

Building a Simple Neural Network from Scratch with NumPy

In this post, we'll walk through a complete implementation of a simple feedforward neural network using only NumPy. This small project is an excellent way to understand the core concepts of how neural networks learn from data using gradient descent and backpropagation, without relying on high-level libraries like TensorFlow or PyTorch.

What Are We Building?

We're building a neural network with:

  • 1 hidden layer

  • Sigmoid activation function

  • Manual weight updates via gradient descent

  • No external libraries beyond NumPy

The network will be trained on a small synthetic dataset to learn a binary classification task.

Prerequisites: The Sigmoid Function

The sigmoid function squashes input values into the range (0, 1), which makes it ideal for binary classification problems.


      def sigmoid(x, derivative=False):
      	if derivative:
        	return x * (1 - x)
        else:
        	return 1 / (1 + np.exp(-x))  
       

The derivative of the sigmoid function is essential during backpropagation when calculating gradients.

The Dataset

We'll define a small dataset with 6 samples and 3 binary features. The corresponding labels are binary (0 or 1).


      X = np.array([      
      [0, 0, 1],      
      [0, 1, 1],      
      [1, 0, 0],      
      [1, 1, 0],     
      [1, 0, 1],     
      [1, 1, 1],
   		])    
      y = np.array([[0, 1, 0, 1, 1, 0]]).T  

Network Architecture and Initialization

We'll use a single hidden layer with 3 neurons. We initialize weights randomly in the range and set a fixed random seed for reproducibility.


      np.random.seed(1)
      alpha = 0.1  # learning rate num_hidden = 3
      hidden_weights = 2 * np.random.random((X.shape[1] + 1, num_hidden)) - 1
      output_weights = 2 * np.random.random((num_hidden + 1, y.shape[1])) - 1  
      

Note: We add a bias node by increasing the input dimensions by 1.

Training the Network

We'll train the network over 10,000 iterations using stochastic gradient descent and backpropagation.

Forward Pass

  1. Add a bias term to the input layer.

  2. Compute hidden layer outputs using sigmoid activation.

  3. Add a bias to the hidden layer.

  4. Compute final outputs (no activation on output layer).

input_layer_outputs = np.hstack((np.ones((X.shape[0], 1)), X))
      hidden_layer_outputs = np.hstack((np.ones((X.shape[0], 1)), sigmoid(np.dot(input_layer_outputs, hidden_weights))))
      output_layer_outputs = np.dot(hidden_layer_outputs, output_weights)  

Backpropagation

We calculate the error at each layer and propagate it backward to update the weights.

Output Layer Error

output_error = output_layer_outputs - y  

Hidden Layer Error

We remove the bias from the error calculation and apply the derivative of the sigmoid function.


      hidden_error = hidden_layer_outputs[:, 1:] * (1 - hidden_layer_outputs[:, 1:]) * np.dot(output_error, output_weights.T[:, 1:])  

Gradients

We compute partial derivatives and average them to get the gradients.


      hidden_pd = input_layer_outputs[:, :, np.newaxis] * hidden_error[:, np.newaxis, :]
      output_pd = hidden_layer_outputs[:, :, np.newaxis] * output_error[:, np.newaxis, :]
      total_hidden_gradient = np.average(hidden_pd, axis=0)  
      total_output_gradient = np.average(output_pd, axis=0)  

Update Weights

hidden_weights += -alpha * total_hidden_gradientoutput_weights += -alpha * total_output_gradient  

Results

After training, we print the network's final predictions:

print("Output After Training: \n{}".format(output_layer_outputs))  

Example output (will vary depending on initial weights):

Output After Training:   [[0.01]   [0.97]   [0.03]   [0.95]   [0.98]   [0.02]]  

As you can see, the network correctly predicts values close to 0 or 1, aligning well with the training labels.

Key Takeaways

  • Implementing a neural network from scratch builds intuition for how learning happens under the hood.

  • The sigmoid activation and its derivative are essential for backpropagation.

  • Bias terms play a critical role in allowing the network to shift activation thresholds.

  • Vectorized operations with NumPy keep the implementation clean and efficient.

إرسال تعليق

Visit Website
Visit Website
Mausam Welcome to WhatsApp chat
Hello! How can we help you today?
Type here...