smoltorch — Autograd Engine & Neural Networks

Overview

smoltorch is a minimalist deep learning library that implements automatic differentiation (autograd) and neural networks from scratch using only NumPy. Inspired by Andrej Karpathy's micrograd, it's designed to be educational, transparent, and functional—you can train real models on real datasets with competitive performance.

The name comes from "Smol" + PyTorch: a tiny implementation that captures the essence of modern deep learning frameworks.

Design Philosophy

📚

Educational

Understand how modern deep learning frameworks work under the hood

🔍

Transparent

Every operation is visible and understandable

⚡

Functional

Train real models on real datasets with competitive performance

✨

Minimal

~500 lines of readable, well-documented Python code

Features

Core Engine

Automatic differentiation with dynamic computational graphs
NumPy-backed tensors for efficient numerical computing
Broadcasting support with proper gradient handling
Topological sorting for correct backpropagation

Operations

Arithmetic: +, -, *, /, **
Matrix operations: @ (matmul)
Activations: ReLU, tanh, sigmoid
Reductions: sum, mean
Element-wise: log

Neural Networks

Layers: Linear (fully connected)
Models: Multi-layer perceptron (MLP)
Loss functions: MSE, Binary Cross-Entropy
Optimizers: SGD (Stochastic Gradient Descent)

Performance

smoltorch achieves competitive results on standard benchmarks:

Dataset

Task

Accuracy

Epochs

Breast Cancer

Binary Classification

96.5%

200

Synthetic Regression

Regression

MSE: 95.7

100

Installation

uv add smoltorch

Quick Start

Basic Tensor Operations

from smoltorch import Tensor

# Create tensors
x = Tensor([1.0, 2.0, 3.0])
y = Tensor([4.0, 5.0, 6.0])

# Operations
z = x + y           # Element-wise addition
w = x * y           # Element-wise multiplication
a = x @ y.T         # Matrix multiplication

# Backward pass
a.backward()
print(x.grad)       # Gradients computed automatically!

Training a Neural Network

from smoltorch import Tensor, MLP, SGD
from sklearn.datasets import make_regression
import numpy as np

# Generate data
X, y = make_regression(n_samples=100, n_features=5, noise=10)
y = y.reshape(-1, 1)

# Create model
model = MLP([5, 16, 16, 1])  # 5 inputs -> 16 -> 16 -> 1 output
optimizer = SGD(model.parameters(), lr=0.001)

# Training loop
for epoch in range(100):
    # Forward pass
    X_tensor = Tensor(X)
    y_tensor = Tensor(y)
    y_pred = model(X_tensor)
    
    # Compute loss (MSE)
    loss = ((y_pred - y_tensor) ** 2).mean()
    
    # Backward pass
    optimizer.zero_grad()
    loss.backward()
    
    # Update weights
    optimizer.step()
    
    if (epoch + 1) % 10 == 0:
        print(f"Epoch {epoch + 1}, Loss: {loss.data:.4f}\")

How Autograd Works

smoltorch builds a dynamic computational graph during the forward pass:

Forward pass: Build computational graph with operations as nodes
Topological sort: Order nodes for correct gradient flow
Backward pass: Apply chain rule in reverse topological order
Gradient accumulation: Sum gradients from multiple paths

x = Tensor([2.0])
y = Tensor([3.0])
z = (x * y) + (x ** 2)  # Graph: z -> [+] -> [*, **] -> [x, y]

z.backward()  # Backpropagate through graph
print(x.grad)  # dz/dx = y + 2x = 3 + 4 = 7.0

Broadcasting Support

smoltorch correctly handles broadcasting in both forward and backward passes:

x = Tensor([[1, 2, 3]])    # shape (1, 3)
y = Tensor([[1], [2]])      # shape (2, 1)
z = x + y                   # shape (2, 3) - broadcasting!

z.backward()
# x.grad sums over broadcast dimensions: shape (1, 3)
# y.grad sums over broadcast dimensions: shape (2, 1)

Supported Operations

Element-wise

x + y x - y x * y x / y x ** 2

Matrix Operations

x @ y

Activations

x.relu() x.tanh() x.sigmoid()

Reductions

x.sum() x.sum(axis=0) x.mean() x.mean(axis=1)

Other

x.log()

Project Structure

smoltorch/tensor.py: Core Tensor class with autograd implementation
smoltorch/nn.py: Neural network layers and models (Linear, MLP)
smoltorch/optim.py: Optimizers (SGD)
examples/train_regression.py: Regression training example
examples/train_classification.py: Classification training example
tests/: Comprehensive test suite covering all operations

Roadmap

Coming Soon

More optimizers: Adam, RMSprop with momentum
More activations: Leaky ReLU, ELU, Softmax
Regularization: Dropout, L2 weight decay
Mini-batch training: Efficient batch processing
Multi-class classification: Softmax + Cross-Entropy loss

Future

Convolutional layers: CNN support for images
Model serialization: Save/load weights in safetensors format
GPU acceleration: Explore Metal Performance Shaders for Apple Silicon
Better initialization: He initialization for ReLU networks
Learning rate scheduling: Decay strategies

smoltorch 🔥