Panther: Faster & Cheaper Computations with RandNLA
High-performance randomized numerical linear algebra for PyTorch. Reduce memory usage, accelerate training, and scale to larger models with sketching algorithms.
import torch
import panther as pr
# Replace standard linear layer
# linear = torch.nn.Linear(4096, 4096)
sketched = pr.nn.SKLinear(4096, 4096, num_terms=1, low_rank=64)
# Memory reduction, faster training
x = torch.randn(128, 4096)
output = sketched(x) # Same API, better performance
Replace standard linear and convolutional layers with memory-efficient sketched alternatives that maintain accuracy while reducing computational cost.
Fast QR and SVD decompositions using randomized sketching, enabling efficient matrix operations on large-scale problems with theoretical guarantees.
Optimized CUDA kernels with Tensor Core support for maximum performance on modern GPUs, seamlessly integrated with PyTorch workflows.
Get started with Panther installation instructions for CPU and GPU environments.
Learn the basics with our quick start guide and first examples.
Comprehensive tutorials covering all aspects of Panther from basics to advanced topics.
Quick Summary#
Panther is a PyTorch library that leverages randomized numerical linear algebra (RandNLA) techniques to create sketched neural network layers, enabling significant memory savings and performance improvements for large-scale machine learning models. It provides efficient implementations of sketched linear layers, matrix decompositions, and GPU-accelerated operations, making it ideal for resource-constrained environments.
🛠️ Key Features#
Sketched Linear Layers: Memory-efficient alternatives to standard linear layers
Randomized Matrix Decompositions: Fast QR and SVD algorithms using sketching
Neural Network Operations: Optimized convolution and attention mechanisms
GPU Acceleration: CUDA kernels with Tensor Core support
AutoTuner: Automatic hyperparameter optimization for sketching parameters
🎯 Why Panther?#
Panther enables you to:
Reduce Memory Usage: Sketched layers use significantly less memory than standard layers
Accelerate Training: Faster forward and backward passes with optimized kernels
Scale to Larger Models: Handle bigger networks with limited GPU memory
Maintain Accuracy: Randomized algorithms with theoretical guarantees