Contributing to Panther

Contributing to Panther#

We welcome contributions to Panther! This guide will help you get started with contributing to the project.

Getting Started#

1. Fork and Clone the Repository

# Fork the repository on GitHub, then clone your fork
git clone https://github.com/YOUR_USERNAME/panther.git
cd panther

# Add the original repository as upstream
git remote add upstream https://github.com/FahdSeddik/panther.git

2. Set Up Development Environment

# Install development dependencies
poetry install --with dev

# Install pre-commit hooks
poetry run pre-commit install

# Build the native backend
cd pawX
make all  # Linux/macOS
# or
.\build.ps1  # Windows

3. Verify Installation

# Run tests to ensure everything works
poetry run pytest tests/

Development Workflow#

1. Create a Feature Branch

git checkout -b feature/your-feature-name

2. Make Your Changes

Follow the coding standards (see below)
Add tests for new functionality
Update documentation as needed
Ensure all tests pass

3. Run Tests and Checks

# Run all tests
poetry run pytest tests/

# Run type checking
poetry run mypy panther/

# Run linting
poetry run ruff check panther/

# Run formatting
poetry run ruff format panther/

4. Commit and Push

git add .
git commit -m "feat: add your feature description"
git push origin feature/your-feature-name

5. Create a Pull Request

Go to GitHub and create a pull request
Describe your changes clearly
Link any related issues
Wait for review and address feedback

Coding Standards#

Python Code Style

We use Ruff for linting and formatting:

# Good: Clear variable names and docstrings
def compute_sketched_linear(
    input_tensor: torch.Tensor,
    sketch_matrices: List[torch.Tensor],
    bias: Optional[torch.Tensor] = None
) -> torch.Tensor:
    """
    Compute sketched linear transformation.

    Args:
        input_tensor: Input tensor of shape (batch_size, in_features)
        sketch_matrices: List of sketching matrices
        bias: Optional bias tensor

    Returns:
        Output tensor of shape (batch_size, out_features)
    """
    # Implementation here
    pass

Type Hints

All functions should include type hints:

from typing import Optional, Tuple, List
import torch

def cqrrpt(
    matrix: torch.Tensor,
    gamma: float = 1.25,
    distribution: DistributionFamily = DistributionFamily.Gaussian
) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
    """Randomized QR with column pivoting."""
    # Implementation
    pass

Documentation

All public functions and classes must have docstrings:

class SKLinear(nn.Module):
    """
    Sketched linear layer using randomized low-rank approximation.

    This layer approximates a standard linear transformation using a sum
    of low-rank terms, reducing memory usage while maintaining performance.

    Args:
        in_features: Number of input features
        out_features: Number of output features
        num_terms: Number of low-rank terms in the approximation
        low_rank: Rank of each low-rank term
        bias: If True, adds a learnable bias

    Example:
        >>> layer = SKLinear(512, 256, num_terms=8, low_rank=64)
        >>> x = torch.randn(32, 512)
        >>> y = layer(x)  # Shape: (32, 256)
    """

C++/CUDA Code Style

For C++ and CUDA code:

// Good: Clear function documentation and naming
/**
 * @brief Compute sketched linear forward pass on GPU.
 *
 * @param input Input tensor with shape [batch_size, in_features]
 * @param S1s First sketch matrices [num_terms, in_features, low_rank]
 * @param S2s Second sketch matrices [num_terms, low_rank, out_features]
 * @param U1s Fixed random matrices [num_terms, low_rank, out_features]
 * @param U2s Fixed random matrices [num_terms, low_rank, in_features]
 * @param bias Optional bias tensor [out_features]
 * @return Output tensor [batch_size, out_features]
 */
torch::Tensor sketched_linear_forward_cuda(
    const torch::Tensor& input,
    const torch::Tensor& S1s,
    const torch::Tensor& S2s,
    const torch::Tensor& U1s,
    const torch::Tensor& U2s,
    const torch::Tensor& bias
);

Testing Guidelines#

Test Structure

Organize tests by functionality:

tests/
├── test_linalg.py          # Linear algebra functions
├── test_nn.py              # Neural network layers
├── test_sketch.py          # Sketching operations
├── test_cuda.py            # CUDA kernel tests
└── test_integration.py     # End-to-end tests

Writing Tests

import pytest
import torch
import panther as pr

class TestSKLinear:
    """Test suite for SKLinear layer."""

    def test_forward_pass_shape(self):
        """Test that forward pass produces correct output shape."""
        layer = pr.nn.SKLinear(
            in_features=128,
            out_features=64,
            num_terms=4,
            low_rank=32
        )

        x = torch.randn(16, 128)
        y = layer(x)

        assert y.shape == (16, 64), f"Expected (16, 64), got {y.shape}"

    def test_backward_pass(self):
        """Test that backward pass computes gradients correctly."""
        layer = pr.nn.SKLinear(128, 64, num_terms=4, low_rank=32)
        x = torch.randn(16, 128, requires_grad=True)

        y = layer(x)
        loss = y.sum()
        loss.backward()

        # Check that gradients were computed
        assert x.grad is not None
        assert layer.S1s.grad is not None
        assert layer.S2s.grad is not None

    @pytest.mark.parametrize("device", ["cpu", "cuda"])
    def test_device_compatibility(self, device):
        """Test layer works on different devices."""
        if device == "cuda" and not torch.cuda.is_available():
            pytest.skip("CUDA not available")

        layer = pr.nn.SKLinear(64, 32, num_terms=2, low_rank=16)
        layer = layer.to(device)

        x = torch.randn(8, 64, device=device)
        y = layer(x)

        assert y.device.type == device

Performance Tests

import time
import pytest

def test_sketched_layer_performance():
    """Test that sketched layers provide memory benefits."""
    # Create layers
    standard = torch.nn.Linear(2048, 2048)
    sketched = pr.nn.SKLinear(2048, 2048, num_terms=8, low_rank=128)

    # Compare parameter counts
    standard_params = sum(p.numel() for p in standard.parameters())
    sketched_params = sum(p.numel() for p in sketched.parameters())

    assert sketched_params < standard_params, "Sketched layer should use fewer parameters"

    # Compare memory usage
    x = torch.randn(64, 2048)

    # Time forward pass
    start = time.time()
    for _ in range(100):
        _ = standard(x)
    standard_time = time.time() - start

    start = time.time()
    for _ in range(100):
        _ = sketched(x)
    sketched_time = time.time() - start

    # Sketched should be competitive (within 2x)
    assert sketched_time < 2 * standard_time

Documentation Guidelines#

Building Documentation

cd docs

# Clean previous build
make clean  # Linux/macOS
.\make.bat clean  # Windows

# Build HTML documentation
make html  # Linux/macOS
.\make.bat html  # Windows

# Open in browser
open _build/html/index.html  # macOS
xdg-open _build/html/index.html  # Linux
start _build/html/index.html  # Windows

Writing Documentation

Use reStructuredText format with clear examples:

Function Name
=============

Brief description of what the function does.

Parameters
----------
param1 : type
    Description of parameter 1
param2 : type, optional
    Description of parameter 2 (default: value)

Returns
-------
return_type
    Description of return value

Examples
--------
>>> import panther as pr
>>> result = pr.function_name(param1, param2)
>>> print(result.shape)

Areas for Contribution#

1. Core Algorithms

New sketching methods
Improved randomized algorithms
Memory optimization techniques

2. Neural Network Layers

Sketched versions of more PyTorch layers
Custom activation functions
Attention mechanisms

3. CUDA Kernels

Performance optimizations
Support for new GPU architectures
Memory-efficient implementations

4. AutoTuner Improvements

Better hyperparameter search strategies
Multi-objective optimization
Integration with popular ML frameworks

5. Documentation and Examples

Tutorial improvements
Real-world examples
Performance benchmarks

6. Testing and CI

Expanded test coverage
Performance regression tests
Cross-platform compatibility

Submitting Issues#

When submitting bug reports or feature requests:

Bug Reports

Include:

Clear description of the problem
Minimal code to reproduce the issue
Error messages and stack traces
Environment information (OS, Python version, CUDA version)

# Example bug report template
import torch
import panther as pr

# Environment
print(f"Python: {sys.version}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA: {torch.version.cuda}")
print(f"Panther: {pr.__version__}")

# Minimal reproduction case
layer = pr.nn.SKLinear(512, 256, num_terms=8, low_rank=64)
x = torch.randn(32, 512)

# This causes the error:
y = layer(x)  # Error occurs here

Feature Requests

Include:

Clear description of the desired feature
Use cases and motivation
Proposed API design (if applicable)
Willingness to implement

Release Process#

For maintainers:

1. Version Bumping

# Update version in pyproject.toml
poetry version patch  # or minor, major

# Update version in __init__.py
# Update CHANGELOG.md

2. Testing

# Run full test suite
poetry run pytest tests/

# Run tests on different Python versions
tox

3. Building

# Build Python package
poetry build

# Build documentation
cd docs && make html

4. Release

# Tag release
git tag v0.1.3
git push origin v0.1.3

# Publish to PyPI
poetry publish

Recognition#

Contributors will be recognized in:

CONTRIBUTORS.md file
Release notes
Documentation acknowledgments

Thank you for contributing to Panther!