Contributing to Panther#

We welcome contributions to Panther! This guide will help you get started with contributing to the project.

Getting Started#

1. Fork and Clone the Repository

# Fork the repository on GitHub, then clone your fork
git clone https://github.com/YOUR_USERNAME/panther.git
cd panther

# Add the original repository as upstream
git remote add upstream https://github.com/FahdSeddik/panther.git

2. Set Up Development Environment

# Install development dependencies
poetry install --with dev

# Install pre-commit hooks
poetry run pre-commit install

# Build the native backend
cd pawX
make all  # Linux/macOS
# or
.\build.ps1  # Windows

3. Verify Installation

# Run tests to ensure everything works
poetry run pytest tests/

Development Workflow#

1. Create a Feature Branch

git checkout -b feature/your-feature-name

2. Make Your Changes

  • Follow the coding standards (see below)

  • Add tests for new functionality

  • Update documentation as needed

  • Ensure all tests pass

3. Run Tests and Checks

# Run all tests
poetry run pytest tests/

# Run type checking
poetry run mypy panther/

# Run linting
poetry run ruff check panther/

# Run formatting
poetry run ruff format panther/

4. Commit and Push

git add .
git commit -m "feat: add your feature description"
git push origin feature/your-feature-name

5. Create a Pull Request

  • Go to GitHub and create a pull request

  • Describe your changes clearly

  • Link any related issues

  • Wait for review and address feedback

Coding Standards#

Python Code Style

We use Ruff for linting and formatting:

# Good: Clear variable names and docstrings
def compute_sketched_linear(
    input_tensor: torch.Tensor,
    sketch_matrices: List[torch.Tensor],
    bias: Optional[torch.Tensor] = None
) -> torch.Tensor:
    """
    Compute sketched linear transformation.

    Args:
        input_tensor: Input tensor of shape (batch_size, in_features)
        sketch_matrices: List of sketching matrices
        bias: Optional bias tensor

    Returns:
        Output tensor of shape (batch_size, out_features)
    """
    # Implementation here
    pass

Type Hints

All functions should include type hints:

from typing import Optional, Tuple, List
import torch

def cqrrpt(
    matrix: torch.Tensor,
    gamma: float = 1.25,
    distribution: DistributionFamily = DistributionFamily.Gaussian
) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
    """Randomized QR with column pivoting."""
    # Implementation
    pass

Documentation

All public functions and classes must have docstrings:

class SKLinear(nn.Module):
    """
    Sketched linear layer using randomized low-rank approximation.

    This layer approximates a standard linear transformation using a sum
    of low-rank terms, reducing memory usage while maintaining performance.

    Args:
        in_features: Number of input features
        out_features: Number of output features
        num_terms: Number of low-rank terms in the approximation
        low_rank: Rank of each low-rank term
        bias: If True, adds a learnable bias

    Example:
        >>> layer = SKLinear(512, 256, num_terms=8, low_rank=64)
        >>> x = torch.randn(32, 512)
        >>> y = layer(x)  # Shape: (32, 256)
    """

C++/CUDA Code Style

For C++ and CUDA code:

// Good: Clear function documentation and naming
/**
 * @brief Compute sketched linear forward pass on GPU.
 *
 * @param input Input tensor with shape [batch_size, in_features]
 * @param S1s First sketch matrices [num_terms, in_features, low_rank]
 * @param S2s Second sketch matrices [num_terms, low_rank, out_features]
 * @param U1s Fixed random matrices [num_terms, low_rank, out_features]
 * @param U2s Fixed random matrices [num_terms, low_rank, in_features]
 * @param bias Optional bias tensor [out_features]
 * @return Output tensor [batch_size, out_features]
 */
torch::Tensor sketched_linear_forward_cuda(
    const torch::Tensor& input,
    const torch::Tensor& S1s,
    const torch::Tensor& S2s,
    const torch::Tensor& U1s,
    const torch::Tensor& U2s,
    const torch::Tensor& bias
);

Testing Guidelines#

Test Structure

Organize tests by functionality:

tests/
├── test_linalg.py          # Linear algebra functions
├── test_nn.py              # Neural network layers
├── test_sketch.py          # Sketching operations
├── test_cuda.py            # CUDA kernel tests
└── test_integration.py     # End-to-end tests

Writing Tests

import pytest
import torch
import panther as pr

class TestSKLinear:
    """Test suite for SKLinear layer."""

    def test_forward_pass_shape(self):
        """Test that forward pass produces correct output shape."""
        layer = pr.nn.SKLinear(
            in_features=128,
            out_features=64,
            num_terms=4,
            low_rank=32
        )

        x = torch.randn(16, 128)
        y = layer(x)

        assert y.shape == (16, 64), f"Expected (16, 64), got {y.shape}"

    def test_backward_pass(self):
        """Test that backward pass computes gradients correctly."""
        layer = pr.nn.SKLinear(128, 64, num_terms=4, low_rank=32)
        x = torch.randn(16, 128, requires_grad=True)

        y = layer(x)
        loss = y.sum()
        loss.backward()

        # Check that gradients were computed
        assert x.grad is not None
        assert layer.S1s.grad is not None
        assert layer.S2s.grad is not None

    @pytest.mark.parametrize("device", ["cpu", "cuda"])
    def test_device_compatibility(self, device):
        """Test layer works on different devices."""
        if device == "cuda" and not torch.cuda.is_available():
            pytest.skip("CUDA not available")

        layer = pr.nn.SKLinear(64, 32, num_terms=2, low_rank=16)
        layer = layer.to(device)

        x = torch.randn(8, 64, device=device)
        y = layer(x)

        assert y.device.type == device

Performance Tests

import time
import pytest

def test_sketched_layer_performance():
    """Test that sketched layers provide memory benefits."""
    # Create layers
    standard = torch.nn.Linear(2048, 2048)
    sketched = pr.nn.SKLinear(2048, 2048, num_terms=8, low_rank=128)

    # Compare parameter counts
    standard_params = sum(p.numel() for p in standard.parameters())
    sketched_params = sum(p.numel() for p in sketched.parameters())

    assert sketched_params < standard_params, "Sketched layer should use fewer parameters"

    # Compare memory usage
    x = torch.randn(64, 2048)

    # Time forward pass
    start = time.time()
    for _ in range(100):
        _ = standard(x)
    standard_time = time.time() - start

    start = time.time()
    for _ in range(100):
        _ = sketched(x)
    sketched_time = time.time() - start

    # Sketched should be competitive (within 2x)
    assert sketched_time < 2 * standard_time

Documentation Guidelines#

Building Documentation

cd docs

# Clean previous build
make clean  # Linux/macOS
.\make.bat clean  # Windows

# Build HTML documentation
make html  # Linux/macOS
.\make.bat html  # Windows

# Open in browser
open _build/html/index.html  # macOS
xdg-open _build/html/index.html  # Linux
start _build/html/index.html  # Windows

Writing Documentation

Use reStructuredText format with clear examples:

Function Name
=============

Brief description of what the function does.

Parameters
----------
param1 : type
    Description of parameter 1
param2 : type, optional
    Description of parameter 2 (default: value)

Returns
-------
return_type
    Description of return value

Examples
--------
>>> import panther as pr
>>> result = pr.function_name(param1, param2)
>>> print(result.shape)

Areas for Contribution#

1. Core Algorithms

  • New sketching methods

  • Improved randomized algorithms

  • Memory optimization techniques

2. Neural Network Layers

  • Sketched versions of more PyTorch layers

  • Custom activation functions

  • Attention mechanisms

3. CUDA Kernels

  • Performance optimizations

  • Support for new GPU architectures

  • Memory-efficient implementations

4. AutoTuner Improvements

  • Better hyperparameter search strategies

  • Multi-objective optimization

  • Integration with popular ML frameworks

5. Documentation and Examples

  • Tutorial improvements

  • Real-world examples

  • Performance benchmarks

6. Testing and CI

  • Expanded test coverage

  • Performance regression tests

  • Cross-platform compatibility

Submitting Issues#

When submitting bug reports or feature requests:

Bug Reports

Include:

  • Clear description of the problem

  • Minimal code to reproduce the issue

  • Error messages and stack traces

  • Environment information (OS, Python version, CUDA version)

# Example bug report template
import torch
import panther as pr

# Environment
print(f"Python: {sys.version}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA: {torch.version.cuda}")
print(f"Panther: {pr.__version__}")

# Minimal reproduction case
layer = pr.nn.SKLinear(512, 256, num_terms=8, low_rank=64)
x = torch.randn(32, 512)

# This causes the error:
y = layer(x)  # Error occurs here

Feature Requests

Include:

  • Clear description of the desired feature

  • Use cases and motivation

  • Proposed API design (if applicable)

  • Willingness to implement

Release Process#

For maintainers:

1. Version Bumping

# Update version in pyproject.toml
poetry version patch  # or minor, major

# Update version in __init__.py
# Update CHANGELOG.md

2. Testing

# Run full test suite
poetry run pytest tests/

# Run tests on different Python versions
tox

3. Building

# Build Python package
poetry build

# Build documentation
cd docs && make html

4. Release

# Tag release
git tag v0.1.3
git push origin v0.1.3

# Publish to PyPI
poetry publish

Recognition#

Contributors will be recognized in:

  • CONTRIBUTORS.md file

  • Release notes

  • Documentation acknowledgments

Thank you for contributing to Panther!