Skip to content
@neuralmagic

Neural Magic

Neural Magic helps developers in accelerating machine learning performance using automated model sparsification techniques and inference technologies.

Pinned Loading

  1. nm-vllm nm-vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 240 9

  2. deepsparse deepsparse Public

    Sparsity-aware deep learning inference runtime for CPUs

    Python 2.9k 169

  3. sparseml sparseml Public

    Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

    Python 2k 141

  4. docs docs Public

    Top-level directory for documentation and general content

    MDX 120 7

  5. examples examples Public

    Notebooks using the Neural Magic libraries 📓

    Jupyter Notebook 39 6

  6. sparsezoo sparsezoo Public

    Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

    Python 362 23

Repositories

Showing 10 of 47 repositories
  • nm-vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    neuralmagic/nm-vllm’s past year of commit activity
    Python 240 3,355 0 25 Updated Jul 18, 2024
  • deepsparse Public

    Sparsity-aware deep learning inference runtime for CPUs

    neuralmagic/deepsparse’s past year of commit activity
    Python 2,945 169 5 20 Updated Jul 18, 2024
  • sparsezoo Public

    Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

    neuralmagic/sparsezoo’s past year of commit activity
    Python 362 Apache-2.0 23 1 6 Updated Jul 19, 2024
  • compressed-tensors Public

    A safetensors extension to efficiently store sparse quantized tensors on disk

    neuralmagic/compressed-tensors’s past year of commit activity
    Python 14 Apache-2.0 0 1 9 Updated Jul 18, 2024
  • AutoFP8 Public
    neuralmagic/AutoFP8’s past year of commit activity
    Python 88 Apache-2.0 12 4 3 Updated Jul 18, 2024
  • neuralmagic/nm-vllm-certs’s past year of commit activity
    0 0 0 0 Updated Jul 18, 2024
  • nm-actions Public

    Neural Magic GHA

    neuralmagic/nm-actions’s past year of commit activity
    0 Apache-2.0 0 0 2 Updated Jul 18, 2024
  • guidellm Public
    neuralmagic/guidellm’s past year of commit activity
    Python 2 Apache-2.0 0 0 4 Updated Jul 18, 2024
  • upstream-llm-foundry Public Forked from mosaicml/llm-foundry

    LLM training code for MosaicML foundation models

    neuralmagic/upstream-llm-foundry’s past year of commit activity
    Python 0 Apache-2.0 512 0 0 Updated Jul 18, 2024
  • inference Public Forked from mlcommons/inference

    Reference implementations of MLPerf™ inference benchmarks

    neuralmagic/inference’s past year of commit activity
    Python 1 Apache-2.0 507 0 1 Updated Jul 17, 2024