repoGitHubTrust 82 · PrimaryPublished 20h agoLive · 12h ago
NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Covers
Implements
paperFlexViT: A Flexible FPGA-based Accelerator for Edge Vision TransformerspaperGPU Parallelization Strategies for Forward and Backward Propagation in Shallow Neural Networks: A CUDA-Based Comparative StudypaperEfficient PEFT Methods with Adaptive Checkpointing for Vision Models and VLMs on Resource Constrained Consumer-GPUs
Related across the graph
paperEfficient PEFT Methods with Adaptive Checkpointing for Vision Models and VLMs on Resource Constrained Consumer-GPUsnewsFrom Materials Simulation to Experimental Astronomy, New NVIDIA AI Software Unlocks Scientific DiscoveriespaperFlexViT: A Flexible FPGA-based Accelerator for Edge Vision TransformersnewsAccelerating Transformers Fine-Tuning with NVIDIA NeMo AutoModelpaperGPU Parallelization Strategies for Forward and Backward Propagation in Shallow Neural Networks: A CUDA-Based Comparative Study
