Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
OliP 's Collections
NewGen small LMs
Leading Leaderboards
2024 Papers of the year
2023 (and before) Papers of the Year
LLM Deployment
Vision-Language
Long-Context
Audio
Special LMs <10B
🌢️ Spaces
Evaluation
Applications
Coding

LLM Deployment

updated Sep 18, 2024
Upvote
-

  • Paused
    273

    Llm Pricing

    πŸ“Š
    273

    Display a React app with TypeScript


  • Running
    Featured
    1.04k

    Can You Run It? LLM version

    πŸš€
    1.04k

    Determine GPU requirements for running large language models


  • Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

    Paper β€’ 2312.15234 β€’ Published Dec 23, 2023 β€’ 3

  • EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

    Paper β€’ 2407.11062 β€’ Published Jul 10, 2024 β€’ 10

  • Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

    Paper β€’ 2408.03314 β€’ Published Aug 6, 2024 β€’ 63

  • Paused
    36

    Transformer Calculator

    πŸ“Š
    36

    Calculate memory, parameters, and FLOPs for transformer models

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs