Hub documentation
Bucket Integrations
Single Sign-On (SSO) Audit Logs Storage Regions Data Studio for Private datasets Resource Groups (Access Control) Advanced Compute Options Advanced Security Tokens Management Publisher Analytics Gating Group Collections Network Security Rate Limits Blog Articles
PRO Plan Repositories Getting Started with Repositories Repository Settings Storage Limits Storage Backend (Xet) Local Cache Pull Requests & Discussions Notifications Collections Webhooks GitHub Actions Notebooks Next Steps Licenses
Models The Model Hub Model Cards Eval Results Leaderboard Data Gated Models Uploading Models Downloading Models Integrated Libraries Model Widgets Model Inference Models Download Stats Model Release Checklist Local Apps Frequently Asked Questions Advanced Topics
Datasets Datasets Overview Dataset Cards Gated Datasets Uploading Datasets Uploading Datasets (for LLMs) Downloading Datasets Streaming Datasets Integrated Libraries Data Studio Datasets Download Stats
Spaces Spaces Overview Spaces GPU Upgrades Spaces ZeroGPU Spaces Dev Mode Spaces Disk Usage & Storage Spaces Custom Domain Spaces as MCP servers Spaces as Agent Tools Spaces as API Endpoints Gradio Spaces Streamlit Spaces Static HTML Spaces Docker Spaces Embed your Space Run Spaces with Docker Spaces Configuration Reference Sign-In with HF button Featured Spaces Spaces Changelog Advanced Topics
Storage Buckets new Jobs Jobs Overview Quickstart Pricing and Billing Manage Jobs Configuration Popular Images Examples & Tutorials Schedule Jobs Webhook Automation Reference
Agents Agents Overview Hugging Face CLI for AI Agents Hugging Face MCP Server Hugging Face Agent Skills Building agents with the HF SDK Local Agents with llama.cpp Agent Libraries
Other Bucket Integrations
Storage Buckets can be read and written from many Python data libraries using hf://buckets/ paths, backed by the huggingface_hub filesystem interface.
For the underlying access mechanisms — mounts, volume mounts, and fsspec — see Access Patterns.
pandas
import pandas as pd
df = pd.read_parquet("hf://buckets/username/my-bucket/data.parquet")
df.to_parquet("hf://buckets/username/my-bucket/output.parquet")Dask
import dask.dataframe as dd
df = dd.read_parquet("hf://buckets/username/my-bucket/data.parquet")PyArrow
import pyarrow.parquet as pq
table = pq.read_table("hf://buckets/username/my-bucket/data.parquet")PySpark
With pyspark_huggingface installed:
df = (
spark.read.format("huggingface")
.option("data_files", '["data.parquet"]')
.load("buckets/username/my-bucket")
)See PySpark on the Hub for more.
🤗 Datasets
from datasets import load_dataset
ds = load_dataset("buckets/username/my-bucket", data_files=["data.parquet"])Filesystem operations
For direct file operations, huggingface_hub exposes a pre-instantiated filesystem object, hffs:
from huggingface_hub import hffs
with hffs.open("buckets/username/my-bucket/hello.txt", "w") as f:
f.write("Hello world!")
hffs.cp("buckets/username/my-bucket/hello.txt", "buckets/username/my-bucket/hello2.txt")
hffs.rm("buckets/username/my-bucket/hello2.txt")
files = hffs.ls("buckets/username/my-bucket")
text_files = hffs.glob("buckets/username/my-bucket/*.txt")Other languages
OpenDAL provides a similar filesystem interface for Rust, Java, Go, JavaScript, and more.
Coming soon
Support for more libraries is on the way — including Polars, DuckDB (native hf:// URL support), Daft, and webdataset.