File system#

Pyxet implements a simple API based on the fsspec library. Use it to access local files, remote files, and files in XetHub.

Using URLs#

Xet URLs are in the form:


Use our public endpoint unless you’re on a custom enterprise deployment.

The <path_to_file> argument is optional if the URL refers to a repository and the xet:// prefix is optional when using pyxet.XetFS. If pyxet.FS is initialized with an endpoint, xet://<endpoint>: is inferred.

Accessing private repositories#

To create your own repositories or access private repositories, first create a XetHub account and set your personal access token.


To work with a XetHub repository as a file system, you can use the pyxet.XetFS class. This class provides a file system handle for a XetHub repository, allowing you to perform opens, reads, and writes. The initialization of this class requires a repository URL and optional arguments for branch, user, and token. All write operations will automatically commit the change back to XetHub; the optional commit message will be applied when available.

Example usage of pyxet.XetFS:

  import pyxet

  # Create a file system handle for a repository
  fs = pyxet.XetFS('')

  # List files in the repository.
  files ='XetHub/Flickr30k/main')

  # Open a file from the repository.
  f ='XetHub/Flickr30k/main/results.csv')

  # Read the contents of the file.
  contents =

  # Write to a repository with an optional commit message
  with fs.transaction as tr:
    tr.set_commit_message("Writing things")"<user_name>/<repo_name>/main/foo", 'w').write("Hello world!")

Other common utilities#

  import pyxet

  fs = pyxet.XetFS('')  # fsspec filesystem

  # Read functions"XetHub/titanic/main/titanic.csv")
  # returns repo level info: {'name': '', 'size': 61194, 'type': 'file'}"XetHub/titanic/main/titanic.csv", 'r').read(20)
  # returns first 20 characters: 'PassengerId,Survived'

  fs.get("XetHub/titanic/main/data/", "data", recursive=True)
  # download remote directory recursively into a local data folder"XetHub/titanic/main/data/", detail=False)
  # returns ['data/titanic_0.parquet', 'data/titanic_1.parquet']

  # Write functions, with optional commit message
  with fs.transaction as tr:
    tr.set_commit_message("Write hi")"<user_name>/<repo_name>/main/text.txt", 'w').write("Hello world!")
  # writes "Hello World" to text.txt, Git commits the change with comment "Write hi" in the main branch of the repository

  with fs.transaction as tr:
    tr.set_commit_message("Copy file")
    fs.cp("<user_name>/<repo_name>/main/text.txt", "<user_name>/<repo_name>/main/text2.txt")
  # copies text.txt into text2.txt in the main branch of the repository, commits the change with "Copy file"

  with fs.transaction as tr:
    tr.set_commit_message("Remove file")
   # removes a file from the main branch of the repository with comment "Remove file"


Many packages such as pandas and pyarrow support the fsspec protocol. xet:// URLs must be used as file paths when interacting with these packages. For example, to read a CSV from pandas, use:

  import pyxet   # make xet protocol available to fsspec
  import pandas as pd

  df = pd.read_csv('xet://')