Skip to main content

Internal differences from Mercurial

note

This page assumes that you are familiar with Mercurial internals.

Visibility

Mercurial treats all commits as visible by default, using obsolescence data to mark obsoleted commits as invisible.

Sapling treats all commits as invisible by default, using "visible heads" and bookmark references to mark commits and their ancestors as visible. This is similar to Git.

Performance wise, too much obsolescence data can slow down a Mercurial repo. Similarly, too many bookmarks and visible heads can slow down a Sapling repo. However, obsolescence data can grow over time unbounded while bookmarks and visible heads can shrink using commands like sl bookmark -d and sl hide. Practically, we assume a bounded number of bookmarks and visible heads.

Mercurial has a "repo view" layer to forbid access to hidden commits. Accessing them (for example, using the predecessors() revset) requires a global flag --hidden. Sapling removes the "repo view" layer. Revsets like all(), children(), descendants() handle the visibility transparently by not including invisible commits. Revsets like predecessors() do not care about visibility and return invisible commits. If the user explicitly requests them using commit hashes, they will be included.

Phase

Mercurial tracks phases (public, draft, secret) explicitly using "phase roots". Commits are public by default. Draft and secret roots are explicitly listed. The "phase roots" can grow unbounded and slow down the repo over time.

Sapling infers phases from remote bookmarks and visibility. Commits are secret (invisible) by default. Main remote bookmarks and their ancestors are marked public. Other visible commits are draft.

In Mercurial visibility and phase are separate concepts. A secret commit can be visible or invisible. In Sapling "secret" is just an alias to "invisible" - there are no "visible secret" commits.

Obsolescence

Mercurial uses the "obsstore" to track commit rewrites. Sapling uses "mutation". Their differences are:

  • Obsstore decides visibility. Mutation does not decide visibility.
  • Obsstore supports "prune" operation to remove a commit without a successor commit. Mutation requires at least one successor commit so it cannot track "prune" rewrites.
  • If all successors of a mutation are invisible, then the mutation is ignored. This means mutation can be implicitly tracked by visibility. Restoring visibility to a previous state during an undo operation effectively restores the commit rewrite state.

Implementation wise, mutation uses IndexedLog for O(log N) lookup. Nothing in Sapling requires O(N) loading of the entire mutation data.

Storage format

Mercurial uses Revlog as its main file format. Sapling uses IndexedLog instead.

For working copy state, Mercurial uses Dirstate. Sapling switched to TreeState in 2017. Mercurial 5.9 released in 2021 introduced Dirstate v2 that improves performance in a similar way.

For repo references such as bookmarks and remote bookmarks, Mercurial tracks them in individual files like .hg/bookmarks. Sapling uses MetaLog to track them so changes across state files are atomic.

Protocols

Mercurial supports ssh and http wireprotocols. Sapling's main protocol is defined in a Rust EdenApi trait. It is very different from the original wireprotocols.

There are two implementations of the EdenApi trait: an HTTP client that talks to a supported server and an EagerRepo for lightweight local testing. The HTTP implementation uses multiple connections to saturate network bandwidth for better performance.

Python 3 and Unicode

Python 3 switched the str type from bytes to unicode. This affects keyword arguments, and stdlib APIs like os.listdir, sys.argv.

Sapling adopts Unicode more aggressively. Command line arguments, bookmark names, file names, config files are considered Unicode and are encoded using utf-8 during serialization. Sapling does not turn Python keyword arguments and stdlib output back to bytes.

Treating file names as utf-8 allows Sapling to read and write correct file names between Windows and *nix systems for a given repo.

Pure Python support

Mercurial maintains a pure Python implementation. It can run without building with a C or Rust compiler by setting HGMODULEPOLICY to py. This is not possible for Sapling.

Ignore files

Mercurial supports .hgignore, optionally .gitignore through extensions. Sapling only supports .gitignore.

Git support

There are 2 extensions that add Git support to Mercurial:

hg-git mirrors the bare Git repo to a regular hg repo. Therefore it double stores file content, and produces different hashes.

hgext/git tries to be compatible with an existing Git repo. Therefore it is limited to git specifications like what the .git directory should contain and in what format.

Sapling treats Git as an implementation of its repo data abstraction. This means:

  • The working copy implementation is Sapling's. It can integrate with our virtualized working copy filesystem in the future.
  • The repo data implementation can adopt Sapling's components in the future for benefits like on-demand fetching, data bookkeeping without repack.
  • Git commands are not supported in Sapling's Git repo.