api-parity: A tool for tracking API coverage

If you’ve ever written a port of someone else’s library - a Rust client for a Python SDK, a TypeScript wrapper around a Go service, anything that promises “we mirror the upstream API” - you know the question that doesn’t go away: what’s actually covered? How far along is the port?
The honest answer is usually “I’m not totally sure.” You start with a wishlist and a TODO file. The TODO file goes stale. Upstream releases a minor version, adds three methods, and you find out two months later when an issue lands. A reviewer asks “is this whole class done?” and the only way to answer is to grep both repos.
I’ve been building spark-connect, a Rust port of PySpark Connect, and ran into exactly this. You see, I needed a way to:
- Enumerate what exists on the upstream side (a reference).
- Enumerate what the port claims to mirror (a port).
- Diff the two, with status and ownership commentary on every covered method.
Rather than hard-code that into the Spark project, I lifted it out into a general-purpose tool: api-parity.
How api-parity works
At the time of writing, api-parity is three pieces:
- A language-agnostic differ: It eats two JSON files and produces a markdown report: per-class coverage, stale port entries (typos, removed APIs, drift), per-method status.
- A Python plugin - walks any Python package via
inspect.getmembersand emits the public surface as JSON. Also ships decorators (@parity,@parity_impl) so a Python port can be annotated in code. - A Rust plugin - ships attribute macros (
#[parity],#[parity_impl]) for annotating a Rust port. Optionally ships a walker (behind a walker Cargo feature) that shells out tocargo +nightly rustdoc --output-format jsonand parses withpublic-api, so you can also use a Rust crate as the reference side.
The plugins talk to the differ over JSON - there’s no shared in-process API. That makes adding a new language a matter of writing a plugin, not patching the core.
The four-direction trick
Each plugin makes two orthogonal choices:
kind: is this envelope the reference (truth) or the port (claims)?mode: how are the entries produced - by walking the public API surface, or by collecting annotations attached to code?
These are independent. So you get four combinations per side, and across two plugins, four cross-language directions:
| X | py REFERENCE |
rs REFERENCE |
|---|---|---|
py PORT |
py library vs another py library |
rs crate → py port |
rs PORT |
py library → rs crate (e.g. PySpark → Rust) |
rs library vs another rs library |
The default mapping (reference → walker, port → annotation) covers the common case: you can’t annotate the upstream library (you don’t own it), and you do want to annotate your port (because the status - implemented, partial, unimplemented - is opinion, not fact). But all four cells work end-to-end.
What api-parity looks like
In a Rust port:
#[parity_impl(path = "pyspark.sql.session.SparkSession", status = Implemented)]
impl SparkSession {
#[parity(path = ".sql", status = Implemented, since = "3.4")]
pub fn sql(&self, q: &str) -> Result<DataFrame, SparkError> { /* ... */ }
#[parity(path = ".stop", status = Unimplemented, comment = "no shutdown hook yet")]
pub fn stop(&self) -> Result<(), SparkError> { unimplemented!() }
}Then:
report.md tells you exactly where you stand: a coverage percentage per class, a list of stale port entries that no longer match any upstream path, and the status / comment you wrote for each method.
Check it out!
The repo is on GitHub. v0.0.2 ships all four directions. Issues and PRs welcome - especially plugins for other languages.