spark-connect: A Spark Connect Client for the Rust ecosystem

I’d like to share a crate that I’ve been working on: spark-connect.
It is a fully asynchronous, strongly typed Rust client for interacting with a remote Spark Connect server.
Like many others, I created this crate out of necessity for a dedicated client to run SQL queries against a Spark server in the Rust ecosystem.
Similar efforts existed in the community (shoutout to spark-connect-rs which provided plenty of inspiration), but I often found existing solutions were bloated or explicitly marked as experimental.
spark-connect emphasizes a clean API, clarity, and maintainability over a broad feature set, making it easier to reason about and build upon while providing the stability needed to move beyond the experimental phase.
Key Features
If you are coming from other Rust SQL toolkits (e.g. sqlx), the API should feel very familiar. It supports parameterized queries, safe binding, and standard connection strings.
- Async/Await: Built on top of tokio and tonic (gRPC).
- SQL-First: Focuses on executing SQL queries with support for parameter binding (
?syntax). - Arrow Native: Returns data as Vec
, making it easy to integrate with polars or datafusion. - Lazy Execution: Supports
sql()for lazy evaluation (similar to PySpark/Scala) orquery()for immediate execution.
Example Usage
Get started by installing the crate with cargo:
Next, initiate a session and execute queries against a Spark cluster:
You can also use the sql() / collect() interface for lazy execution, similarly to idiomatic Spark:
Closing Thoughts
I’m excited to share spark-connect with the Rust community and hope it makes working with Spark both simpler, more reliable and idiomatic. My goal has been to provide a client that is easy to understand, maintain, and extend, while supporting Rust async patterns and SQL-first workflows.
I’d love to hear your feedback, see what you build with it, or collaborate on improving it further. You can check out the crate and get started here: