Hi, Developers! My First post here, don’t roast me too hard 😅
I’d like to share a pet project my teammate and I have been working on. The core idea is to build a multi-language computational graph that also lets you quickly deploy a mini-FaaS (Function as a Service) platform on your local machine. In other words, you can easily mix and match code from various sources (and even different third-party tools) using a local framework and server. Right now, we’re calling this project SPL (Smart Pipe Lime).
How did this idea come about?
While working on a complex model, we realized we needed to combine code and utilities written in different programming languages, pulled from earlier projects. We had separate database queries, several fundamentally different methods of preprocessing large datasets, a two-stage training process, plus final evaluation and validation of the resulting model.
We considered well-known tools like Airflow, Dagster, and Prefect. However, they felt a bit heavy for some simpler scenarios, and they weren’t ideal for rapid prototyping. Besides, part of our dataset required lower-level processing with C++ rather than standard Python. That’s how the idea for a pet project arose—something that would let us seamlessly bring together code that otherwise wouldn’t play nicely, and also allow us to share our work within the team. Essentially, we had a few key goals:
- Build a connected computational graph made up of functions or utilities, regardless of their language or dependencies.
- Support both local and remote execution of these graphs (so teams can share their work).
- Make it possible to run only part of a graph, keeping the state and results of previous steps, to simplify testing new approaches.
Some implementation details
A computational graph is a directed, connected graph with nodes (functions or utilities) and links between them (inputs and outputs). Each node takes input parameters, performs a specific task, and sends the result along to the next node.
We use a few key terms:
- Node: A function that has input and output ports.
- Port: A named argument (input port) or a return value (output port).
- Artifact: The result of a node (a specific port), which gets cached and passed on.
Our approach involves a framework for a specific language — if we’re running code — plus a server that handles the FaaS side, orchestrates nodes in the graph, and takes care of passing artifacts around correctly.
We chose to build the first version of the SPL framework for Python, since that’s the language we use most. The end result will be a library that lets you intuitively create computational graphs right from a Python notebook.
To manage the graph itself — adding or removing nodes, saving results, and running only certain parts — we decided on a mechanic similar to PyTorch. In PyTorch, you build a model by sequentially adding layers to nn.Sequential()
. In SPL, you likewise assemble a graph by adding function-nodes in a simple, easy-to-follow way:
# Example of creating a simple graph
from spl import Graph, step
graph = Graph()
graph.add(step(load_data))
graph.add(step(process_data, params={'method': 'mean'}))
graph.add(step(train_model, params={'epochs': 10}))
graph.add(step(save_results))
Running the graph or individual nodes is also straightforward:
# Run the entire graph
graph.run()
# Run the graph starting from the third step
graph.run(from_step=2)
# Run a specific step
graph.run(step=process_data)
A key feature is the ability to “freeze” the execution environment (including Python and library versions) with a single command:
# Freeze the environment in one go
graph.freeze_environment()
A pocket-sized FaaS
One of the coolest features, in our opinion, is that you can quickly spin up your own mini-FaaS right on your machine. If your computer has internet access, your functions and graphs become instantly available to other users.
Right now, the SPL server supports:
- An HTTP API for remotely executing functions.
- Import and export of graphs in JSON format.
- Task coordination and distributed result caching.
- A simple web interface for viewing and editing graphs.
Possible use cases for SPL
- Local development: Build graphs and functions you can reuse across different projects without constantly copying code.
- Production usage: Keep business logic and infrastructure separate, with easy zero-downtime updates.
- Personal FaaS (including a function marketplace): Potentially publish your work for others (including monetization), delivering only results instead of the entire codebase.
- Visualizing business processes: The server supports graph rendering and displays input and output ports, which can be handy for high-level project management.
Why am I writing this post?
We’d really love to hear what you think:
- Have you faced similar challenges?
- Would such a tool be useful for you?
- What features would you like to see in a project like this?
We’re excited to discuss these ideas in the comments here or in Telegram https://t.me/SPLime_io!