So, you've come across deep search tools and are wondering if you can build your own. Well, you certainly can! Building one yourself might involve tasks like scraping web URLs, implementing your own RAG (Retrieval-Augmented Generation) system to find relevant information, and then summarizing it using an agent loop.
Luckily, the open-source community has already created tools that handle many of these steps. The purpose of this article is to demonstrate how easily we can spin up the necessary environment for such a tool using Nix. We'll focus not only on running the service, but also on development environment setup, making it easy to modify the tool if needed.
For demonstration purposes, I'll be using the gpt-researcher
tool.
Github link: https://github.com/assafelovic/gpt-researcher
The basic setup for almost any Python project often feels trivial: install Python, install requirements.txt
, and you're good to go. However, if you manage tens of these projects on your machine, you'll quickly realize you not only need to isolate your project environments (which .venv
solves) but ideally also isolate the Python installations themselves.
Welcome Nix flakes!
Why Use Nix for Your Development Environment?
Nix is a powerful package manager that helps ensure your development environment is reproducible and isolated. By defining your dependencies (like Python and other tools) in a Nix flake, anyone (including your future self!) can recreate the exact same environment with a single command, avoiding "it works on my machine" problems.
To get a better grasp of what Nix is, you can check out my beginner's article.
Nix flake
Once you pulled the repo, the only thing you need to do (besides .env configuration with your own API keys), is to setup the dev environment using a nix flake. Here's what my file looks like:
{
inputs = {
nixpkgs.url = "nixpkgs";
flake-utils.url = "github:numtide/flake-utils";
};
outputs = { self, nixpkgs, flake-utils }:
flake-utils.lib.eachDefaultSystem (system:
let
pkgs = import nixpkgs {
inherit system;
config.allowUnfree = true;
};
in {
devShells.default = pkgs.mkShell {
buildInputs = with pkgs; [
git
python311
python311Packages.pip
python311Packages.virtualenv
];
shellHook = ''
# Create and activate virtual environment if it doesn't exist
VENV=.venv
REQUIREMENTS_HASH=".venv/.requirements.hash"
# Calculate new hash of requirements file
NEW_HASH=""
if [ -f "requirements.txt" ]; then
NEW_HASH=$(sha256sum requirements.txt)
fi
if [ ! -d "$VENV" ]; then
echo "Creating virtual environment..."
python -m venv "$VENV"
source "$VENV/bin/activate"
# Initial pip upgrade and requirements installation
pip install --upgrade pip
if [ -f "requirements.txt" ]; then
echo "Installing requirements from requirements.txt..."
pip install -r requirements.txt
echo "$NEW_HASH" > "$REQUIREMENTS_HASH"
fi
echo "Virtual environment setup complete!"
else
source "$VENV/bin/activate"
# Check if requirements have changed
if [ -f "requirements.txt" ]; then
if [ ! -f "$REQUIREMENTS_HASH" ] || [ "$NEW_HASH" != "$(cat $REQUIREMENTS_HASH)" ]; then
echo "Requirements have changed. Updating..."
pip install --upgrade pip
pip install -r requirements.txt
echo "$NEW_HASH" > "$REQUIREMENTS_HASH"
fi
fi
fi
# Print Python version
echo "Using Python: $(python --version)"
'';
};
});
}
Using the Environment
Once you have the flake.nix
file in your project's root directory (assuming Nix with flakes is installed), the only command you need to run is nix develop
. Running this command will drop you into an isolated development shell.
Within this shell, you'll have access to the tools defined in the flake, such as the specific Python version requested. Additionally, the custom shellHook
I included in the flake automates the management of your Python dependencies. This hook automatically creates and activates a standard Python virtual environment (.venv
). More importantly, it checks your requirements.txt
file and automatically installs or updates your Python packages using pip
whenever the contents of requirements.txt
change. And everything isolated for this specific repo without any leakeges to the outside world!
The result is a fully prepared, self-contained development environment that keeps your project dependencies neatly isolated from the rest of your system, and also isolates the core tools like the Python interpreter itself – a key advantage over managing global installations or relying solely on .venv
.
Why not docker?
Now, why not Docker? While Docker is awesome for deploying things, for active development where you're constantly tweaking code, it can feel a bit heavyweight. Jumping in and out of container builds or dealing with bind mounts just isn't as immediate as simply entering an interactive shell. That's it.
Bonus: Comparing Perplexity vs. local tool reports
So, I put a short summary request for "LLM history, roots, most impactful studies (scan all starting with 20th century)" to both Perplexity and the local tool we've been discussing. My goal was to show that for basic deep research, you don't necessarily need to pay for a service; you can configure your own using free tiers like Gemini for LLM tasks and a free search API like Serper, which the local tool utilises to scrape the web.
To get an unbiased comparison of the results, I asked a third LLM to analyze the style and substance of the two reports. Here's a summary of its findings:
While Article 2 (Perplexity) is more concise and its table of "Most Impactful Studies" is a valuable quick reference, Article 1 (Local tool) provides a richer, more detailed, and better-structured historical account that allows for a deeper understanding of the evolution of LLMs.
And you know what?:
You can always tweak the local tool implementation per your own needs! (e.g. I've adjusted it specifically for my needs to narrow down the set of resources I want it to look into). I mean, it's programming, you do whatever you want! And using nix preparing your dev setup as simple as running just 1 command, nix develop
!