I just open-sourced CodeBeaver, a tool I built after LLM-generated code kept sneaking weird bugs into my projects.

With just a few lines of YAML, CodeBeaver can:

  • Run end-to-end (E2E) tests written in natural language
  • Generate, maintain, and execute unit tests automatically
  • Analyze test failures to determine if it’s a bug or just a flaky test

You can run it locally with a quick pip install, or integrate it into CI/CD with GitHub Actions, where it will even open PRs with missing tests.

It's basically vibe testing :D

We use BrowserUse for the E2E, O3-mini for the unit test generation, plus a bunch of shell scripts to make everything seamless.

Currently supports Python & TypeScript, with more languages on the roadmap. Would love to hear your thoughts!

You can check it out here on GitHub