Testing

Testing your plugin

Plugins are pure Python — every existing testing pattern applies. The three levels of confidence: unit-test the handlers in isolation, integration-test through a real PluginContext, and manual-test against a running gateway with a real agent.

Three levels of testing

Unit — test handler functions directly. Fastest, narrowest scope. No Flowly imports needed.
Integration — boot a real PluginContext, register your plugin against an isolated ToolRegistry + HookRegistry, then dispatch tool calls or fire hooks programmatically. Catches wiring bugs.
Manual end-to-end — install in a real Flowly profile, restart the gateway, drive the agent from the desktop chat. Catches real-world UX bugs that unit/integration miss.

Most plugins only need unit + manual. Integration tests pay off once the plugin becomes complex enough that boot-time wiring is non-trivial.

Unit tests for handlers

Your handlers are just Python functions. Test them like any other function — no Flowly machinery needed.

python

# tests/test_my_plugin_handlers.py
import pytest
from pathlib import Path

# Import your plugin handlers directly
from my_plugin import (
    weather_handler,
    slash_handler,
    _extract_paths_from_exec,
)


@pytest.mark.asyncio
async def test_weather_handler_default_city(monkeypatch):
    # Mock the network call
    async def fake_get(url, **kwargs):
        class _R:
            text = "Berlin: ⛅️ +14°C"
            def raise_for_status(self): pass
        return _R()
    monkeypatch.setattr("httpx.AsyncClient.get", fake_get)

    result = await weather_handler("Berlin")
    assert "+14°C" in result


def test_slash_help_returned_on_empty_args():
    out = slash_handler("")
    assert out.startswith("/auto-commit")  # help text starts here


def test_slash_status_with_no_config(tmp_path, monkeypatch):
    monkeypatch.setenv("FLOWLY_HOME", str(tmp_path))
    monkeypatch.delenv("FLOWLY_AUTO_COMMIT_PATHS", raising=False)
    out = slash_handler("status")
    assert "DISABLED" in out


def test_extract_paths_from_heredoc():
    cmd = "cat > /tmp/file.py << 'EOF'\nprint('x')\nEOF\n"
    paths = _extract_paths_from_exec({"command": cmd})
    assert "/tmp/file.py" in paths

Tip

Write tests directly against the parsing/scrape helpers (_extract_paths_from_exec, _was_recently_written). They have the most regression risk when you tweak the implementation, and they're the cheapest to test.

Integration tests with PluginContext

When you want to verify that register(ctx) wires everything up correctly, build a minimal context and run it:

python

# tests/test_my_plugin_integration.py
import pytest

from flowly.agent.hooks import HookRegistry, ToolHookContext
from flowly.agent.tools.registry import ToolRegistry
from flowly.plugins import PluginManager


@pytest.fixture
def loaded_plugin(tmp_path, monkeypatch):
    """Boot your plugin against a fresh registry inside an isolated home."""
    monkeypatch.setenv("FLOWLY_HOME", str(tmp_path / "home"))

    tools = ToolRegistry()
    hooks = HookRegistry()
    mgr = PluginManager(tool_registry=tools, hook_registry=hooks)
    mgr.discover_and_load(
        enabled={"my-plugin"},
        disabled=set(),
    )
    return tools, hooks, mgr


def test_my_plugin_registers_expected_surfaces(loaded_plugin):
    tools, hooks, mgr = loaded_plugin
    info = next(p for p in mgr.list_plugins() if p["key"] == "my-plugin")

    assert info["enabled"] is True
    assert "my_tool" in info["tools"]
    assert "post_tool_call" in info["hooks"]
    assert "my-cmd" in info["commands"]


@pytest.mark.asyncio
async def test_hook_runs_on_tool_call(loaded_plugin):
    tools, hooks, _ = loaded_plugin

    # Fire a synthetic post_tool_call event
    ctx = ToolHookContext(
        tool_name="write_file",
        params={"path": "/some/path"},
        success=True,
    )
    await hooks.fire("post_tool_call", ctx)

    # Assert your plugin's side effects (e.g. log file written, state updated)
    # ...

The pattern: build a fresh, isolated PluginManager, point it at a temp FLOWLY_HOME, force-enable your plugin, then drive hook/tool calls programmatically.

Reference

The Flowly source includes integration tests at tests/test_disk_cleanup_plugin.py showing this pattern end-to-end.

Manual end-to-end test recipe

The most reliable way to catch real-world bugs:

bash

# 1. Install fresh
flowly plugins remove my-plugin --yes 2>/dev/null
flowly plugins install /path/to/my-plugin
flowly plugins enable my-plugin

# 2. Restart the gateway with debug logging
LOGURU_LEVEL=DEBUG flowly gateway --port 18790

# 3. In another terminal, watch your plugin's audit log
tail -f ~/.flowly/my-plugin/log.jsonl

# 4. Drive the agent from desktop chat / Telegram / web
#    — test the slash commands manually
#    — ask the agent to do something that triggers your hook
#    — observe the audit log for hook_fired events

# 5. Inspect persistent state
cat ~/.flowly/my-plugin/config.json

Checklist

flowly plugins list shows your plugin as enabled after gateway start
Slash commands respond — including help, status, and error paths
The hook fires for the expected tools (verify via hook_fired audit events)
Skip filters log a reason — never silently return
Successful actions write a clearly-named audit event
Plugin survives a misconfigured environment (missing env var, missing files)

Useful fixtures

Patterns we've found valuable across plugin test suites:

python

@pytest.fixture
def isolated_flowly_home(tmp_path, monkeypatch):
    """Each test gets its own pristine $FLOWLY_HOME."""
    home = tmp_path / "home"
    home.mkdir()
    monkeypatch.setenv("FLOWLY_HOME", str(home))
    return home


@pytest.fixture
def temp_git_repo(tmp_path):
    """A throwaway git repo for plugins that touch git."""
    import subprocess
    repo = tmp_path / "repo"
    repo.mkdir()
    for cmd in [
        ["git", "init"],
        ["git", "config", "user.email", "t@local"],
        ["git", "config", "user.name", "test"],
    ]:
        subprocess.run(cmd, cwd=repo, check=True, capture_output=True)
    return repo


@pytest.fixture
def fresh_audit_log(isolated_flowly_home, monkeypatch):
    """Returns a Path to your plugin's log.jsonl — guaranteed empty."""
    log = isolated_flowly_home / "my-plugin" / "log.jsonl"
    yield log
    # Optionally print contents on test failure for debugging
    if log.exists():
        print("Audit log contents:")
        print(log.read_text())

Tip

Combine isolated_flowly_home + temp_git_repo + a real PluginManager for hermetic integration tests that mirror actual production behavior without touching your real ~/.flowly/.