Commit Graph

109 Commits

Author SHA1 Message Date
Julia McGhee
620fbc6b83 Add MCP servers (Gitea, K8s, Postgres, filesystem, git) to harness agents
Some checks failed
CI / lint-and-test (push) Successful in 36s
Deploy Production / deploy (push) Failing after 40s
CI / build (push) Failing after 59s
Wire 5 MCP servers into Claude Code agents spawned by the harness:
- Gitea MCP for repo/issue/PR management on self-hosted Gitea
- Kubernetes MCP with read-only RBAC for cluster inspection
- Postgres MCP with read-only user for database queries
- Filesystem and Git MCP scoped to task worktrees

Generates .claude/settings.json in each worktree before agent spawn.
Gracefully skips for Codex/OpenCode runtimes (no MCP support).

Also fixes node-pty build failure by using local Node.js headers
instead of downloading from unofficial-builds.nodejs.org (ECONNRESET).
2026-03-21 20:55:19 +00:00
Julia McGhee
a5ef56b052 Fix input focus loss in NewTaskTab form fields
Some checks failed
CI / lint-and-test (push) Successful in 32s
Deploy Production / deploy (push) Failing after 2m49s
CI / build (push) Failing after 3m32s
Move Field component out of NewTaskTab to prevent React from
remounting input wrappers on every keystroke. Same root cause as
the ProjectsTab DetailView fix.
2026-03-21 20:46:06 +00:00
Julia McGhee
af090b1de2 Fix input focus loss when creating a project
Some checks failed
CI / lint-and-test (push) Successful in 36s
CI / build (push) Has been cancelled
Deploy Production / deploy (push) Has been cancelled
DetailView was defined as a component inside ProjectsTab's render,
causing React to unmount/remount it on every keystroke. Replace with
inline JSX so the input element identity stays stable across renders.
2026-03-21 20:45:12 +00:00
Julia McGhee
7bb091d4b3 Add interactive PTY Chat tab with xterm.js terminal emulator
Some checks failed
CI / lint-and-test (push) Successful in 33s
CI / build (push) Has been cancelled
Deploy Production / deploy (push) Has been cancelled
Browser-based interactive terminal sessions with agent CLIs via
WebSocket + node-pty. Supports full TUI rendering (colors, cursor,
ctrl-c) through xterm.js in the browser.

Architecture: xterm.js ←WebSocket→ pty-server.js ←PTY→ agent CLI

- Extract shared buildAgentEnv() from executor into agent-env.ts
- Add internal /api/agents/[id]/env endpoint for PTY server
- Add pty-server.js (WebSocket + node-pty, max 3 sessions, 2hr cleanup)
- Add custom server.js wrapping Next.js with WebSocket upgrade
- Add ChatTab component with agent selector and terminal
- Wire CHAT tab into dashboard nav and render
- Configure serverExternalPackages for node-pty
- Update Dockerfile with build tools and custom server
- Bump k8s memory limit 1Gi → 2Gi for PTY sessions
2026-03-21 20:43:07 +00:00
gitea-actions[bot]
f45fa64855 deploy: update production images to ff0573703f 2026-03-21 20:36:48 +00:00
Julia McGhee
ff0573703f Fix harness Dockerfile standalone paths for monorepo workspace build
All checks were successful
CI / lint-and-test (push) Successful in 29s
Deploy Production / deploy (push) Successful in 1m9s
CI / build (push) Successful in 1m42s
Next.js standalone output nests server.js under apps/harness/ when
built from a pnpm workspace. Preserve the directory structure and
update CMD to point to the correct server.js path.
2026-03-21 20:35:09 +00:00
Julia McGhee
a687652bcd Add Gitea as a git provider for harness workspace repositories
Some checks failed
CI / lint-and-test (push) Successful in 30s
CI / build (push) Has been cancelled
Deploy Production / deploy (push) Has been cancelled
Support Gitea alongside GitHub/GitLab for repo search, authenticated
cloning, and pull request creation via Gitea API. Tasks can specify
gitProvider and gitBaseUrl in their spec (defaults to github for
backwards compat). Auto-discovers GITEA_TOKEN from env on boot.
2026-03-21 20:33:35 +00:00
gitea-actions[bot]
11192da432 deploy: update production images to e2b339aac8 2026-03-21 20:28:54 +00:00
Julia McGhee
e2b339aac8 Auto-discover OpenCode Zen and Go models, add catalog search and pagination
All checks were successful
CI / lint-and-test (push) Successful in 29s
Deploy Production / deploy (push) Successful in 3m40s
CI / build (push) Successful in 1m34s
Add model fetchers for OpenCode Zen (https://opencode.ai/zen/v1/models) and
Go (https://opencode.ai/zen/go/v1/models) APIs. Register opencode-go as a new
provider, load shared credentials from auth.json, add known models with pricing,
and create default agents for both tiers on first boot.

Replace the manual "Add Model" form with a search bar that filters by model
name/ID and paginate the catalog at 25 models per page.
2026-03-21 20:24:38 +00:00
Julia McGhee
f0d9482bc8 Fix Docker build for harness workspace dependency on @homelab/db
Some checks failed
CI / lint-and-test (push) Successful in 29s
CI / build (push) Has been cancelled
Deploy Production / deploy (push) Has been cancelled
Switch harness Dockerfile to pnpm with repo root build context so
workspace:^ dependency on @homelab/db resolves. Use .dockercontext
marker file to opt individual apps into root context builds while
keeping web/api on their local app context.
2026-03-21 20:24:02 +00:00
Julia McGhee
3fe75a8e04 Migrate harness from in-memory stores to CloudNativePG
Some checks failed
CI / lint-and-test (push) Successful in 22s
Deploy Production / deploy (push) Failing after 21s
CI / build (push) Failing after 1m51s
Replace all in-memory Map-backed stores (credentials, models, agents,
tasks, iterations, usage) with Drizzle ORM queries against the
homelab-pg PostgreSQL cluster. All store functions are now async.

- Add 6 harness_* tables to @homelab/db schema
- Generate and apply initial Drizzle migration
- Add lazy DB connection proxy to avoid build-time errors
- Wire DATABASE_URL from sealed secret into harness deployment
- Update all API routes, orchestrator, executor, and boot to await
  async store operations
2026-03-21 20:17:08 +00:00
gitea-actions[bot]
df351439d6 deploy: update production images to a60754d5a2 2026-03-21 20:00:58 +00:00
Julia McGhee
a60754d5a2 Fix boot state sharing across Next.js module boundaries
All checks were successful
CI / lint-and-test (push) Successful in 29s
Deploy Production / deploy (push) Successful in 47s
CI / build (push) Successful in 1m16s
Use globalThis for all in-memory stores (credentials, models, agents,
tasks) so the instrumentation hook and API route handlers share the
same data. Next.js bundles these as separate chunks with independent
module instances, causing boot-populated state to be invisible to
API routes.
2026-03-21 19:59:41 +00:00
gitea-actions[bot]
a079225367 deploy: update production images to 25b4769ff8 2026-03-21 19:54:48 +00:00
Julia McGhee
25b4769ff8 Auto-discover credentials, models, and agents on harness startup
All checks were successful
CI / lint-and-test (push) Successful in 20s
Deploy Production / deploy (push) Successful in 59s
CI / build (push) Successful in 1m13s
Read mounted secret files (Claude OAuth, OpenCode auth.json) and env
vars on boot, register them as credentials, fetch available models
from provider APIs, and create default agent configs for each viable
runtime+provider+model combination.
2026-03-21 19:53:29 +00:00
gitea-actions[bot]
e97614d568 deploy: update production images to df1111da15 2026-03-21 19:44:21 +00:00
Julia McGhee
df1111da15 Remove mock data from harness and add agent credential healthchecks
All checks were successful
CI / lint-and-test (push) Successful in 25s
Deploy Production / deploy (push) Successful in 59s
CI / build (push) Successful in 1m11s
Strip all seed/mock data (fake tasks, models, usage entries, agent configs)
so the dashboard starts clean and populates from real API state. Add
/api/agents/health endpoint that validates each agent's provider credentials
and CLI availability.
2026-03-21 19:42:53 +00:00
Julia McGhee
9a40240bd2 Enable ServerSideApply for app-of-apps to fix CRD annotation size limit
All checks were successful
CI / lint-and-test (push) Successful in 23s
Deploy Production / deploy (push) Successful in 25s
CI / build (push) Successful in 24s
ArgoCD v3.3 ApplicationSet CRD exceeds the 262144-byte client-side apply
annotation limit. ServerSideApply=true avoids this.
2026-03-21 19:33:24 +00:00
Julia McGhee
cfa9699926 Upgrade ArgoCD v2.13.3 → v3.3.4
Some checks failed
CI / lint-and-test (push) Successful in 28s
Deploy Production / deploy (push) Successful in 24s
CI / build (push) Has been cancelled
Stepped through v2.14.12 → v3.0.7 → v3.1.6 → v3.2.5 → v3.3.4.
Use server-side apply with force-conflicts for CRD size limits in v3.3+.
2026-03-21 19:32:09 +00:00
gitea-actions[bot]
28ec38bc59 deploy: update production images to fccf749598 2026-03-21 19:16:47 +00:00
Julia McGhee
fccf749598 Set Gitea deployment strategy to Recreate to avoid LevelDB lock conflicts
All checks were successful
CI / lint-and-test (push) Successful in 23s
Deploy Production / deploy (push) Successful in 15s
CI / build (push) Successful in 17s
2026-03-21 19:14:32 +00:00
Julia McGhee
0d7fa44577 Fix Gitea admin: use existing lazorgurl account and matching email
All checks were successful
CI / lint-and-test (push) Successful in 26s
CI / build (push) Successful in 22s
2026-03-21 19:06:41 +00:00
Julia McGhee
8eefb12c97 Fix Gitea admin init: set email explicitly to avoid default conflict
All checks were successful
CI / lint-and-test (push) Successful in 19s
CI / build (push) Successful in 16s
2026-03-21 19:05:32 +00:00
Julia McGhee
76cda86791 Fix Gitea upgrade: disable bundled valkey (renamed from redis in chart v12)
All checks were successful
CI / lint-and-test (push) Successful in 21s
CI / build (push) Successful in 23s
2026-03-21 19:03:20 +00:00
Julia McGhee
f7ffc91a4c Upgrade Gitea Helm chart 10.6.0 → 12.5.0 for workflow_dispatch UI
All checks were successful
CI / lint-and-test (push) Successful in 22s
CI / build (push) Successful in 21s
2026-03-21 19:00:58 +00:00
Julia McGhee
82225fa8c9 chore: trigger harness rebuild
All checks were successful
CI / lint-and-test (push) Successful in 23s
CI / build (push) Successful in 1m14s
2026-03-21 18:27:12 +00:00
Julia McGhee
3153f0eda5 chore: trigger web rebuild
All checks were successful
CI / lint-and-test (push) Successful in 19s
CI / build (push) Successful in 1m14s
2026-03-21 18:23:40 +00:00
Julia McGhee
3a15f6ed07 chore: trigger api rebuild
All checks were successful
CI / lint-and-test (push) Successful in 25s
CI / build (push) Successful in 29s
2026-03-21 18:21:11 +00:00
Julia McGhee
a525fc8aec chore: trigger full rebuild (7)
All checks were successful
CI / lint-and-test (push) Successful in 18s
CI / build (push) Successful in 2m11s
2026-03-21 18:13:19 +00:00
Julia McGhee
580c6dced7 Fix registry auth: use REGISTRY_TOKEN secret instead of gitea.token
All checks were successful
CI / lint-and-test (push) Successful in 18s
CI / build (push) Successful in 19s
2026-03-21 18:12:06 +00:00
Julia McGhee
d7f0931fa6 Fix harness: add ca-certificates, make opencode install non-fatal
Some checks failed
CI / lint-and-test (push) Successful in 17s
CI / build (push) Failing after 3m38s
2026-03-21 18:04:44 +00:00
Julia McGhee
adaff14c36 chore: trigger full rebuild (6)
Some checks failed
CI / lint-and-test (push) Successful in 37s
CI / build (push) Failing after 5m16s
2026-03-21 17:58:13 +00:00
Julia McGhee
1dd93aa5a3 Disable telemetry for turbo, next.js in runner image
Some checks failed
CI / lint-and-test (push) Failing after 0s
CI / build (push) Has been skipped
2026-03-21 17:54:10 +00:00
Julia McGhee
8958372716 Fix cache: combine export + pnpm install in single step
Some checks failed
CI / lint-and-test (push) Successful in 42s
CI / build (push) Has been cancelled
2026-03-21 17:53:16 +00:00
Julia McGhee
264e498657 chore: trigger full rebuild (5)
Some checks failed
CI / lint-and-test (push) Successful in 22s
CI / build (push) Has been cancelled
2026-03-21 17:52:17 +00:00
Julia McGhee
4bf4c0f639 Fix pnpm cache: use inline export instead of workflow env block
All checks were successful
CI / lint-and-test (push) Successful in 19s
CI / build (push) Successful in 19s
act_runner v0.3.0 doesn't propagate workflow-level or job-level
env: blocks to job containers. Use export in run commands instead.
First run warms cache, subsequent runs will show reused packages.
2026-03-21 17:51:14 +00:00
Julia McGhee
dc46f8c54a Move cache env vars to workflow level, remove debug step
All checks were successful
CI / lint-and-test (push) Successful in 35s
CI / build (push) Successful in 20s
Move PNPM_STORE_DIR and COREPACK_HOME to workflow-level env
which may propagate differently than job-level in act_runner.
2026-03-21 17:48:59 +00:00
Julia McGhee
1ef3383ba1 debug: check if pnpm-store volume is mounted in job container
All checks were successful
CI / lint-and-test (push) Successful in 43s
CI / build (push) Successful in 19s
2026-03-21 17:46:21 +00:00
Julia McGhee
2e32e02adb Add empty public dir for harness (required by Dockerfile)
Some checks failed
CI / lint-and-test (push) Successful in 44s
CI / build (push) Has been cancelled
2026-03-21 17:45:31 +00:00
Julia McGhee
5e86e56bed Add .dockerignore for harness to exclude node_modules from COPY
Some checks failed
CI / lint-and-test (push) Successful in 34s
CI / build (push) Failing after 1m12s
2026-03-21 17:43:10 +00:00
Julia McGhee
1e3d4bceaa Install opencode via curl installer (Go binary, not on npm)
Some checks failed
CI / lint-and-test (push) Successful in 23s
CI / build (push) Failing after 1m11s
2026-03-21 17:41:05 +00:00
Julia McGhee
188003f0e8 Fix harness Dockerfile: remove opencode (not on npm)
Some checks failed
CI / lint-and-test (push) Successful in 37s
CI / build (push) Has been cancelled
2026-03-21 17:39:32 +00:00
Julia McGhee
e672ca5d2d chore: trigger full rebuild (4)
Some checks failed
CI / lint-and-test (push) Successful in 44s
CI / build (push) Failing after 3m3s
2026-03-21 17:34:23 +00:00
Julia McGhee
0a8b65a496 Mount Docker socket into job containers for docker build
Some checks failed
CI / lint-and-test (push) Failing after 8s
CI / build (push) Has been skipped
Job containers need access to the DinD daemon for docker build/push.
Mount /var/run/docker.sock from DinD into job containers and set
docker_host in runner config.
2026-03-21 17:32:53 +00:00
Julia McGhee
0be7ad6dca chore: trigger full rebuild of all app images (3)
Some checks failed
CI / lint-and-test (push) Successful in 34s
CI / build (push) Failing after 45s
2026-03-21 17:30:47 +00:00
Julia McGhee
d8715e361f Fix CI matrix and pnpm cache: set env vars in workflow, drop matrix
All checks were successful
CI / lint-and-test (push) Successful in 27s
CI / build (push) Successful in 21s
- Set PNPM_STORE_DIR and COREPACK_HOME as job env vars instead of
  relying on container.options -e flags which act_runner may ignore
- Replace fragile cross-job matrix with single-job loop for builds
- Both fixes: empty matrix app name and 0 reused packages
2026-03-21 17:29:26 +00:00
Julia McGhee
8ceea37976 chore: trigger full rebuild of all app images
Some checks failed
CI / changes (push) Successful in 16s
CI / lint-and-test (push) Successful in 43s
CI / build (push) Failing after 19s
2026-03-21 17:27:06 +00:00
Julia McGhee
64baf319fe Fix runner: use explicit register + daemon with --config flag
All checks were successful
CI / changes (push) Successful in 1s
CI / lint-and-test (push) Successful in 32s
CI / build (push) Has been skipped
The act_runner entrypoint ignores CONFIG_FILE for the daemon
command, so container.options (pnpm cache volume) never loads.
Use a custom command that registers manually then runs daemon
with --config explicitly.
2026-03-21 17:23:25 +00:00
Julia McGhee
d13bc9103a Fix CI changes detection: build JSON array without jq
All checks were successful
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Successful in 43s
CI / build (push) Has been skipped
2026-03-21 17:22:16 +00:00
Julia McGhee
3ef1cbd1bb chore: trigger initial image builds for Gitea registry
Some checks failed
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Successful in 43s
CI / build (push) Failing after 20s
2026-03-21 17:20:14 +00:00