Commit Graph

116 Commits

Author SHA1 Message Date
Julia McGhee
0a8b65a496 Mount Docker socket into job containers for docker build
Some checks failed
CI / lint-and-test (push) Failing after 8s
CI / build (push) Has been skipped
Job containers need access to the DinD daemon for docker build/push.
Mount /var/run/docker.sock from DinD into job containers and set
docker_host in runner config.
2026-03-21 17:32:53 +00:00
Julia McGhee
0be7ad6dca chore: trigger full rebuild of all app images (3)
Some checks failed
CI / lint-and-test (push) Successful in 34s
CI / build (push) Failing after 45s
2026-03-21 17:30:47 +00:00
Julia McGhee
d8715e361f Fix CI matrix and pnpm cache: set env vars in workflow, drop matrix
All checks were successful
CI / lint-and-test (push) Successful in 27s
CI / build (push) Successful in 21s
- Set PNPM_STORE_DIR and COREPACK_HOME as job env vars instead of
  relying on container.options -e flags which act_runner may ignore
- Replace fragile cross-job matrix with single-job loop for builds
- Both fixes: empty matrix app name and 0 reused packages
2026-03-21 17:29:26 +00:00
Julia McGhee
8ceea37976 chore: trigger full rebuild of all app images
Some checks failed
CI / changes (push) Successful in 16s
CI / lint-and-test (push) Successful in 43s
CI / build (push) Failing after 19s
2026-03-21 17:27:06 +00:00
Julia McGhee
64baf319fe Fix runner: use explicit register + daemon with --config flag
All checks were successful
CI / changes (push) Successful in 1s
CI / lint-and-test (push) Successful in 32s
CI / build (push) Has been skipped
The act_runner entrypoint ignores CONFIG_FILE for the daemon
command, so container.options (pnpm cache volume) never loads.
Use a custom command that registers manually then runs daemon
with --config explicitly.
2026-03-21 17:23:25 +00:00
Julia McGhee
d13bc9103a Fix CI changes detection: build JSON array without jq
All checks were successful
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Successful in 43s
CI / build (push) Has been skipped
2026-03-21 17:22:16 +00:00
Julia McGhee
3ef1cbd1bb chore: trigger initial image builds for Gitea registry
Some checks failed
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Successful in 43s
CI / build (push) Failing after 20s
2026-03-21 17:20:14 +00:00
Julia McGhee
e57f458058 Fix runner: use CONFIG_FILE env var instead of command override
All checks were successful
CI / changes (push) Successful in 14s
CI / lint-and-test (push) Successful in 37s
CI / build (push) Has been skipped
The command override bypasses the entrypoint that handles
registration. Use CONFIG_FILE env var which the entrypoint
respects, keeping the registration flow intact.
2026-03-21 17:14:30 +00:00
Julia McGhee
ab52874970 Fix pnpm cache: use explicit /pnpm-store path and env vars
Some checks are pending
CI / build (push) Blocked by required conditions
CI / changes (push) Successful in 15s
CI / lint-and-test (push) Successful in 21s
Mount volume at /pnpm-store and set PNPM_STORE_DIR and
COREPACK_HOME env vars in job containers so pnpm and corepack
both write to the cached volume. Corepack cache avoids
re-downloading pnpm binary each run.
2026-03-21 16:52:46 +00:00
Julia McGhee
b6bd2dbae0 Add workflow_dispatch trigger to deploy-production
All checks were successful
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Successful in 31s
CI / build (push) Has been skipped
Allows manual trigger to build all apps (or specific ones).
Empty input builds web, api, harness. Useful for initial
registry population after migration.
2026-03-21 16:49:47 +00:00
Julia McGhee
14cf33f57f Bake pnpm into runner image, fix config loading with --config flag
Some checks are pending
CI / build (push) Blocked by required conditions
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Successful in 27s
Deploy Production / deploy (push) Successful in 24s
Pre-install pnpm 9.15.4 via corepack in the image so it doesn't
download every run. Use --config CLI flag instead of CONFIG_FILE
env var to ensure container.options volume mount is applied.
2026-03-21 16:49:14 +00:00
Julia McGhee
65abed3426 Fix runner config: timeout needs duration string not int
All checks were successful
CI / changes (push) Successful in 10s
CI / lint-and-test (push) Successful in 51s
CI / build (push) Has been skipped
Deploy Production / deploy (push) Successful in 22s
2026-03-21 16:43:50 +00:00
Julia McGhee
eced4c1473 Add pnpm store cache to runner via persistent Docker volume
Some checks failed
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Successful in 49s
Deploy Production / deploy (push) Failing after 20s
CI / build (push) Has been skipped
Mount a named Docker volume (pnpm-store) into every job container
at the default pnpm store path. The volume persists in the DinD
sidecar across job runs, so pnpm install reuses cached packages.
2026-03-21 16:41:37 +00:00
Julia McGhee
98ab851b60 Use custom runner image with jq, kustomize, docker pre-installed
Some checks failed
CI / changes (push) Successful in 1s
Deploy Production / deploy (push) Failing after 26s
CI / build (push) Has been skipped
CI / lint-and-test (push) Successful in 35s
Build a runner-image based on node:20-bookworm with all CI tools
baked in, avoiding apt-get install in every workflow run. Runner
labels now point to gitea.coreworlds.io/lazorgurl/runner-image.
2026-03-21 16:39:34 +00:00
Julia McGhee
eb8e090283 Fix kustomize install: download binary, not apt package
Some checks failed
CI / changes (push) Successful in 5s
CI / lint-and-test (push) Successful in 23s
Deploy Production / deploy (push) Failing after 2m35s
CI / build (push) Has been skipped
kustomize isn't in Debian repos. Download from GitHub releases.
2026-03-21 16:36:06 +00:00
Julia McGhee
22488d5bf5 Fix CI: install jq/kustomize, fetch-depth 2 for deploy-production
Some checks failed
CI / changes (push) Successful in 5s
CI / lint-and-test (push) Successful in 41s
Deploy Production / deploy (push) Failing after 5s
CI / build (push) Has been skipped
node:20-bookworm doesn't include jq or kustomize. Also need
fetch-depth: 2 so HEAD~1 exists for turbo's change detection.
2026-03-21 16:34:22 +00:00
Julia McGhee
0b69d6c6f4 Simplify workflows: drop setup-node/pnpm-action, use corepack
Some checks failed
CI / changes (push) Successful in 1s
CI / lint-and-test (push) Successful in 32s
Deploy Production / deploy (push) Failing after 19s
CI / build (push) Has been skipped
The runner containers use node:20-bookworm which already has Node
and corepack. Remove actions/setup-node and pnpm/action-setup
which hang in Gitea Actions. Use corepack enable + pnpm directly.
Also fix preview comment to use Gitea API instead of github-script.
2026-03-21 16:31:01 +00:00
Julia McGhee
b28b1fcae2 Use gitea.token instead of secrets.GITEA_TOKEN in workflows
Some checks failed
CI / build (push) Blocked by required conditions
Deploy Production / deploy (push) Waiting to run
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Has been cancelled
The built-in gitea.token is automatically available in Gitea
Actions without needing a repo secret configured.
2026-03-21 16:26:39 +00:00
Julia McGhee
9c02fd7f4c Add Gitea SSH host key to ArgoCD known_hosts via kustomize patch
Some checks failed
CI / build (push) Blocked by required conditions
Deploy Production / deploy (push) Waiting to run
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Has been cancelled
Without this, ArgoCD rejects SSH connections to the in-cluster
Gitea service. Uses a patch file to replace the known_hosts
ConfigMap with defaults + Gitea key.
2026-03-21 16:23:49 +00:00
Julia McGhee
b8ef09359d Re-seal ArgoCD repo secret with insecure flag for in-cluster SSH
Some checks failed
CI / build (push) Blocked by required conditions
Deploy Production / deploy (push) Waiting to run
CI / changes (push) Successful in 2s
CI / lint-and-test (push) Has been cancelled
2026-03-21 16:19:29 +00:00
Julia McGhee
1d98d6e131 Cut over ArgoCD to Gitea: update all repoURLs and PR generator
Some checks failed
CI / build (push) Blocked by required conditions
Deploy Production / deploy (push) Waiting to run
CI / changes (push) Successful in 1s
CI / lint-and-test (push) Has been cancelled
Switch app-of-apps, platform, apps, and previews ApplicationSets
to read from in-cluster Gitea (gitea-helm-ssh.platform.svc:2222).
Previews now use Gitea PR generator instead of GitHub.
2026-03-21 16:15:22 +00:00
Julia McGhee
e6f8054055 Fix runner DinD: disable TLS between sidecar containers
Some checks failed
CI / build (push) Blocked by required conditions
Deploy Production / deploy (push) Waiting to run
CI / changes (push) Successful in 19s
CI / lint-and-test (push) Has been cancelled
TLS between act_runner and DinD in the same pod is unnecessary
and causes race conditions with cert generation. Use port 2375
(no TLS) and set DOCKER_TLS_CERTDIR="" on the DinD sidecar.
2026-03-21 16:13:19 +00:00
Julia McGhee
30c6f89f20 Seal remaining Gitea secrets: API token, runner token, pull secret
Some checks are pending
CI / changes (push) Waiting to run
CI / lint-and-test (push) Waiting to run
CI / build (push) Blocked by required conditions
Deploy Production / deploy (push) Waiting to run
All placeholder secrets replaced with real sealed values:
- argocd-gitea-token: API token for ArgoCD PR generator
- gitea-runner-token: registration token for in-cluster runner
- gitea-pull-secret: registry credentials for app image pulls
2026-03-21 16:09:19 +00:00
Julia McGhee
e0fcf2b756 Fix Gitea username: julia → lazorgurl in all registry/API refs
Some checks are pending
CI / changes (push) Waiting to run
CI / lint-and-test (push) Waiting to run
CI / build (push) Blocked by required conditions
Deploy Production / deploy (push) Waiting to run
Gitea admin username is julia but the Gitea account name is
lazorgurl. Update container registry URLs, workflow refs,
Taskfile API calls, and pull secret placeholders.
2026-03-21 16:06:58 +00:00
Julia McGhee
cb733c92a0 Add internal-only middleware to Gitea IngressRoute
Some checks are pending
CI / changes (push) Waiting to run
CI / lint-and-test (push) Waiting to run
CI / build (push) Blocked by required conditions
Deploy Production / deploy (push) Waiting to run
Restrict Gitea web UI to LAN access only, matching other
platform services. SSH NodePort (30022) is unaffected.
2026-03-21 16:02:24 +00:00
Julia McGhee
a4553fbeae Fix Gitea service names: gitea-http → gitea-helm-http
The Gitea Helm chart names services as gitea-helm-http and
gitea-helm-ssh, not gitea-http/gitea-ssh. Update IngressRoute
and runner deployment to match.
2026-03-21 16:00:08 +00:00
Julia McGhee
e78807bff1 Fix Gitea Valkey auth: inject password via env var interpolation
Valkey requires authentication. Use additionalConfigFromEnvs to
read the password from valkey-credentials secret and interpolate
it into the Redis URLs for cache and session config.
2026-03-21 15:58:48 +00:00
Julia McGhee
a3c73dccb0 Fix Gitea DB auth: use additionalConfigFromEnvs for password
The _secret/_key syntax doesn't work in Gitea Helm values. Use
additionalConfigFromEnvs to inject GITEA__database__PASSWD from
the sealed secret, which the chart translates into app.ini config.
2026-03-21 15:56:18 +00:00
Julia McGhee
7db7bc916e Fix longhorn-nvme: add storageclass.yaml to Longhorn kustomization
The longhorn-nvme StorageClass was defined but never included in the
Longhorn kustomization, so it was never deployed. Add it and revert
Gitea manifests back to longhorn-nvme as intended.
2026-03-21 15:51:24 +00:00
Julia McGhee
aed0bff28a Fix storage class: use longhorn instead of longhorn-nvme
The longhorn-nvme storage class doesn't exist yet in the cluster.
Use the available longhorn class to unblock PVC provisioning.
2026-03-21 15:49:49 +00:00
Julia McGhee
5b4086e71f Revert ArgoCD repoURLs to GitHub temporarily
Gitea needs to be deployed before ArgoCD can read from it.
Keep GitHub repoURLs so ArgoCD can discover and deploy the
new gitea-pg, gitea, and gitea-runner directories. Switch
to Gitea repoURLs after Gitea is running and repo is pushed.
2026-03-21 15:46:41 +00:00
Julia McGhee
f04ecbf5cd Add Gitea self-hosted git/CI/registry to replace GitHub
Deploy Gitea via Helm with dedicated CloudNativePG database,
in-cluster Actions runner (DinD), and built-in container registry.
ArgoCD repoURLs updated to use in-cluster Gitea SSH. Preview
ApplicationSet switched from GitHub PR generator to Gitea PR
generator. App images now pull from gitea.coreworlds.io registry.

Remaining setup after deploy: seal runner token, ArgoCD API token,
and registry pull secret once Gitea is running. Add ArgoCD deploy
key to Gitea repo settings.
2026-03-21 15:43:30 +00:00
Julia McGhee
06ae2c7d46 Add ESLint config for harness app to fix CI lint 2026-03-21 15:38:12 +00:00
Julia McGhee
6dde7c8aef Add harness app: agent orchestrator with cluster deployment
- Next.js app for orchestrating coding agent benchmarks (Claude Code, Codex, OpenCode)
- Dockerfile installs git, gh CLI, and agent CLIs for headless execution
- K8s deployment with workspace volume, sealed credentials for Claude + OpenCode
- Traefik IngressRoute at harness.coreworlds.io with internal-only middleware + TLS
- CI pipeline path filter for harness builds
- Fix OpenCode runtime flags (subcommand-based headless mode)
2026-03-21 15:26:09 +00:00
Julia McGhee
9e7077cd82 Add Grafana admin sealed secret 2026-03-21 13:19:08 +00:00
Julia McGhee
c6ce40a557 Add Ansible storage role for NVMe setup and Longhorn dual-disk config
Automates LV expansion, NVMe mount, and Longhorn node disk tagging
(hdd/nvme) via Ansible instead of Kustomize-managed manifests.
2026-03-21 13:19:04 +00:00
Julia McGhee
3b8fd4afd2 expand disk storage 2026-03-21 09:53:50 +00:00
Julia McGhee
051c957347 Add observability stack: ServiceMonitors, Tempo, OTel API instrumentation, dashboards
- Add ServiceMonitors for Traefik, ArgoCD, and Longhorn
- Enable cert-manager ServiceMonitor via helm values
- Deploy Grafana Tempo for distributed tracing (single-binary, Longhorn PVC)
- Add Tempo datasource with trace-to-logs and trace-to-metrics correlation
- Instrument API with OpenTelemetry SDK (Prometheus metrics + OTLP traces)
- Replace console.log with pino structured logging + pino-http middleware
- Add Grafana dashboards for Traefik, API overview, and PostgreSQL (CNPG)
2026-03-20 21:01:05 +00:00
github-actions[bot]
8a23d5d5f6 deploy: update production images to da95687db9 2026-03-20 20:36:32 +00:00
Julia McGhee
da95687db9 Fix db package lint: add missing @types/node
tsc --noEmit failed in CI because process.env requires @types/node.
2026-03-20 20:29:57 +00:00
Julia McGhee
11f1365f75 Fix CI lint failures: add ESLint config, use next/font, fix JSX comment
- Add .eslintrc.json so next lint doesn't prompt interactively in CI
- Switch Google Fonts from <link> tags to next/font/google
- Wrap "// SECURE_NODE_7" in JSX expression to avoid comment parse error
2026-03-20 20:27:13 +00:00
Julia McGhee
3d61911868 Add Tactical Monolith design system and landing page to web app
Set up Tailwind CSS v4 with full design token system from Stitch project
(obsidian surfaces, neon cyan/magenta/lime palette, Space Grotesk + Inter
typography, 0px border-radius). Landing page includes hero section, side
nav, module cards, system status panels, terminal log, and CRT overlay.
2026-03-20 20:24:19 +00:00
Julia McGhee
18b2564c8e Add sealed api-secrets with database and Valkey connection strings 2026-03-20 20:16:35 +00:00
Julia McGhee
9ae228f0f3 Add ghcr.io pull secret for private container images
Sealed secret provides auth for pulling from ghcr.io/lazorgurl/*.
Added imagePullSecrets to both app deployments.
2026-03-20 20:06:18 +00:00
github-actions[bot]
a38c6d399a deploy: update production images to 6df9afdc20 2026-03-20 19:58:01 +00:00
Julia McGhee
6df9afdc20 Add packages:write permission for ghcr.io push 2026-03-20 19:55:51 +00:00
Julia McGhee
6317291330 Add empty public directory for Next.js Docker build 2026-03-20 19:53:32 +00:00
Julia McGhee
68261e17a2 Add .dockerignore files to prevent node_modules copy conflicts 2026-03-20 19:51:12 +00:00
Julia McGhee
c9f612d5ce Switch Dockerfiles from pnpm to npm for standalone app builds
pnpm in workspace mode can't generate per-app lockfiles, and without a
lockfile the install is unreliable in CI. npm works fine for these
standalone app builds since they have no workspace dependencies.
2026-03-20 19:49:50 +00:00
Julia McGhee
dafbb59463 Fix Docker builds: drop frozen-lockfile for standalone app builds
Apps build in isolation from the monorepo, so the root pnpm-lock.yaml
doesn't match the app-level package.json. Use plain pnpm install
since each app's package.json is the source of truth.
2026-03-20 19:48:00 +00:00