No description
  • Go 95.6%
  • Shell 2.1%
  • Makefile 2.1%
  • Dockerfile 0.2%
Find a file
Charley Sheets fc5803fd47
Some checks failed
CI / ci (push) Successful in 1m21s
Publish images / publish (push) Successful in 3m57s
E2E smoke test / e2e (push) Has been cancelled
Merge pull request 'fix: run make generate before tests in publish workflow' (#24) from feat/ci-versioned-publish into master
Reviewed-on: #24
2026-03-28 20:57:18 +00:00
.forgejo/workflows fix: run make generate before tests in publish workflow 2026-03-28 20:48:59 +00:00
api/v1alpha1 feat: configurable service account and SA token mount for runner pods 2026-03-28 04:41:28 +00:00
cmd style: fix gofmt alignment 2026-03-28 20:40:57 +00:00
config feat: configurable service account and SA token mount for runner pods 2026-03-28 04:41:28 +00:00
hack feat: add CI/CD pipelines, e2e smoke test, Go 1.26 toolchain, and Renovate 2026-03-20 13:33:27 +00:00
internal style: fix gofmt alignment 2026-03-28 20:40:57 +00:00
proto/provisioner/v1 feat: scaffold Forgejo Runner Operator 2026-02-14 03:51:22 -08:00
.gitignore feat: scaffold Forgejo Runner Operator 2026-02-14 03:51:22 -08:00
CLAUDE.md feat: add CI/CD pipelines, e2e smoke test, Go 1.26 toolchain, and Renovate 2026-03-20 13:33:27 +00:00
Containerfile feat: add Sentry instrumentation to controller, inject version via ldflags 2026-03-25 21:12:14 +00:00
Containerfile.admin-registrar feat: add Sentry instrumentation to controller, inject version via ldflags 2026-03-25 21:12:14 +00:00
Containerfile.provisioner feat: add CI/CD pipelines, e2e smoke test, Go 1.26 toolchain, and Renovate 2026-03-20 13:33:27 +00:00
go.mod feat: add Sentry/GlitchTip error reporting instrumentation 2026-03-25 00:25:32 +00:00
go.sum feat: add Sentry/GlitchTip error reporting instrumentation 2026-03-25 00:25:32 +00:00
Makefile feat: add Sentry instrumentation to controller, inject version via ldflags 2026-03-25 21:12:14 +00:00
README.md feat: add Containerfiles and update Makefile for container builds 2026-02-25 00:37:00 -08:00
renovate.json feat: add CI/CD pipelines, e2e smoke test, Go 1.26 toolchain, and Renovate 2026-03-20 13:33:27 +00:00

Forgejo Runner Operator

A Kubernetes operator that manages pools of Forgejo Actions runners, supporting both in-cluster Pods (with Docker-in-Docker) and off-cluster ephemeral VMs.

Core Principle: Ephemerality Without Invisibility

Every disposable resource — a runner pod, a VM, a registration, a job execution — is also an instrumentation point. The shorter something lives, the more important it is to capture what it did while it existed.

  • Structured event log: every lifecycle transition emits a structured event to a durable store.
  • Metrics at every boundary: registration latency, job queue wait time, execution duration, VM boot time.
  • Status on the CRD: RunnerPool.status carries a full accounting including recently-destroyed runners.
  • Log forwarding from ephemeral VMs: logs ship to a collector before the runner accepts work.

Architecture

The operator consists of three components:

  1. Runner Controller (in-cluster) — reconciles RunnerPool CRDs, manages Deployments (kubernetes backend) or calls the provisioner (remote backend).
  2. Webhook Receiver (in-cluster, same binary) — receives Forgejo workflow_job webhooks for scale-to-zero support.
  3. Provisioner Agent (on VM host, separate binary) — manages ephemeral VM lifecycle via a pluggable hypervisor driver (Firecracker recommended).

Quick Start

Prerequisites

  • Go 1.23+
  • A Kubernetes cluster with kubectl configured
  • A Forgejo instance with Actions enabled

Build

# Resolve dependencies
go mod tidy

# Generate CRD manifests and deepcopy methods
make generate

# Build both binaries
make build

# Build container images
make podman-build

Install CRDs

make install

Deploy the operator

# Edit config/manager/manager.yaml to set your image
make deploy

Create a RunnerPool

# Create the namespace and token secret
kubectl create namespace forgejo-runners-trusted
kubectl create secret generic forgejo-admin-token \
  --namespace=forgejo-runners-trusted \
  --from-literal=token=<YOUR_FORGEJO_API_TOKEN>

# Apply the sample RunnerPool
kubectl apply -f config/samples/trusted_dind.yaml
# or a repo-enrolled trusted pool:
kubectl apply -f config/samples/trusted_repo_forgejo_runner_operator.yaml

For trusted runners, enroll specific projects by setting spec.forgejo.scope:

  • repository + spec.forgejo.repository: owner/repo for a single project
  • organization + spec.forgejo.organization: org for a whole org

This lets you run multiple trusted pools (for example, one per enrolled project) without exposing those runners to unrelated repositories.

Observe

# Watch the pool status
kubectl get runnerpools -A

# Detailed status including runner history
kubectl describe runnerpool trusted-dind -n forgejo-runners-trusted

# Prometheus metrics
curl http://localhost:8443/metrics | grep forgejo_runner_operator

CRD: RunnerPool

A single CRD covers both runner topologies. The backend field selects the execution strategy. Use spec.forgejo.scope (global, organization, or repository) to control which Forgejo namespace can request runners from the pool.

See config/samples/ for example manifests.

Project Layout

cmd/
  controller/       Operator entrypoint
  provisioner/      Provisioner agent entrypoint
api/v1alpha1/       CRD type definitions
internal/
  controller/       Reconciliation loop
  backend/          Execution strategy implementations
  forgejo/          Forgejo API client
  webhook/          Webhook receiver
  metrics/          Prometheus metrics
  audit/            Append-only audit logger
  provisioner/
    server/         gRPC server
    driver/         Hypervisor drivers (Firecracker, etc.)
    events/         Event ring buffer
proto/              Protobuf definitions
config/
  crd/              CRD manifests
  rbac/             RBAC resources
  manager/          Controller deployment
  samples/          Example RunnerPool manifests

Development Phases

  • Phase 0: Self-registering init container (in infra repo, no operator needed)
  • Phase 1: Operator scaffolding + in-cluster Kubernetes backend
  • Phase 2: Webhook receiver + scale-to-zero autoscaling
  • Phase 3: Provisioner agent + remote ephemeral VM backend
  • Phase 4: mTLS hardening, audit logging, Grafana dashboards

License

Apache License 2.0