- Go 95.6%
- Shell 2.1%
- Makefile 2.1%
- Dockerfile 0.2%
|
|
||
|---|---|---|
| .forgejo/workflows | ||
| api/v1alpha1 | ||
| cmd | ||
| config | ||
| hack | ||
| internal | ||
| proto/provisioner/v1 | ||
| .gitignore | ||
| CLAUDE.md | ||
| Containerfile | ||
| Containerfile.admin-registrar | ||
| Containerfile.provisioner | ||
| go.mod | ||
| go.sum | ||
| Makefile | ||
| README.md | ||
| renovate.json | ||
Forgejo Runner Operator
A Kubernetes operator that manages pools of Forgejo Actions runners, supporting both in-cluster Pods (with Docker-in-Docker) and off-cluster ephemeral VMs.
Core Principle: Ephemerality Without Invisibility
Every disposable resource — a runner pod, a VM, a registration, a job execution — is also an instrumentation point. The shorter something lives, the more important it is to capture what it did while it existed.
- Structured event log: every lifecycle transition emits a structured event to a durable store.
- Metrics at every boundary: registration latency, job queue wait time, execution duration, VM boot time.
- Status on the CRD:
RunnerPool.statuscarries a full accounting including recently-destroyed runners. - Log forwarding from ephemeral VMs: logs ship to a collector before the runner accepts work.
Architecture
The operator consists of three components:
- Runner Controller (in-cluster) — reconciles
RunnerPoolCRDs, manages Deployments (kubernetes backend) or calls the provisioner (remote backend). - Webhook Receiver (in-cluster, same binary) — receives Forgejo
workflow_jobwebhooks for scale-to-zero support. - Provisioner Agent (on VM host, separate binary) — manages ephemeral VM lifecycle via a pluggable hypervisor driver (Firecracker recommended).
Quick Start
Prerequisites
- Go 1.23+
- A Kubernetes cluster with
kubectlconfigured - A Forgejo instance with Actions enabled
Build
# Resolve dependencies
go mod tidy
# Generate CRD manifests and deepcopy methods
make generate
# Build both binaries
make build
# Build container images
make podman-build
Install CRDs
make install
Deploy the operator
# Edit config/manager/manager.yaml to set your image
make deploy
Create a RunnerPool
# Create the namespace and token secret
kubectl create namespace forgejo-runners-trusted
kubectl create secret generic forgejo-admin-token \
--namespace=forgejo-runners-trusted \
--from-literal=token=<YOUR_FORGEJO_API_TOKEN>
# Apply the sample RunnerPool
kubectl apply -f config/samples/trusted_dind.yaml
# or a repo-enrolled trusted pool:
kubectl apply -f config/samples/trusted_repo_forgejo_runner_operator.yaml
For trusted runners, enroll specific projects by setting spec.forgejo.scope:
repository+spec.forgejo.repository: owner/repofor a single projectorganization+spec.forgejo.organization: orgfor a whole org
This lets you run multiple trusted pools (for example, one per enrolled project) without exposing those runners to unrelated repositories.
Observe
# Watch the pool status
kubectl get runnerpools -A
# Detailed status including runner history
kubectl describe runnerpool trusted-dind -n forgejo-runners-trusted
# Prometheus metrics
curl http://localhost:8443/metrics | grep forgejo_runner_operator
CRD: RunnerPool
A single CRD covers both runner topologies. The backend field selects the execution strategy.
Use spec.forgejo.scope (global, organization, or repository) to control
which Forgejo namespace can request runners from the pool.
See config/samples/ for example manifests.
Project Layout
cmd/
controller/ Operator entrypoint
provisioner/ Provisioner agent entrypoint
api/v1alpha1/ CRD type definitions
internal/
controller/ Reconciliation loop
backend/ Execution strategy implementations
forgejo/ Forgejo API client
webhook/ Webhook receiver
metrics/ Prometheus metrics
audit/ Append-only audit logger
provisioner/
server/ gRPC server
driver/ Hypervisor drivers (Firecracker, etc.)
events/ Event ring buffer
proto/ Protobuf definitions
config/
crd/ CRD manifests
rbac/ RBAC resources
manager/ Controller deployment
samples/ Example RunnerPool manifests
Development Phases
- Phase 0: Self-registering init container (in infra repo, no operator needed)
- Phase 1: Operator scaffolding + in-cluster Kubernetes backend
- Phase 2: Webhook receiver + scale-to-zero autoscaling
- Phase 3: Provisioner agent + remote ephemeral VM backend
- Phase 4: mTLS hardening, audit logging, Grafana dashboards
License
Apache License 2.0