Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.trynebula.ai/llms.txt

Use this file to discover all available pages before exploring further.

This guide covers a production Nebula install on GKE using Cloud SQL for PostgreSQL, Google Cloud Storage (via HMAC keys or a MinIO bridge), GKE Workload Identity, and external secrets via External Secrets Operator with GCP Secret Manager.

Prereqs

Before helm install, the following must be in place on the cluster side.

Cluster

  • GKE 1.30+ (Autopilot or Standard mode both work; Standard gives more control over node pools)
  • Workload Identity enabled on the cluster (--workload-pool=<project>.svc.id.goog) — required for keyless SA binding to GCP IAM
  • OIDC provider is implicit on GKE when Workload Identity is enabled; no separate step needed

Addons + controllers

ComponentPurposeInstall reference
GKE Cluster AutoscalerNode autoscalingGKE built-in: --enable-autoscaling per node pool
nginx Ingress Controller (or GCE Ingress)HTTP/HTTPS ingresskubernetes.github.io/ingress-nginx
cert-managerTLS from Let’s Encryptcert-manager.io/docs
External Secrets Operator (recommended)Sync from GCP Secret Managerexternal-secrets.io
GKE Standard clusters create node pools manually; size them to match the workload sizing table below. GKE Autopilot provisions nodes on-demand from pod resource requests — set resources.requests precisely so Autopilot selects the right machine family.
  • Cloud SQL for PostgreSQL 16 in the same region as the cluster, with Private IP enabled. Enable the pgvector extension: in the Cloud SQL console, add vector to the cloudsql.enable_pgvector flag (Cloud SQL 15.7+ / 16.3+) or run CREATE EXTENSION IF NOT EXISTS vector after connecting.
  • GCS bucket in the same region. Grant the Nebula service account roles/storage.objectAdmin on the bucket.
Object storage note: the chart’s objectStorage block uses S3-protocol env vars. GCS exposes an S3-compatible XML API at https://storage.googleapis.com. Use HMAC keys (Service Accounts → HMAC keys in the Cloud Console) as the credentialsSecret, and set objectStorage.forcePathStyle: false for the GCS XML API. Alternatively, run a MinIO gateway in front of GCS.

Workload Identity setup

  1. Create a GCP service account for Nebula:
    gcloud iam service-accounts create nebula-sa \
      --project <project>
    
  2. Bind it to the Kubernetes service account the chart creates:
    gcloud iam service-accounts add-iam-policy-binding \
      nebula-sa@<project>.iam.gserviceaccount.com \
      --role roles/iam.workloadIdentityUser \
      --member "serviceAccount:<project>.svc.id.goog[nebula/<release>-nebula-sa]"
    
    Replace <release> with your helm install release name.
  3. Grant the GCP service account access to GCS:
    gcloud storage buckets add-iam-policy-binding gs://<bucket> \
      --role roles/storage.objectAdmin \
      --member "serviceAccount:nebula-sa@<project>.iam.gserviceaccount.com"
    
  4. If ESO uses the same GCP service account for Secret Manager access, also grant roles/secretmanager.secretAccessor on the secrets.
  5. Annotate the Kubernetes service account in your values file:
    serviceAccount:
      annotations:
        iam.gke.io/gcp-service-account: nebula-sa@<project>.iam.gserviceaccount.com
    

Install

1. Push images to Artifact Registry

tar -xzf nebula-enterprise-<version>.tar.gz
cd nebula-enterprise-<version>/
sha256sum -c checksums.txt
docker load -i images.tar

REGION=us-central1
AR="${REGION}-docker.pkg.dev/<project>/<repo>"
gcloud auth configure-docker "${REGION}-docker.pkg.dev"

docker tag nebula:enterprise-<version>              "${AR}/nebula/nebula-runtime:<version>"
docker tag nebula-graph-engine:enterprise-<version> "${AR}/nebula/graph-engine:<version>"
docker push "${AR}/nebula/nebula-runtime:<version>"
docker push "${AR}/nebula/graph-engine:<version>"
For private-cluster GKE (no public-registry egress), also mirror third-party images:
docker tag ghcr.io/hatchet-dev/hatchet/hatchet-engine:v0.79.0 "${AR}/hatchet-engine:v0.79.0"
docker tag pgvector/pgvector:0.8.0-pg16                       "${AR}/pgvector/pgvector:0.8.0-pg16"
docker tag rabbitmq:3.13-management                           "${AR}/rabbitmq:3.13-management"
docker tag busybox:1.37.0                                     "${AR}/busybox:1.37.0"
docker push "${AR}/hatchet-engine:v0.79.0"
docker push "${AR}/pgvector/pgvector:0.8.0-pg16"
docker push "${AR}/rabbitmq:3.13-management"
docker push "${AR}/busybox:1.37.0"

2. Seed secrets in GCP Secret Manager

echo -n "sk-..."           | gcloud secrets create OPENAI_API_KEY       --data-file=-
echo -n "$(openssl rand -hex 32)" | gcloud secrets create NEBULA_SECRET_KEY --data-file=-
# Repeat for NEBULA_SERVICE_API_KEY, NEBULA_WEBHOOK_HMAC_SECRET,
# NEBULA_INTERNAL_WAKE_TOKEN, NEBULA_VECTOR_BUILD_HATCHET_TRIGGER_TOKEN.
Postgres credentials go in separate secrets and are materialized into Kubernetes Secrets (with username and password keys) by ESO.

3. Copy + fill the reference values file

The bundle ships helm/examples/gke/values.yaml with GKE-specific knobs pre-wired (Workload Identity annotation, GCS endpoint, nginx ingress, Secret Manager ESO). Copy it, fill in the <placeholder> markers, and save as your-values.yaml.

4. Install

gcloud container clusters get-credentials <cluster> --region <region> --project <project>

helm install nebula ./helm/nebula-<version>.tgz \
  -n nebula --create-namespace \
  -f helm/examples/_common/production-sizing.yaml \
  -f your-values.yaml
_common/production-sizing.yaml is the shared production-shape sizing block (replicas, CPU/memory requests + limits, persistence) used by all three cloud-managed K8s examples (EKS/AKS/GKE). Omit it to keep the chart’s minimal-dev defaults; override per-workload in your-values.yaml to fit your GKE node SKUs. The chart runs schema migrations and catalog-apply automatically via a per-revision Job (<release>-nebula-migrations-<revision>); API and worker pods gate startup on an init container that polls public.nebula_release_contract for the install’s release row. releaseContract.releaseId and releaseContract.gitSha are stamped by bundle.sh and consumed automatically.

5. Verify

kubectl -n nebula get pods
kubectl -n nebula get ingress nebula
curl -fsS https://nebula.<your-domain>.com/v1/health

Upgrade

Pull the new bundle, push new images to Artifact Registry, then:
helm upgrade nebula ./helm/nebula-<new-version>.tgz \
  -n nebula \
  -f your-values.yaml

Sizing reference

WorkloadStarterWhen to scale
API2 replicas, 1 CPU / 2-4 GBHPA on CPU >70% sustained
Worker2 replicas, 2 CPU / 4-8 GBHPA on queue depth (Hatchet metric)
Graph engine2 replicas, 2 CPU / 4-8 GBManual; restart-sensitive (WAL replay)
Compactor1 replica, 1 CPU / 2-4 GBSingle-writer; do not scale horizontally
RabbitMQ1 replica, 8 GB PVCSingle-broker is fine up to ~10k workflows/min
Recommended GKE machine types: n2-standard-4 (4 vCPU / 16 GB) for API, worker, Hatchet; n2-highmem-4 (4 vCPU / 32 GB) for graph-engine and compactor.

Troubleshooting

Confirm the Kubernetes SA annotation is set: kubectl -n nebula describe sa <release>-nebula-sa should show iam.gke.io/gcp-service-account. Also verify the IAM binding: gcloud iam service-accounts get-iam-policy nebula-sa@<project>.iam.gserviceaccount.com should list the workloadIdentityUser binding for the K8s SA. Ensure the cluster’s Workload Identity pool (<project>.svc.id.goog) is enabled.
The GCE Ingress controller provisions a Google Cloud Load Balancer which can take 5-10 minutes. Check kubectl -n nebula describe ingress nebula for events. If you need faster provisioning, switch ingress.className: nginx and install the nginx Ingress controller instead.
Cloud SQL for PostgreSQL 16.3+ supports pgvector via the vector extension. Enable it in the Cloud SQL flags (cloudsql.enable_pgvector=on) then run CREATE EXTENSION IF NOT EXISTS vector; in each database. Cloud SQL docs: Use pgvector.
Verify the HMAC key is created for a service account (not a user account). HMAC keys for service accounts are under IAM & AdminService Accounts → select the account → Keys tab → HMAC keys. Store the Access ID and Secret in the Kubernetes Secret referenced by objectStorage.credentialsSecret. The Secret must have AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY keys — those exact uppercase names — the chart’s nebula.objectStorageEnv helper reads them via secretKeyRef.key.