Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Nivis: All your base belongs to Nix

Nivis

Infrastructure as Nix Code. All your base belongs to Nix. (Nivis, Latin, "of snow"; it belongs to Nix. Formerly nixform, then Terrae Nivis.)

A Nix-native infrastructure tool where Terraform/OpenTofu provider resources are first-class Nix values. A thin Go executor speaks the Terraform plugin protocol directly to unmodified provider binaries: Nix is the configuration frontend, Go is pure orchestration.

The headline capability (the reason this project exists) is the round trip: a provider-created resource returns computed values (an IP, an ID, a generated secret) back into Nix, which re-evaluates to produce dependent configuration, repeating to a fixpoint. This is proven end to end across two providers with unknown values originating on both sides.

How it works

Nix evaluates your configuration to a JSON IR (docs/IR-CONTRACT.md). Values that aren't known until apply-time are emitted as typed placeholders: a __ref (a direct reference to another resource's output) or a __derived (a value Nix computed from an output, e.g. a string built from an IP). The Go executor ingests the IR, spawns the relevant provider binaries, drives GetProviderSchema/PlanResourceChange/ApplyResourceChange, and collects the real outputs into an outputs ledger. It then re-evaluates Nix with the ledger injected, so placeholders resolve to concrete values; the new IR may unlock more resources. This loop repeats to a fixpoint (no new value resolves). Because each Nix-mediated (__derived) hop needs its own re-evaluation, deep chains take more than two phases; the loop generalizes to N phases. See DESIGN.md for why this (not an Output<T> promise model) is the honest, Nix-shaped approach.

Where to start

Installing Nivis

The Nivis CLI is nivis (schema codegen is the nivis gen subcommand). It is distributed as a Nix flake, so you can run it without cloning anything. Pick whichever fits how you work.

Runtime needs. nivis shells out to nix to evaluate your configuration, so Nix must be on your PATH (with flakes enabled). The first time you use a real provider, nivis downloads it from the OpenTofu registry (e.g. the AWS provider is ~900 MB) and caches it, so that first run needs network and a little patience.

Run it ad hoc (no install)

The quickest way: run straight from the flake; Nix builds it on first use and caches the result:

nix run github:wearetechnative/nivis#nivis -- --version
nix run github:wearetechnative/nivis#nivis -- plan      # in your infra dir

Everything after -- is passed to nivis; codegen is nivis -- gen ….

A throwaway shell

Drop into a shell with nivis on PATH for the session, handy while iterating:

nix shell github:wearetechnative/nivis#nivis
nivis --version

Install it persistently

Add nivis to your user profile so it's always available:

nix profile install github:wearetechnative/nivis#nivis
nivis --version

Update later with nix profile upgrade, remove with nix profile remove.

From a clone (contributors)

If you've checked out the repository:

nix run .#nivis -- --version          # from the repo root
# or build a binary:
go build -o bin/nivis ./cmd/nivis
nix build .#nivis                     # -> ./result/bin/nivis

Pinning

The github: reference floats on the default branch. For reproducible infra, pin it in your own flake's flake.lock (the AWS S3 tutorial does this: Nivis becomes an input, and nix flake lock records the exact revision). Re-pin deliberately with nix flake update nivis.

Getting started with Nivis

A hands-on walkthrough using the in-repo fake providers. Everything here runs offline: no provider registry, no cloud account, no credentials. You need Go 1.22+ and Nix.

1. Build the binaries

go build -o bin/provider-alpha ./cmd/provider-alpha
go build -o bin/provider-beta  ./cmd/provider-beta
go build -o bin/nivis ./cmd/nivis

provider-alpha and provider-beta are minimal tfprotov6 providers used as a hermetic test substrate. Their outputs are a deterministic function of inputs (a per-process counter seeded by TERRAE_NIVIS_FAKE_COUNTER, default 0), so every run is reproducible.

Prefer Nix? The flake builds the CLIs from source: nix run .#nivis -- … and nix run .#nivis-gen -- … (or nix build .#nivis). Everywhere below, ./bin/nivis can be read as nix run .#nivis --. You still build the fake providers with go build (they aren't packaged as apps).

2. The example configuration

The flake's nivis.plan (in nix/example/) describes three resources and a consumer, wired so each hop crosses the Nix boundary:

alpha_token.A            (alpha)            -- no inputs; A.value computed at apply
   └─ name = "rec-" + A.value               (a __derived value)
beta_record.B  (beta)    from = name        -- B.endpoint computed at apply
   └─ final = B.endpoint + "::" + A.value    (a __derived on BOTH providers)
alpha_token.C  (alpha)   label = final

systemConfig (a Nix consumer) reads:
  recordEndpoint = B.endpoint   # from beta
  tokenValue     = A.value      # from alpha
  combined       = final        # from both

Because name and final are values Nix computes from provider outputs, they can't be known until those outputs exist and Nix is re-evaluated, which is what forces multiple phases.

3. Plan and apply

./bin/nivis plan
+ alpha.alpha_token.A (alpha_token)
+ beta.beta_record.B (beta_record)
+ alpha.alpha_token.C (alpha_token)

3 resource(s) to resolve across phases. Run `nivis apply`.
./bin/nivis apply
Applied 3 resource(s) across 3 phase(s):
  ✓ alpha.alpha_token.A
  ✓ beta.beta_record.B
  ✓ alpha.alpha_token.C

Three phases, not one: phase 1 applies A (nothing else is ready); re-evaluating with A.value known unlocks B; re-evaluating with B.endpoint known unlocks C. The loop halts at a fixpoint once nothing new resolves.

4. Inspect the round trip

./bin/nivis state list
./bin/nivis state show alpha.alpha_token.C
alpha.alpha_token.C (alpha_token)
  id = alpha-1
  label = beta://rec-alpha::0::alpha::0
  value = alpha:beta://rec-alpha::0::alpha::0:1

C.label is final: a string Nix built from both B.endpoint (beta) and A.value (alpha). That value only became concrete after both providers applied and Nix re-evaluated. That is the round trip.

5. Refresh and destroy

./bin/nivis refresh    # reconciles state via ReadResource; no changes here
./bin/nivis destroy    # tears down in reverse dependency order
Destroyed 3 resource(s):
  - alpha.alpha_token.C
  - beta.beta_record.B
  - alpha.alpha_token.A

6. Generate constructors from a provider schema

nivis gen turns any provider's schema into typed Nix constructors:

go run ./cmd/nivis gen -- --provider ./bin/provider-alpha --out ./generated
cat ./generated/alpha/alpha_token.nix

The generated constructor requires the provider's required inputs (throwing a named error if missing), passes optional inputs through, omits computed-only attributes (they're outputs), and accepts an overrides argument so you can adjust the generated output.

7. A real provider (AWS)

Everything above is offline against the fakes. The same nivis commands drive real providers: nivis resolves a provider by address from the OpenTofu registry, downloads and checksum-verifies the binary, negotiates the plugin protocol (AWS speaks v5), configures it, and runs plan/apply/destroy. The example nix/example/aws.nix (flake attr nivis.aws) declares the hashicorp/aws provider with mkProvider and one aws_s3_bucket.

⚠️ This creates a real resource in your AWS account: one (free-tier) S3 bucket, then destroys it. The provider's region lives in the Nix config; only credentials come from the environment (the AWS SDK default chain), so set AWS_PROFILE (or AWS_ACCESS_KEY_ID/…). The first run downloads the ~900 MB AWS provider (cached afterwards).

For the full, hand-held walkthrough (prerequisites, writing the config line by line, plan/apply/inspecting state/destroy, and troubleshooting) follow the AWS S3 tutorial.

Where to go next

  • IR-CONTRACT.md + ir-schema.json: the IR, the stable contract between the Nix frontend and the Go executor.
  • TESTING.md: the test layers and the headline two-provider e2e.
  • DESIGN.md: why the architecture is the way it is (spawn-not-link, batch-not-live, phased re-eval to a fixpoint).

The core test suite is hermetic (fakes, no network/credentials); real-provider support (registry download + checksum verification, tfprotov5/6) is proven against AWS as shown above.

Real providers (AWS)

Everything above is offline against the fakes. The same nivis commands drive real providers: nivis resolves a provider by address from the OpenTofu registry, downloads and checksum-verifies the binary, negotiates the plugin protocol (AWS speaks v5), configures it, and runs plan/apply/destroy. The example nix/example/aws.nix (flake attr nivis.aws) declares the hashicorp/aws provider with mkProvider and one aws_s3_bucket.

⚠️ This creates a real resource in your AWS account: one (free-tier) S3 bucket, then destroys it. The provider's region lives in the Nix config; only credentials come from the environment (the AWS SDK default chain), so set AWS_PROFILE (or AWS_ACCESS_KEY_ID/…). The first run downloads the ~900 MB AWS provider (cached afterwards).

For the full, hand-held walkthrough (prerequisites, writing the config line by line, plan/apply/inspecting state/destroy, and troubleshooting) follow the AWS S3 tutorial.

Tutorial: an S3 bucket on AWS

A genuinely from-scratch walkthrough. You start in an empty directory on your own machine (not a checkout of Nivis), install the nivis CLI, scaffold a fresh flake that uses Nivis as a dependency, declare one S3 bucket, and drive it through plan → apply → inspect → destroy. By the end you'll have a small infra flake you own and a real bucket created and torn down.

⚠️ This creates a real resource in your AWS account: a single S3 bucket (no objects, negligible cost) that you destroy at the end. The commands and outputs below come from real runs.

Prerequisites: Nix (with flakes enabled) on your PATH, and AWS credentials you can use locally.

Part 1: Install nivis

The CLI is nivis. You don't need to clone anything; the quickest path is to run it straight from the flake:

nix run github:wearetechnative/nivis#nivis -- --version

If you'd rather have nivis on your PATH for the rest of this tutorial, install it persistently or open a shell with it: see Installing Nivis for all the options (nix run, nix shell, nix profile install, building from a clone). The rest of this tutorial writes nivis …; if you chose the ad-hoc form, read that as nix run github:wearetechnative/nivis#nivis -- ….

Part 2: A fresh infra flake

2.1 Scaffold the flake

mkdir my-infra && cd my-infra
nix flake init

nix flake init drops a placeholder flake.nix (a hello package). Replace its contents with the infra flake below.

2.2 The boilerplate

{
  description = "My infrastructure, as Nix code (Nivis).";

  # Pull Nivis in as a dependency. `nix flake lock` (run automatically by
  # the first nivis command) records the exact revision in flake.lock.
  inputs.nivis.url = "github:wearetechnative/nivis";

  outputs =
    { self, nivis }:
    let
      # The Nivis Nix library: mkResource, mkProvider, toIR, str, …
      lib = nivis.lib;
    in
    {
      # `nivis` (the CLI) evaluates the attribute `nivis.plan` by default. It's a
      # function of the outputs ledger (the apply-time values fed back in each
      # phase).
      nivis.plan =
        ledger:
        let
          # A bucket. `bucket` (the name) is omitted, so AWS generates one.
          bucket = lib.mkResource {
            provider = "aws";
            type = "aws_s3_bucket";
            name = "demo"; # id becomes aws.aws_s3_bucket.demo
            config = {
              force_destroy = true; # let `nivis destroy` delete it even if non-empty
            };
          };

          # A text file whose CONTENT is generated by Nix, from the bucket's own
          # output. `bucket = bucket.refAttr "id"` makes the object depend on the
          # bucket; `content` is a Nix string embedding the bucket's generated
          # name, which doesn't exist until the bucket is applied. This is the
          # round trip (see below).
          note = lib.mkResource {
            provider = "aws";
            type = "aws_s3_object";
            name = "note";
            config = {
              bucket = bucket.refAttr "id";
              key = "hello-from-nix.txt";
              content = lib.str [
                "This file's content was generated by Nix.\n"
                "It is stored in the bucket named: "
                (bucket.refAttr "id")
                "\n"
              ];
              content_type = "text/plain";
            };
          };
        in
        lib.toIR {
          # --- providers --------------------------------------------------
          providers.aws = lib.mkProvider {
            source = "registry.opentofu.org/hashicorp/aws";
            config = {
              region = "eu-central-1"; # set in Nix, not via AWS_REGION
              # default_tags is a *list-nested* block in the AWS provider, so it
              # takes a list (a bare attrset is rejected at configure time).
              default_tags = [ { tags = { managed-by = "nivis"; }; } ];
            };
          };

          # --- resources --------------------------------------------------
          resources = [ bucket note ];

          inherit ledger;
        };
    };
}

Reading it:

  • inputs.nivis.url makes Nivis a dependency; lib = nivis.lib binds its Nix library.
  • nivis.plan is the attribute nivis evaluates by default: a function ledger → IR. (Name it something else and pass nivis plan --attr <name>.)
  • mkProvider declares the AWS provider: source is its registry address, region lives in Nix, and default_tags is a one-element list because that block is list-nested in the AWS provider.
  • mkResource declares the aws_s3_bucket (force_destroy for easy teardown; bucket omitted so AWS picks a unique name) and an aws_s3_object.
  • The object is the interesting part. bucket = bucket.refAttr "id" is a reference to the bucket's output, so Nivis knows the object depends on the bucket. And content = lib.str [ … (bucket.refAttr "id") … ] builds the file's body in Nix, embedding the bucket's name, a value that does not exist until the bucket is applied. See A file whose content comes from Nix below.

2.3 Credentials

nivis uses the AWS SDK default credential chain (the same one the AWS CLI uses). Point it at your account, typically a named profile:

export AWS_PROFILE=your-profile
aws sts get-caller-identity   # sanity check

Only credentials come from the environment; the region is in the flake above.

Part 3: Plan, apply, inspect, destroy

Run these from your my-infra directory.

Plan

nivis plan
+ aws.aws_s3_bucket.demo (aws_s3_bucket)
+ aws.aws_s3_object.note (aws_s3_object)

2 resource(s) to resolve across phases (+ create, ~ change). Run `nivis apply`.

The first nivis command resolves the nivis input (writing flake.lock) and, on first use of a real provider, downloads it, so the first run is slower.

Apply

nivis apply
Applied 2 resource(s) across 2 phase(s):
  ✓ aws.aws_s3_bucket.demo
  ✓ aws.aws_s3_object.note

Two phases, not one. Phase 1 creates the bucket: nothing else can run, because the object's content needs the bucket's name. Nivis then re-evaluates the Nix config with the bucket's generated name injected, which resolves the object's content; phase 2 uploads it. nivis writes the resulting state to nivis.state.json in my-infra.

Inspect the round trip

nivis state show aws.aws_s3_bucket.demo
  arn = arn:aws:s3:::terraform-20260615181937557000000001
  tags_all = map[managed-by:nivis]
  force_destroy = true
  region = eu-central-1
  id = terraform-20260615181937557000000001
  …

The bucket name (terraform-2026…) was generated by AWS and read back into state: a value that didn't exist until apply is now concrete. tags_all shows the provider's default_tags were applied.

A file whose content comes from Nix

This is the point of the project. Look at the object:

nivis state show aws.aws_s3_object.note
  bucket = terraform-20260615181937557000000001
  key = hello-from-nix.txt
  content_type = text/plain
  content = This file's content was generated by Nix.
  id = terraform-20260615181937557000000001/hello-from-nix.txt

The content was built in Nix, and it embeds the bucket's AWS-generated name, which did not exist until phase 1 created the bucket. Fetch the actual object from S3 to see it landed in the real world:

aws s3 cp "s3://$(nivis state show aws.aws_s3_bucket.demo | awk '/^  id = /{print $3}')/hello-from-nix.txt" -
This file's content was generated by Nix.
It is stored in the bucket named: terraform-20260615181937557000000001

A value computed in the Nix domain became the body of a real resource in the Terraform/AWS domain, resolved across phases. That round trip, not just "Terraform from Nix," but Nix and provider state feeding each other, is why Nivis exists.

Destroy

nivis destroy
Destroyed 2 resource(s):
  - aws.aws_s3_object.note
  - aws.aws_s3_bucket.demo

Resources tear down in reverse dependency order: the object first, then the bucket. Confirm nothing's left:

aws s3api list-buckets --query 'Buckets[?contains(Name, `terraform-`)].Name'

Make it your own

  • A specific bucket name: add bucket = "globally-unique-name"; to the resource config.
  • A different region: change region in the provider config.
  • More resources: add more mkResource entries to the resources list; wire one resource's output into another with the reference helpers (refAttr) and Nivis resolves them across phases.
  • Pin Nivis: the input floats on the default branch; flake.lock pins the exact revision. Re-pin deliberately with nix flake update nivis.

Troubleshooting

  • NoCredentialProviders / could not find credentials: the SDK chain found nothing. Set AWS_PROFILE (or the access-key vars) and confirm with aws sts get-caller-identity.
  • expected array … got map[string]interface {} for a provider block: that block is list-nested; wrap it in a list, e.g. default_tags = [ { tags = { … }; } ].
  • First run is slow / seems to hang: it's resolving the flake input and downloading the ~900 MB AWS provider once; later runs use the cache.
  • BucketAlreadyExists: you set an explicit bucket name someone already owns (S3 names are global). Omit bucket, or pick another.
  • nivis can't find your flake: run nivis from the directory containing flake.nix, or pass nivis plan --flake /path/to/my-infra.

Tutorial: a NixOS machine on EC2

This goes further than the S3 tutorial: you build a NixOS image in Nix, register it as an AMI, and launch it as an EC2 instance, and the entire AWS pipeline (upload, import, register, launch) is driven by Nivis. The machine runs a tiny web server, and you verify the running instance answers HTTP 200 on port 80, then tear it all down.

⚠️ This creates real, billable AWS resources (an EBS snapshot, an AMI, and a t3.micro instance) and uploads a ~2 GB image to S3. The walkthrough destroys everything at the end. Credentials come from the environment (AWS_PROFILE); the region is in the Nix config.

The shape:

nix build  ──►  NixOS amazon image (a .vhd, nginx baked in)
                      │
   Nivis ── aws_iam_role + policy (vmimport)      the VM-import service role
         ── aws_s3_bucket + aws_s3_object         the .vhd uploaded to S3
         ── aws_ebs_snapshot_import               S3 .vhd → EBS snapshot
         ── aws_ami                               register the snapshot as an AMI
         ── aws_security_group                    ingress :80
         ── aws_instance                          launch it  (public_ip → Nix)
                      │
   curl http://<public_ip>/  ──►  200

This mirrors elastinix (the wearetechnative NixOS-on-AWS flake) and its Terraform module, but driven by Nivis instead of a Terraform module.

Part 1: The OS and the infra in one file

The key idea: the image and the infrastructure live in the same Nix file. A NixOS "amazon image" is itself a Nix derivation, config.system.build.images.amazon, a .vhd disk image of a machine configuration, so you reference its build output directly as aws_s3_object.source. When nivis apply evaluates the flake, Nix realises the image as part of evaluation and its store path flows straight into the upload. One expression defines the OS and the cloud resources that ship it: that two-domain mix is the whole point of this tutorial.

{
  inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-25.05";
  inputs.nivis.url = "github:wearetechnative/nivis";

  outputs = { self, nixpkgs, nivis }:
    let
      # --- domain 1: the OS, built in Nix (nginx, returns 200 on :80) -----
      image = (nixpkgs.lib.nixosSystem {
        system = "x86_64-linux";
        modules = [ ({ modulesPath, ... }: {
          imports = [ (modulesPath + "/virtualisation/amazon-image.nix") ];
          services.nginx.enable = true;
          services.nginx.virtualHosts."_".locations."/".return =
            ''200 "hello from NixOS on EC2, built and launched by Nivis\n"'';
          networking.firewall.allowedTCPPorts = [ 80 ];
          system.stateVersion = "25.05";
        }) ];
      }).config.system.build.images.amazon;

      # --- domain 2: the infra, as Nivis resources, fed by that image -----
      pipeline = import (nivis + "/nix/example/ec2.nix") {
        nivis = nivis.lib;
        nixosImage = image;   # `drv image` -> aws_s3_object.source (realised at apply)
      };
    in {
      nivis.plan = ledger: pipeline (ledger // { vars.suffix = "demo"; });
    };
}

nivis apply builds image first (the heavy step, the one that uses the Nix binary cache, ≈2 GB), then drives the AWS pipeline; everything after the build is pure AWS. (This repo's nix/example/ec2.nix + the nivis.ec2 flake attr are exactly this, ready to run.)

Part 2: The Nivis pipeline

Here is the whole AWS side: a function of the built image and the outputs ledger, returning the IR. It's the contents of this repo's nix/example/ec2.nix (exposed as the flake attr nivis.ec2); drop it in your own repo and import it as shown in Part 1, or inline it.

{
  nivis,        # the Nivis library (mkResource / mkProvider / toIR / drv)
  nixosImage,   # the NixOS amazon image derivation from Part 1
}:
ledger:
let
  inherit (nivis) mkResource mkProvider toIR drv;

  suffix = (ledger.vars or { }).suffix or "demo";
  bucketName = "nivis-ec2nix-${suffix}";

  # The OS crossing into the infra: `drv` marks the image as a build output,
  # a __build leaf that `nivis apply` realises (builds) before uploading, then
  # substitutes the concrete .vhd path. No manual store-path interpolation; no
  # separate `nix build`. (drv uses the image's passthru.filePath, the .vhd.)
  imgSource = drv nixosImage;

  # --- the vmimport service role AWS requires to import a disk image ---------
  role = mkResource {
    provider = "aws"; type = "aws_iam_role"; name = "vmimport";
    config = {
      name = "nivis-vmimport-${suffix}";
      assume_role_policy = builtins.toJSON {
        Version = "2012-10-17";
        Statement = [{
          Effect = "Allow";
          Principal.Service = "vmie.amazonaws.com";
          Action = "sts:AssumeRole";
          Condition.StringEquals."sts:Externalid" = "vmimport";
        }];
      };
    };
  };

  policy = mkResource {
    provider = "aws"; type = "aws_iam_policy"; name = "vmimport";
    config = {
      name = "nivis-vmimport-${suffix}";
      policy = builtins.toJSON {
        Version = "2012-10-17";
        Statement = [
          {
            Effect = "Allow";
            Action = [ "s3:GetBucketLocation" "s3:GetObject" "s3:ListBucket" "s3:PutObject" "s3:GetBucketAcl" ];
            Resource = [ "arn:aws:s3:::${bucketName}" "arn:aws:s3:::${bucketName}/*" ];
          }
          {
            Effect = "Allow";
            Action = [ "ec2:ModifySnapshotAttribute" "ec2:CopySnapshot" "ec2:RegisterImage" "ec2:Describe*" ];
            Resource = "*";
          }
        ];
      };
    };
  };

  attach = mkResource {
    provider = "aws"; type = "aws_iam_role_policy_attachment"; name = "vmimport";
    config = { role = role.refAttr "name"; policy_arn = policy.refAttr "arn"; };
  };

  # --- upload the built .vhd to S3 ------------------------------------------
  bucket = mkResource {
    provider = "aws"; type = "aws_s3_bucket"; name = "image";
    config = { bucket = bucketName; force_destroy = true; };
  };

  image = mkResource {
    provider = "aws"; type = "aws_s3_object"; name = "image";
    config = { bucket = bucket.refAttr "id"; key = "nixos.vhd"; source = imgSource; };
  };

  # --- S3 .vhd -> EBS snapshot ----------------------------------------------
  # disk_container and user_bucket are LIST-nested blocks, so each is a
  # one-element list (a bare attrset is rejected at apply).
  snapshot = mkResource {
    provider = "aws"; type = "aws_ebs_snapshot_import"; name = "nixos";
    config = {
      role_name = role.refAttr "name";
      disk_container = [{
        format = "VHD";
        user_bucket = [{ s3_bucket = bucket.refAttr "id"; s3_key = "nixos.vhd"; }];
      }];
    };
  };

  # --- register the snapshot as a bootable AMI ------------------------------
  ami = mkResource {
    provider = "aws"; type = "aws_ami"; name = "nixos";
    config = {
      name = "nivis-ec2nix-${suffix}";
      virtualization_type = "hvm";
      root_device_name = "/dev/xvda";
      ena_support = true;
      ebs_block_device = [{ device_name = "/dev/xvda"; snapshot_id = snapshot.refAttr "id"; }];
    };
  };

  # --- a security group opening port 80 -------------------------------------
  sg = mkResource {
    provider = "aws"; type = "aws_security_group"; name = "web";
    config = {
      name = "nivis-ec2nix-web-${suffix}";
      description = "Nivis EC2+NixOS demo: allow HTTP";
      ingress = [{ from_port = 80; to_port = 80; protocol = "tcp"; cidr_blocks = [ "0.0.0.0/0" ]; description = "http"; }];
      egress  = [{ from_port = 0;  to_port = 0;  protocol = "-1";  cidr_blocks = [ "0.0.0.0/0" ]; description = "all";  }];
    };
  };

  # --- launch the AMI -------------------------------------------------------
  instance = mkResource {
    provider = "aws"; type = "aws_instance"; name = "web";
    config = {
      ami = ami.refAttr "id";
      instance_type = "t3.micro";
      vpc_security_group_ids = [ (sg.refAttr "id") ];
      tags = { Name = "nivis-ec2nix-${suffix}"; managed-by = "nivis"; };
    };
  };
in
toIR {
  providers.aws = mkProvider {
    source = "registry.opentofu.org/hashicorp/aws";
    config = { region = "eu-central-1"; };
  };
  resources = [ role policy attach bucket image snapshot ami sg instance ];
  inherit ledger;
}

Reading the chain: image's source is the built .vhd path (Part 1): the OS crossing into the infra. Every later resource references the previous one's output with refAttr (a __ref), so Nivis resolves the chain across phases: the snapshot import waits on the upload, the AMI on the snapshot, the instance on the AMI. The IAM role + policy create the vmimport service role AWS requires for disk-image import. The only per-deployment knob is suffix (unique resource names).

Part 3: Apply

export AWS_PROFILE=your-profile
nivis plan      # 9 resources to create across phases
nivis apply     # build the image, upload (~2 GB), import, register, launch

Because source is a drv (__build) leaf, nivis apply builds the image itself before uploading it: no separate nix build step. The image build is the heavy part (it uses the Nix binary cache, ≈2 GB); everything after is pure AWS. (Pass --build=false to skip realising if you've pre-built.)

A real run of this pipeline (verified against AWS) resolves across four phases: the AWS chain can't all happen at once:

Applied 9 resource(s) across 4 phase(s):
  ✓ aws.aws_iam_role.vmimport
  ✓ aws.aws_iam_policy.vmimport
  ✓ aws.aws_s3_bucket.image
  ✓ aws.aws_security_group.web
  ✓ aws.aws_iam_role_policy_attachment.vmimport
  ✓ aws.aws_s3_object.image          # the ~2 GB NixOS .vhd
  ✓ aws.aws_ebs_snapshot_import.nixos
  ✓ aws.aws_ami.nixos
  ✓ aws.aws_instance.web

Read the instance's public address back out of state and check it serves:

nivis state show aws.aws_instance.web    # public_ip / public_dns / instance_state
curl -sS -o /dev/null -w '%{http_code}\n' "http://<public_ip>/"
# 200

The instance boots, nginx comes up on port 80, and returns 200: a machine whose OS you built in Nix, registered as an AMI through Nivis, and launched, all from one flake. (Give it a minute after apply: the instance has to boot before nginx answers.)

That public_ip did not exist until AWS launched the instance; it was read back into state (and is available to Nix for dependent config). The instance is running an OS you built in Nix, from an image you registered through Nivis.

Tear it all down (reverse dependency order: instance, AMI, snapshot, bucket, role):

nivis destroy

Notes

  • Cost & safety: a t3.micro is cheap, but don't leave it running; nivis destroy removes everything this created. The EBS snapshot import takes a few minutes; that's AWS, not Nivis.
  • The vmimport role: AWS requires this specific service role for disk-image import; the example creates it (and a least-privilege policy) so the pipeline is self-contained. If your account already has a vmimport role, point aws_ebs_snapshot_import.role_name at it instead.
  • Production: for a real fleet, use elastinix: it owns the image-build + upload pipeline and a maintained module. This tutorial shows the mechanism, end to end, driven entirely by Nivis.

DESIGN.md: Nivis architecture & decisions

This is the decision ledger. It exists so that a future session does not re-derive (or undo) conclusions that were expensive to reach. Each decision records the choice, the reasoning, and the alternative that was rejected.

D1. Don't fork OpenTofu; drive the provider plugin protocol

Decision. Build a minimal Go engine that speaks the Terraform plugin protocol (tfprotov6, over HashiCorp go-plugin/gRPC) to provider binaries. Use terraform-plugin-go as the dependency and read OpenTofu's internal/plugin as the reference for how to use it.

Why. Forking to "strip what we don't need" inverts the cost. The parser and HCL loader are the small, easily-replaced parts (Nix replaces them). The provider plugin client, state engine, dependency graph, and DAG scheduler are the large parts, and we need those, so we'd inherit exactly the maintenance burden we wanted to avoid. The protocol is stable; the config frontend is the part we're actually changing.

Rejected. Forking OpenTofu and removing HCL. Higher ongoing cost, no upside.

Decision. Launch the upstream provider binary as a subprocess and talk tfprotov6 to it.

Why / prior art. The Pulumi Terraform Bridge is the closest prior art and is worth mining, but it makes the opposite choice here, and the contrast is instructive. Pulumi does not use provider binaries; it compiles the provider's Go modules (against a forked plugin SDK) into its own provider binary, per-provider, with a shim. That buys Pulumi tighter integration at the cost of a per-provider build and a maintained SDK fork. Our headline goal is universal support for all existing providers with zero per-provider work, which spawn-not-link delivers directly. So we deliberately diverge from Pulumi here. Do not refactor toward the link model.

Mine from Pulumi instead: its schema type-mapping (required/optional/ computed/sensitive, sets vs lists, nested blocks), its ProviderInfo/overlay pattern (raw schema→code is usable but not idiomatic, so plan an override seam), and how it encodes unknown values to the provider during plan/diff (relevant to D4 below).

D3. Nix as a batch frontend; resolution by phased re-evaluation to a fixpoint

Decision. Nix evaluates configuration to a JSON IR. Cross-resource and cross-domain references that aren't yet known are emitted as typed placeholders. The Go executor applies what it can, collects real outputs, and the system re-evaluates Nix with those outputs injected, repeating until a fixpoint (no phase produces a new resolved value). Two phases is the shallow case; deeper Nix-mediated dependency chains need more. We explicitly support N phases.

Why. This is the central constraint of the whole project. Nix evaluation is a single forward batch pass that completes or errors: there is eval-time, then build/apply-time, and they are separate. A value a provider computes at apply (an IP, an ID, a generated secret) does not exist at eval time. Anything Nix must compute from that value (a hostname string, a NixOS option, another resource's input) therefore cannot be produced in the same evaluation. The only faithful way to feed apply-time values back into Nix is to evaluate again with them in scope.

The two flavors of reference (the executor must distinguish):

  • TF→TF: resource A's output feeds resource B's input. Resolved inside the executor during apply; no re-eval needed.
  • *→Nix: a Nix expression computes something from an apply-time value (and that result may feed further resources). Requires re-eval with the value injected. This is what drives phase count.

Why not Pulumi's elegant model. Pulumi represents not-yet-known values as Output<T> (a promise) and resolves them in-process as the program runs, because a Pulumi program is a live running process the engine can feed values back into. Nix has no promise, no suspend/resume, no live runtime to re-enter. Pulumi's model is unavailable to us not because it's cleverer but because its substrate is a different kind of thing. Our phased re-eval is the honest Nix-shaped equivalent, not a workaround to feel bad about.

Rejected (for now). "Option B": a live evaluator the engine drives via suspend/resume (libexpr internals). Elegant in theory, fragile and effectively unsupported in practice. Our phased loop converges toward B's expressiveness as iterations grow, without B's dependence on Nix internals. Revisit only if re-eval cost becomes a measured problem.

D4. The IR is the single frozen contract

Decision. IR-CONTRACT.md defines the JSON IR. It is the API between the Nix library (Epic 1), the codegen (Epic 2), and the executor (Epic 3/3.5). Breaking changes require an OpenSpec change to the contract first.

Why. Three workstreams depend on it; once stable they can progress in parallel. An underspecified linchpin is how this kind of project fragments. The hard parts the contract must pin down: reference encoding (nested attrs, list/set indices, refs inside for_each/count), for_each/count expansion timing (Nix expands, executor receives concrete resources), unknown-value representation toward the provider, and how sensitive values cross the JSON boundary without landing in world-readable nix eval output / the Nix store, are decided in the contract, not improvised per-epic.

D5. Prove the round trip before building breadth

Decision. Critical path is: Nix lib core → IR contract → executor that drives one (fake) provider through plan/apply → the phased-eval loop → the two-provider e2e. General schema codegen for arbitrary providers and registry integration come after the thesis is proven.

Why. The conceptual risk lives entirely in the round trip and the phased loop. Codegen is breadth (how we reach "all providers"), not risk. Hand-written constructors for the fake providers are enough to validate everything. Building the generation machinery first means a lot of code before a single resource round-trips.

D6. Hermetic testing via in-repo fake providers

Decision. Write minimal Go binaries that speak tfprotov6 and return canned/computed values (no real APIs, no credentials, no network). The executor drives them exactly as it would a real provider.

Why. Proves the protocol client and the whole pipeline deterministically and offline, essential given the restricted network, and the right substrate for the headline e2e. Real-provider runs are low conceptual risk and network-gated; they are out of scope for the PoC and tracked as a separate bean.

D7. Flake apps use nixpkgs; the library stays input-free

Decision. The flake exposes packages/apps for the nivis and nivis gen CLIs, built with nixpkgs buildGoModule (Go toolchain from a pinned nixpkgs input, module deps pinned by a committed vendorHash). The library outputs (lib, nivis.*) remain pure builtins and do not depend on the nixpkgs input: evaluating them imports nothing from nixpkgs.

Why. Originally the flake took no inputs at all, so the library evaluated without the binary cache (the configuration frontend must be cheap to evaluate every phase, and the cache was unreachable). A runnable CLI needs a real Go toolchain, which means nixpkgs. The refinement keeps the property that actually matters (the configuration-frontend outputs never force nixpkgs) while letting nix run .#nivis build the executor from source. The two concerns are kept separate in flake.nix: only packages/apps touch nixpkgs.

Rejected. flake-utils (replaced by a few lines of Nix that enumerate systems); a committed vendor/ directory (a one-line vendorHash keeps the repo lean). Keeping the CLIs go-build-only was the prior state; nix run is strictly additive: go build/go run still work unchanged.

Nivis vs the usual suspects

How does Nivis relate to the other tools people reach for when they want infrastructure-as-code, especially in (or near) the Nix world? This page is an honest comparison: where Nivis is genuinely different, and where it is young and the others are mature.

Maturity, stated plainly. Nivis is early (0.x, alpha). The round trip works across two providers and real providers (AWS today) apply / update / replace / destroy, but it has not been run at scale, the surface is small, and the contracts (the IR, the flake interface) are the stable parts while the rest moves. Everything below should be read with that in mind: a tool is not "better" than CloudFormation because a feature table has more checkmarks in its column. Maturity, ecosystem, and operational track record are features too, and there the established tools lead.

The one-line positioning

ToolOne line
NivisTerraform/OpenTofu provider resources as first-class Nix values, driven by a thin Go executor that spawns unmodified provider binaries. Nix is the config; the provider does the work.
OpenTofu / TerraformThe provider ecosystem and engine. HCL config, its own state, a huge provider registry. Mature, ubiquitous.
TerranixGenerates HCL/JSON from Nix, then hands it to Terraform/OpenTofu to run. Nix as an HCL generator.
NixOps 4Nix-native deployment orchestrator (the NixOps line, reworked around a resource/provider model). Nix-centric, NixOS-deployment heritage.
PulumiReal programming languages (TS, Python, Go, …) for IaC. Reuses Terraform providers by compiling them into per-provider plugins via its bridge.
AWS CDKReal languages that synthesize CloudFormation. AWS-first; CDKTF variant synthesizes Terraform.
CloudFormationAWS's native, declarative, AWS-only IaC service. Managed state, deep AWS integration.

What actually makes Nivis different

Three choices, none of which the others combine:

  1. Provider resources are first-class Nix values. Not generated HCL (Terranix), not a separate program (Pulumi/CDK), not a bespoke resource DSL. You write mkResource { … } and wire outputs with refAttr in plain Nix.

  2. Spawn unmodified providers; do not link. Nivis talks the Terraform plugin protocol (tfprotov5/v6) over gRPC to the same provider binaries OpenTofu uses. Contrast Pulumi, which compiles each provider's Go into its own plugin via a bridge and a maintained SDK fork, per provider. Spawn-not-link is what buys universal, zero-per-provider compatibility with the OpenTofu ecosystem, at the cost of the tighter integration Pulumi's bridge gives.

  3. The round trip via phased re-evaluation. A provider-created value (an IP, an ID, a generated secret) flows back into Nix, which re-evaluates to produce dependent config, repeating to a fixpoint. Pulumi gets a live Output<T> promise model for free because a Pulumi program is a running process; Nix is a batch evaluator with no live runtime, so Nivis does the honest Nix-shaped thing (re-eval to a fixpoint) rather than pretending to have promises. Terranix has no round trip at all: it generates HCL once and stops.

The closest neighbor by intent is Terranix (Nix + the Terraform provider ecosystem); the closest by mechanism for reusing providers is Pulumi (both ride Terraform providers). Nivis sits between them and matches neither: Nix-native like Terranix, provider-reusing like Pulumi, but generating HCL like neither and linking providers like neither.

Feature comparison

Legend: ✅ yes · ⚠️ partial / with caveats · ❌ no · n/a not applicable.

Essential features

FeatureNivisOpenTofu/TFTerranixNixOps 4PulumiCDKCloudFormation
Config languageNixHCLNix → HCLNixTS/Py/Go/…TS/Py/…YAML/JSON
Reuses Terraform/OpenTofu providers✅ (spawn)✅ (native)✅ (via TF)⚠️✅ (bridge)⚠️ (CDKTF)
Multi-cloud / any provider⚠️⚠️❌ (AWS)
Plan / preview before apply✅ (via TF)⚠️✅ (change sets)
Apply / update / replace / destroy✅ (via TF)
State management✅ (local)✅ (TF)n/a (CFN)✅ (managed)
Outputs feed back into config (round trip)✅ (phased re-eval)⚠️ (HCL refs, no host-lang feedback)⚠️✅ (Output<T>)⚠️⚠️
Typed/validated config✅ (Nix + schema codegen)✅ (Nix)✅ (host lang)⚠️
Modules / composition✅ (Nix modules)✅ (Nix)⚠️ (nested stacks)
Mix OS build + cloud in one expr✅ (NixOS image → AMI)⚠️

Enterprise / operational features

This is where Nivis is youngest. Honest status:

FeatureNivisOpenTofu/TFPulumiCloudFormation
Remote / shared state backends❌ (local only, today)✅ (Pulumi Cloud + self-host)✅ (managed)
State locking❌ (today)
Drift detection / refresh⚠️ (refresh)
Policy as code / guardrails⚠️ (Sentinel/OPA)✅ (CrossGuard)✅ (Guard/SCP)
Secrets handling across the boundary✅ (sensitive refs, 0600 ledger)
RBAC / teams / audit (hosted)✅ (TFC/Enterprise)✅ (Pulumi Cloud)✅ (IAM/CloudTrail)
Provider registry / auto-download⚠️ (planned; offline by default)n/a
Production track record / scale❌ (alpha)
Commercial support✅ (vendors)✅ (AWS)

Licensing (a real differentiator)

ToolLicense posture
NivisOwn code Apache-2.0; vendored Terraform-protocol files are MPL-2.0. No BUSL anywhere.
OpenTofuMPL-2.0 (the open fork created after Terraform's BUSL relicense).
TerraformBUSL-1.1 (source-available) since v1.6.
TerranixOpen source (MIT); generates HCL for whichever engine you run.
PulumiApache-2.0 core; Pulumi Cloud is a commercial service.
CDK / CloudFormationCDK Apache-2.0; CloudFormation is an AWS service.

When to pick what

  • You live in Nix and want real, multi-cloud infra with provider outputs feeding back into your Nix config: Nivis is the only tool aimed squarely at that, but accept the alpha maturity.
  • You want Nix to author config but run it through battle-tested tooling: Terranix (Nix generates HCL, OpenTofu/Terraform runs it). No round trip, but mature and boring in the good way.
  • You want a mature engine and the biggest provider ecosystem, HCL is fine: OpenTofu (open) or Terraform (BUSL).
  • You want general-purpose languages and a hosted control plane: Pulumi.
  • You are AWS-only and want native, deeply-integrated IaC: CloudFormation, or CDK if you want a real language synthesizing it.
  • You deploy NixOS machines and want a Nix-native orchestrator: NixOps 4.

Sources (re-verify against these)

External facts above (versions, licenses, features of other tools) drift. When re-checking, confirm against the upstream docs and update the last-verified date at the top of this file:

Nivis's own claims are grounded in this repo: docs/DESIGN.md (the spawn-not-link and phased-eval decisions) and docs/OVERVIEW.md (the round trip).

docs/IR-CONTRACT.md: the frozen IR

This is the contract between the Nix library (Epic 1), codegen (Epic 2), and the executor (Epic 3 / 3.5). It is an API. Changing its shape requires an OpenSpec change to this document first, then downstream updates. Version it.

schemaVersion is bumped on any breaking change.

Top-level shape

{
  "schemaVersion": 1,
  "providers": {
    "<provider-id>": {                 // e.g. "alpha", "registry.opentofu.org/x/alpha"
      "source": "<source-or-path>",    // for PoC: filesystem path to the binary
      "config": { /* provider config, may contain refs */ }
    }
  },
  "resources": [
    {
      "id": "<provider>.<type>.<name>",       // stable identity, unique
      "provider": "<provider-id>",
      "type": "<resource-type>",               // e.g. "alpha_token"
      "name": "<name>",
      "config": { /* attribute tree; leaves may be values, refs, or unknowns */ },
      "meta": {
        "dependsOn": ["<id>", ...],            // explicit edges (additive to implicit)
        "lifecycle": { "preventDestroy": false, "ignoreChanges": ["<path>"] }
        // count/for_each are NOT here: expansion already happened in Nix (see below)
      }
    }
  ],
  "edges": [
    { "from": "<id>", "to": "<id>", "via": "<config-path-in-to>" }  // dependency graph
  ],
  "nixConsumers": [
    // values Nix computed FROM resource outputs, surfaced for the round trip.
    // On a given phase these may still be unknown; they become concrete once
    // their inputs are resolved and Nix is re-evaluated.
    { "id": "<consumer-id>", "value": { /* tree; leaves may be values/refs/unknowns */ } }
  ]
}

Reference encoding (the core of the contract)

A not-yet-known cross-resource or cross-domain value is a typed ref leaf:

{ "__ref": { "resource": "<id>", "path": ["attr"] } }                 // scalar attr
{ "__ref": { "resource": "<id>", "path": ["net", 0, "ip"] } }         // nested + list index
{ "__ref": { "resource": "<id>", "path": ["tags", "env"] } }          // map key
  • path is an ordered list of string keys / integer indices into the source resource's output object. This covers nested attributes, list/set indices, and map keys uniformly. (For sets, index is the post-apply stable ordering the provider returns.)
  • A ref inside an expanded for_each/count instance is just a normal ref whose resource is the concrete expanded id (e.g. alpha.alpha_token.web["a"] → id alpha.alpha_token.web__a). There is no special "expansion ref" because expansion is already done (below).
  • A ref whose target resource does not yet exist in state is unresolved, not an error, until fixpoint (Epic 3.5.3).

Ref classification (drives phase behavior, DESIGN D3)

The executor classifies each ref:

  • TF→TF: the ref appears in a resources[].config. Resolved in-executor when the target's output is known; does not require Nix re-eval.
  • *→Nix: the ref appears in a nixConsumers[].value, or a resources[].config leaf that Nix itself derived from another resource's output (Nix marks these, see "derived" below). Resolving these requires re-eval with the outputs ledger injected.

Unknown values (toward the provider)

When the executor calls PlanResourceChange with inputs that are still refs, it must present them to the provider as the protocol's unknown value sentinel, not as the __ref JSON (providers don't understand our refs). The mapping { "__ref": ... } → tfprotov6 unknown is the executor's responsibility (Epic 3a.5). Mine Pulumi's bridge for how it encodes unknowns at plan/diff time.

for_each / count expansion timing

Expansion happens in Nix. The IR contains concrete, already-multiplied resources with deterministic ids (<base>__<key>). The executor never sees count/for_each; it only sees resolved instances and edges between them. This keeps the Go ResourceNode simple and the graph explicit.

"Derived" Nix values

A config leaf that Nix computed from a resource output (e.g. "web-" + alpha.id) cannot be a plain __ref (it's a transformation). Nix emits such a leaf as unknown-pending until the inputs are available:

{ "__derived": { "inputs": ["<id>.attr", ...] } }   // value computed by Nix once inputs known

The executor treats __derived leaves as *→Nix: it cannot compute them; it records that the listed inputs are required, and once those outputs are in the ledger, the next Nix re-eval produces the concrete value. This is the mechanism that forces N>2 phases for chained Nix-mediated dependencies.

Build outputs (__build)

A config leaf that is the output of a Nix build (e.g. a resource source that is a built disk image) is a __build leaf carrying its store path:

{ "__build": { "path": "/nix/store/<hash>-<name>/<file>" } }   // a Nix build output

Unlike __ref/__derived, a __build leaf is a known value: its path is fixed at evaluation. But nivis evaluates (it does not build), so before a resource is applied the executor realises each __build path it references (building the derivation if the store path is not yet valid) and substitutes the concrete path into the provider config. This is done per resource as it becomes ready, so a build whose derivation depends on an earlier resource's apply-time output is realised in a later phase: the build participates in the phased fixpoint. A __build leaf is not an edge and not unknown-pending; authors emit it with the drv helper (source = drv image).

Sensitive values across the boundary

Provider schema marks attributes sensitive. Sensitive outputs:

  • Must not be written into the IR JSON emitted by nix eval (that output and the Nix store are world-readable). The Nix side emits a ref/placeholder only.
  • Live only in the executor's outputs ledger, which is written with restricted permissions (0600) and is not a Nix store path.
  • When a sensitive output must feed a later Nix re-eval, it is injected via a private channel (file path passed as --argstr, file mode 0600), never baked into a derivation. The re-eval may use it but must not re-emit it into a world-readable output.

This is a hard requirement; getting it wrong leaks secrets into the store.

Outputs ledger (the phased-eval injection format)

The file the executor accumulates and injects on each re-eval:

{
  "phase": 2,
  "outputs": {
    "<resource-id>": { "<attr>": <value-or-{__sensitiveRef}>, ... }
  }
}

Nix reads this (path passed in via the flake plan argument) to resolve refs and compute __derived values on the next phase. __sensitiveRef points at the restricted-mode channel rather than embedding the secret.

Validation

The contract is machine-checkable, not just prose. Two artifacts make it so:

  • docs/ir-schema.json: the normative JSON Schema (Draft 2020-12) encoding the structural rules of everything above: top-level shape, the __ref/__derived/__sensitiveRef leaf encodings, and the no count/ for_each in the IR rule. A leaf-marker object (__ref etc.) is dispatched to its exact subschema, so a malformed marker reports a precise, addressed error (e.g. at resources/1/config/label/__ref: 'path' is a required property) rather than a generic failure.
  • tests/ir-conformance/: the executable conformance suite. check.py layers (1) JSON-Schema structural validation over (2) the referential rules JSON Schema cannot express: unique ids, every provider declared, every edge endpoint present, every __ref/__sensitiveRef target existing. Fixtures under fixtures/valid and fixtures/invalid lock both directions; each invalid fixture asserts the error names the offending element.

Both producer/consumer sides MUST conform to these artifacts:

  • Nix (Epic 1.5 toIR): a property test that, for arbitrary valid resource graphs, toIR output passes check.py validate: every leaf is a value, a well-formed __ref, a __derived, or a __sensitiveRef; ids unique; every edge endpoint exists.
  • Go (Epic 3a.1 IngestIR): rejects malformed IR with an actionable error naming the offending resource/path (Epic 4c), matching the failure classes in tests/ir-conformance/fixtures/invalid. check.py is the reference behavior the Go validator is tested against.

Run the suite: python3 tests/ir-conformance/check.py test.

docs/TESTING.md: testing strategy & the headline e2e

Testing is part of "done." No OpenSpec change is complete without its tests.

Layers

  1. Pure Nix functions, property tests. mkResource, the reference system, for_each/count expansion, and toIR conformance to docs/IR-CONTRACT.md. Property: for arbitrary valid resource graphs, every IR leaf is a value, a well-formed __ref, or a __derived; ids are unique; every edge endpoint exists.
  2. Go, table-driven unit tests. IR ingestion/validation, DAG construction, ref classification (TF→TF vs *→Nix), TF→TF in-executor resolution, the __ref→tfprotov6-unknown mapping, state read/write/lock, fixpoint detection.
  3. Integration, against fake providers, no network. Executor spawns a fake tfprotov6 provider, completes the go-plugin handshake, drives GetProviderSchema/PlanResourceChange/ApplyResourceChange, and persists state. Proves the protocol client end-to-end offline.
  4. E2E, the full pipeline. .nix → IR → phased plan/apply → state → refresh → destroy, culminating in the headline test below.

All provider-touching tests use in-repo fake providers so the suite is hermetic and runs in CI without credentials or registry/network access.

The fake providers (Epic 4a)

Two minimal Go binaries that speak tfprotov6. Each returns a static schema and produces computed (unknown-at-plan) outputs at apply.

provider-alpha, resource alpha_token:

  • inputs: label (string, optional)
  • computed outputs (known only after apply): id (string), value (string, derived deterministically from label + a counter so tests are reproducible)

provider-beta, resource beta_record:

  • inputs: from (string, required)
  • computed output (known only after apply): endpoint (string, derived from from)

Determinism: outputs are a pure function of inputs (+ a per-process counter that the test harness seeds), so assertions are exact. No clocks, no randomness, no external calls.

Headline e2e: milestone exit criterion (Epic 4b)

What it must prove, in one test: two providers, unknown values originating on both sides, resolution requiring ≥3 phases, and a Nix-side consumer reading outputs from both providers (the round trip). The dependency graph is acyclic; the phase count comes from each hop being Nix-mediated (__derived), which is exactly what forces N>2.

Topology (tests/e2e/two-providers.nix)

alpha_token.A           (alpha)  : no inputs
   └─ A.value  ─┐
                ▼  Nix: name = "rec-" + A.value          (__derived on A.value)
beta_record.B  (beta)   from = name
   └─ B.endpoint ─┐
                  ▼ Nix: final = B.endpoint + "::" + A.value   (__derived on B.endpoint, A.value)
alpha_token.C  (alpha)  label = final

# Nix-side consumer reading from BOTH providers (simulated NixOS option):
systemConfig = {
  recordEndpoint = B.endpoint;   # from beta
  tokenValue     = A.value;      # from alpha
  combined       = final;        # from both
}
  • Unknowns originate on both sides: A.value/C.* from provider alpha and B.endpoint from provider beta.
  • The chain A → (Nix name) → B → (Nix final) → C is acyclic but each arrow crosses the Nix boundary via __derived, so it cannot collapse into one pass.

Required phase progression

  • Phase 0 eval: A.config fully known; name is __derived on A.value (unknown); B, final, C, systemConfig all pending.
  • Phase 1 apply: only A is ready → apply A → ledger gains A.id, A.value.
  • Phase 1→2 eval: re-eval injects A.valuename resolves → B.config.from now known; final still pending on B.endpoint.
  • Phase 2 apply: B ready → apply B → ledger gains B.endpoint.
  • Phase 2→3 eval: re-eval injects B.endpointfinal resolves → C.config.label known; systemConfig fully resolves (both providers present).
  • Phase 3 apply: C ready → apply C. No pending refs remain.
  • Phase 4 eval: produces no new resolved value → fixpoint → halt.

Assertions

  • Total phases that performed an apply == 3 (and the loop halts at fixpoint, not by a hardcoded count).
  • Attempting to resolve with a 2-phase cap leaves final/C/systemConfig unresolved → the engine reports them as pending (proves >2 phases is required, not incidental).
  • Final outputs ledger contains A.id,A.value,B.endpoint,C.*.
  • systemConfig evaluates to concrete values for recordEndpoint, tokenValue, and combined, each matching the deterministic provider outputs, proving TF→Nix feedback from both providers.
  • A cycle variant (make A.label depend on C.*) is rejected at fixpoint with an actionable "unresolvable / cycle" error naming A and C (Epic 3.5.3).
  • destroy removes C, B, A in reverse dependency order; refresh reconciles state via ReadResource without changing the plan.

Why this is the right exit test

It exercises every load-bearing decision at once: the protocol client (real tfprotov6 handshake to two providers), TF→TF and *→Nix ref handling, the __derived mechanism, N-phase fixpoint resolution with N>2, and the round trip that is the project's entire reason for existing, with unknowns genuinely originating on both provider sides.

ROADMAP.md: Nivis

Nivis cleared its proof-of-concept milestone. The thesis is proven: real Terraform/OpenTofu provider resources are first-class Nix values, driven by a thin Go executor that spawns unmodified provider binaries, and provider outputs round-trip back into Nix across phases to a fixpoint. On top of that we have real AWS apply/update/replace/destroy, schema codegen, and an end-to-end "build a NixOS AMI and launch it" example.

That makes Nivis experimental / alpha (0.3.x): real, but small. This roadmap is about the next thing, taking Nivis from "the demo works" to "I can run my real infrastructure on this," and eventually to something an enterprise can adopt. The PoC roadmap that got us here is preserved at the bottom as history.

How this maps to beans. Each phase below is a beans milestone; each theme under it is a beans epic; each task inside an epic is an OpenSpec change (spec before code). See CLAUDE.md §3. The doc is the why and what; beans is the audit trail.

Where we are honestly weak

docs/COMPARISON.md states this plainly. Versus Terraform/OpenTofu, Pulumi and CDK, the gaps that actually block adoption today are:

  • State is local-only. There is a Store interface seam, but no shared backend and no locking. Two people (or CI) cannot safely touch the same infra.
  • No variables / overrides. Config is whatever the flake hard-codes plus an ad-hoc ledger.vars. There is no first-class way to parameterise per environment or pass values at the CLI.
  • No datasources. The provider protocol's ReadDataSource is unused; you cannot look up an existing AMI, VPC, or zone the way every other tool can.
  • Thin DX. Plan/apply/destroy output is not colorised by change type, there is no shell completion, and there is no per-provider reference documentation.
  • No enterprise controls. No policy-as-code, no RBAC/audit, no hosted control plane, and provider download from the registry is network-gated and not the default path.

The phases close these in the order that unlocks the widest audience soonest.

Architecture invariants (do not regress)

Every phase below is bound by docs/DESIGN.md. In particular: spawn unmodified providers, do not link them; Nix is a batch evaluator resolved by phased re-evaluation to a fixpoint, not a live Output<T> runtime; the IR is the frozen contract (docs/IR-CONTRACT.md), so any feature that changes the IR shape needs an OpenSpec change to the contract first; and tests run against in-repo fake providers (hermetic, no network, no credentials).


Phase A: a daily-driver for Nix developers ⟵ the next milestone

Beans milestone: nixform2-zdj0 ("Road to v1"). Epics: A1 nixform2-kym5, A2 nixform2-6e6i, A3 nixform2-yqd3, A4 nixform2-oycy, A5 nixform2-n2rg, A6 nixform2-z8e1.

Definition of done: a Nix developer can manage a real, multi-resource project end to end, day to day, without dropping back to Terraform, with shared state, parameterised config, datasource lookups, and a plan they can actually read. This is the headline goal for the next milestone, the same role the round-trip e2e played for the PoC.

  • A1. Variables and overrides. First-class inputs to a plan: typed variables with defaults, a CLI way to set them (--var, --var-file), and a clear precedence (defaults < file < flag < environment). Must thread cleanly through the phased-eval loop (the ledger already carries vars; formalise it) and stay pure: no impurity sneaks into the Nix evaluation. Probably an IR-contract touch for how vars enter the plan function.
  • A2. Datasources. Drive the provider protocol's ReadDataSource so a config can read existing infrastructure (an AMI by filter, a VPC, an availability zone) and feed it into resources. Needs a Nix-lib constructor (mkData or similar), executor support, and an IR-contract addition for the datasource node and its outputs. Datasource reads happen per phase like any other node.
  • A3. Legible plan/apply/destroy output. Colorise by change type (+ create, ~ update, -/+ replace, - destroy, = no-op), summarise counts, and make the phased nature visible (which resources resolved in which phase). Respect NO_COLOR and non-TTY output. No behaviour change, pure DX.
  • A4. Shell completion. Cobra can generate bash/zsh/fish completion; wire it up (nivis completion <shell>) and complete resource ids for state show / --target from the state file.
  • A5. Per-provider reference docs. Today a user reads the provider's Terraform registry docs and mentally translates HCL to Nivis. Generate or curate a "Terraform docs to Nivis" mapping so aws_instance's arguments are discoverable in Nivis terms. Couples naturally to schema codegen (Epic 2, already built).
  • A6. State ergonomics. Configurable state path is done; add the small things a real project needs: state list/show/rm polish, a state pull/push shape that the remote backend (Phase B) will reuse, and clear errors on a stale or locked state file.

Phase B: team-ready ⟵ after Phase A

Beans milestone: nixform2-kovh. Epics: B1 nixform2-izhk, B2 nixform2-0oqk, B3 nixform2-tyzs, B4 nixform2-cdfj.

Definition of done: multiple people and CI can safely operate the same infrastructure concurrently.

  • B1. Remote state backend (S3 first). Implement the Store seam against S3 (object per state, server-side encryption, the credential chain Nivis already uses). Keep the format Nivis's own; no tfstate compatibility guarantee (DESIGN). Configured in the flake, not via env soup.
  • B2. State locking. A lock so two concurrent applies cannot corrupt state (DynamoDB-style advisory lock for the S3 backend, with a force-unlock escape hatch and clear "who holds the lock" errors).
  • B3. Drift detection. refresh exists; build a real "plan shows drift" experience that reconciles remote reality against stored state and surfaces out-of-band changes.
  • B4. Multiple environments. A clean pattern for dev/staging/prod from one config: workspaces or per-environment var-files + state keys, decided in a spec, not improvised.

Phase C: enterprise-credible ⟵ the longer horizon

Beans milestone: nixform2-1okn. Epics: C1 nixform2-alr9, C2 nixform2-84fs, C3 nixform2-m83a, C4 nixform2-q7fx, C5 nixform2-7evo.

NixOS is gaining enterprise traction; this is where Nivis earns a seat there. These are deliberately later, after the basics are solid, and several are large enough to be their own milestones.

  • C1. Policy as code / guardrails. A pre-apply policy hook (deny by rule, required tags, allowed regions). Evaluate doing this in Nix (assertions in the module system) versus an external engine; Nix-native is the differentiator.
  • C2. RBAC, teams, audit. The story for who can apply what, and an audit trail. Likely pairs with a remote backend and possibly a hosted control plane; scope carefully, this is where tools grow a SaaS.
  • C3. Provider registry integration. Real provider download/verify/cache from the OpenTofu registry. Network-gated (CLAUDE.md §6); today providers are fetched on first use but this needs hardening, offline/air-gapped mirrors, and supply-chain verification for enterprise.
  • C4. Secrets at scale. The IR already keeps sensitive values out of the world-readable store; extend to integration with real secret stores (Vault, SSM, sops-nix) so secrets never transit the Nix store at all.
  • C5. Scale and performance. Phased re-eval cost on large graphs is currently unmeasured. Measure it; optimise only if it is a measured problem (DESIGN rejects premature cleverness like a live evaluator).

Cross-cutting, every phase

  • Stay hermetic. Every feature lands with tests against the in-repo fakes; the fakes grow new capabilities (a datasource-serving fake, a drift-injecting fake) as the features that need them arrive.
  • Keep the lib pure. nivis.lib stays builtins-only (no nixpkgs); only packages/apps may force nixpkgs.
  • Spec before code. IR-affecting work (vars entry, datasource node) updates IR-CONTRACT.md via an OpenSpec change first.

History: the PoC milestone (delivered)

Kept for the record. This is the roadmap that proved the thesis; every epic below is complete (see the beans milestone nixform PoC / alpha base and its epics).

Milestone exit criterion (met): the headline e2e in TESTING.md: two providers, unknown values originating on both sides, resolved across ≥3 phases, with a Nix-side consumer reading outputs from both providers.

Critical path that was followed:

E1 (Nix lib core: mkResource + refs + IR serializer)
        │
E1.5 ── IR CONTRACT  (linchpin; written & frozen first)
        │
E4a ── fake tfprotov6 providers (alpha, beta)  (test substrate)
        │
E3a ── executor: ingest IR, spawn ONE fake provider, plan+apply, write state
        │
E3.5 ── PHASED EVALUATION TO FIXPOINT  (the thesis)
        │
E4b ── headline two-provider / unknowns-both-sides e2e  (milestone exit)
        │
(then breadth:) E2 schema codegen · E3b refresh/destroy/CLI · E4c/4d error UX & docs

Epics delivered (PoC and the alpha follow-ons):

  • E1 Nix library core: mkResource, mkProvider, the reference system, meta-arguments (depends_on, lifecycle, count/for_each expanded in Nix), the module system, toIR, and the flake interface (nivis.plan).
  • E1.5 The IR contract: IR-CONTRACT.md + ir-schema.json, the frozen JSON contract pinning ref encoding, expansion timing, unknown representation, and sensitive-value handling.
  • E2 Provider schema codegen (nivis gen): typed Nix constructors from a provider's GetProviderSchema.
  • E3 Go executor: IR ingestion, the lockable local Store, the plugin manager (spawn + gRPC handshake, v5 and v6), the DAG, plan and apply engines.
  • E3.5 Phased evaluation to fixpoint: the outputs ledger, the phase driver, fixpoint and cycle detection, and verified *→Nix feedback (the round trip).
  • E3b Refresh and destroy engines and CLI (plan/apply/destroy/refresh/state, --target, --refresh, --build).
  • E4a Fake tfprotov6/tfprotov5 providers (the hermetic test substrate).
  • E4b The headline two-provider, unknowns-both-sides, ≥3-phase e2e.
  • E4c/4d Error UX and docs (actionable errors; README, getting-started, the stable-contract docs).
  • Real-provider support (M2): real tfprotov5 + on-first-use registry fetch, proven against AWS.
  • Resource lifecycle: update and replace beyond create-only, with prevent_destroy.
  • EC2 + NixOS: build a NixOS AMI in Nix and launch it through Nivis, with nivis apply realising the image itself (__build / nivis.drv).
  • Branding, rename to Nivis, release management (versioning, changelog, releases, the docs site).
Nivis emblem

Nivis: brand reference

Nivis (Latin, "of snow") · tagline "Infrastructure as Nix Code." Formerly nixform. This file records the brand tokens so future work (docs site, UI, slides) has the palette and type in-repo, and is the in-repo source of truth for the logo geometry and treatments.

  • assets/nivis-emblem.svg: full emblem (snow-capped twin summit on a navy disc with silver ring + ember star). Use at ≥40px.
  • assets/nivis-glyph.svg: simplified single-peak mark for 16-64px (favicons, tabs, avatars).
  • Never recolour (beyond the ember star), stretch, skew, rotate, or shadow the emblem. Clear space ≥ the ring thickness. Below 40px use the glyph.

Colour tokens

TokenHexUse
Ink#081726deepest background
Deep Navy#0E3157primary brand, icon tile, dark surfaces
Disc Navy#0B2A48emblem disc fill
Steel Blue#2D5E8Esecondary
Glacier Blue#4A93C8links, cool UI accent
Ice Blue#AECFE6text on navy, tagline
Pale Ice#DCEDF7subtle fills
Silver#C3D2DEring, hairlines, metallic
Snow#F5FAFDlight surfaces, text on navy
Volcanic Ember#F2632Ethe one warm accent: the star, key highlights
Magma#C4361Adeep warm shade

Typography (all Google Fonts, OFL)

  • Cinzel (600): wordmark / display, Roman caps, letter-spacing ≈ .07em, used UPPERCASE ("NIVIS").
  • Schibsted Grotesk (400/500/600): UI, body, tagline.
  • IBM Plex Mono (400/500): code, CLI, labels.

CLI colours (nivis)

Truecolor ANSI, applied only on a TTY (honours NO_COLOR): ember \e[38;2;242;99;46m for the prompt and "fixpoint reached"; ice blue \e[38;2;174;207;230m for resource names/values; dim grey for secondary text.

Regenerating the raster assets

The logo SVGs are the source; the icons and banner PNG are generated from them. Requires rsvg-convert and ImageMagick (magick), plus the Cinzel and IBM Plex Mono fonts available to fontconfig.

# favicon (32px) + apple-touch-icon (180px glyph on a #0E3157 tile, ~14% padding)
cp assets/nivis-glyph.svg assets/favicon.svg
rsvg-convert -w 32 -h 32 assets/nivis-glyph.svg -o /tmp/f32.png
magick /tmp/f32.png assets/favicon.ico
rsvg-convert -w 180 -h 180 docs/assets/apple-touch-icon.svg -o assets/apple-touch-icon.png

# README hero banner (1280×640): needs Cinzel + IBM Plex Mono on fontconfig
rsvg-convert -w 1280 -h 640 docs/assets/banner.svg -o docs/assets/banner.png

The banner/apple-touch source SVGs live in docs/assets/ / generated inline; the committed PNGs are reproducible from them.