Kyverno Chainsaw - The ultimate end to end testing tool!

Have fun testing Kubernetes operators!

Kyverno Chainsaw

Creating Kubernetes operators is hard, testing Kubernetes operators is also hard. Of course creating, maintaining and testing a Kubernetes operator is even harder.

It often requires writing and maintaining additional code to get proper end to end testing, it takes time, is a cumbersome process, and making changes becomes a pain. All this often leads to poor operator testing and can impact the operator quality.

Today we are extremely proud to release the first stable version of Kyverno Chainsaw, a tool to make end to end testing Kubernetes operators entirely declarative, simple and almost fun.

In this blog post, we will introduce Chainsaw, how it works, and what problems it is solving. Hopefully after reading it you will never consider writing end to end tests the same!

What are Kubernetes operators

Kubernetes operators are described in this Kubernetes documentation page.

Operators are software extensions to Kubernetes that make use of custom resources to manage applications and their components. Operators follow Kubernetes principles, notably the control loop.

They often rely on Custom Resource Definitions and continuously reconcile the cluster state with the spec of Custom Resources.

How do we test a Kubernetes operator

An operator is essentially responsible for watching certain resources in a cluster and reacting to maintain a state matching the spec described in the Custom Resources.

Testing an operator boils down to creating, updating, or deleting certain resources and verifying the state of the cluster changes accordingly.

For example, an operator could be responsible for managing role bindings and service accounts in a cluster based on a simplified definition of permissions. This operator exists, see rbac-manager from FairWinds.

In the next sections of this blog post I will demonstrate how Chainsaw can help testing the rbac-manager operator.

Getting started

Before we can look at Chainsaw we need a Kubernetes cluster with rbac-manager installed. We can create a local cluster with KinD and use Helm to install the operator.

1# create a cluster
2kind create cluster
3
4# deploy rbac-manager
5helm install rbac-manager --repo https://charts.fairwinds.com/stable rbac-manager --namespace rbac-manager --create-namespace

Once the operator is installed, you should see a new Custom Resource Definition in the cluster:

1kubectl get crd
1NAME                                         CREATED AT
2rbacdefinitions.rbacmanager.reactiveops.io   2023-12-12T12:20:19Z

Install Chainsaw

Chainsaw can be installed in different ways. If you are using MacOS or Linux, the simplest solution is to use brew.

1# add the chainsaw tap
2brew tap kyverno/chainsaw https://github.com/kyverno/chainsaw
3
4# install chainsaw
5brew install kyverno/chainsaw/chainsaw

What is a test

To put it simply, a test can be represented as an ordered sequence of test steps.

Test steps within a test are run sequentially: if any of the test steps fail, the entire test is considered failed.

A test step can consist of one or more operations:

  • To delete resources present in a cluster
  • To create or update resources in a cluster
  • To assert one or more resources in a cluster meet the expectations (or the opposite)
  • To run arbitrary commands or scripts

In Chainsaw, tests are entirely declarative and created with YAML files.

Our first test

In this first test, we’re going to create an RBACDefinition and verify the rbac-manager operator created the corresponding ClusterRoleBinding in the cluster.

RBACDefinition

The RBACDefinition below states that the service account rbac-manager/test-rbac-manager should be bound to a test-rbac-manager cluster role.

 1cat > resources.yaml << EOF
 2apiVersion: rbacmanager.reactiveops.io/v1beta1
 3kind: RBACDefinition
 4metadata:
 5  name: rbac-manager-definition
 6rbacBindings:
 7  - name: admins
 8    subjects:
 9      - kind: ServiceAccount
10        name: test-rbac-manager
11        namespace: rbac-manager
12    clusterRoleBindings:
13      - clusterRole: test-rbac-manager
14EOF

ClusterRoleBinding

If we apply the RBACDefinition definition above, the operator is expected to create the corresponding ClusterRoleBinding.

 1cat > expected.yaml << EOF
 2apiVersion: rbac.authorization.k8s.io/v1
 3kind: ClusterRoleBinding
 4metadata:
 5  labels:
 6    rbac-manager: reactiveops
 7  ownerReferences:
 8  - apiVersion: rbacmanager.reactiveops.io/v1beta1
 9    kind: RBACDefinition
10    name: rbac-manager-definition
11roleRef:
12  apiGroup: rbac.authorization.k8s.io
13  kind: ClusterRole
14  name: test-rbac-manager
15subjects:
16- kind: ServiceAccount
17  name: test-rbac-manager
18  namespace: rbac-manager
19EOF

An important point in this manifest is that it doesn’t contain a name. This manifest won’t be used by Chainsaw to create resources in the cluster but to verify that a resource in the cluster exists and matches with this definition.

Finally writing the test file

To summarize, the test we want to write should do:

  1. Apply the RBACDefinition in the cluster
  2. Verify the corresponding ClusterRoleBinding is created by the operator
  3. Cleanup and move to the next test

Such a Chainsaw test can be written like this:

 1cat > chainsaw-test.yaml << EOF
 2apiVersion: chainsaw.kyverno.io/v1alpha1
 3kind: Test
 4metadata:
 5  name: clusterrolebindings
 6spec:
 7  steps:
 8  - try:
 9    # create resources in the cluster
10    - apply:
11        file: resources.yaml
12    # verify the operator reacted as expected
13    - assert:
14        file: expected.yaml
15EOF

Please note that the file containing the test is named chainsaw-test.yaml.

Invoking Chainsaw

To execute the test we just created against the local cluster, we need to invoke Chainsaw with the test command.

1chainsaw test
 1Version: 0.1.0
 2Loading default configuration...
 3- Using test file: chainsaw-test.yaml
 4- TestDirs [.]
 5- SkipDelete false
 6- FailFast false
 7- ReportFormat ''
 8- ReportName 'chainsaw-report'
 9- Namespace ''
10- FullName false
11- IncludeTestRegex ''
12- ExcludeTestRegex ''
13- ApplyTimeout 5s
14- AssertTimeout 30s
15- CleanupTimeout 30s
16- DeleteTimeout 15s
17- ErrorTimeout 30s
18- ExecTimeout 5s
19Loading tests...
20- clusterrolebindings (.)
21Running tests...
22=== RUN   chainsaw
23=== PAUSE chainsaw
24=== CONT  chainsaw
25=== RUN   chainsaw/clusterrolebindings
26=== PAUSE chainsaw/clusterrolebindings
27=== CONT  chainsaw/clusterrolebindings
28    | 13:41:26 | clusterrolebindings | @setup   | CREATE    | OK    | v1/Namespace @ chainsaw-ample-racer
29    | 13:41:26 | clusterrolebindings | step-1   | TRY       | RUN   |
30    | 13:41:26 | clusterrolebindings | step-1   | APPLY     | RUN   | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
31    | 13:41:26 | clusterrolebindings | step-1   | CREATE    | OK    | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
32    | 13:41:26 | clusterrolebindings | step-1   | APPLY     | DONE  | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
33    | 13:41:26 | clusterrolebindings | step-1   | ASSERT    | RUN   | rbac.authorization.k8s.io/v1/ClusterRoleBinding @ *
34    | 13:41:26 | clusterrolebindings | step-1   | ASSERT    | DONE  | rbac.authorization.k8s.io/v1/ClusterRoleBinding @ *
35    | 13:41:26 | clusterrolebindings | step-1   | TRY       | DONE  |
36    | 13:41:26 | clusterrolebindings | @cleanup | DELETE    | RUN   | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
37    | 13:41:26 | clusterrolebindings | @cleanup | DELETE    | OK    | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
38    | 13:41:26 | clusterrolebindings | @cleanup | DELETE    | DONE  | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
39    | 13:41:26 | clusterrolebindings | @cleanup | DELETE    | RUN   | v1/Namespace @ chainsaw-ample-racer
40    | 13:41:26 | clusterrolebindings | @cleanup | DELETE    | OK    | v1/Namespace @ chainsaw-ample-racer
41    | 13:41:31 | clusterrolebindings | @cleanup | DELETE    | DONE  | v1/Namespace @ chainsaw-ample-racer
42--- PASS: chainsaw (0.00s)
43    --- PASS: chainsaw/clusterrolebindings (5.28s)
44PASS
45Tests Summary...
46- Passed  tests 1
47- Failed  tests 0
48- Skipped tests 0
49Done.

Chainsaw will discover tests and run them, either concurrently or sequentially depending on the tool and tests configuration.

A more advanced test

In the test above, we only covered the creation of RBACDefinition resources. While it’s a good starting point, we also want to test updates and deletions. If we delete an RBACDefinition resource for example, the corresponding ClusterRoleBinding should be deleted from the cluster by the operator.

Chainsaw can easily do that, we just need to add two more steps to our test to delete the RBACDefinition and verify the ClusterRoleBinding is deleted accordingly.

 1cat > chainsaw-test.yaml << EOF
 2apiVersion: chainsaw.kyverno.io/v1alpha1
 3kind: Test
 4metadata:
 5  name: clusterrolebindings
 6spec:
 7  steps:
 8  - try:
 9    # create resources in the cluster
10    - apply:
11        file: resources.yaml
12    # verify the operator reacted as expected
13    - assert:
14        file: expected.yaml
15    # delete previously created resources
16    - delete:
17        ref:
18          apiVersion: rbacmanager.reactiveops.io/v1beta1
19          kind: RBACDefinition
20          name: rbac-manager-definition
21    # make sure expected resources have been deleted
22    - error:
23        file: expected.yaml
24EOF

Running Chainsaw again

If we execute this new test, Chainsaw will now verify that deleting a resource has the expected effect in the cluster.

1chainsaw test
 1Version: 0.1.0
 2Loading default configuration...
 3- Using test file: chainsaw-test.yaml
 4- TestDirs [.]
 5- SkipDelete false
 6- FailFast false
 7- ReportFormat ''
 8- ReportName 'chainsaw-report'
 9- Namespace ''
10- FullName false
11- IncludeTestRegex ''
12- ExcludeTestRegex ''
13- ApplyTimeout 5s
14- AssertTimeout 30s
15- CleanupTimeout 30s
16- DeleteTimeout 15s
17- ErrorTimeout 30s
18- ExecTimeout 5s
19Loading tests...
20- clusterrolebindings (.)
21Running tests...
22=== RUN   chainsaw
23=== PAUSE chainsaw
24=== CONT  chainsaw
25=== RUN   chainsaw/clusterrolebindings
26=== PAUSE chainsaw/clusterrolebindings
27=== CONT  chainsaw/clusterrolebindings
28    | 13:50:35 | clusterrolebindings | @setup   | CREATE    | OK    | v1/Namespace @ chainsaw-causal-cobra
29    | 13:50:35 | clusterrolebindings | step-1   | TRY       | RUN   |
30    | 13:50:35 | clusterrolebindings | step-1   | APPLY     | RUN   | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
31    | 13:50:35 | clusterrolebindings | step-1   | CREATE    | OK    | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
32    | 13:50:35 | clusterrolebindings | step-1   | APPLY     | DONE  | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
33    | 13:50:35 | clusterrolebindings | step-1   | ASSERT    | RUN   | rbac.authorization.k8s.io/v1/ClusterRoleBinding @ *
34    | 13:50:35 | clusterrolebindings | step-1   | ASSERT    | DONE  | rbac.authorization.k8s.io/v1/ClusterRoleBinding @ *
35    | 13:50:35 | clusterrolebindings | step-1   | DELETE    | RUN   | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
36    | 13:50:35 | clusterrolebindings | step-1   | DELETE    | OK    | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
37    | 13:50:35 | clusterrolebindings | step-1   | DELETE    | DONE  | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
38    | 13:50:35 | clusterrolebindings | step-1   | ERROR     | RUN   | rbac.authorization.k8s.io/v1/ClusterRoleBinding @ *
39    | 13:50:35 | clusterrolebindings | step-1   | ERROR     | DONE  | rbac.authorization.k8s.io/v1/ClusterRoleBinding @ *
40    | 13:50:35 | clusterrolebindings | step-1   | TRY       | DONE  |
41    | 13:50:35 | clusterrolebindings | @cleanup | DELETE    | RUN   | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
42    | 13:50:35 | clusterrolebindings | @cleanup | DELETE    | DONE  | rbacmanager.reactiveops.io/v1beta1/RBACDefinition @ rbac-manager-definition
43    | 13:50:35 | clusterrolebindings | @cleanup | DELETE    | RUN   | v1/Namespace @ chainsaw-causal-cobra
44    | 13:50:35 | clusterrolebindings | @cleanup | DELETE    | OK    | v1/Namespace @ chainsaw-causal-cobra
45    | 13:50:40 | clusterrolebindings | @cleanup | DELETE    | DONE  | v1/Namespace @ chainsaw-causal-cobra
46--- PASS: chainsaw (0.00s)
47    --- PASS: chainsaw/clusterrolebindings (5.32s)
48PASS
49Tests Summary...
50- Passed  tests 1
51- Failed  tests 0
52- Skipped tests 0
53Done.

Conclusion

In this short blog post we demonstrated how Chainsaw can be useful to test Kubernetes operators.

Chainsaw can go a lot deeper and offers much more features than what we demonstrated here.

If you’re writing an operator, chances are you need to write end to end tests and this can be painful. Chainsaw can help tremendously in focusing on the tests needed rather than messing with writing and maintaining a test framework.

Using it within the Kyverno project helped improve the test coverage by orders of magnitude. Converting issues into end to end tests is often a matter of copying-and-pasting a couple of manifests. Such simplicity guarantees more than just fixing issues but prevents regressions by having a test that continuously verifies they don’t happen again.