A question that comes up every now and then is whether it’s possible to compose policies based on dynamic attributes provided with the request when querying Open Policy Agent (OPA) for decisions. Could we for example provide a group, team or role name as part of the input and have OPA evaluate all policies provided for that group, team or role, but no additional policies other than those? Imagine you have several teams in an organization, each of them with their own responsibilities. These teams might include a compute team, a storage team and a networking team.
When an end user tries to create a new resource, say on Kubernetes or a public cloud, we want OPA to decide whether the resource is safe or not. At this point OPA can look at the resource, and if it's compute workload (a VM, container, lambda), delegate the decision to the compute team's policy. Likewise for network and storage.
Developers experienced in object oriented languages might be familiar with the concept of dynamic dispatch, where an interface defines an action to be performed on some event, but the actual implementation of the action is not provided until the event actually happens. Similarly we can use policy composition to determine which policy should be evaluated when a request is received.
There are a couple of reasons why this might be a good idea. First, it helps organize policy documents in logical groups that make sense for the domain. While this type of policy organization is possible by dividing policy in multiple documents and having the application or service query OPA for specific documents, it requires the application to know which documents are relevant to the query, thus creating a coupling between the two.
Secondly, the dataset needed for performing policy decisions could be big enough (several gigabytes) that running an instance of OPA per instance of the application—with all the data loaded into the memory—would be prohibitively expensive. While it might be possible to partition the data and distribute it among OPA instances, it could be that all instances of OPA need access to all available data, thus forcing only a few instances of OPA to serve all applications in a system.
Lastly, just like resource and cost constraints sometimes define the design of a system, architectural decisions commonly enforce other types of constraints. One such model where policy composition makes sense could, for example, be that of multi-tenant systems, where many—potentially thousands of—users share the same instances of OPA without having access to (or even knowledge of) the policies or data owned by other tenants of the system.
So how do we go about designing policy composition in OPA?
The way we’d normally design this type of policy composition system with OPA is by using a single main policy, which evaluates any incoming request and “routes” the request to one or more target policies determined by some value provided in the input, commonly (but not necessarily) in combination with some predefined data that maps input X to policy Y and Z.
Whether the target policies and rules are provided to OPA when starting the server or provided in form of bundles is not important for the main policy, but for the sake of a simplified example we can assume a local directory structure organized like below.
With each request to OPA we can expect the provided input to contain at least the action being performed (like create or update) and the type of resource it is being performed on. Assuming the package name of each policy follows the same structure as the directories (so that the policy at policies/compute/policy1.rego would have a package declaration like package policies.compute.policy1) and assuming each target policy has a deny rule returning a set of messages for any violation it reports, we would now be able to start OPA in the root of the policy directory (opa run --server .) and have our service(s) query the main document with a resource type provided in input. The main rule would evaluate the deny rules of each referenced document and aggregate the result.
A super simple example of such a main policy could look something like this.
Only three lines of policy code, but thanks to the expressiveness of Rego that’s really all we need in order to build the base functionality of our “router” or “dispatcher.” Let’s take a look at what’s going on here. Rather than returning a single true or false, the deny rule defined here returns a set of messages. This pattern is common in rules where the policy needs to communicate one or more reasons (or other types of messages, like warnings) back to the caller.
Rather than generating these messages in the same policy we can see that the rule delegates producing these to other rules defined in the data.policies document, namely to those under the name mapped from input.resource. For any such document we find we’ll simply check the deny rules of those, and aggregate all the messages found into our main deny rule.
Now, while we could route directly on values provided in input, it often makes sense to map the input against some values known beforehand. In the example provided above we could see that if input.resource was equal to compute, then the compute policies would be evaluated. In a more realistic scenario though, there would probably be many resources types contained under the “compute” category, such as “vm,” “lambda” and “container.” Creating a mapping table to allow for this routing comes with two big benefits—it creates an abstraction between client input and the policies routed to, and it provides basic input validation since the client can never be routed to policies not contained in the mapping table. Let’s extend our example to include this.
Things are starting to look pretty good, and we could certainly leave it like this. One thing you might notice when you evaluate the result of querying the deny rule of the main policy however is how you don’t really know which policies were evaluated. For debugging purposes this could be a pretty useful feature, so let’s extend our policy to do just that.
Map generating rules
One underused feature of Rego is using rules that generate objects (rather than scalar values or sets). Using this construct allows us to generate not only a set of violations, but we can also group them by the policy from which they originate. Let’s rewrite our routing deny rule to instead generate a map. Since we already have other deny rules that generate sets we’re going to rename this one to “router.”
As you can see, the router rule looks very similar to our previous deny rule, but we now use the policy name as a map key, and each deny set as the map value. If we were to query the router rule directly, we should expect a response to look something like this.
While this information may be great for debugging, it might be a bit too verbose for most production deployments. With a few additional helper rules we can achieve something that would provide us the best of both options by providing only messages by default, with an option (provided in input) to toggle the more verbose “explain” behavior.
Looking at the finished main policy we’ll see that in addition to the new router rule, we reintroduce the aggregating deny rule. This time however it aggregates from the router rule. In order to achieve a response with an optional “explain,” we’ll create another map rule—this time we know the keys are going to be static so we might as well provide the keys as strings directly in the rule heads.
The values are however still variables, and both the “allow” and “reason” items are built from aggregating the deny rule. The “explain” item is not just variable but even conditional on input.explain being present and true. This way we’ll allow our clients to decide whether they need this extra bit of information as part of the response. The decision rule can now be queried directly, and as it is a map we could extend it further in the future.
Last little thing! With OPA configured like this, we might want to consider changing the default decision to point to the path of the decision rule in our main policy. This means that clients would not even need to provide a path with their query, but could just use the root path of the OPA server. Quite convenient.
One way to look at what we've done here is that we're resolving conflicts across teams. A classic problem in policy is that a request comes in, and different policies make different decisions, and somehow those need to get resolved to return a single, cogent answer to the caller. Usually this conflict-resolution logic is hard-coded into Go/Java/etc, but OPA gives the policy-author the ability to control that logic, because after all resolving conflicts is arguably a policy-decision too.
A more classic kind of conflict resolution is when multiple statements within a single policy conflict, e.g. allow and deny both evaluate to true. You can resolve these conflicts as well. We sometimes call this keyword-based conflict resolution.
And of course you can combine team-based conflict resolution and keyword-based conflict resolution as well.
Whether you intend to use policy composition in your OPA systems or not, if you’ve made it this far, I hope you’ve at least found the idea interesting! Given the versatile nature of OPA, you never know when you’ll face a system to integrate with where policy composition is perfect for the use case. If you have questions or would like to share your own experiences around policy composition, feel free to do so on the OPA Slack! Finally, If you’d like to quickly try out the example policies without copy pasting from the examples provided here, the full example is available on Github.