Group Filtering

The code function of git-ws is to allow having git based projects to have dependencies that they can pull in easily, stashing everything together within a large workspace.

These workspaces quickly can grow large, especially when dependencies bring in transitive dependencies. To allow bringing down the size of workspaces, group filtering can be used. But, what is a group filter?

To understand this, let’s step back and have a look at the dependencies a project might have:

Some of them will be hard dependencies, e.g. libraries that a project depends on that are absolutely needed for the project in question to be able to function.

On the other side, there are soft dependencies. These dependencies are not needed for the project to work. These could, for example, be dependencies used during testing the project. When a project is built, it needs to be tested as well in order to guarantee stability and functionality. But: Once tested (and released), when the project is used within another one as a dependency, then these tests don’t necessarily need to be run.

To make this easier to grasp, let’s consider the following scenario: We have a library PrintLib, that has a dependency to another library called IOLib plus - for testing - a dependency towards SimpleUT for writing unit tests:

PrintLib
├── IOLib
└── SimpleUT

Now, let’s say we write a simple calculator tool, where we want to use PrintLib. The dependency tree in this case would look like this:

Calculator
└── PrintLib
    ├── IOLib
    └── SimpleUT

With this, the final workspace created would have four sub-folders:

  • Calculator

  • PrintLib

  • IOLib

  • SimpleUT

But: What if for Calculator we don’t want to use SimpleUT? By default, we would still get it, even if we never run the unit tests of PrintLib. This is where group filtering can help. In the manifest of PrintLib, we can put SimpleUT into a group - let’s call it dev:

[[dependencies]]
name = "IOLib"
revision = "v42.0"

[[dependencies]]
name = "SimpleUT"
revision = "v3.2.0"
# Put this dependency into a group called "dev":
groups = ["dev"]

Without further ado, what would happen now? It’s as simple as that:

  • When we create a workspace for PrintLib, all of its dependencies will be included.

  • But when creating a workspace for our Calculator app, the SimpleUT dependency would not get installed!

Perfect! So, for simple cases, these simple rules apply:

  1. Direct dependencies of a project will always be pulled into the workspace.

  2. All transitive dependencies that are in any group will be skipped.

With these simple rules, creating efficient modules is quite easy: Whatever is a strict dependency should get no group assigned. Everything optional in turn could get assigned any group (or groups). You can use any group you like for this, e.g. you could have a group used for unit testing, another one for linters and code formatters used in your project, and so on.

Enabling Groups

So now you know how to create efficient projects which hide parts of their dependencies to contribute to smaller workspaces. But, what if you want to install groups that got deselected due to the group filtering? There are two ways to do so.

Filtering Groups on the Command Line

A lot of the commands of git-ws tool allow you to specify group filters via the --group-filter (or -G) option:

git ws clone -G +dev https://example.com/Calculator.git

This would create a workspace for the Calculator project, including also transitive dependencies that are in the dev group.

When the --group-filter option is used during the git ws init or git ws clone operations, the filter is stored in the workspace settings (and can be updated using the git ws config set command). The option can be used multiple times to specify additional groups. And, you are not limited to enabling groups. A group filter string is structured like this:

(+|-) group [ @ path]

So in more prose:

  • A group filter expression always starts with a + (to select) or a - (to deselect) a group.

  • It is followed by a group name, where group names are valid identifier names.

  • Optionally, there can be an @ character, followed by a path. In this case, the filter is applied only to the project specified by the path.

Here are some examples:

  • As shown above, a group filter of +dev would enable the group of development dependencies also for transitive dependencies.

  • On the other side, we could disable dependencies e.g. for generating documentation via -doc. If explicitly specified, this would exclude dependencies of the main project.

  • Finally, we could selectively select groups, e.g. like +network@PrintLib. This would enable the network group of our PrintLib dependency (which could e.g. be used to pull in optional libraries that allow it to provide input and output via network connections).

Filtering Groups Via The Manifest

Sometimes, controlling groups via the command line might not be convenient. Consider the last example from the previous section: Assuming the PrintLib library has optional dependencies for networking input/output, if we mark them as optional by putting them into a network group, they would - by default - not be installed into a workspace of our Calculator app. But what if we actually want this dependency in most cases? Telling everyone to use a special, non-standard command for initializing a workspace is certainly a bad idea. In this case, we can specify the appropriate dependency directly in the manifest of our Calculator app:

group-filters = ["+network@PrintLib"]

With this, everyone would - by default - also get the networking dependencies of PrintLib in their workspace (unless they override the group filter on the command line).

Last Match Wins

As we’ve seen, group filters can be set in various locations: In the main repo’s manifest, in manifest files of dependent repositories itself but also “manually” on the command line.

Naturally, such filters might conflict - so it is important to understand which takes precedence and how the evaluation works.

A group filter is - basically - a list of instructions turning individual groups of repositories on or off. The initial filter is build from the main repo’s manifest. While evaluating the dependency tree, we’ll append filter expressions of dependencies to that list of filters. Before actually evaluating a filter on a repository, the filter set on the command line is appended to the list.

What happens technically is, that the last matching expression in this sequence takes precedence. In other words: If git-ws has to decide for a given repository if it needs to be included in the workspace or not, it will evaluate the list of filter expressions in order. The decision (include or exclude) of the last matching expression will be used to determine if this particular repository will be included or not.

This has some handy implications:

  • We can always specify group filters on the command line. They’ll override filters set in the manifests.

  • We can specify sane defaults for group filters in the main repo. However, to aid encapsulation of information, if a dependency absolutely needs some peer repositories next to it, it can define a group filter to pull them in - even if the main repo excluded the specific group these dependencies belong to.