Auto generated files in a project are pretty common. There are generally 3 scenarios for this in a project that needs auto-generated files:

  • An external pre-built tool generates files pre-compilation and then the generated files get checked in to the tree
  • The tool is compiled in-tree but is again used to generate files pre-compilation and then check them into tree
  • The tool is built during the main build step and then it also generates the needed files just in time.

Arguably, the last method is usually the better one since it keeps friction to a minimum during development by always generating latest files according to any changes done locally and also prevents against the human error of someone foregetting to commit the separately generated files, or having a time period where the tree is out of sync because the generated files and hand written files were committed separately. There is a con as well that such files are not available for someone going through the code statically for understanding or debugging. But one could always couple both methods if so desired.

I had a need to generate some files for a side project of mine recently and I thought of listing down the way I achieve this with bazel, particularly the last mechanism mentioned above. So, our (simplified) problem statement is:

  • Our primary code is in main.cc which needs to include an auto-generated header file header.h.
  • header.h is generated by running a tool header_generator.
  • header_generator is built by compiling header_generator.cc.
  • We only want to run a single build step of building our primary target. Rest should happen on its own flowing through the dependency logic.

To achieve this, we should move in the reverse dependency order. First, let’s define the straight forward target of header_generator. This target does not have any dependency and can build standalonea as well.

cc_binary(
  name = "header_generator",
  srcs = ["header_generator.cc"],
)

Next, we define a target that will generate our library that builds header.h. If this was a hand-written or pre-generated file, the target would’ve been simply:

cc_library(
  name = "header",
  hdrs = ["header.h"],
)

But this file does not exist in our tree and so bazel wouldn’t know how to make sense of it. So, we replace the hdrs field of the library with another target (genrule) instead, which generates a rule on the fly.

Bazel documentation: A genrule generates one or more files using a user-defined Bash command. Genrules are generic build rules that you can use if there’s no specific rule for the task.

genrule(
  name = "generate_header",
  outs = ["header.h"],
  cmd = "./$(location :header_generator) > $@",
  tools = [":header_generator"],
)

cc_library(
  name = "header",
  hdrs = [":generate_header"],
)

This target runs the command specified in the cmd property as a bash command. The command uses the tool label mentioned in the tools property to figure out any tools’ targets that need to be built before running this. location attribute is used to refer to the final compiled binary of the associated tool and then it redirects the output to the filename mentioned in outs property.

Finally, we just plumb this with our primary target normally, making it dependent on the target that contains the generated header file.

cc_binary(
  name = "main",
  srcs = ["main.cc"],
  deps = [":header"],
)

With these set of rules, whenever I run bazel run //src:main, bazel will first compile the header_generator binary, then run this binary to generate header.h file and finally compile main.cc along with these depdenencies and then run the final binary.