What is a Golden File in software development?

Golden Files to run your test with large input and output data While programming, one of the best thing to do to make sure that your software works as expected is to write tests.

Actually, if you write enough tests to make sure that the code works as expected, it should take you between two and three more time to finish up your project. So if it takes you one day to write a function, it will take you another two days to write good tests for that one function.

Many of those tests, especially as you construct more and more complex software, require large amount of input and output data. It's easy enough to test a function such as strlen(). It has one string as input and one integer as output.

How do you verify that a function which takes Mb of data as input and generates Mb of data as output?

One solution is to make use of Golden Files.

Before Golden Files

In the old days, I would write code that generated the expected output. That means writing additional code which the test is already expected to do.

I still think that this is a good technique as it ensures that the output is generated by a test system separate from the regular test. However, you are not unlikely to repeat the code you are using to generate that data anyway. In other words, you are replicating work you've already done to write your software.

With Golden Files

The concept of Golden Files is to make use of the test to generate the output file and save that data. In other words, your test uses your software to generate the output data, save it and then each and every time you run the test again, it uses that first instance as the expected output.

The one drawback with this technique is that you have to trust your software enough for generating that file with proper values. The very first time, you should verify that the contents of that file are as expected.

Generally speaking, your test will have an --update command line option which means that the data should be updated with whatever is generated anew instead of compared against the existing data.

One solution to limit the drawback is to have a compare which shows you clearly what has changed. That should make it pretty easy for the programmer to know whether the new data is what is now valid (i.e. a field was added, another removed, the value of a field changed, etc.)

Many of our tests for Snap! C++ can be found here. We are not yet using Golden Files in our tests, but with further and further tests running against network connections, we are looking into doing such to make sure that we can have large data transfers that work better.