We keep our engine and internal projects 100% covered with unit tests, almost all of it is reachable from NCrunch (95%+), which we use continuously to make sure we do not break anything when doing changes (which we do every day, even when working on other projects). Anyway, this only covers the logical rules and whatever we wrote in the unit tests. We used functional testing in the past and constantly tested our samples and tests manually to see if things still work as they should, but after reaching several thousand tests over the years and a much smaller team now, we don't have time to constantly test rendering functionality or sample games. Almost all of the sample games had to be removed because we just did not have the time to constantly adjust them and test them. We probably also had way too many, so we reduced it to the bare minimum plus the current project we are working on.
To help out testing the frameworks, graphics and rendering features we used approval tests in the past, which is accomplished by simply adding an extra attribute to visual tests (any test can be visual in our engine as long as the test class derives from TestWithMocksOrVisually). In the past we also integrated this into our nightly builds in TeamCity, but we had constantly problems with NUnit crashing or the whole TeamCity process freezing up after hundreds of tests (many minutes later), which was impossible to reproduce locally, thus we disabled this 2 years ago :(
Now with our new Editor Viewport technology, which works similar to how Unity3D loads and unloads .NET code (we just do it from inside our engine, not like Unity3D from native code and with some super old mono runtime), I could finally reactivate approval tests on TeamCity and this is what it looks like (pretty proud of myself, does not look like much, but this took a long time to get working):
- Currently tests about 150 visual tests
- Runs each of them to the 0.4s time (which is done in less than 20ms for each in reality)
- Makes a screenshot (1280x720, takes up most time together with comparing it, around 50ms)
- Compares it to be almost pixel perfect with the approved screenshot (either from before or what we specify)
- 0.1% difference is allowed (not all frameworks render things the same way pixel perfectly)
- Repeats this for all of our now 12 frameworks to make sure it looks correct on all support platforms (more on that in a later post)