I’ve been working on the new test/benchmark programme.
It’s a lot of work. There’s a -lot- of code.
Problem is, the test code is not serious. All it does is run lots of threads doing lots of work and tries to make things go wrong – there’s no directed effort to cause unusual cases, it’s just run-and-hope – and there’s fairly limited checking you can do at the end of the test to see if data is correct.
It does find bugs, but that’s the most you can say.
I need and I’ve known I’ve needed for a long time a more formal, rigourous test.
This is not a straightforward problem to solve. There’s ongoing research work into how to do this in a timely and practical manner – which is to say, on unmodified or essentially unmodified source code, no translating the source code into a model for a modeller (since such work obviously leads to bugs).
I read a white paper just now which is fairly recent (2014) and what’s being done basically is enumeration of paths of execution, with culling for paths which are equivelent, with the compiler back-end being co-oped to notice when memory accesses occur (although I don’t know how it really knows, since what the hardware does is different to what the compiler does – OTOH, in theory, the world looks normal to a single thread so maybe that all works out, and you the processor ensuring the world looks normal is enough?), and then there’s a correctness checker which I’ve not yet read about, so I’m not sure how it works.
The code is on github, it’s for C++, but you know open source – it doesn’t work. I’m going to need to write my own.