Another weekend fruitlessly trying to build GCC and glibc

Well, I think I made some progress, but I can’t be sure yet.

I’ve stopped trying to build binutils as part of a combined build with GCC. The docs say you can, but they’re wrong, AFAICT. What you have to do (the docs don’t explain this and it took me the first weekend to find this out) is dump the entire *contents* of binutils into the GCC dir. Problem is, both have an include dir, and they share some files, and the versions differ.

I read that the GCC files take precedence, but when I tried that, it did not build.

I also tried putting the binutils dir actually in the GCC dir (rather than the contents of the binutils dir) and you can make this work with some hackery – you have to make softlinks in the GCC dir to the build results of the binutils dir (which configures and builds just fine – just it’s output is in the wrong place and you only find out you’ve done the wrong thing during the make – this is completely typical of GCC building) and the get wiped during every build stage, and a build script which is generated is built wrongly, it doesn’t know where nm is, so you need to fix it by hand – but the upshot of this binutils is that although it builds, it’s fucked; when you come to make glibc, you get freaky errors about CFI instructions, and there’s NOTHING in google about what causes that problem…

I *think* what’s happening is *maybe* the OS native binutils are being used *in part* during the build – so the only binutils build I’ve made work is a stand-alone build, making binutils on its own, with the OS native compiler, and then installing it in /usr/local and using update-alternatives to get it into use.

That’s been the main advance I made this weekend (two solid twelve hours days – again).

That lets me build glibc up to a point – it ends up complaining it can’t find “crt1.o”. This was a surprise, as I expected glibc wouldn’t be linking binaries, but it does. But it has already at this point *made* crt1.o (with the GCC and binutils I’ve made) but it’s not *seeing* them. However, if I tell it about the stuff it’s made (LIBRARY_PATH, LD_LIBRARY_PATH), what I find is the make seems to be calling iconv, which is then core dumping because it’s using the libc shared object the build has just made…

So, obviously, I’m messing it up somehow, but fuck me if I have any idea where. There are no meaningful docs (there is a page or two about buildin glibc; it’s like having one or two pages explaining say Egyptian hieroglyphs) and making pretty much any chance induces a range of new, bizarre, wholly unexpected build errors – and you haven’t even got ANY build going yet. Like adding “–disable-shared”, which you might think would simply things, causes the make to almost immediately fail due to a missing header file… which apparently is not a problem if you ENABLE shared. How does that even make any sense…

However, I can see that glibc builds fine with the OS native compiler and tools, so the problem is down to me somewhere in how I’m building or intalling GCC, and/or how I build glibc. It’s just that these are complex build systems, with no docs and completely meaingless error messages which are I think often massively downstream of the real problem.

I am basically now of the view that the only people who can make GCC and glibc build are the developers on those two projects.

glibc (the pain continues)

Well, after building GCC, my tolerance for fucked up builds has massively increased.

I’ve built 6.2.0, 6.1.0 and 5.4.0 now and I thought it prudent to get a glibc build passing, in case there’s something amiss with my GCC builds.

glibc build is fucked, or maybe debian is fucked, but it seems – compared to my new high tolerance – to be dead simple; although I’ve yet to get it working and there’s no end of scope for totally fucked up insanity yet.

Problem is that glibc looks at /usr/include/linux/version.h, to know the verison of ther clib kernel headers, and looks like debian despite being on kernel 4.7.0 now is only shippined clib kernel headers for 3.16.36.

Not helpful.

So now off to fucking sort out the fucking fucked up fucking Linux build process once again, and indeed every fucking time I ever use it for anything, ever, amen.

General update

So, basically two lines of work in the air right now.

First, which is making progress, is building every GCC starting with 4.1.2, first on x64, then ARM32, ARM64 and MIPS32. Once done, then building liblfds with every version of GCC on all those platforms, and making sure test passes and then comparing the benchmarks.

Tell ya somethin’ – GCC 6.x.x takes freaking FOREVER to build. Four core i5 with SSD, takes about two hours.

Second, which is not making much progress, is thinking over the SMR design. I’ve been trying to think up a lock-free epoch based design, which would mean not being forced to process generations sequentially – being able where necessary skip generations, until they became releasable – but I think it’s not possible in any obvious way. The information just can’t be obtained, even more so when thinking in terms of memory barriers only, which depend on the atomic op of the data structure operation to publish.

I see very clearly why now hazard pointers are so good, but I don’t just want to use someone else’s idea – but given the limits imposed by hardware capabilities, it is hard to see any other approach.

I’m still waiting for the PINE64 to arrive, which is a blocker for the next release.

I need also to sort out the mysterious performance behaviour of the unbounded/single/single queue and also try to get the elimination layer on the freelist scaling.

I have also been told (and I think it’s true – I knew, really) that casting *away* volatile is undefined behaviour. That sucks, because being able to do so gives me the ability to do single-threaded ops on the data structures (have them after such work with a function call flipping back to thread-safe behaviour).

Maybe I can determine behaviour for GCC and clang, and offer it on those compilers.

Building GCC

I’ve improved and extended the GCC article into a general article on building (curse, spit, etc) GCC.

http://liblfds.org/mediawiki/index.php?title=Building_GCC

Matrix of GCC/dependency release dates, so you can try to figure out which versions to use, general advice, specific advice and currently explicit and tested instructions to build 4.4.3, which I will (God help me) be extending to cover all the versions of GCC which I can manage to make build.

Today’s installment of GCC building disaster, insanity and catastrophe

GCC docs say this;

“If you also intend to build binutils (either to upgrade an existing installation or for use in place of the corresponding tools of your OS), unpack the binutils distribution either in the same directory or a separate one.”

And then explain about the separate dir case.

So, what do I think they mean? I think they mean put the root binutils dir in the GCC sources, because otherwise I’ll be spaffing the ton of directories in binutils into the GCC source root and that’s nuts.

Guess what?

Nuts is the order of the day.

Now that wouldn’t be too bad – well, except it is – except that if you in fact do what I did, and put binutils into the GCC source dir, GCC WILL FIND IT, CONFIGURE IT AND BUILD IT. It will ONLY complain at the very end, when it will say it can’t find “ar”, because it’s looking now – finally, despite getting it all right up till now – in the wrong directory.

But it gets worse.

You see, now, having finally worked out how this way of building GCC worked, I now simultanously find out *IT DOES NOT WORK*.

That’s right.

You see, binutils has an include directory, and so does GCC, and they share some files, and the files differ.

I now understand a post I read on Stackoverflow, where the poster was talking about making softlinks to all the binaries made by binutils.

Open Soure : broken out of the box and no docs.

binutils is death

I’ve spent the day trying to compile old versions of GCC.

GCC itself seems okay – but binutils is a freaking catastrophe.

Binutils is basically fucked. AFAICT, all versions compile with warnings, and all versions have warnings-as-errors turned on.

Remember that I’m building binutils through GCC – just putting the binutils source dir into the GCC source dir and letting GCC do the work.

I think now that –disable-werror passed to the GCC ./configure propagates to the binutils ./configure; this has got me to the point now where the binutils (2.17) build seems to be breaking because… it’s trying to call something called “cc1obj” and it can’t – rightly so, because it’s not there – but I *think* cc1obj is actually the *Objective C compiler*…

I’ve also found out GCC depends on a couple of external math libraries, which I also need to download and have GCC build, so I need to add them to the version matrix in the docs.

===

So, later…

2.27 will build, but it emits its “ar”, etc, into the a directory which is different to the directory GCC looks for them in.

I guess maybe later GCCs look in the right place, so I should stick to the chronologically matching binutils version (2.17). So I go back to 2.17. I install the Objective-C compiler and the “can’t exec cc1obj” bug is fixed. So binutils is in fact dependent on Objective fucking C. It would be nice if I could find some – ANY – documentation on this. We are however talking Linux, which means broken-out-of-the-box-with-no-docs. They say there’s no guarantees with Linux. Believe me, this isn’t true. There really *is* a guarantee.

So, Objective C installed and now we’re on to the next problem – which turns out to be that modern gnumake has a built-in rule for the suffix “.m”, and that built-in rule is wrong for the gprof makefile.

I mean, shit. If you’re going to fuck it up, fuck it up REAL GOOD.

===

More later. So I checked the release dates on gnumake. Sure enough, looks like bunutils 2.17 would have been built with 3.80, and 3.80 doesn’t have an implicit rule for “.m”.

So I download 3.80, make it, put it in /usr/local/bin.

Now here’s the thing; which make gives me /usr/local/bin/make.

But “make -v” gives me version 4.0.0 – the make in /usr/bin.

If I rename /usr/bin/make to /usr/bin/make.old, I get a “can’t find file error” when I do “make -v”.

In other words, there is a totally unexpected hidden extra layer masking which make you use. WHY THE FUCK DOES THIS EVEN EXIST?

Why do I even HAVE /local IF THE OS IS GOING TO HARD CODE USE OF /USR/BIN?

FUCKS SAKE.

It’s bullshit enough trying to compile binutils without the operating fucking system pulling shit like this AS WELL.

===

More later.

I think I’ve hit the end of the road with 2.17. I *can’t* make it compile. I think it probably it acting properly on the .m files, it’s just that one of those files is actually really broken – or anyway, broken maybe so if you somewhat could telepathically know the right config you could fix it, but that means all seven billion humans on planet Earth can’t fix it.

So I decided to have a look at version 2.17a – maybe it’s a bug fix?

Guess fucking what. When you untar it, it untars to “binutils-2.17”, not “binutils-2.17a”, thus over-writing your existing directory. FUCKING MORONS.

If this is what I can expect from binutils, then I am a fucking idiot for trying to make it compile.

===

More more later.

The earlier version I can make build is 2.20.

However, this has the same problem (at least with GCC 4.1.2, assuming GCC version is relevant) of emitting its binaries into a directory which is different to the directory GCC is looking in.

I’ll try a later GCC to see if it makes a difference.

===

So, GCC compiles and binutil compiles. I presume libgmp is compiling too, since it’s in there.

The problem now though is that the GCC docs are fucked. They won’t make.

Turns out the doc files with 4.4.3 (and prolly a bunch of older versions) won’t work with more modern versions of texinfo. Looks like there is a workaround.

Basic lesson here. If your code has a dependency on ANYTHING – A N Y T H I N G – then in a couple of years, your code will not build because of that dependency.

===

Well fuck me – the workaround does fuck all. I’m so surprised. The idea of the workaround is right though – explain to configure there’s no texinfo and not to make docs.

So I’ve uninstalled texinfo. That seems to have enabled me to move on to the next set of errors, which are that libmpfr is missing – which is a valid error, as I’ve not put that library into the GCC source tree.

I need to figure out which version to use though, and it’s bed time now – it was bed time 1.5 hours ago.

Matrix of release dates for GCC, binutils, glibc and numactl

I spent today figuring out release dates for GCC, binutils, glibc and numactl, going back to the release of GCC 4.1.2, so I can build the appropriate matching versions of these packages, to build liblfds.

This is the result!

http://liblfds.org/mediawiki/index.php?title=Matrix_of_GCC,_glibc,_binutils_and_numactl_release_dates#Introduction

More on building compilers

TUrns out GCC when you build it self-bootstraps, AND if you put a binutils source tree in the root of the GCC source, will build that too!

So that leaves libc. Looks like this might be coming together… …time for some chocolate!

Updatez’n’stuff

So, two lines of work are in progress.

First, setting up every version of GCC, so I can compile, test and benchmark against them all.

Second, Safe Memory Reclamation.

With regard to GCC, as I’ve been progressing with this work, I’ve been learning more about the GCC/compiler eco-system. Stuff I vaguely knew, but which I did not particularly know, and now I am particularly learning about; there are in fact three pieces in this puzzle. There is GCC itself, then there is binutils (which provides the linker and assembler and GCC calls both when compiling C) and then there is libc. Each has their own – long – version history, with lots of releases. A canonical combination might be those versions which were available at the time a given GCC release was made – but this is by no means the and the only possible combination; I think there is a lot of scope for different vesions to be used together.

The script I’m putting together to generate all the GCC versions currently has two steps; build a given release with the local compiler, and then build with that newly build compiler itself, i.e. download 4.1.2, build with the installed GCC (4.9.2), then use the 4.1.2 I now have (which was built with 4.9.2) to build 4.1.2 again – so I have a 4.1.2 which was compiled by 4.1.2.

This means I’m using my local libc (2.19-18, compiled with GCC 4.8.4 on a Linux 3.16.7 system) and my local binutils (2.25).

I don’t actually myself care about libc, because liblfds itself does not use it; nor does libtest or libbenchmark (although they do use libpthread (huh – maybe a fourth piece in the puzzle). Only the test and benchmark veneers use libc, and they use it very lightly – printf and malloc, mainly. Now I think of it they also use libnuma.

I have something of a memory now about now mixing libc versions with different compiler versions. I need to check on this. If it’s true, then I HAVE to recompile libc for every GCC I use. That would then leave the question of whether I can use binutils across GCC versions.

So, basically, pulling on a thread here. I care most about compilation problems, since they are fatal to users, and any user compiling with a given compiler version is likely to be on a coherent system and so will have the appropriate versions of the other libraries and tools. However, I do rather care about getting a sane final compile, so I can benchmark across compiler versions. Actually, I think I also care about libc versions for benchmarking, because I benchmark locking-based mechanisms too.

So, some Googling indicates in theory you should only use a library with the compiler it was compiled with. This is because in theory each compiler has its own ABI. In practise the C ABI is very stable. Nevertheless, I actually want to compile all the different libpthread versions to see how their locking mechanisms vary in performance. (In fact, I’m using GCC 4.9 now with libc compiled by 4.8.4, which looks wrong!)

Also looks like binutils has a relationship to GCC and also libc – certain vesions needed for certain versions.

So; GCC, libc, libnuma, libpthread, binutils.

If I only care about compilation errors, I only need GCC. I don’t even need to bootstrap, really – it’s not very likely there will be compilation problems which occur only if the compiler is compiled with itself.

If I care about benchmarking, then I need the lot.

(More Googling – pthreads is part of libc, apparently – the libc compile I guess produing more than one library).

Moving on – second, Safe Memory Reclamation.

I’ve been trying to get my head around the core implementation of hazard pointers as presented by McKenney, here;

http://queue.acm.org/detail.cfm?id=2488549

The three core functions are provided, and they’re only a few lines of code each.

I’ve seen the white paper, of course, from Magel. For me, it is utterly impenetrable. I understand – or think I understand – the concept of hazard pointers, and the concept is as simple and straightforward as the white paper is impenetrable. I don’t think much of the white paper.

The code itself – I didn’t get it. Probably part of the problem has been it’s out of context; when you’re dealing with memory barriers and so on, you often need to see what all the code is doing, to be able to follow the paths of execution, to see what gets flushed to memory when, and so on.

However, a day or so ago, I had an insight.

The code itself only issues memory barriers, so I couldn’t understand how it could work. When other threads came to check to see if the per-thread hazard pointers were pointing at something, how could they be guaranteed to have seen the values set by other threads?

Then it came to me – or at least, I think it came to me 🙂

The code is using the atomic operations performed *BY THE DATA STRUCTURE* to flush its writes to memory.

In my epoch-based SMR, on the entry to a read section, I’m issuing my own exchange to guarantee my status flag change it pushed to memory. In fact I need to find a way to use the atomic op performed by the data structure to do that job.

Oh – and there’s a third thing.

I read a post by Drepper, about static linking. He said – never do it. Always used shared objects – and he provided arguments which convinced me; indeed, there is one simple argument which is by itself enough. What happens if you have to issue a security fix? with a shared object, you replace the shared object. With static linking, you have to recompile all the dependent programmes.

With this thought in mind, it seems to me now I am going about versioning in the wrong way. When I release a bugfix, I bump the API name prefix (7.1.0 to 7.1.1), i.e. I never in fact release fixes to any already released libraries. That WAS the idea, in fact – but I’m going about it in the wrong way, because this way forces recompilation of the dependent programmes.

buildtastic

Working on the build system.

Writing a script which downloads every release of GCC from the latest back to 4.1.2, building the version in question and then using that build to *build itself* again (i.e. so I have the given version of the compiler used to build the given version of the compiler, just as in the official releases).

I’m then going to build, test and benchmark against all versions, for every release.

I need to do the same for clang, and that’s more pressing, because the test app goes into a permanent loop with clang 3.8.0 in one the elimination layer freelist tests. It might be a compiler issue, or it might be an EA bug.

Using multiple compilers is a Bloody Good Thing.