ARM64 dev board has arrived

PINE I think gave up on the one circulating up and down the east coast of the USA and sent me a new unit from China.

I have written Debian 8 with XFCE to an SD card, booted and voila! I now have a 64-bit Linux on ARM64 dev platform.

Beautiful!

Organizing a company

A few days ago I wrote a post about organizing a dev team.

It was not properly written, for it does not properly reflect my views, in part by being too brief and in part because I was in effect writing about a compromse (a dysfunctional compromise) which was in my head at the time due to the situation at work.

I need then to write properly that which I think, otherwise I look like I’m talking what I think is partially rubbish.

1. Software engineers must be physically removed from the office. They must work at home, or in an office which has *just them* in. No other developers, no one else from the company, no other people at all. Distraction is death to software engineering productivity, and an office is no different to being in a disco. This is an order of magnitude effect. Developers come into the office for one day each week, to maintain relationships. One day per week is the minimum contact rate for humans to maintain close relationships.

2. Contact with developers is either through email, real-time typing, voice, or group voice. Some conversations are best in email, others best in voice; meetings often need group voice. I have yet to see any need for people to be physically together, other than for maintain relationship (and this will be with people who are not otherwise contacted during the week).

3. The company cannot organize itself into large teams (sales, IT, etc) – but rather forms teams, with members from each discipline. Motivation is key to productivity. When software engineers are separated from the users of their work, put into an IT team, they are removed from motivation. This is death to software engineering productivity. Moreover, in a larger sense, any work which is done without input from major components of the company (sales, for example) is fundamentally brain damaged. It’s like trying to be sentient without your right hemisphere.

4. A second profound negative consequence of putting IT into a group and then having it control its own work is that other teams no longer care about the work being done in IT. If a non-IT team controls their own developers (which is not the situation I’m talking about here, but to explain the problem) then if those developers are doing the wrong work, that non-IT team *cares*, because *they* are wasting *their* resource. They incur the opportunity cost. However, people or teams or whatever can only bear an opportunity cost *if they could have directed the effort involved toward something else*. When all the non-IT team can do is submit tickets and someone else decided if they are done, that non-IT team no longer bears any opportunity costs, *because they have no capability to redirect the effort expended to other things*. If something is done for them, well, that’s nice! and because they cannot bear an opportunity cost, they are fundamentally and profoundly less interested in the work being done and ensuring it goes correctly. Bearing opporunity costs is VITAL to productivity, because it is this and this alone which makes people direct effort towards the tasks which actually matter.

5. The closer someone is to something, the better they know it. The higher up the (management) hierarchy you go, the less they know what’s actually needed and the more oblivious they are to the unintended consequences (let alone the intended consequences!) of their actions. Small cross-discipline teams are close to the task and motivated; they are the smallest team necessary which possesses all the skills necessary to move from nothing to completion. Management as such no longer exists; rather, staff members with knowledge of the larger business environment are members on these teams. The IT work done also includes maintaining systems after they are created.

6. All members of a given discipline should be in event-driven communication with each other. Not regular meetings – i.e. polling – it’s inefficient and needless. People know when they need something from someone else, and people know when something has happened they need to tell other people about. *Regular* meetings are senseless – they have a profound impact on productivity (scratch that half of the day) and they are not needed.

SimpleCGI

Yay!

I now have a simple CGI server. Takes CGI style arguments, emits a page back to the browser, working with nginx.

Now I need to lash up the database stuff to actually run an announcement list.

It’s not going to be a fully-fledged mailing list (at least not to start with) as I’d need to mess about with the SMTP/POP servers and be able to read mail and that’s like having your brains slowly pulverized with a million tiny hammers; rather, it’ll be an announcement list (just a descriptive phrase I’m using here) where you can receive email from liblfds (bug reports, new releases) and where I can when necessary send out the odd functionality/design question – all replies to it will come to the liblfds admin email addy, rather than to everyone on the list.

Agile considered harmful

Where I currently work has been without a product manager for about four months.

We’re a small group of devs (five of us) and we have a lightweight “agile” process – really, it means a meeting Monday morning, a meeting Friday, and JIRA, which is convenient for organizing tickets.

I’ve seen a lot of companies, and I’ve seen them do a lot of things, and I’ve generally not minded or much cared, so long as it didn’t get in the way of me getting work done.

In fact the elephant in the room for software engineers is the simple fact we’re all forced to work in offices and that this reduces productivity by a factor of ten. No one speaks about it, because we all know managers require everyone to be in the office.

The problem is the people who run companies and organize software groups are never or almost never software engineers, and so they have absolutely no idea about the requirements for software engineering productivity; in fact, I would say, in all or almost all cases, the matter simply doesn’t come to mind. It’s not thought about. It’s simply – we have an office, everyone is in the office.

So, we’re in all in an office, and then we have all this fashion about how to organize the software engineering effort, *as if this was a major factor in engineering productivity*. The problem is all or almost all companies are in fact gung-ho on organizing productivity but not about *actual productivity* – and software engineers working in an office have bugger all productivity. I do in one six hour session at home what takes a week in the office.

The loss to companies is catastrophic. Software engineering is regarded as an incredibly slow process – in fact it is not, *at all*, but the problem is that software engineers are basically being asked to work while in a disco.

So – this is how I would arrange a software engineering group, althoug here I only talk about the internal organization of the group itself. I’m not talking here about integration with other groups, which is vital.

1. Everyone works at home. Engineers NEED to be PHYSICALLY REMOVED from the office, so other people cannot disrupt them. If home is not peaceful, they get an office of their own, provided by the company, AWAY from the main office.

2. Communication happens by email, real-time chat, or VOIP (and this specifically must include group VOIP). Some conversations are best by email, but some MUST happen verbally (too subtle or complex for typing) and group chat is necessary for meetings.

3. A product manager exists, who keeps track of all work requests, and who’s currently working on what. When an engineer finishes a task, the PM and the engineer select the next task. None of this “plan for two weeks ahead” idiocy. Why do it? when my OS switches tasks, it picks the next best job out of all available – it doesn’t pick a dozen so it’s commited for the next ten seconds.

And this will never happen, because managers want people in the office. They do not know about software engineering productivity to care about it, and as such they are in fact not qualified to run a software engineering group.

versioning

I’ve been properly re-reading Dreppers white paper about DSOs.

Where I’m working on the simple CGI server for the mailing list (crazy – I’m sitting here writing a server around epoll so I can write a CGI) and for that I need another data structure library I threw together year or three ago (never published), libstds, which is just a collection of single threaded data structures.

So I need to tart this up a bit now – bring it to the the same presentation as liblfds. Thing is, Drepper’s paper has raised the question of how I’m going about versioning.

Drepper makes a pivotal point; the reason DSOs are used is that if we distribute security or bug fixes, we update the entire system by replacing a DSO, but if we statically linked, we have to recompile every binary using the library in question.

(BTW, I can’t use DSO internal versioning at all, because it’s Solaris and Linux only).

So this means that the APIs must remain constant between versions, do the new versions work as drop-in replacements of the old versions.

This has one problems; the concurrent use of multiple versions of the library.

I want this for two reasons. Firstly, when new versions of the library are released *not* for needed security or bug fixes, but for new functionality, if users adopt them, then they need to revalidate their code. I would rather existing code was completely unchanged – i.e. the *binary of the DSO is unchanged* – as this is the only way that effort is not required. Secondly, I want to be able to benchmark multiple versions of the library, in the benchmark programme (so users can see how performance changes over time).

Concurrent use of multiple versions requires that APIs differ by version. This means DLLs cannot be used as drop-in replacements.

Right now, where the API changes on every release (as it contains the version number) we get some of the code-sharing benefit of DSOs (everyone using the same version re-uses the DSO in memory, as opposed to being just the one DSO any everyone using it) but we do not get the linking benefits of DSOs (you *do* need to relink to use a new version, rather than simply replacing the DSO). We note though this last behaviour is somthing explicitly eschewed, as it requires revalidation.

At a pinch, if it were needed, I could handle multiple versions concurrently in the benchmark by code manipulation; after all, the benchmark code contains a copy of each of the earlier liblfds libraries. I can modify that local copy.

It certainly is the case that users normally expect a stable API, with the DSO changing behind the scenes. The fantasy here of course is that the authors of the DSO introduce no new bugs, unexpected behaviour, etc, such that revalidtion is not required.

What we see in fact is that software being software, i.e. extremely complex and so error prone, the point of maximum validity for an application is that where all of the dependencies (DSOs, etc) have the version used at the time the test suite was running (and even this of course is only true for the OS version, the hardware revisions, etc, etc).

Of course at that point how valid the application is depends then on the quality of the test suite.

In all things, there are factors which encourage, and factors which discourage, and in the end you get what you get.

So we see in the actions we can take we move from the point of maximum validity (the systems the test suites have been run on – system here in all its glory, DSO versions, hardware, OS, etc) to points further away from maximum validity, where we can see benefits in other domains for such moves (being able to easily distribute security and bug fixes by DSO updates).

Update – we’re on a road to nowhere, come on inside…

Not much happening right now.

I ordered a PINE64 back in August as a 64-bit ARM platform. Instead of being posted to Germany, it was posted to the USA, where it’s gone in postal circles since then. I first emailed PINE about this about a month ago – they are now finally sorting it out.

It doesn’t matter much now in fact, because GCC 4.6.0 introduced a 128 bit type so I can use the GCC atomic intrinsics if you’re on that version of higher and get the built-in support for ARM64 double-word CAS. Still, it’ll be great to have another dev platform to actually test on.

I’ve been trying to get a mailing list going, A G A I N. Fourth or fifth try. Failed again. The problem is that almost all mailing list software uses CGI, which is not supported by nginx. Apache is a catastrophe – bloated and it turned out, very broken out of the box (default is IPv6 only and to deny all files), lighttpd is out of the question because it’s a thumping nightmare to configure. That left dadamail as a possible, since it can use FastCGI – and this ALMOST worked – but then it turns out dada uses wierd “cgi-name in the path with more path as arguments” and I could not face trying to configure that in the HTTP server, when it could just as well have used normal CGI arguments which would have worked exactly as well.

So I’m still stuck on this. I’m totally repelled by all the existing mailing list software I’ve seen (except for dadamail, which had a viable install path, but – as above – then takes a wrong turn) and for my simple use I can roll my own much more easily than trying to install an existing mailing list. The problem is I can’t easily get the HTTP server to operate the mailing list. I looked at the FastCGI spec, thinking to write one, and it’s a monstrosity. I’ve never seen something so simple be made so complicted, MY GOD.

I’ve more or less come to a halt with building all GCCs starting with 4.1.2, and matching glibcs, and using them for compiling and building. I spent five weeks on that, and in the end, I’m doing something wrong, because glibc fails to build with my compiler, and I am at a dead end. I can’t see what’s wrong – the problem is that a call to iconv seg faults. I’d still like to get this working, but… after five weeks and now being at a complete dead end with no clues, it’s not obvious what to do.

Once the mailing list is sorted out I can then get on with fixing the SMR design.

I’ve also now been closely reading Dreppers white paper about DSOs. I have a lot of changes to make, because of that.

Once SMR is up and Drepper-changes are done, then I need to figure out what’s happening with the performance of the unbounded/single/single queue and the freelist elimination layer, to implement and test the SMR based freelist, stack and unbouded/many/many queue, modify the ringbuffer to use the improvement in the queue with regard to queue element reuse (which will mean the ringbuffer elements retain physical locality between the ringbuffer element and its queue element), and then I can either release, or implement the full (not add-only) singly-linked list.

Complexity is oblivion, because you can’t use complex software

Well, I spent all of yesterday trying (again) to install (another) mailing list, and was unable to do so.

Today, I spent the whole day trying to run a server-side script from a web-server, i.e. a CGI script. Likewise, this has proved impractical.

The problem in the latter case reminds me of how HTML and CSS have evolved. In the early days, they were simple and by that accessable. As time passed, HTML became increasingly complex and then CSS came out – and the CSS spec is impenetrable unless you’re a computer scientist.

Similarly, in the early days, web-servers were simple. Anyone could easily and rapidly set them up and get stuff going. Today – it’s hopeless. Apache is a pigs breakfast to configure – I tried it today. Listens by default on IPv6 only and denies all files by default, and the installer is broken and it’s now in a state where I *cannot* install it again. Lighttpd I abandoned years ago because configuration is a nightmare. Nginx is okay to configure – I had it up and running in sixty seconds – but there’s no CGI support, only FastCGI and SimpleCGI. Python itself does not support FastCGI, only WSGI. What I read is that to get FastCGI running, I need an FastCGI server which talks WSGI to Python.

Seriously, I REALLY, *REALLY* DON’T FUCKING NEED THAT.

What I need is to be able to *call a script from the web-server*.

Fuck me, huh? radical!

Without a fucking metric ton of configuration, because configuration on open source projects is *the kiss of death*. If you have to do any significant config, forget it. Give up before you start – save the time you’d waste discovering it’s fucked.

And I mean, that’s just to run *FastCGI*, which is already work for me, since my current script is CGI and I’ll need to make it FastCGI.

So these servers, they’re now effectively non-existant, because they’re inaccessable. I can’t use them to do work.

I was reading the FastCGI spec and began to write a C server of my own which would just call system() to run the script I have, but oh Jesus, the FastCGI spec is written by fucking aliens. It’s the most incomprehensible garbage I’ve seen since I looked at the Git docs. It’s also bad C – they provide struct definitions which are invalid, they’re actually psuedocode, but you’d need to know it to know it.

So, I’ve given up. I’m going to run the script I have manually, when I need to run it.

All those guys – Apache, Nginx, Lighttpd – the *decades* of work which have gone into those projects is totally and utterly useless for me, because I can’t use what they’ve done to do work. I’m running scripts manually instead. Complexoty is death, because it makes software – especially open source, with its profound quality control, documentation and configuration issues – unusable, which is the same as non-existant.

I never want to see –

“Ensure your password is no longer than [small number] of characters”.

Where “small number” for Dadamail is sixteen.

My normal default passphrase is 25 characters.