Usage Guide (benchmark)

Revision as of 20:16, 17 February 2017 by Admin (talk | contribs) (1 revision imported)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


The benchmark programme is a thin command line veneer to libbenchmark, which compares lock-free data structures to their locking counterparts, by implementing locking versions of the data structures in question and running the same benchmark on the lock-free and locking data structures.

The benchmark code through its abstraction layer knows about the system processor and memory topology, and runs each benchmark over the meaningful combinations of logical cores, to illustrate the behaviour of the system under the various possible distributions of memory and processor load.

Finally, libbenchmark can emit gnuplots (tested with gnuplot 4.6).


The benchmark programme is run from the command line. Change directory into test_and_benchmark/benchmark/bin/ and type;

 benchmark -r

The full command line (which is printed by -h or by running with no arguments) is as follows;

benchmark -g [s] -h -l -m [n] -p -r -s [n] -t -v -x [n] -y [n]
  -g [s] : emit gnuplots, where [s] is an arbitrary string (in quotes if spaces) describing the system
  -h     : help (this text you're reading now)
  -l     : logarithmic gnuplot y-axis (normally linear)
  -m [n] : alloc [n] mb RAM for benchmarks, default is 64 (minimum 2 (two))
           (user specifies RAM as libbenchmark performs no allocs - rather it is handed a block of memory
            on NUMA systems, each node allocates an equal fraction of the total - benchmark knows about
            NUMA and does the right things, including NUMA and non-NUMA versions of the benchmarks)
  -p     : call gnuplot to emit PNGs (requires -g and gnuplot must be on the path)
  -r     : run (causes benchmarks to run; present so no args gives help)
  -s [n] : individual benchmark duration in integer seconds (min 1, duh)
  -t     : show CPU topology, uses -m (or its default) for amount of RAM to alloc
  -v     : build and version info
  -x [n] : gnuplot width in pixels (in case the computed values are no good)
  -y [n] : gnuplot height in pixels (in case the computed values are no good)


-g [s] : This flag causes gnuplots to be emitted once all benchmarks are complete. The gnuplot contains in its filename and in the gnuplot itself an arbitrary string, which is intended to describe the system, i.e. "Core i5" or "Raspberry Pi 2 Model B". This string is mandatory. If the string contains spaces, it must be in quotes.

-h : Show help.

-l : The gnuplots emitted by benchmark by default have a linear y-axix. However, there is usually such an enourmous difference in the y-axis values, between one core and many core charts, and between lock-free and locking benchmarks, that most charts end up with extremely short bars, so much so that they are visually almost indistinguishable. To address this problem, this flag instructs benchmark to emit gnuplots with a logarithmic y-axis.

-m [n] : There are a large number of benchmarks. Mainly these benchmarks run one thread per logical core, but there are a few API benchmarks and a few single-threaded benchmarks. Each benchmark being run typically allocates a large number of data structure elements with which to perform the benchmark. The value supplied to the -m argument is the amount of memory to be used for element allocation, in megabytes. The code does not properly or consistently check for memory exhaustion; a safe absolute minimum is 4mb, 64mb is a reasonable minimum for benchmarking, and anything more than 1gb gives no advantage.

-p : A convenience argument, for use when generating gnuplots, which causes benchmark to actually call the gunplot binary on the emitted gnuplot files, to generate the gnuplot images.

-r : If run with no arguments, the help text is displayed. As such, an argument is needed to inform the benchmark programme that it is in fact to run benchmarks, which is why the "-r" argument exists.

-s [n] : Specifies how long benchmarks should run for. This is the time in seconds for each individual benchmark to run, not the total time for the entire set of benchmarks. The minimum value is one second. This value should be at least five seconds, the default, as there is auto-tuning behaviour in liblfds which needs a bit of time to find optimal values; very short benchmarks will be skewed by fail-safe initial settings inside liblfds not having had time to reach more sensible values.

-t : Display and only display system memory and processor topology, like so;

       R        R   = Notional system root     
       N        N   = NUMA node                
       S        S   = Socket (physical package)
      L3U       LnU = Level n unified cache    
   P       P    LnD = Level n data cache       
  L2U     L2U   LnI = Level n instruction cache
  L1I     L1I   P   = Physical core            
  L1D     L1D   nnn = Logical core             
003 001 002 000                                

-v : Shows the build version, like so;

benchmark 7.1.1 (release, user-mode, NUMA)
libbenchmark 7.1.1 (release, Linux, user-mode)
libshared 7.1.1 (release, Linux, user-mode)
liblfds 7.1.1 (release, Linux, user-mode, x64, GCC >= 4.7.3)

-x [n] : Over-rides the default x-axis sizing of the gnuplot, for cases where the value computed by benchmark isn't any good.

-y [n] : Over-rides the default y-axis sizing of the gnuplot, for cases where the value computed by benchmark isn't any good.

See Also