Fork me on GitHub

Kategorie: testing

For many sce­nar­ios it’s impor­tant to know how a data­base per­forms. Espe­cially these days when the num­ber of data­bases seem to grow by the day and a choice is hard to make.

To demon­strate how sones GraphDB per­forms at given use-cases we cre­ated a bench­mark frame­work and tool which basi­cally divides bench­mark­ing into two steps:

  1. Gen­er­ate and/or Import use-case spe­cific data and mea­sure the performance

  2. Exe­cute use-case spe­cific algo­rithms on the graph and mea­sure the performance

Because there are many dif­fer­ent use-cases these both steps are made up by plug-ins which can be adressed using the com­man­d­line which is inte­grated into the bench­mark tool.

The frame­work, tool and plug-ins are released as AGPLv3 licensed Open­Source soft­ware and can be down­loaded here.

We dis­trib­ute the source code mainly because it’s the best way for you to repro­duce the results and take a look at what actu­ally is being tested, the other main cause is that we want every­body to be able to bench­mark and test their own algo­rithms on GraphDB.

fetch

Source 1: https://github.com/sones/benchmark
Source 2: http://developers.sones.de/wiki/doku.php?id=benchmarks

9. September 2010
benchmarking the sones GraphDB
Cat: GraphDB sones testing  Tags:

Since we’re at it – we not only took the new Mono garbage col­lec­tor through it’s paces regard­ing lin­ear scal­ing but we also made some inter­est­ing mea­sure­ments when it comes to query per­for­mance on the two .NET plat­form alternatives.

The same data was used as in the last arti­cle about the Mono GC. It’s basi­cally a set of 200.000 nodes which hold between 15 to 25 edges to instances of another type of nodes. One INSERT oper­a­tion means that the start­ing node and all edges + con­nected nodes are inserted at once.

We did not use any bulk load­ing opti­miza­tions – we just fed the sones GraphDB with the INSERT queries. We tested on two plat­forms – on Win­dows x64 we used the Microsoft .NET Frame­work and on Linux x64 we used a cur­rent Mono 2.7 build which soon will be replaced by the 2.8 release.

After the import was done we started the bench­mark­ing runs. Every run was given a spec­i­fied time to com­plete it’s job. The num­ber of queries that were exe­cuted within this time win­dow was logged. Each run uti­lized 10 simul­ta­ne­ously query­ing clients. Each client exe­cuted ran­domly gen­er­ated queries with pre-specified complexity.

The Import

Not sur­pris­ingly both plat­forms are almost head-to-head in aver­age import times. While Mono starts way faster than .NET the .NET plat­form is faster at the end with a larger dataset. We also mea­sured the ram con­sump­tion on each plat­form and it turns out that while Mono takes 17 kbyte per com­plex insert oper­a­tion on aver­age the Microsoft .NET Frame­work only seems to take 11 kbyte per com­plex insert oper­a­tion.

The Bench­mark

Let the charts speak for them­selves first:

mononet

click to enlarge

benchmark-mono-sgen
click on the pic­ture to enlarge

benchmark-dotnet
click on the pic­ture to enlarge

As you can see on both plat­forms the sones GraphDB is able to work through more than 2.000 queries per sec­ond on aver­age. For the longest run­ning bench­mark (1800 sec­onds) with all the data imported .NET allows us to answer 2.339 queries per sec­ond while Mono allows us to answer 1.980 queries per second.

The Con­clu­sion

With the new gen­er­a­tional garbage col­lec­tor Mono surely made a great leap for­ward. It’s impres­sive to see the progress the Mono team was able to make in the last months regard­ing per­for­mance and mem­ory con­sump­tion. We’re already con­sid­er­ing Mono an impor­tant part of our plat­form strat­egy – this new garbage col­lec­tor and bench­mark results are show­ing us that it’s the right thing to do!

UPDATE: There was a mishap in the “import objects per sec­ond” row of the above table.

“Mono is a soft­ware plat­form designed to allow devel­op­ers to eas­ily cre­ate cross plat­form appli­ca­tions. It is an open source imple­men­ta­tion of Microsoft’s .Net Frame­work based on the ECMA stan­dards for C# and the Com­mon Lan­guage Run­time. We feel that by embrac­ing a suc­cess­ful, stan­dard­ized soft­ware plat­form, we can lower the bar­ri­ers to pro­duc­ing great appli­ca­tions for Linux.” (Source)

In other words: Mono is the plat­form which is needed to run the sones GraphDB on any oper­at­ing sys­tem dif­fer­ent from Win­dows. It included the so called “Mono Run­time” which basi­cally is the place where the sones GraphDB “lives” to do it’s work.

Being a run­time is not an easy task. In fact it’s abil­i­ties and algo­rithms take a deep impact on the per­for­mance of the appli­ca­tion that runs on top of it. When it comes to all things related to mem­ory man­age­ment the garbage col­lec­tor is one of the most impor­tant parts of the runtime:

“In com­puter sci­ence, garbage col­lec­tion (GC) is a form of auto­matic mem­ory man­age­ment. It is a spe­cial case of resource man­age­ment, in which the lim­ited resource being man­aged is mem­ory. The garbage col­lec­tor, or just col­lec­tor, attempts to reclaim garbage, or mem­ory occu­pied by objects that are no longer in use by the pro­gram. Garbage col­lec­tion was invented by John McCarthy around 1959 to solve prob­lems in Lisp.” (Source)

The Mono run­time has always used a sim­ple garbage col­lec­tor imple­men­ta­tion called “Boehm-Demers-Weiser con­ser­v­a­tive garbage col­lec­tor”. This imple­men­ta­tion is mainly known for its sim­plic­ity. But as more and more data inten­sive appli­ca­tions, like the sones GraphDB, started to appear this type of garbage col­lec­tor wasn’t quite up to the job.

So the Mono team started the devel­op­ment on a Sim­ple Gen­er­a­tional Garbage col­lec­tor whose prop­er­ties are:

  • Two gen­er­a­tions.
  • Mostly pre­cise scan­ning (stacks and reg­is­ters are scanned conservatively).
  • Copy­ing minor collector.
  • Two major col­lec­tors: Copy­ing and Mark&Sweep.
  • Per-thread frag­ments for fast per-thread allocation.
  • Uses write bar­ri­ers to min­i­mize the work done on minor collections.

To fully under­stand what this new garbage col­lec­tor does you most prob­a­bly need to read this and take a look inside the mono s-gen garbage col­lec­tor code.

So what we did was tak­ing the old and the new garbage col­lec­tor and our GraphDB and let them iter­ate through an auto­mated test which basi­cally runs 200.000 insert queries which result in more than 3.4 mil­lion edges between more than 120.000 objects. The results were impres­sive when we com­pared the old mono garbage col­lec­tor to the new mono-sgen garbage collector.

When we plot­ted a basic graph of the mea­sure­ments we got that:

monovsmono-sgen

On the x-axis it’s the num­ber of inserts and on the y-axis it’s the time it takes to answer one query. So it’s a great mea­sure­ment to see how big actu­ally the impact of the garbage col­lec­tor is on a com­plex appli­ca­tion like the sones GraphDB.

The red curve is the old Boehm-Demers-Weiser con­ser­v­a­tive garbage col­lec­tor built into cur­rent sta­ble ver­sions of mono. The blue curve is the new SGEN garbage col­lec­tor which can be used by invok­ing Mono using the “mono-sgen” com­mand instead of the “mono” com­mand. Since mono-sgen is not included in any sta­ble build yet it’s nec­es­sary to build mono from source. We doc­u­mented how to do that here.

So what are we actu­ally see­ing in the chart? We can see that mono-sgen draws a fairly lin­ear line in com­par­i­son to the old mono garbage col­lec­tor. It’s easy to tell why the blue curve is ris­ing – it’s because the num­ber of objects is grow­ing with each mil­lisec­ond. The blue line is just what we are expect­ing from a hard work­ing garbage col­lec­tor. To our sur­prise the old garbage col­lec­tor seems to have prob­lems to cope with the num­ber of objects over time. It spikes sev­eral times and in the end it even gets worse by spik­ing all over the place. That’s what we don’t want to see hap­pen­ing anywhere.

The con­clu­sion is that if you are run­ning some­thing that does more than print­ing out “Hello World” on Mono you surely want to take a look at the new mono-sgen garbage col­lec­tor. If you’re plan­ning to run the sones GraphDB on Mono we highly rec­om­mend to use mono-sgen.

For an eas­ier access to any sones GraphDB instance you can use the our new web based shell. It is based on well-known tech­nolo­gies and libraries like HTML, JavaScript, JQuery and our REST API. You can choose between a text-based out­put for­mat like shown in the fol­low­ing screen shot, a XML out­put for­mat or a JSON out­put format.

WebShell-01-small

The Web­Shell ist Open Source Soft­ware and licensed under the New BSD License.

To get access to your own per­sonal test instance go to the sones home­page and reg­is­ter a new account. When you’re logged in you’re only one click away from access­ing the sones graphDB Web­Shell. You can log in using the given URL and the user­name and password.