Fork me on GitHub

Kategorie: benchmark

Today we added a plug-in to the 2.1 ver­sion of the com­mu­nity edi­tion of GraphDB which is called “FastIm­port”. It’s basi­cally a bulk import plug-in which takes a pro­pri­etary XML for­mat as input and imports ver­tices and edges into a run­ning GraphDB instance.

In order to use this new plug-in and import fea­ture you need to know that an import basi­cally splits into a two-stage process:

  1. scheme setup
  2. fast-import

So first you’ll have to define which ver­tex and edge types get imported by the fol­low­ing step 2 – you nor­mally do this using the GraphQL and spec­i­fy­ing sev­eral ver­tex types. For demon­stra­tion pur­poses we take a small social net­work with only one ver­tex type:

CREATE VERTEX TYPE User ATTRIBUTES (String Name, Int64 Age, Set<User> Friends)

After hav­ing set-up the scheme the only thing left is to actu­ally call the import plug-in using another short GraphQL query:

IMPORT FROM ‘file:\\100k_import.xml’ FORMAT FastImport

This will, for exam­ple, take the 100.000 user dataset and import it into the cur­rent GraphDB instance. Of course we did that already for you so here are the com­par­i­son results between a GraphQL and FastIm­port and the per­sis­tent and In-Memory ver­sion of GraphDB:

in-Memory_import

persistent_import

 

Of course you can also down­load the data-sets used in this small bench­mark here:

10.000 Users / 592.374 edges : GraphQL Import , FastIm­port

100.000 Users / 5.944.332 edges : GraphQL Import, FastIm­port

For many sce­nar­ios it’s impor­tant to know how a data­base per­forms. Espe­cially these days when the num­ber of data­bases seem to grow by the day and a choice is hard to make.

To demon­strate how sones GraphDB per­forms at given use-cases we cre­ated a bench­mark frame­work and tool which basi­cally divides bench­mark­ing into two steps:

  1. Gen­er­ate and/or Import use-case spe­cific data and mea­sure the performance

  2. Exe­cute use-case spe­cific algo­rithms on the graph and mea­sure the performance

Because there are many dif­fer­ent use-cases these both steps are made up by plug-ins which can be adressed using the com­man­d­line which is inte­grated into the bench­mark tool.

The frame­work, tool and plug-ins are released as AGPLv3 licensed Open­Source soft­ware and can be down­loaded here.

We dis­trib­ute the source code mainly because it’s the best way for you to repro­duce the results and take a look at what actu­ally is being tested, the other main cause is that we want every­body to be able to bench­mark and test their own algo­rithms on GraphDB.

fetch

Source 1: https://github.com/sones/benchmark
Source 2: http://developers.sones.de/wiki/doku.php?id=benchmarks