Fork me on GitHub

we’ve added a bulk import plug-in

Today we added a plug-in to the 2.1 ver­sion of the com­mu­nity edi­tion of GraphDB which is called “FastIm­port”. It’s basi­cally a bulk import plug-in which takes a pro­pri­etary XML for­mat as input and imports ver­tices and edges into a run­ning GraphDB instance.

In order to use this new plug-in and import fea­ture you need to know that an import basi­cally splits into a two-stage process:

  1. scheme setup
  2. fast-import

So first you’ll have to define which ver­tex and edge types get imported by the fol­low­ing step 2 – you nor­mally do this using the GraphQL and spec­i­fy­ing sev­eral ver­tex types. For demon­stra­tion pur­poses we take a small social net­work with only one ver­tex type:

CREATE VERTEX TYPE User ATTRIBUTES (String Name, Int64 Age, Set<User> Friends)

After hav­ing set-up the scheme the only thing left is to actu­ally call the import plug-in using another short GraphQL query:

IMPORT FROM ‘file:\\100k_import.xml’ FORMAT FastImport

This will, for exam­ple, take the 100.000 user dataset and import it into the cur­rent GraphDB instance. Of course we did that already for you so here are the com­par­i­son results between a GraphQL and FastIm­port and the per­sis­tent and In-Memory ver­sion of GraphDB:

in-Memory_import

persistent_import

 

Of course you can also down­load the data-sets used in this small bench­mark here:

10.000 Users / 592.374 edges : GraphQL Import , FastIm­port

100.000 Users / 5.944.332 edges : GraphQL Import, FastIm­port

write a new comment

*