Skip to content

Quick Start

Mathieu Bastian edited this page Jun 23, 2024 · 8 revisions

Graph

The library supports the most common graph paradigm and include the following features:

  • Directed: Edges can have a direction. Graphs can be directed, undirected or mixed.
  • Weighted: Edges can have a weight.
  • Self-loops: Nodes can have self-loops.
  • Labelled edges: Edges can have a label.
  • Properties: Each element in the graph can have properties associated to it.
  • Time-based: Use timestamp or intervals to represent element and values over time.

This representation is compatible with the Property Graph Model.

Fundamentals

This section explains the fundamental concepts to get started using the library.

Graph

The graph is the representation of the graph structure and is essentially a list of nodes and a list of edges.

Edge

Edges can be directed or undirected. A graph that contains a mixture of directed and undirected edges is called a 'mixed' graph.

Two directed edges of opposite direction between two nodes are called 'mutual' edges.

Element

An element is either a node or an edge and is part of the graph. Elements can have attributes associated to them such as 'age' or 'gender'. In the library, the Element concept is useful to provide a common set of features for both nodes and edges.

Attribute

An attribute is a specific piece of information associated to an element. It has a type and is identified by a key. For instance, each node could have a 'age' attribute of type 'int'.

Column

A column defines an attribute key and type. This concept is similar to what a column is in a spreadsheet. In addition of the type and the name, a column can also have a title and a default value.

Table

A table is the container for columns. Each graph has two tables: a node table and an edge table.

Index

An index holds the list of elements that have a specific attribute value. For instance, if nodes have a 'gender' column the index can be queried to retrieve the list of 'male' nodes.

Edge Label

An edge label characterise the type of relationship between the two nodes. This is optional and when not provided a 'default' label is assigned. We say 'parallel edges' when multiple edges exist between two nodes with different labels.

View

A view is a mask on the graph structure and represents a subgraph. The user controls the set of nodes and edges in the view.

Timestamp

The timestamp is an instant in time and can be associated with elements and attributes to represents graphs over time.

Interval

Intervals are the continuous counterpart of timestamps. Each interval has a 'start' and an 'end'.

Configuration

Different features can be customized with the following settings:

  • Node ID type (Class): The type of the node ID column. Default is String.class
  • Edge ID type (Class): The type of the edge ID column. Default is String.class
  • Edge Label type (Class): The type of the edge Label objects. Default is String.class
  • Edge Weight type (Class): The type of the edge weight. Default is Double.class
  • Spatial Index (Boolean): Whether to keep a spatial index based on the nodes' positions. Default is False.

For dynamic graphs (i.e. graphs over time):

  • Time representation (Enum): Either INTERVAL or TIMESTAMP. Default is TIMESTAMP.

API

Create Graph Model

The graph model is the entry point for all features and represents a single graph.

GraphModel gm = GraphModel.Factory.newInstance();

Create nodes & edges

The element creation pass through a factory. The elements are just created at this point and aren't added to the graph.

Simple example that creates two nodes and one directed edge

GraphModel gm = GraphModel.Factory.newInstance();
GraphFactory gf = gm.factory();

Node n1 = gf.newNode();
Node n2 = gf.newNode();
Edge e = gf.newEdge(n1, n2);

Edges can also be created undirected and/or with a type and a weight:

// Source, Target, Directed
Edge e = gf.newEdge(n1, n2, false)

Create parallel edges

Multiple edges between two nodes can be created with different labels:

int type1 = gm.addEdgeType("friend");
int type2 = gm.addEdgeType("coworker");

// Source, Target, Type, Directed
Edge e1 = gf.newEdge(n1, n2, type1, false);
Edge e2 = gf.newEdge(n1, n2, type2, false);

Get graph and add elements

GraphModel gm = GraphModel.Factory.newInstance();
GraphFactory gf = gm.factory();

Node n1 = gf.newNode();
Node n2 = gf.newNode();
Edge e = gf.newEdge(n1, n2);

Graph g = gm.getGraph();
g.addNode(n1);
g.addNode(n2);
g.addEdge(e);

Iterate over nodes

Graph g = ...
for(Node n : graph.getNodes()) {
}

Iterate over edges

Graph g = ...
for(Edge e : graph.getEdges()) {
}

Set attributes

Attributes are organized by columns so columns need to be created before values can be associated with elements

GraphModel gm = GraphModel.Factory.newInstance();
Column col = gm.getNodeTable().addColumn("Age", Integer.class);

Node node = ...
node.setAttribute(col, 30);

Get sizes and degrees

Graph graph = ...
Node node = ...

int nodeCount = graph.getNodeCount();
int edgeCount = graph.getEdgeCount();

graph.getDegree(node);

Custom configuration

A custom configuration can be passed to the newInstance() method:

Configuration config = Configuration.builder().build();
GraphModel gm = GraphModel.Factory.newInstance(config);

The configuration can't be modified once the graph model has been created.

Multithreading

By default, GraphStore automatically locks the graph structure using a ReentrantReadWriteLock behind the scene. This is done automatically when calling read or write functions, including iterators. If you would like to manage multithreading yourself, you can disable auto locking:

Configuration config = Configuration.builder()
   .enableAutoLocking(false).build();
GraphModel gm = GraphModel.Factory.newInstance(config);

Low memory footprint

Depending on the use-case, not all GraphStore features might be necessary. To reduce memory usage, one can create a custom configuration that disable all features that take extra memory per element.

Configuration config = Configuration.builder()
   .enableIndexNodes(false)
   .enableIndexEdges(false)
   .enableIndexTime(false)
   .enableNodeProperties(false)
   .enableEdgeProperties(false).build();
GraphModel gm = GraphModel.Factory.newInstance(config);

Miscellaneous

Using elements' store ids

GraphStore stores its nodes and edges in arrays. Directly using indices might be handy for some algorithms.

The underlying array index can be accessed via the Element.getStoreId() method. In addition, the Graph has a getNodeByStoreId() and getEdgeByStoreId method. Finally, GraphModel has a getMaxNodeStoreId() and getMaxEdgeStoreId() method.

Graph graph = ...
Node node = ...
int index = node.getStoreId();
node = graph.getNodeByStoreId(index);

Note that not all consecutive indices may be assigned. When elements are deleted, the store ids will be re-used.