-
Notifications
You must be signed in to change notification settings - Fork 26
Quick Start
The library supports the most common graph paradigm and include the following features:
- Directed: Edges can have a direction. Graphs can be directed, undirected or mixed.
- Weighted: Edges can have a weight.
- Self-loops: Nodes can have self-loops.
- Labelled edges: Edges can have a label.
- Properties: Each element in the graph can have properties associated to it.
- Time-based: Use timestamp or intervals to represent element and values over time.
This representation is compatible with the Property Graph Model.
This section explains the fundamental concepts to get started using the library.
The graph is the representation of the graph structure and is essentially a list of nodes and a list of edges.
Edges can be directed or undirected. A graph that contains a mixture of directed and undirected edges is called a 'mixed' graph.
Two directed edges of opposite direction between two nodes are called 'mutual' edges.
An element is either a node or an edge and is part of the graph. Elements can have attributes associated to them such as 'age' or 'gender'. In the library, the Element concept is useful to provide a common set of features for both nodes and edges.
An attribute is a specific piece of information associated to an element. It has a type and is identified by a key. For instance, each node could have a 'age' attribute of type 'int'.
A column defines an attribute key and type. This concept is similar to what a column is in a spreadsheet. In addition of the type and the name, a column can also have a title and a default value.
A table is the container for columns. Each graph has two tables: a node table and an edge table.
An index holds the list of elements that have a specific attribute value. For instance, if nodes have a 'gender' column the index can be queried to retrieve the list of 'male' nodes.
An edge label characterise the type of relationship between the two nodes. This is optional and when not provided a 'default' label is assigned. We say 'parallel edges' when multiple edges exist between two nodes with different labels.
A view is a mask on the graph structure and represents a subgraph. The user controls the set of nodes and edges in the view.
The timestamp is an instant in time and can be associated with elements and attributes to represents graphs over time.
Intervals are the continuous counterpart of timestamps. Each interval has a 'start' and an 'end'.
Different features can be customized with the following settings:
-
Node ID type (Class): The type of the node ID column. Default is
String.class
-
Edge ID type (Class): The type of the edge ID column. Default is
String.class
-
Edge Label type (Class): The type of the edge Label objects. Default is
String.class
-
Edge Weight type (Class): The type of the edge weight. Default is
Double.class
-
Spatial Index (Boolean): Whether to keep a spatial index based on the nodes' positions. Default is
False
.
For dynamic graphs (i.e. graphs over time):
-
Time representation (Enum): Either
INTERVAL
orTIMESTAMP
. Default isTIMESTAMP
.
The graph model is the entry point for all features and represents a single graph.
GraphModel gm = GraphModel.Factory.newInstance();
The element creation pass through a factory. The elements are just created at this point and aren't added to the graph.
Simple example that creates two nodes and one directed edge
GraphModel gm = GraphModel.Factory.newInstance();
GraphFactory gf = gm.factory();
Node n1 = gf.newNode();
Node n2 = gf.newNode();
Edge e = gf.newEdge(n1, n2);
Edges can also be created undirected and/or with a type and a weight:
// Source, Target, Directed
Edge e = gf.newEdge(n1, n2, false)
Multiple edges between two nodes can be created with different labels:
int type1 = gm.addEdgeType("friend");
int type2 = gm.addEdgeType("coworker");
// Source, Target, Type, Directed
Edge e1 = gf.newEdge(n1, n2, type1, false);
Edge e2 = gf.newEdge(n1, n2, type2, false);
GraphModel gm = GraphModel.Factory.newInstance();
GraphFactory gf = gm.factory();
Node n1 = gf.newNode();
Node n2 = gf.newNode();
Edge e = gf.newEdge(n1, n2);
Graph g = gm.getGraph();
g.addNode(n1);
g.addNode(n2);
g.addEdge(e);
Graph g = ...
for(Node n : graph.getNodes()) {
}
Graph g = ...
for(Edge e : graph.getEdges()) {
}
Attributes are organized by columns so columns need to be created before values can be associated with elements
GraphModel gm = GraphModel.Factory.newInstance();
Column col = gm.getNodeTable().addColumn("Age", Integer.class);
Node node = ...
node.setAttribute(col, 30);
Graph graph = ...
Node node = ...
int nodeCount = graph.getNodeCount();
int edgeCount = graph.getEdgeCount();
graph.getDegree(node);
A custom configuration can be passed to the newInstance()
method:
Configuration config = Configuration.builder().build();
GraphModel gm = GraphModel.Factory.newInstance(config);
The configuration can't be modified once the graph model has been created.
By default, GraphStore automatically locks the graph structure using a ReentrantReadWriteLock
behind the scene. This is done automatically when calling read or write functions, including iterators. If you would like to manage multithreading yourself, you can disable auto locking:
Configuration config = Configuration.builder()
.enableAutoLocking(false).build();
GraphModel gm = GraphModel.Factory.newInstance(config);
Depending on the use-case, not all GraphStore features might be necessary. To reduce memory usage, one can create a custom configuration that disable all features that take extra memory per element.
Configuration config = Configuration.builder()
.enableIndexNodes(false)
.enableIndexEdges(false)
.enableIndexTime(false)
.enableNodeProperties(false)
.enableEdgeProperties(false).build();
GraphModel gm = GraphModel.Factory.newInstance(config);
GraphStore stores its nodes and edges in arrays. Directly using indices might be handy for some algorithms.
The underlying array index can be accessed via the Element.getStoreId()
method. In addition, the Graph
has a getNodeByStoreId()
and getEdgeByStoreId
method. Finally, GraphModel
has a getMaxNodeStoreId()
and getMaxEdgeStoreId()
method.
Graph graph = ...
Node node = ...
int index = node.getStoreId();
node = graph.getNodeByStoreId(index);
Note that not all consecutive indices may be assigned. When elements are deleted, the store ids will be re-used.