-
Notifications
You must be signed in to change notification settings - Fork 4
/
README.Rmd
211 lines (168 loc) · 8.52 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
---
output: github_document
---
# netrankr <img src="man/figures/logo.png" align="right"/>
[![R-CMD-check](https://github.com/schochastics/netrankr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/schochastics/netrankr/actions/workflows/R-CMD-check.yaml)
[![CRAN Status Badge](https://www.r-pkg.org/badges/version/netrankr)](https://cran.r-project.org/package=netrankr)
[![CRAN Downloads](https://cranlogs.r-pkg.org/badges/netrankr)](https://CRAN.R-project.org/package=netrankr)
[![Codecov test coverage](https://codecov.io/gh/schochastics/netrankr/branch/main/graph/badge.svg)](https://app.codecov.io/gh/schochastics/netrankr?branch=main)
[![JOSS](https://joss.theoj.org/papers/10.21105/joss.04563/status.svg)](https://doi.org/10.21105/joss.04563)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7109041.svg)](https://doi.org/10.5281/zenodo.7109041)
```{r setup, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
fig.align = "center",
out.width = "80%",
comment = "#>",
fig.path = "man/figures/README-",
echo = TRUE,
warning = FALSE,
message = FALSE
)
```
## Overview
The literature is flooded with centrality indices and new ones are introduced
on a regular basis. Although there exist several theoretical and empirical guidelines
on when to use certain indices, there still exists plenty of ambiguity in the concept
of network centrality. To date, network centrality is nothing more than applying indices
to a network:
![](man/figures/flow_old.png)
The only degree of freedom is the choice of index. The package comes with an Rstudio addin (`index_builder()`),
which allows to build or choose from more than 20 different indices. Blindly (ab)using
this function is highly discouraged!
The `netrankr` package is based on the idea that centrality is more than a
conglomeration of indices. Decomposing them in a series of microsteps offers
the posibility to gradually add ideas about centrality, without succumbing to
trial-and-error approaches. Further, it allows for alternative assessment methods
which can be more general than the index-driven approach:
![](man/figures/flow_new.png)
The new approach is centered around the concept of *positions*, which are defined as
the relations and potential attributes of a node in a network. The aggregation
of the relations leads to the definition of indices. However, positions can also
be compared via *positional dominance*, leading to partial centrality rankings and
the option to calculate probabilistic centrality rankings.
For a more detailed theoretical background, consult the [Literature](#literature)
at the end of this page.
________________________________________________________________________________
## Installation
To install from CRAN:
```{r install_cran, eval=FALSE}
install.packages("netrankr")
```
To install the developer version from github:
```{r install_git, eval=FALSE}
# install.packages("remotes")
remotes::install_github("schochastics/netrankr")
```
________________________________________________________________________________
## Simple Example
This example briefly explains some of the functionality of the package and the
difference to an index driven approach. For a more realistic application see
the use case vignette.
We work with the following small graph.
```{r example_graph, warning=FALSE,message=FALSE}
library(igraph)
library(netrankr)
data("dbces11")
g <- dbces11
```
```{r dbces_neutral, echo=FALSE}
library(ggraph)
V(g)$name <- as.character(1:11)
ggraph(g, "stress") +
geom_edge_link0(edge_colour = "grey66") +
geom_node_point(shape = 21, fill = "grey25", size = 8) +
geom_node_text(aes(label = name), col = "white") +
theme_graph()
```
Say we are interested in the most central node of the graph and simply compute some
standard centrality scores with the `igraph` package. Defining centrality indices
in the `netrankr` package is explained in the centrality indices vignette.
```{r cent,warning=FALSE}
cent_scores <- data.frame(
degree = degree(g),
betweenness = round(betweenness(g), 4),
closeness = round(closeness(g), 4),
eigenvector = round(eigen_centrality(g)$vector, 4),
subgraph = round(subgraph_centrality(g), 4)
)
# What are the most central nodes for each index?
apply(cent_scores, 2, which.max)
```
```{r dbces_color, echo=FALSE}
V(g)$col <- "none"
V(g)$col[apply(cent_scores, 2, which.max)] <- names(apply(cent_scores, 2, which.max))
V(g)$lab <- ""
V(g)$lab[apply(cent_scores, 2, which.max)] <- stringr::str_to_upper(stringr::str_extract(names(apply(cent_scores, 2, which.max)), "^[a-z]"))
ggraph(g, "stress") +
geom_edge_link0(edge_colour = "grey66") +
geom_node_point(aes(fill = col), shape = 21, size = 8) +
geom_node_text(aes(label = lab), col = "white") +
scale_fill_manual(values = c("#1874CD", "#CD2626", "#EEB422", "#9A32CD", "#4D4D4D", "#EE30A7")) +
theme_graph() +
theme(legend.position = "none")
```
As you can see, each index assigns the highest value to a different vertex.
A more general assessment starts by calculating the neighborhood inclusion preorder.
```{r ex_ni}
P <- neighborhood_inclusion(g)
P
```
[Schoch & Brandes (2016)](https://doi.org/10.1017/S0956792516000401) showed that
`P[u,v]=1` implies that u is less central than v for
centrality indices which are defined via specific path algebras. These include
many of the well-known measures like closeness (and variants), betweenness (and variants)
as well as many walk-based indices (eigenvector and subgraph centrality, total communicability,...).
Neighborhood-inclusion defines a partial ranking on the set of nodes. Each ranking
that is in accordance with this partial ranking yields a proper centrality ranking.
Each of these ranking can thus potentially be the outcome of a centrality index.
Using rank intervals, we can examine the minimal and maximal possible rank of each node.
The bigger the intervals are, the more freedom exists for indices to rank nodes differently.
```{r partial}
plot(rank_intervals(P), cent_scores = cent_scores, ties.method = "average")
```
The potential ranks of nodes are not uniformly distributed in the intervals. To get
the exact probabilities, the function `exact_rank_prob()` can be used.
```{r ex_p}
res <- exact_rank_prob(P)
res
```
For the graph `g` we can therefore come up with
`r format(res$lin.ext,big.mark = ",")` indices that would rank the nodes differently.
`rank.prob` contains the probabilities for each node to occupy a certain rank.
For instance, the probability for each node to be the most central one is as follows.
```{r most_central}
round(res$rank.prob[, 11], 2)
```
`relative.rank` contains the relative rank probabilities. An entry `relative.rank[u,v]`
indicates how likely it is that `v` is more central than `u`.
```{r rel_rank}
# How likely is it, that 6 is more central than 3?
round(res$relative.rank[3, 6], 2)
```
`expected.ranks` contains the expected centrality ranks for all nodes. They are
derived on the basis of `rank.prob`.
```{r exp_rank}
round(res$expected.rank, 2)
```
The higher the value, the more central a node is expected to be.
**Note**: The set of rankings grows exponentially in the number of nodes and the exact
calculation becomes infeasible quite quickly and approximations need to be used.
Check the benchmark results for guidelines.
________________________________________________________________________________
## Theoretical Background {#literature}
`netrankr` is based on a series of papers that appeared in recent years. If you
want to learn more about the theoretical background of the package,
consult the following literature:
> Schoch, David. (2018). Centrality without Indices: Partial rankings and rank
Probabilities in networks. *Social Networks*, **54**, 50-60.([link](https://doi.org/10.1016/j.socnet.2017.12.003))
> Schoch, David & Valente, Thomas W., & Brandes, Ulrik. (2017). Correlations among centrality indices
and a class of uniquely ranked graphs. *Social Networks*, **50**, 46-54.([link](https://doi.org/10.1016/j.socnet.2017.03.010))
> Schoch, David & Brandes, Ulrik. (2016). Re-conceptualizing centrality in social networks.
*European Journal of Appplied Mathematics*, **27**(6), 971–985.
([link](https://doi.org/10.1017/S0956792516000401))
> Brandes, Ulrik. (2016). Network Positions.
*Methodological Innovations*, **9**, 2059799116630650.
([link](https://dx.doi.org/10.1177/2059799116630650))
## Code of Conduct
Please note that the netrankr project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/1/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.