as.network.data.frame does not follow as.network.matrix semantics #65

CarterButts · 2021-09-19T22:19:44Z

(This is related to, but not exactly the same as, the issue with two-mode adjacency processing with as.matrix.data.frame.)

When passed a matrix as input, network and as.network will use as.network.matrix to attempt to coerce the data into a network object. How to do this is complex, because there are many types of matrix representations of relational data, all of which are limited and/or leave ambiguities in what they represent. Thus, we use a combination of intelligent defaults and user-generated cues to figure out what behavior is desired. Among these is the use of which.matrix.type to guess whether an input matrix is a (1) standard adjacency matrix, (2) two-mode matrix, (3) incidence matrix, or (4) edge list, along with supplied prompts (e.g., the bipartite or loops attributes, or the matrix.type argument as.network.matrix, where given). We also follow some conventions on data handling, such as ignoring the diagonal of an adjacency matrix when loops==FALSE, which follows typical practice in the social network community (where diagonals are taken as undefined when loops are undefined, and anything might be stuck in those entries).

Currently, as.network.data.frame is not following all of the same semantics as as.network.matrix, which is problematic: wrapping a matrix in as.data.frame should not change the behavior of any network coercion function, in cases where the matrix version is well-defined. (It can extend matrix behavior, but never replace it.) This issue surfaced with the specific example of the above-cited two-mode issue, and I do not have an exhaustive list at this point. But here are examples that I see:

As noted, two-mode adjacency matrices are not processed as two-mode adjacency matrices.
Passing an adjacency structure with loops results in an error with loops==FALSE, instead of ignoring diagonal entries.
Setting loops=TRUE with an adjacency structure having replicated rows leads to an error about multiple edges that seems to imply that the function is interpreting the matrix as if it were an edge list or transposed incidence matrix. (This even holds for rows that are all zero!)

There may be more. The needed fix is that as.network.data.frame must replicate the behavior of as.network.matrix where the former is well-defined, only extending it where a data.frame offers functionality that a matrix cannot supply - otherwise, one gets capricious changes in behavior based on e.g. how data is imported or stored (and, indeed, we are already seeing problems resulting from this). A second needed fix is that any differences need to be documented. (But there really should be many as noted above.)

The text was updated successfully, but these errors were encountered:

CarterButts added the bug label Sep 19, 2021

knapply mentioned this issue Sep 30, 2021

as.network.data.frame does not handle two-mode adjacency matrices correctly #64

Open

mbojan mentioned this issue Jul 4, 2023

as.network.data.frame test trips #83

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

as.network.data.frame does not follow as.network.matrix semantics #65

as.network.data.frame does not follow as.network.matrix semantics #65

CarterButts commented Sep 19, 2021

as.network.data.frame does not follow as.network.matrix semantics #65

as.network.data.frame does not follow as.network.matrix semantics #65

Comments

CarterButts commented Sep 19, 2021