Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

as.network.data.frame does not follow as.network.matrix semantics #65

Open
CarterButts opened this issue Sep 19, 2021 · 0 comments
Open
Labels

Comments

@CarterButts
Copy link
Contributor

(This is related to, but not exactly the same as, the issue with two-mode adjacency processing with as.matrix.data.frame.)

When passed a matrix as input, network and as.network will use as.network.matrix to attempt to coerce the data into a network object. How to do this is complex, because there are many types of matrix representations of relational data, all of which are limited and/or leave ambiguities in what they represent. Thus, we use a combination of intelligent defaults and user-generated cues to figure out what behavior is desired. Among these is the use of which.matrix.type to guess whether an input matrix is a (1) standard adjacency matrix, (2) two-mode matrix, (3) incidence matrix, or (4) edge list, along with supplied prompts (e.g., the bipartite or loops attributes, or the matrix.type argument as.network.matrix, where given). We also follow some conventions on data handling, such as ignoring the diagonal of an adjacency matrix when loops==FALSE, which follows typical practice in the social network community (where diagonals are taken as undefined when loops are undefined, and anything might be stuck in those entries).

Currently, as.network.data.frame is not following all of the same semantics as as.network.matrix, which is problematic: wrapping a matrix in as.data.frame should not change the behavior of any network coercion function, in cases where the matrix version is well-defined. (It can extend matrix behavior, but never replace it.) This issue surfaced with the specific example of the above-cited two-mode issue, and I do not have an exhaustive list at this point. But here are examples that I see:

  • As noted, two-mode adjacency matrices are not processed as two-mode adjacency matrices.
  • Passing an adjacency structure with loops results in an error with loops==FALSE, instead of ignoring diagonal entries.
  • Setting loops=TRUE with an adjacency structure having replicated rows leads to an error about multiple edges that seems to imply that the function is interpreting the matrix as if it were an edge list or transposed incidence matrix. (This even holds for rows that are all zero!)

There may be more. The needed fix is that as.network.data.frame must replicate the behavior of as.network.matrix where the former is well-defined, only extending it where a data.frame offers functionality that a matrix cannot supply - otherwise, one gets capricious changes in behavior based on e.g. how data is imported or stored (and, indeed, we are already seeing problems resulting from this). A second needed fix is that any differences need to be documented. (But there really should be many as noted above.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant