Consider updating/extending the matrix.type heuristics #62

CarterButts · 2021-06-14T02:00:06Z

From a different thread:

Yeah, the matrix type heuristics have to make some judgment calls, and those are tricky in some cases. Currently, a square matrix is always assumed to be an adjacency matrix if not specified (since it usually is); if it has an n attribute that doesn't match the dimension, then that flags as an error. Regular edgelists with two edges, or sna edgelists with three edges, can be hard to distinguish from valued adjacency matrices. It may be worth revisiting those heuristics (especially since we weren't using extra attributes, IIRC, when they were first created); the help does specify that they are dubious, but one would like them to be as smart as they can reasonably be under the circumstances. One such heuristic might be that if a square matrix has 2 or 3 columns, an n attribute not matching the dimension, and the first two columns contain only values in 1:n, then it's probably an edgelist.

As background, coercing a matrix to a network requires knowing what type of matrix (sociomatrix, edgelist, or incidence matrix) is involved; this is normally specified by the user, but when the user declines to explicitly specify it then we fall back to automagic solutions. Presently, this is controlled by the function which.matrix.type. Unfortunately, it is not always possible to unambiguously determine the matrix type from the data itself, and thus we rely on a series of heuristics that are based in part on common use cases (in fairness, we warn the user in the help pages about this). Over time, our most common use cases have evolved, so some heuristics may no longer be ideal (or, more broadly, we can do better). In particular, edgelists have gone from being a relatively uncommon data type to an extremely widely used one, so it is useful to be sure that we handle these well.

The proximate issue here has to do with edgelists that happen to create square matrices (so two-edge networks for conventional edgelists and three-edge networks for sna edgelists). The current heuristics assume that a square matrix is an adjacency matrix, which is almost always the right answer; however, it would be useful to be able to spot more of these odd cases. The above quote supplies a suggestion in that regard, which may be worth implementing. I am opening the issue mostly so that I don't forget, but also in case others have more cases that we'd like the heuristics to cover.

The text was updated successfully, but these errors were encountered:

CarterButts added the enhancement label Jun 14, 2021

CarterButts self-assigned this Jun 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider updating/extending the matrix.type heuristics #62

Consider updating/extending the matrix.type heuristics #62

CarterButts commented Jun 14, 2021

Consider updating/extending the matrix.type heuristics #62

Consider updating/extending the matrix.type heuristics #62

Comments

CarterButts commented Jun 14, 2021