Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist registry ID in block #445

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from
Draft

Persist registry ID in block #445

wants to merge 15 commits into from

Conversation

nbenn
Copy link
Collaborator

@nbenn nbenn commented Nov 4, 2024

In order to look up data from the block registry (such as input/output types) for a given block, we need to be able to identify the registry entry from which a block was created.

@nbenn nbenn mentioned this pull request Nov 4, 2024
@DivadNojnarg
Copy link
Collaborator

@nbenn while testing this branch, I realised there might be an issue in the stack code. Only the first block gets the new registry_id attribute. Checking the code, it looks like we don't call construct_block when the stack has 1 block or more:

if (!length(stack)) {

    tmp <- initialize_block(
      construct_block(block)
    )

  } else {

    tmp <- initialize_block(
      do.call(block, list(position = position)),
      data
    )
  }

I'd go for this instead:

...
tmp <- construct_block(block)

  if (!length(stack)) {
    tmp <- initialize_block(tmp)
  } else {
    attr(tmp, "position") <- position
    tmp <- initialize_block(tmp, data)
  }

@nbenn
Copy link
Collaborator Author

nbenn commented Nov 4, 2024

@DivadNojnarg, yes, my bad, that was a bit sloppy on my part. Should be fixed now.

@nbenn
Copy link
Collaborator Author

nbenn commented Nov 7, 2024

@DivadNojnarg not really tested this well yet, but everything we discussed is in place I believe. Any thoughts? Does this look good to you? Should fix all issues we had (as discussed in #436).

@DivadNojnarg
Copy link
Collaborator

@nicolas. I continued to play around with the registry PR. What I tested so far works rather well. I have a remark regarding the result block that appears when it should not (see below). You can reproduce it by running the code I published below with the palmer penguins.
Screenshot 2024-11-12 at 08 36 30

My only potential concern is the breaking change induced by the new input and output system and the impact it would have on @JohnCoene' packages. All blockr side packages we maintain (like blockr.ggplot2 and cie) as well as in our internal blockr vignettes, input and output are not using the new system and would need an update. An example, if I want to generate the palmer penguins stack interactively, I have to update the code to create missing methods for plot_block and plot_layer_block. I does not seem to much work to update but I don't know about the impact on the blocks used by the DPP, blockr.composer, composer ... (cc @MikeJohnPage).

new_ggplot_block <- function(col_x = character(), col_y = character(), ...) {

  data_cols <- function(data) colnames(data)

  new_block(
    fields = list(
      x = new_select_field(col_x, data_cols, type = "name"),
      y = new_select_field(col_y, data_cols, type = "name")
    ),
    expr = quote(
      ggplot(mapping = aes(x = .(x), y = .(y)))
    ),
    class = c("ggplot_block", "plot_block"),
    ...
  )
}

new_geompoint_block <- function(color = character(), shape = character(), ...) {

  data_cols <- function(data) colnames(data$data)

  new_block(
    fields = list(
      color = new_select_field(color, data_cols, type = "name"),
      shape = new_select_field(shape, data_cols, type = "name")
    ),
    expr = quote(
      geom_point(aes(color = .(color), shape = .(shape)), size = 2)
    ),
    class = c("geompoint_block", "plot_layer_block", "plot_block"),
    ...
  )
}

block_input_check.plot_block <- function(x, data, ...) {

  if (inherits(data, "data.frame")) {
    return(invisible(NULL))
  }

  input_failure("Expecting data.frame input.")
}

block_output_ptype.plot_block <- function(x, ...) ggplot()

block_input_check.plot_layer_block <- function(x, data, ...) {
  if (inherits(data, "ggplot")) {
    return(invisible(NULL))
  }

  input_failure("Expecting ggplot input.")
}

block_output_ptype.plot_layer_block <- function(x, ...) ggplot()

register_blocks(
  constructor = c(new_ggplot_block, new_geompoint_block),
  name = c("ggplot block", "geompoint block"),
  description = c(
    "Builds a ggplot object",
    "Add points geom to ggplot object"
  ),
  package = "blockr.demo",
  category = c("custom data", "visualization", "visualization")
)

stack <- new_stack(data_block = new_dataset_block("penguins", "palmerpenguins"))
serve_stack(stack)

@MikeJohnPage
Copy link
Collaborator

We are still finalising the API for the tables, listings and figures in composer, so we will need to reimplenet all of the blockr functionality in any case. So implementing new methods shouldn't be a concern for this particularly package.

@nbenn
Copy link
Collaborator Author

nbenn commented Nov 12, 2024

@DivadNojnarg thanks for checking this out and your feedback. To your two issues:

  1. The result block showing up where it should not: currently the rule is only this:

    blockr/R/block-core.R

    Lines 362 to 364 in c0c9987

    if (length(get_workspace_stacks()) >= 2L) {
    return(invisible(NULL))
    }

    easy to tweak; will do so.

  2. This representing a breaking change: yes, for any block that does not inherit from either data_block or transform_block, we need an implementation for block_input_check() and block_output_ptype(). This was the least intrusive was of doing this that I could come up with, as it does not require changes that inherit from "core" block classes (as long as the defaults work for the given block class). If you have good ideas for "lower-level" defaults, we can provide implementations for the block class to be used as fallback. But I don't think this is sensible, as it will most likely have to be overridden every time anyways (and a straight up error is easier I feel; happy to use the base-class for a better error msg though).

@nbenn
Copy link
Collaborator Author

nbenn commented Nov 12, 2024

@DivadNojnarg we could do something like

block_input_check.block <- function(x, data, ...) TRUE
block_output_ptype.block <- function(x, ...) list()

But then for example the plot_blocks are "compatible" with everything.

Let me know what you think.

@DivadNojnarg
Copy link
Collaborator

But then for example the plot_blocks are "compatible" with everything.

Seems more annoying. If others are ok with our change, I'd keep what we have.

@DivadNojnarg
Copy link
Collaborator

On a side note, are we going to always have this pattern?

block_input_check.plot_block <- function(x, data, ...) {

  if (inherits(data, "<PTYPE>")) {
    return(invisible(NULL))
  }

  input_failure("Expecting data.frame input.")
}

If yes, we could have a helper such as:

block_input_check_builder <- function(ptype) {
  function(x, data, ...) {
     if (inherits(data, ptype)) {
       return(invisible(NULL))
     }
     input_failure(sprintf("Expecting %s input.", ptype))
   }
}

block_input_check_builder("data.frame")

People are free to use it or not ...

@DivadNojnarg
Copy link
Collaborator

@nbenn add_block would need a little update:

add_block <- function(stack, block, position = NULL) {
  if (length(stack) == 0) {
    # we can't check check inherit because
    # this comes from the registry which is of class
    # block_descr
    if (!"data_block" %in% attr(block, "classes")) {
      stop("The first block must be a data block.")
    }
  }

With the current code base, there is some redundancy. In the server function, we now use get_compatible_block in the add_block_server module, which returns a compatible block to add_block_stack in the stack server function:

add_block_stack(
          block_to_add = available_blocks()[[block_to_add$selected()]], # pass in block constructor
          position = NULL,
          vals = vals
        )

add_block_stack in turns calls add_block:

add_block_stack <- function(
    block_to_add,
    position = NULL,
    vals,
    session = getDefaultReactiveDomain()) {
  vals$stack <- add_block(vals$stack, block_to_add, position)
...
}

On the other hand, it could happen that people forget to call get_compatible_block in their custom designed "add block" server functions.

What do you think would be the best?

@DivadNojnarg
Copy link
Collaborator

@nbenn I just found this a bit annoying. Assuming you have a package with custom blocks that are registered on load. Now say, we add other blocks which require some new block_input_check and block_output_ptype methods. When you call devtools::document() to add these new methods to the namespace, this yields:

Error in `get_s3_method()`:
! no method found for generic block_input_checkand classes plot_block, block

since the block registration happens on load in zzz.R and the method isn't in the namespace yet. The consequence: the new method can't be registered. Workaround: comment out the registration hook and rerun devtools::document().

}

input_failure("Expecting at least two stacks.")
NextMethod()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nbenn Spotted a new issue when checking the tests:

Error: Get compatible blocks
Error in `NextMethod()`: 'NextMethod' called from an anonymous function

Looking at are_blocks_compatible, problem is that the stored function inside attr(x, "input") is anonymous.

Code to reproduce:

stack1 <- new_stack(
    new_dataset_block,
    new_filter_block
  )

  stack2 <- new_stack()

  set_workspace(stack1 = stack1, stack2 = stack2)
serve_workspace(clear = FALSE)

Then, click on the + button for the first stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants