Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Citations and cross-references in Quarto #215

Closed
andrewheiss opened this issue Mar 28, 2024 · 18 comments
Closed

Citations and cross-references in Quarto #215

andrewheiss opened this issue Mar 28, 2024 · 18 comments

Comments

@andrewheiss
Copy link

When using Markdown inside tables with Quarto, Quarto ignores the content and will not parse it. That's ordinarily okay—using format_tt(..., markdown = TRUE) will format most things just fine.

It gets tricky with syntax that Quarto should parse, like cross references and citations. For instance, take this:

---
title: "Reference stuff"
references:
- type: article-journal
  id: Lovelace1842
  author:
  - family: Lovelace
    given: Augusta Ada
  issued:
    date-parts:
    - - 1842
  title: >-
    Sketch of the analytical engine invented by Charles Babbage, by LF Menabrea, 
    officer of the military engineers, with notes upon the memoir by the translator
  title-short: Molecular structure of nucleic acids
  container-title: Taylor’s Scientific Memoirs
  volume: 3
  page: 666-731
  language: en-GB
---

```{r}
library(tinytable)

x <- data.frame(Thing = 1234, Citation = "@Lovelace1842")
tt(x)
```

It emits this HTML:

<table>
  <thead>
    <tr>
      <th scope="col" class="tinytable_css_9qmquh7a5tfly3l7oiyn">Thing</th>
      <th scope="col" class="tinytable_css_9qmquh7a5tfly3l7oiyn">Citation</th>
    </tr>
  </thead>
  
  <tbody>
    <tr>
      <td>1234</td>
      <td>@Lovelace1842</td>
    </tr>
  </tbody>
</table>

The @Lovelace1842 citation key isn't parsed and appears in the table:

image

Quarto has the ability to treat specific text as Markdown, though, if you wrap it in an element with a data-qmd attribute set. A td element containing this should render as an actual citation:

<td> <span data-qmd="@Lovelace1842"></span> </td>

This is an issue with all table-making packages (see here for a discussion about it at Quarto quarto-dev/quarto-cli#3340). {gt} has it fixed and there's an open issue at {knitr} for it, with more details too: yihui/knitr#2289


I don't know the best way to handle this with {tinytable} though. format_tt(..., markdown = TRUE) uses the {markdown} package to convert to HTML rather than Quarto, and that's great.


One additional complication is that this also doesn't work in LaTeX, and neither does {gt}, but knitr::kable() does somehow (see quarto-dev/quarto-cli#3340 (comment)).

@vincentarelbundock

This comment was marked as outdated.

@vincentarelbundock

This comment was marked as outdated.

@vincentarelbundock
Copy link
Owner

OK, I figured this out.

Proof of concept with bad user interface:

https://vincentarelbundock.github.io/tinytable/vignettes/tinytable.html#quarto-data-processing

Background on two issues:

  1. Quarto normally does a ton of pre-processing on all tables. By default, tinytable disables that pre-processing, because it breaks a bunch of features.
  2. Even when the pre-processing gets done, Quarto still requires users to specifically mark a cell with the special span: <span data-qmd="@Lovelace1842"></span>

On Github, I added a new global option to re-enable Quarto pre-processing. I also added an example to the vignette with a reference.

The user experience is terrible, but it works.

I'm not sure how to make the experience better. We can't enable pre-processing all the time, because there are tons of conflicts with nice features and styles.

What does <span data-qmd> do, exactly? Will this always interpret markdown?

Should we insert that span automatically when the global option is set and a user calls format_tt(markdown = TRUE)? Is this span a complete substitute to the markdown package?

@andrewheiss
Copy link
Author

Oh cool, yeah, this is roughly what {gt} does too. You have to disable quarto processing in an option—see quarto.disable_processing here:

library(gt)
k <- data.frame(Thing = "x^2^", Citation = "@Lovelace1842")

k |>
  gt() |>
  tab_options(
    quarto.disable_processing = TRUE
  )

I'm fairly certain that when rendering to HTML, Quarto assumes that the content of chunks that create tables (like {gt}, {kableExtra}, {tinytable} and friends) is HTML. If that content has markdown content that needs to be parsed at the time of the whole document (like citation keys and cross references), it won't because Quarto assumes that all the formatting has been done (like with the markdown package for {tinytable}, or whatever {gt} uses to create its HTML), and the table is ready to go. The special span tells Quarto to do some further processing on those cells (e.g. parse the citation key).

There's also a similar feature for LaTeX—there's a \QuartoMarkdownBase64{} command. See quarto-dev/quarto-cli#9342 for how it works, and for a bug where it works for cross references but not for citations (because of the complexity of Quarto's Lua filter ordering)

@vincentarelbundock
Copy link
Owner

Thanks for the info. I've added a quarto argument to format_tt(). This means you can now do things like:

Mark a single cell for Quarto processing:

k <- data.frame(Thing = "qwerty", Citation = "@Lovelace1842")

tt(k) |> format_tt(i = 1, j = 2, quarto = TRUE)

Apply Quarto data processing to all tables using a theme and a global option:

theme_quarto <- function(x) format_tt(x, quarto = TRUE)
options(tinytable_tt_theme = theme_quarto)

tt(k)

@vincentarelbundock
Copy link
Owner

Closing this now, but feel free to open separate issues if you run into issues that you believe can be fixed on tinytable's end.

@giabaio
Copy link

giabaio commented Apr 18, 2024

Sorry to jump back into this, but is the fix given by the extra quarto option to work on HTML output only? Or is it supposed to do on pdf output too? If I run the example above with the supposed tinytable fix and try to output to pdf I get a bunch of (Lua-related) errors...

---
title: "Reference stuff"
references:
- type: article-journal
  id: Lovelace1842
  author:
  - family: Lovelace
    given: Augusta Ada
  issued:
    date-parts:
    - - 1842
  title: >-
    Sketch of the analytical engine invented by Charles Babbage, by LF Menabrea, 
    officer of the military engineers, with notes upon the memoir by the translator
  title-short: Molecular structure of nucleic acids
  container-title: Taylor’s Scientific Memoirs
  volume: 3
  page: 666-731
  language: en-GB
---

```{r}
library(tinytable)
theme_quarto <- function(x) format_tt(x, quarto = TRUE)
options(tinytable_tt_theme = theme_quarto)

k <- data.frame(Thing = "qwerty", Citation = "@Lovelace1842")

tt(k) |> format_tt(quarto = TRUE)

options(tinytable_tt_theme=NULL)
```

Everything works OK when output to html...

@vincentarelbundock
Copy link
Owner

@giabaio are you using the development version from GitHub? If so, what specific errors are you getting?

@giabaio
Copy link

giabaio commented Apr 18, 2024

I am on the Github version. Here's the error message

Error running filter /home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/resources/filters/main.lua:
...arto-cli/src/resources//pandoc/datadir/lpegshortcode.lua:289: invalid UTF-8 code
stack traceback:
	...arto-cli/src/resources//pandoc/datadir/lpegshortcode.lua:289: in upvalue 'escape_unicode'
	...arto-cli/src/resources//pandoc/datadir/lpegshortcode.lua:307: in function 'lpegshortcode.wrap_lpeg_match'
	(...tail calls...)
	...qmd/quarto-cli/src/resources//pandoc/datadir/readqmd.lua:129: in function 'readqmd.readqmd'
	...qmd/quarto-cli/src/resources/filters/./common/pandoc.lua:216: in function 'string_to_quarto_ast_blocks'
	...i/src/resources/filters/./normalize/extractquartodom.lua:83: in function <...i/src/resources/filters/./normalize/extractquartodom.lua:70>
	[C]: in ?
	[C]: in method 'walk'
	...d/quarto-cli/src/resources/filters/./ast/customnodes.lua:76: in function <...d/quarto-cli/src/resources/filters/./ast/customnodes.lua:65>
	(...tail calls...)
	.../quarto-cli/src/resources/filters/./ast/runemulation.lua:82: in local 'callback'
	.../quarto-cli/src/resources/filters/./ast/runemulation.lua:100: in upvalue 'run_emulated_filter_chain'
	.../quarto-cli/src/resources/filters/./ast/runemulation.lua:136: in function <.../quarto-cli/src/resources/filters/./ast/runemulation.lua:133>
stack traceback:
	...d/quarto-cli/src/resources/filters/./ast/customnodes.lua:76: in function <...d/quarto-cli/src/resources/filters/./ast/customnodes.lua:65>
	(...tail calls...)
	.../quarto-cli/src/resources/filters/./ast/runemulation.lua:82: in local 'callback'
	.../quarto-cli/src/resources/filters/./ast/runemulation.lua:100: in upvalue 'run_emulated_filter_chain'
	.../quarto-cli/src/resources/filters/./ast/runemulation.lua:136: in function <.../quarto-cli/src/resources/filters/./ast/runemulation.lua:133>
ERROR: Error
    at renderFiles (file:///home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/command/render/render-files.ts:350:23)
    at eventLoopTick (ext:core/01_core.js:153:7)
    at async render (file:///home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/command/render/render-shared.ts:102:18)
    at async Command.actionHandler (file:///home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/command/render/cmd.ts:248:26)
    at async Command.execute (file:///home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/vendor/deno.land/x/[email protected]/command/command.ts:1948:7)
    at async Command.parseCommand (file:///home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/vendor/deno.land/x/[email protected]/command/command.ts:1780:14)
    at async quarto (file:///home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/quarto.ts:156:3)
    at async file:///home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/quarto.ts:170:5
    at async mainRunner (file:///home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/core/main.ts:35:5)
    at async file:///home/gianluca/Dropbox/Rstuff/Packages/qmd/quarto-cli/src/quarto.ts:160:3

@vincentarelbundock
Copy link
Owner

Weird. I don't get the same error. Can you make sure you are also running the latest Quarto? Maybe even try prerelase if 1.4 doesn't work.

@giabaio
Copy link

giabaio commented Apr 18, 2024

I was on a fairly recent commit on quarto-cli; just updated to the latest, but I still get the same error... I am investigating further too...

@andrewheiss
Copy link
Author

andrewheiss commented Apr 18, 2024

With Quarto 1.5.29 on macOS I'm getting the same error:

---
title: "Reference stuff"
references:
- type: article-journal
  id: Lovelace1842
  author:
  - family: Lovelace
    given: Augusta Ada
  issued:
    date-parts:
    - - 1842
  title: >-
    Sketch of the analytical engine invented by Charles Babbage, by LF Menabrea, 
    officer of the military engineers, with notes upon the memoir by the translator
  title-short: Molecular structure of nucleic acids
  container-title: Taylor’s Scientific Memoirs
  volume: 3
  page: 666-731
  language: en-GB
---

```{r}
library(tinytable)

x <- data.frame(Thing = 1234, Citation = "@Lovelace1842")
tt(x) |> format_tt(quarto = TRUE)
```

Here's the error:

> quarto render testing.qmd --to pdf
Error running filter /Applications/quarto/share/filters/main.lua:
/Applications/quarto/share/pandoc/datadir/lpegshortcode.lua:289: invalid UTF-8 code
stack traceback:
        /Applications/quarto/share/pandoc/datadir/lpegshortcode.lua:289: in upvalue 'escape_unicode'
        /Applications/quarto/share/pandoc/datadir/lpegshortcode.lua:307: in function 'lpegshortcode.wrap_lpeg_match'
        (...tail calls...)
        /Applications/quarto/share/pandoc/datadir/readqmd.lua:129: in function 'readqmd.readqmd'
        /Applications/quarto/share/filters/main.lua:3089: in function 'string_to_quarto_ast_blocks'
        /Applications/quarto/share/filters/main.lua:8502: in function </Applications/quarto/share/filters/main.lua:8489>
        [C]: in ?
        [C]: in method 'walk'
        /Applications/quarto/share/filters/main.lua:535: in function </Applications/quarto/share/filters/main.lua:524>
        (...tail calls...)
        /Applications/quarto/share/filters/main.lua:1312: in local 'callback'
        /Applications/quarto/share/filters/main.lua:1330: in upvalue 'run_emulated_filter_chain'
        /Applications/quarto/share/filters/main.lua:1366: in function </Applications/quarto/share/filters/main.lua:1363>
stack traceback:
        /Applications/quarto/share/filters/main.lua:535: in function </Applications/quarto/share/filters/main.lua:524>
        (...tail calls...)
        /Applications/quarto/share/filters/main.lua:1312: in local 'callback'
        /Applications/quarto/share/filters/main.lua:1330: in upvalue 'run_emulated_filter_chain'
        /Applications/quarto/share/filters/main.lua:1366: in function </Applications/quarto/share/filters/main.lua:1363>

Though also, if there wasn't an error, the citation still wouldn't be processed and the @Lovelace1842 citation key would appear in the table, because right now the \QuartoMarkdownBase64{...} wrapper only works with cross reference keys (e.g., @fig-whatever) and not with citations

@andrewheiss
Copy link
Author

andrewheiss commented Apr 18, 2024

Wait, the issue might be here:

out[i, col] <- sprintf('\\QuartoMarkdownBase64{%s}', ori[i, col, drop = TRUE])

I might be reading the code wrong here and it might already be doing it elsewhere in that file, but the content inside \QuartoMarkdownBase64{...} (or %s in the code now) needs to be base64-encoded

base64enc::base64encode(charToRaw("@Lovelace1842"))
#> QExvdmVsYWNlMTg0Mg==

If you put the citation key in the data.frame as the base64-encoded version, it will render to PDF just fine:

```{r}
library(tinytable)

x <- data.frame(Thing = 1234, Citation = "QExvdmVsYWNlMTg0Mg==")
tt(x) |> format_tt(quarto = TRUE)
```

Here's the PDF—the @Lovelace1842 is still there because of the Quarto issue, but the file renders at least:

image

@giabaio
Copy link

giabaio commented Apr 18, 2024

I can replicate this!

@vincentarelbundock
Copy link
Owner

Aaah thanks so much for the deep dive!

I was convinced this worked on my computer but it doesn't. (Was on the move without computer; sorry!)

I just pushed a new commit on Github which should at least give us compilation, as shown in Andrew's last post.

Thanks both!

@giabaio
Copy link

giabaio commented Apr 18, 2024

Thank you both! One step closer!... :-)

@andrewheiss
Copy link
Author

Oh awesome! I just looking for a base R, no-external-packages method for base64-encoding to eliminate dependencies, but there isn't one. Everyone seems to use one of these:

  • base64enc::base64encode(charToRaw("@Lovelace1842"))
  • jsonlite::base64_enc("@Lovelace1842")
  • RCurl::base64Encode("@Lovelace1842")

I was this close to looking up the algorithm and trying to figure it out for a custom function here, but you just added base64enc to Suggests, so that fixes that :)

@vincentarelbundock
Copy link
Owner

Oh yeah, adding an option package by Simon Urbanek feels like we're maintaining the spirit of the project 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants