-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
goldmark-jupyter: support cell attachments in markdown cells (#5)
This new package provides 2 extensions: - jupyter.Attachments (goldmark) - jupyter.Goldmark (nb)
- Loading branch information
Showing
9 changed files
with
484 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) 2024 Dmytro Solovei | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
# goldmark-jupyter | ||
|
||
From `nbformat` documentation: | ||
|
||
```txt | ||
Markdown (and raw) cells can have a number of attachments, typically inline images, that can be referenced in the markdown content of a cell. 🖇 | ||
(punctuation mine) | ||
``` | ||
|
||
`goldmark-jupyter` helps [`goldmark`](https://github.com/yuin/goldmark) recognise [cell attachments](https://nbformat.readthedocs.io/en/latest/format_description.html#cell-attachments) and include them in the rendered markdown correctly. | ||
|
||
|
||
| `goldmark` | `goldmark-jupyter` | | ||
| ----------- | ----------- | | ||
| ![img](./assets/goldmark.png) | ![img](./assets/goldmark-jupyter.png) | | ||
|
||
## Installation | ||
|
||
```sh | ||
go get github.com/bevzzz/nb/extensions/extra/goldmark-jupyter | ||
``` | ||
|
||
## Usage | ||
|
||
Package `goldmark-jupyter` exports 2 dedicated extensions for `goldmark` and `nb`, which should be used together like so: | ||
|
||
```go | ||
import ( | ||
"github.com/bevzzz/nb" | ||
"github.com/bevzzz/nb/extensions/extra/goldmark-jupyter" | ||
"github.com/yuin/goldmark" | ||
) | ||
|
||
md := goldmark.New( | ||
goldmark.WithExtensions( | ||
jupyter.Attachments(), | ||
), | ||
) | ||
|
||
c := nb.New( | ||
nb.WithExtensions( | ||
jupyter.Goldmark(md), | ||
), | ||
) | ||
|
||
if err := c.Convert(io.Stdout, b); err != nil { | ||
panic(err) | ||
} | ||
``` | ||
|
||
`Attachments` will extend the default `goldmark.Markdown` with a custom link parser and an image renderer. Quite naturally, this renderer accepts `html.Options` which can be passed to the constructor: | ||
|
||
```go | ||
import ( | ||
"github.com/bevzzz/nb/extensions/extra/goldmark-jupyter" | ||
"github.com/yuin/goldmark" | ||
"github.com/yuin/goldmark/render/html" | ||
) | ||
|
||
md := goldmark.New( | ||
goldmark.WithExtensions( | ||
jupyter.Attachments( | ||
html.WithXHTML(), | ||
html.WithUnsafe(), | ||
), | ||
), | ||
) | ||
``` | ||
|
||
Note, however, that options not applicable to image rendering will have no effect. As of the day of writing, `goldmark v1.6.0` references these options when rendering images: | ||
|
||
- `WithXHML()` | ||
- `WithUnsafe()` | ||
- `WithWriter(w)` | ||
|
||
## Contributing | ||
|
||
Thank you for giving `goldmark-jupyter` a run! | ||
|
||
If you find a bug that needs fixing or a feature that needs adding, please consider describing it in an issue or opening a PR. | ||
|
||
## License | ||
|
||
This software is released under [the MIT License](https://opensource.org/license/mit/). |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,206 @@ | ||
// Package jupyter provides extensions for goldmark and nb. Together they add support | ||
// for inline images, which have their data stored as cell attachments, in markdown cells. | ||
// | ||
// How it is achieved: | ||
// | ||
// 1. Goldmark extends nb with a custom "markdown" cell renderer which | ||
// stores cell attachments to the parser.Context on every render. | ||
// | ||
// 2. Attachments extends goldmark with a custom link parser (ast.KindLink) | ||
// and an image NodeRenderFunc. | ||
// | ||
// The parser is context-aware and will get the related mime-bundle from the context | ||
// and store it to node attributes for every link whose destination looks like "attachments:image.png" | ||
// | ||
// Custom image renderer writes base64-encoded data from the mime-bundle if one's present, | ||
// falling back to the destination URL. | ||
package jupyter | ||
|
||
import ( | ||
"io" | ||
"regexp" | ||
|
||
"github.com/bevzzz/nb" | ||
"github.com/bevzzz/nb/extension" | ||
"github.com/bevzzz/nb/schema" | ||
"github.com/yuin/goldmark" | ||
"github.com/yuin/goldmark/ast" | ||
"github.com/yuin/goldmark/parser" | ||
"github.com/yuin/goldmark/renderer" | ||
"github.com/yuin/goldmark/renderer/html" | ||
"github.com/yuin/goldmark/text" | ||
"github.com/yuin/goldmark/util" | ||
) | ||
|
||
// Attachments adds support for Jupyter [cell attachments] to goldmark parser and renderer. | ||
// | ||
// [cell attachments]: https://nbformat.readthedocs.io/en/latest/format_description.html#cell-attachments | ||
func Attachments(opts ...html.Option) goldmark.Extender { | ||
c := html.NewConfig() | ||
for _, opt := range opts { | ||
opt.SetHTMLOption(&c) | ||
} | ||
return &attachments{ | ||
config: c, | ||
} | ||
} | ||
|
||
// Goldmark overrides the default rendering function for markdown cells | ||
// and stores cell attachments to the parser.Context on every render. | ||
func Goldmark(md goldmark.Markdown) nb.Extension { | ||
return extension.NewMarkdown( | ||
func(w io.Writer, c schema.Cell) error { | ||
ctx := newContext(c) | ||
return md.Convert(c.Text(), w, parser.WithContext(ctx)) | ||
}, | ||
) | ||
} | ||
|
||
var ( | ||
// key is a context key for storing cell attachments. | ||
key = parser.NewContextKey() | ||
|
||
// name is the name of a node attribute that holds the mime-bundle. | ||
// This package uses node attributes as a proxy for rendering context, | ||
// so <mime-bundle> will never be added to the HTML output. The name is | ||
// intentionally [invalid] to avoid name-clashes with othen potential attributes. | ||
// | ||
// [invalid]: https://www.w3.org/TR/2011/WD-html5-20110525/syntax.html#attributes-0 | ||
name = []byte("<mime-bundle>") | ||
) | ||
|
||
// newContext adds mime-bundles from cell attachements to a new parse.Context. | ||
func newContext(cell schema.Cell) parser.Context { | ||
ctx := parser.NewContext() | ||
if c, ok := cell.(schema.HasAttachments); ok { | ||
ctx.Set(key, c.Attachments()) | ||
} | ||
return ctx | ||
} | ||
|
||
// linkParser adds base64-encoded image data from parser.Context to node's attributes. | ||
type linkParser struct { | ||
link parser.InlineParser // link is goldmark's default link parser. | ||
} | ||
|
||
func newLinkParser() *linkParser { | ||
return &linkParser{ | ||
link: parser.NewLinkParser(), | ||
} | ||
} | ||
|
||
var _ parser.InlineParser = (*linkParser)(nil) | ||
|
||
func (p *linkParser) Trigger() []byte { | ||
return p.link.Trigger() | ||
} | ||
|
||
// attachedFile retrieves the name of the attached file from the link's destination. | ||
var attachedFile = regexp.MustCompile(`attachment:(\w+\.\w+)$`) | ||
|
||
// Parse stores mime-bundle in node attributes for links whose destination is an attachment. | ||
func (p *linkParser) Parse(parent ast.Node, block text.Reader, pc parser.Context) (n ast.Node) { | ||
n = p.link.Parse(parent, block, pc) | ||
|
||
img, ok := n.(*ast.Image) | ||
if !ok { | ||
// goldmark's default link parser will return a "state node" whenever it's triggered | ||
// by the opening bracket of the link's alt-text "[" or any intermediate characters. | ||
// We only want to intercept when the link is done parsing and we get a valid *ast.Image. | ||
return n | ||
} | ||
|
||
submatch := attachedFile.FindSubmatch(img.Destination) | ||
if len(submatch) < 2 { | ||
return | ||
} | ||
filename := submatch[1] | ||
|
||
att, ok := pc.Get(key).(schema.Attachments) | ||
if att == nil || !ok { | ||
return | ||
} | ||
|
||
// Admittedly | ||
data := att.MimeBundle(string(filename)) | ||
n.SetAttribute(name, data) | ||
return | ||
} | ||
|
||
// image renders inline images from cell attachments. | ||
type image struct { | ||
html.Config | ||
} | ||
|
||
var _ renderer.NodeRenderer = (*image)(nil) | ||
|
||
func (img *image) RegisterFuncs(reg renderer.NodeRendererFuncRegisterer) { | ||
reg.Register(ast.KindImage, img.render) | ||
} | ||
|
||
// render borrows heavily from goldmark's [renderImage]. | ||
// | ||
// [renderImage]: https://github.com/yuin/goldmark/blob/90c46e0829c11ca8d1010856b2a6f6f88bfc68a3/renderer/html/html.go#L673 | ||
func (img *image) render(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) { | ||
if !entering { | ||
return ast.WalkContinue, nil | ||
} | ||
|
||
n := node.(*ast.Image) | ||
_, _ = w.WriteString("<img src=\"") | ||
|
||
attr, hasAttachments := n.Attribute(name) | ||
if !hasAttachments { | ||
if img.Unsafe || !html.IsDangerousURL(n.Destination) { | ||
_, _ = w.Write(util.EscapeHTML(util.URLEscape(n.Destination, true))) | ||
} | ||
} else if mb, ok := attr.(schema.MimeBundle); ok { | ||
// Here we do not need to extract the filename again, as it is sufficient | ||
// that the mime-bundle is present in the attributes. | ||
io.WriteString(w, "data:") | ||
io.WriteString(w, mb.MimeType()) | ||
io.WriteString(w, ";base64, ") | ||
w.Write(mb.Text()) | ||
} | ||
|
||
_, _ = w.WriteString(`" alt="`) | ||
_, _ = w.Write(nodeToHTMLText(n, source)) | ||
_ = w.WriteByte('"') | ||
|
||
if n.Title != nil { | ||
_, _ = w.WriteString(` title="`) | ||
img.Writer.Write(w, n.Title) | ||
_ = w.WriteByte('"') | ||
} | ||
|
||
if n.Attributes() != nil { | ||
html.RenderAttributes(w, n, html.ImageAttributeFilter) | ||
} | ||
|
||
if img.XHTML { | ||
_, _ = w.WriteString(" />") | ||
} else { | ||
_, _ = w.WriteString(">") | ||
} | ||
|
||
return ast.WalkSkipChildren, nil | ||
} | ||
|
||
// attachments implements goldmark.Extender. | ||
type attachments struct { | ||
config html.Config | ||
} | ||
|
||
var _ goldmark.Extender = (*attachments)(nil) | ||
|
||
// Extends adds custom link parser and image renderer. | ||
// | ||
// Priorities are selected based on the ones used in goldmark. | ||
func (a *attachments) Extend(md goldmark.Markdown) { | ||
md.Parser().AddOptions( | ||
parser.WithInlineParsers(util.Prioritized(newLinkParser(), 199)), // default: 200 | ||
) | ||
md.Renderer().AddOptions( | ||
renderer.WithNodeRenderers(util.Prioritized(&image{Config: a.config}, 999)), // default: 1000 | ||
) | ||
} |
Oops, something went wrong.