Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example suggestion: How to configure tokenizers #115

Open
jaydenseric opened this issue Jan 19, 2021 · 3 comments
Open

Example suggestion: How to configure tokenizers #115

jaydenseric opened this issue Jan 19, 2021 · 3 comments
Labels

Comments

@jaydenseric
Copy link
Contributor

jaydenseric commented Jan 19, 2021

So, I'm trying migrate jsdoc-md to the comment-parser v1.x API, but can't find any documentation about how to configure tokenizers for parsing standard JSDoc tags. Here is the old code:

https://github.com/jaydenseric/jsdoc-md/blob/98effb0b4d45af041e8ce91d6659512b53cbdfbb/private/jsdocCommentToMember.js#L3

Ideally the example will use deep require paths to just the functions needed (vs getting things from index files), for a minimal memory footprint and bundle size.

@syavorsky
Copy link
Owner

I believe you have already found this example. It is not clear how to get tokenizers though as you pointed out

CommonJS option

const {default: tag} = require('comment-parser/lib/parser/tokenizers/tag')
const {default: type} = require('comment-parser/lib/parser/tokenizers/type')
const {default: name} = require('comment-parser/lib/parser/tokenizers/name')
const {default: description} = require('comment-parser/lib/parser/tokenizers/description')

or ES6 imports, it might work for you with the right tooling. I still have to tune ES6 for distribution

import tag from "./es6/parser/tokenizers/tag"
...

take a look on what existing tokenizers do, I don't have a written guide on implementing one yet. See

@jaydenseric
Copy link
Contributor Author

For other people migrating from v0.x to v1, here is a jsdoc-md diff to reference:

jaydenseric/jsdoc-md@ebcdbf5#diff-04e9c1ef0da56feec83db032c6ee7f9db1230accf7530a0767902d21be9faf85

One thing I would like to investigate is if the comment-parser tokenizer can be configured to skip work for JSDoc tags not in an arbitrary whitelist. This will prevent it doing work to try to tokenize JSDoc tags we're not interested in, that most of the time don't fit the default behavior that expects a type, name, and description to be present. @syavorsky is there a way to do this?

The next jsdoc-md version is currently a work in progress, but I have a really huge amount of work locally nearly ready to push up and hopefully publish today. The CLI will display syntax highlighted ranges of problematic JSDoc code right in the terminal for errors.

As I mentioned here, I've been working the past few weeks on a brand new JSDoc comment parser package, that has source location data for every node in the JSDoc AST (relating both to just the doclet, and the whole code file). I got about 80% of the way there, but now that comment-parser v1 is out and it makes it possible (with a bit of manual work) to figure out line and column numbers for JSDoc block tag spans, I couldn't justify spending another few weeks on my own solution. Frankly it was tiring me out! @syavorsky I appreciate your work :)

Once the next major version of jsdoc-md is published I will share my utility function that extracts source line and column numbers for JSDoc block tag spans for a given span token name, i.e. tag, name, type, or description.

@syavorsky
Copy link
Owner

syavorsky commented Jan 22, 2021

great, I am trying to make comment-parser a flexible low-level parser for tools like this.

One thing I would like to investigate is if the comment-parser tokenizer can be configured to skip work for JSDoc tags not in an arbitrary whitelist.

There is no such thing. Parser is processing entire source over few stages. I didn't do any benchmarking, but would be interested to find what input data volume would show any noticeable performance boost for proposed optimization.

It may get tricky though if you would need to stringity data back. For that you would need to iterate over Block.tags[].source instead of Block.source, which would need minor API tweaks (UPD: created #118 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants