Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot narrow Token type since v7.0.2 #2938

Open
rokoucha opened this issue Aug 13, 2023 · 17 comments
Open

Cannot narrow Token type since v7.0.2 #2938

rokoucha opened this issue Aug 13, 2023 · 17 comments

Comments

@rokoucha
Copy link

Marked version: v7.0.2

Describe the bug
Since Marked v7.0.2, cannot narrow Token type.

Tokens.Generic was added to Token in #2891 but this change makes cannot narrow Token type by Token.type.

| Tokens.Generic) & { loose?: boolean, tokens?: Token[] };

To Reproduce
Steps to reproduce the behavior:

import type { Token } from 'marked'

const t: Token = {
  type: 'text',
  raw: 'test',
  text: 'test',
}
if (t.type === 'text') {
  t // const t: (Tokens.Text & {
    //   loose?: boolean | undefined;
    //   tokens?: Token[] | undefined;
    // }) | (Tokens.Generic & {
    //   loose?: boolean | undefined;
    //   tokens?: Token[] | undefined;
    // })
  t.text // (property) Tokens.Text.text: any
}

Expected behavior

import type { Token } from 'marked'

const t: Token = {
  type: 'text',
  raw: 'test',
  text: 'test',
}
if (t.type === 'text') {
  t // const t: Tokens.Text & {
    // loose?: boolean | undefined;
    // tokens?: Token[] | undefined;
    // }
  t.text // (property) Tokens.Text.text: string
}
@UziTech
Copy link
Member

UziTech commented Aug 13, 2023

I've been running into this as well with fixing types in #2893

But even though this is less convenient it is actually correct. Just because a token has type 'text' doesn't mean that token wasn't created by an extension and returned as a Generic token with whatever properties it wants.

I don't know the best way to solve this. Either we can narrow by type or we can have a correctly typed variable. We can't do both.

@UziTech
Copy link
Member

UziTech commented Aug 13, 2023

I don't know if there is a way in typescript to say the Generic token type is a string but not one of the other type strings.

@rokoucha
Copy link
Author

An extension should override the type of Token in the right type and Marked should provide the correct type in their implementation of Marked, IMO.
Types other than Generic have useless now because it is not possible to narrow by type.

@UziTech
Copy link
Member

UziTech commented Aug 13, 2023

It is still possible to narrow with const textToken = token as Token.Text.

If you have a better idea on how to fix it PRs are always welcome 😁👍

@UziTech
Copy link
Member

UziTech commented Aug 13, 2023

IMO that is actually better, because even though it is possible that token is not type Token.Text if the type === 'text' it is explicitly stated in the code that we think token is Token.Text. If there is ever a bug because the token is not a Token.Text it is easier to see what assumptions we might have gotten wrong.

@UziTech
Copy link
Member

UziTech commented Aug 13, 2023

I do agree that we should limit the type string in Generic to any string that is not one of the main token types. If you can find a way to do that in typescript that would be very helpful.

@rokoucha
Copy link
Author

I'm confusing how to identify a Token safely without Token.type.
How do I identify 'space' and 'hr' without Token.type? they are the same keys and type of value.

I believe Marked will return the correct Token and can identify that with Token.type if never extensions are loaded. is that wrong?

@UziTech
Copy link
Member

UziTech commented Aug 13, 2023

I agree that token.type should identify the token. I just don't know how to do that in typescript when Token.Generic.type can be any string.

Using as Token.Text just tells typescript that even though this could be Token.Generic I assume it is Token.Text.

The solution is not as simple as removing Token.Generic from Token. That would cause a lot of places to use any which is what was fixed in 7.0.2

@rokoucha
Copy link
Author

I'm trying to improve the type of Token without Token.Generic in https://github.com/rokoucha/marked/tree/improve-token-typing .
It passes the tests and type-checking, and it's good enough, but what's the problem?

@rokoucha
Copy link
Author

I don't like non-null assertion operators and it seemed to me that some token objects were of the wrong type.

@UziTech
Copy link
Member

UziTech commented Aug 19, 2023

what's the problem?

First thing I see is that the extension tokenizer can only return one of the current tokens. This is totally wrong and will break almost every extension. Extension tokenizers can return any object as a token as long as it has a type and raw property. Hence the Tokens.Generic type.

@rokoucha
Copy link
Author

rokoucha commented Aug 20, 2023

OK, I see...
It's impossible to implement types that are extended by extensions before loading extensions, so I think Marked should provide extensible types, not Generic types.

For example, declaration merging and ambient module declarations allow extensions to define the correct types.

extend.d.ts

import 'marked'

declare module 'marked' {
  // It's unsightly but I have to do this
  export type Token = (
    | Tokens.Space
    | Tokens.Code
    | Tokens.Heading
    | Tokens.Table
    | Tokens.Hr
    | Tokens.Blockquote
    | Tokens.List
    | Tokens.ListItem
    | Tokens.Paragraph
    | Tokens.HTML
    | Tokens.Text
    | Tokens.Def
    | Tokens.Escape
    | Tokens.Tag
    | Tokens.Image
    | Tokens.Link
    | Tokens.Strong
    | Tokens.Em
    | Tokens.Codespan
    | Tokens.Br
    | Tokens.Del
    | Tokens.Custom // add custom Token type here
  ) & {
    loose?: boolean
    tokens?: Token[]
  }

  namespace Tokens {
    // Extend existing token
    interface Text {
      extended: true
    }
    // Something completely new token
    interface Custom {
      type: 'custom'
      customize: true
    }
  }
}

example.ts

import { Token, Tokens } from 'marked'

const text = {
  type: 'text',
  text: 'hoge',
  raw: 'hoge',
  extended: true,
} satisfies Tokens.Text // pass!

const custom = {
  type: 'custom',
  customize: true,
} satisfies Tokens.Extended // it also passes!!

const _ts = [text, extended] satisfies Token[] // this will also pass!!!

@UziTech
Copy link
Member

UziTech commented Aug 20, 2023

rebase on the latest version of marked. I fixed a few types and removed & { loose?: boolean, tokens?: Token[] } from the Token type.

It could work to extend the token types with what extensions could return but I don't know enough about typescript to know what that would look like. However the fact still remains that an extension can use one of the current types but not have the same properties. I think this is just a limitation of typescript that we can't say an extension can return an object with type "text" but it must have these properties.

@rokoucha
Copy link
Author

Yes, TypeScript will merge interfaces so cannot override existing tokens by extensions.
I'm looking for a way, but haven't found one yet.

@mikew
Copy link

mikew commented Mar 23, 2024

It's annoying not being able to narrow, and I haven't invest much time but feel like there is a better way to allow generic / custom tokens (maybe it's on marked to accept Token | TokenGeneric wherever it accepts Token), but this will work for downstream users:

function isMarkedToken<T extends Token>(
  obj: unknown,
  markedType: T['type'],
): obj is T {
  return (
    typeof obj === 'object' &&
    obj != null &&
    'type' in obj &&
    obj.type === markedType
  )
}

const token: Token = ...
if (isMarkedToken<Tokens.Heading>(token, 'heading')) {
  // token is `Token.Heading`
}

@vladnicula
Copy link

I wish I understood the necessity of the generic token to provide some kind of guidance or maybe a small PR for this. It seems to me that this entire issue is detrimental for developers that don't want to worry about the implementation details of the marked package.

I just wanted to get some standard markdown nodes out of markdown docs. I know typescript, and I often use a switch by type when working with type unions like the Token one to narrow down the type. This decision to have the generic token has derailed my implementation flow and has added additional complexity that requires entire phrases to explain during a code review, and it relates to the library, not the feature of the app we are building.

If someone can point me to the reasoning for the generic token I'd like to take a look.

@UziTech
Copy link
Member

UziTech commented Sep 4, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants