Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A different way to store custom types in JSON #17

Open
lucamug opened this issue Jun 16, 2023 · 8 comments
Open

A different way to store custom types in JSON #17

lucamug opened this issue Jun 16, 2023 · 8 comments

Comments

@lucamug
Copy link
Contributor

lucamug commented Jun 16, 2023

More a comment than an issue.

I wanted to remove the "tag" and "args" from inside the JSON output.

So I changed a couple of functions and now instead of generating this JSON
(for Red 42 Ciao as per the example in the README)

{
  "args": [
    42,
    "Ciao"
  ],
  "tag": "Red"
}

it generates this:

{
  "Red": [
    42,
    "Ciao"
  ]
}

I wonder if this has been considered as an option and if there are any concerns
about generating this type of JSON.

These are the functions that I modified to achieve this (I commented out the removed code for comparison)

variant :
    String
    -> ((List Value -> Value) -> a)
    -> Decoder v
    -> CustomCodec (a -> b) v
    -> CustomCodec b v
variant name matchPiece decoderPiece (CustomCodec am) =
    let
        enc v =
            JE.object
                -- [ ( "tag", JE.string name )
                -- , ( "args", JE.list identity v )
                -- ]
                [ ( name, JE.list identity v )
                ]
    in
    CustomCodec
        { match = am.match <| matchPiece enc
        , decoder = Dict.insert name decoderPiece am.decoder
        }

buildCustom : CustomCodec (a -> Value) a -> Codec a
buildCustom (CustomCodec am) =
    Codec
        { encoder = \v -> am.match v
        , decoder =
            -- JD.field "tag" JD.string
            --     |> JD.andThen
            --         (\tag ->
            --             case Dict.get tag am.decoder of
            --                 Nothing ->
            --                     JD.fail <| "tag " ++ tag ++ "did not match"
            --
            --                 Just dec ->
            --                     JD.field "args" dec
            --         )
            JD.oneOf
                (List.map
                    (\( tag, dec ) -> JD.field tag dec)
                    (Dict.toList am.decoder)
                )
        }
@miniBill
Copy link
Owner

miniBill commented Jun 21, 2023

Ooooh, I like this. I would probably still use andThen instead of oneOf for efficiency, but I like this change.

It can also be made in a backwards-compatible way, because even if a user picks "tag" or "args" as a tag we can just check if there are one or two fields

My main concern is... what would users prefer? I'm planning on doing a breaking release anyway, to fix all the nullable/optional/maybe fields mess, so I might have two different functions for this

@lucamug
Copy link
Contributor Author

lucamug commented Jun 22, 2023

My main concern is... what would users prefer?

Giving both options would answer this question.

Looking at the JSON with human eyes, I didn't like all the extra "tag" and "args", as these are strings that are not in my code base.

I was even thinking to create an option where these two strings could be customized ("key" and "values", for example), but then I realized that I can even remove them, so I went for this last option.

The extra thing I was thinking to clean up was about avoiding using a list if there is only one element in the payload of a custom type. For example, in the case of Red 42 just encode it as

{ "Red": 42 }

instead of

{ "Red": [ 42 ] }

but I didn't try to implement it

@miniBill
Copy link
Owner

I think a nice option would be to have:

  1. new way [in Codec and possibly also Codec.V2]
  2. old way [in Codec.V1]
  3. new way encoder but both decoders [also in Codec.V1]

So that people using the old way can migrate to either the compat way (if only Elm is reading the data) or old way (if other software is reading the data too)

My only unhappiness with this approach is that if someone updates the library without reading the changelog it will break everything without any compiler errors.

@miniBill
Copy link
Owner

I decided I'm going to implement this in elm-codec 3.0.0. The decoder will still accept the tag form, and I will have a separate module that produces the old form when encoding (but still decodes both), so that people can update without breaking by simply changing their imports.

I think I'm not going to implement the { "Red": 42 } form, because as much as I really like it, it means that any other software consuming the data would have to special case that. And if the only consumer is Elm, might as well use @MartinSStewart's elm-serialize and Bytes

@dillonkearns
Copy link

{
  "Red": [
    42,
    "Ciao"
  ]
}

The above one definitely reads more cleanly than the current format:

{
  "args": [
    42,
    "Ciao"
  ],
  "tag": "Red"
}

I think it might be worth considering how it is accessed though. For example, I often do a switch statement based on the tag like so:

  switch (body.tag) {
    case "EmptyBody": {
      return null;
    }
    case "StringBody": {
      return body.args[1];
    }
    case "BytesBody": {
      return Buffer.from(body.args[1], "base64");
    }
    case "JsonBody": {
      return JSON.stringify(body.args[0]);
    }
  }

https://github.com/dillonkearns/elm-pages/blob/36ccccfddab510119fd2e289d4966ca8b26aee9a/generator/src/request-cache.js#L275-L288

It seems a little more awkward to do a check for different tags with the proposed format. The way I can think to do it would be:

switch (Object.keys(body)[0]) {
    case "StringBody": {
        return Object.values(body)[0];
    }

Is there a better way to access it? If not, the proposed format seems less clean to access in my opinion.

@miniBill
Copy link
Owner

Another alternative would be [ "Red", 42, "Mandi" ]. This wouldn't be awkward to access from js/ts, but I don't love it either.

@dillonkearns
Copy link

Yeah, both seem more concise but also more dependent on understanding the structure than the current format. If I'm looking at something.args[0] then it states what is happening pretty clearly whether I have the format in my head or not. switch (something[0]) and something[1] to get the 0th argument seem more error prone and less intention revealing (more dependent on memorizing the format).

With Object.keys(something)[0], if I were looking at that code without context then I would be wondering whether to expect anything to be in Obect.keys(something)[1] or not (similar with Object.values(something)[0]). Not to mention that it's somewhat verbose and more advanced to access it.

@miniBill
Copy link
Owner

Mh... guess elm-codec 3 is not coming soon then :D

This idea needs more baking, I need to think of a nice way to do it.

It might end up being a Codec.Advanced module, where you'd have different options to shape your custom types encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants