Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exit_obeject() functionality #51

Open
bogdanCsn opened this issue Feb 5, 2016 · 10 comments
Open

Add exit_obeject() functionality #51

bogdanCsn opened this issue Feb 5, 2016 · 10 comments

Comments

@bogdanCsn
Copy link

In a nested json, it would be useful to be able to exit an object you have entered - a function that does the opposite of enter_object(). Or perhaps change the functionality of enter_object() so that full / relative paths are supported, for example "." would refer to the top level while ".." would refer to the previous level. Here is some code example:

library(tidyjson)
library(dplyr)


input <- '{
      "name": "Bob",
      "age": 30,
      "social": {
            "married": "yes",
            "kids": "no"
      },
      "work": {
            "title": "engineer",
            "salary": 5000
      } 
}'


output <- input %>% as.tbl_json() %>%
      spread_values(name = jstring("name"),
                    age = jnumber("age")) %>%
      enter_object("social") %>% 
      spread_values(married = jstring("married"),
                    kids = jstring("kids")) %>%
      #### I would need an exit_obeject() here
      enter_object("work") %>%
      spread_values(title = jstring("title"),
                    salary = jnumber("salary"))
@jeffwong-nflx
Copy link

Bump on this feature!

@arochettesteinberg
Copy link

It's not even a feature it's a real problem if nobody can tell me how to actually get around this? How to access several nested objects within a json object, this happens often it's not something exotic.

@timwilliate
Copy link

+1 on this feature. It would make parsing multiple nested objects possible from a single pipeline

@rpalloni
Copy link

+1 function absolutely needed. Or parallel pipes to get data from multiple objects. Json is harder than tidyjson examples...

@pgensler
Copy link

+1 this is definitely needed. @arochettesteinberg the only way I know how to use the enter object is to use it as the last argument when piping, especially when working with json lite.

@colearendt
Copy link

Please see discussion here. If you can provide a REPREX there, that would be fantastic. The jstring('social','married') solution using a more complex path, as discussed there, solves the example above as well.

Explicitly:

output <- input %>% as.tbl_json() %>%
      spread_values(name = jstring("name"),
                    age = jnumber("age")) %>%
      spread_values(married = jstring("social","married"),
                    kids = jstring("social","kids")) %>%
      spread_values(title = jstring("work","title"),
                    salary = jnumber("work","salary"))

@pgensler
Copy link

@robpalgit Is this example a good example of some of the data you run into? This is jsonlite, but I think an example like this might help to better reflect how to use tidyjson appropriate to parse this data. This work to parse out most of the data, but the spread_values for each element makes sense to me, as it is a separate object. Otherwise, you would have to do something like THIS approach, where you put every single command into a massive pipeline, which seems a bit unwieldy to me:

colearendt/tidyjson#95

#One interesting thing to note is that I am using the spread_values command at the end of the chain sequence, which diden't seem like it was possible based on the vignette

pacman::p_load("tidyjson","magrittr")
###Something to note is that the \\t must be escaped for R, AND the string must be in '' to parse properly
##data is jsonlite format
poop1 <- '{"review/appearance": 2.5,"beer/style": "Hefeweizen", "review/palate": 1.5, "review/taste": 1.5, "beer/name": "Sausa Weizen", "review/timeUnix": 1234817823, "beer/ABV": 5.0, "beer/beerId": "47986", "beer/brewerId": "10325", "review/timeStruct": {"isdst": 0, "mday": 16, "hour": 20, "min": 57, "sec": 3, "mon": 2, "year": 2009, "yday": 47, "wday": 0}, "review/overall": 1.5, "review/text": "A lot of foam. But a lot In the smell some banana, and then lactic and tart. Not a good start.\\tQuite dark orange in color, with a lively carbonation (now visible, under the foam).\\tAgain tending to lactic sourness.\\tSame for the taste. With some yeast and banana.", "user/profileName": "stcules", "review/aroma": 2.0}'
poop2 <- '{"review/appearance": 3.0, "beer/style": "English Strong Ale", "review/palate": 3.0, "review/taste": 3.0, "beer/name": "Red Moon", "review/timeUnix": 1235915097, "beer/ABV": 6.2, "beer/beerId": "48213", "beer/brewerId": "10325", "review/timeStruct": {"isdst": 0, "mday": 1, "hour": 13, "min": 44, "sec": 57, "mon": 3, "year": 2009, "yday": 60, "wday": 6}, "review/overall": 3.0, "review/text": "Dark red color, light beige foam, average.\\tIn the smell malt and caramel, not really light.\\tAgain malt and caramel in the taste, not bad in the end.\\tMaybe a note of honey in teh back, and a light fruitiness.\\tAverage body.\\tIn the aftertaste a light bitterness, with the malt and red fruit.\\tNothing exceptional, but not bad, drinkable beer.", "user/profileName": "stcules", "review/aroma": 2.5}'

clean  <- poop2 %>%
  spread_values(
    review_appearance = jnumber("review/appearance"),
    beer_style = jstring("beer/style"),
    review_palate = jnumber("review/palate"),
    review_taste = jnumber("review/taste"),
    beer_name = jstring("beer/name"),
    review_time = jstring("review/timeUnix"),
    beer_ABV = jstring("beer/ABV"),
    beer_beerid = jnumber("beer/beerId"),
    beer_breweryid = jstring("beer/brewerId"),
    review_overall = jnumber("review/overall"),
    review_text = jstring("review/text"),
    profile_name = jstring("user/profileName"),
    review_aroma = jnumber("review/aroma")
  ) %>% 
  spread_values(
    isdst = jnumber("isdst"),
    mday = jnumber("mday"),
    hour = jnumber("hour"),
    min = jnumber("min"),
    sec = jnumber("sec"),
    mon = jnumber("mon"),
    year = jnumber("year"),
    yday = jnumber("yday"),
    wday = jnumber("wday")
  )

@ghost
Copy link

ghost commented Jun 19, 2017

An exit_object() feature would be especially valuable, particularly if one needs to enter a nested repeating structure, then exit out and enter another one to obtain all the data. Currently this is not possible without workarounds.

@colearendt
Copy link

@morebento If you can come up with a minimal use case / reprex, please post it here. At present, most of our use cases seem to revolve around better articulating the use of complex paths like json %>% spread_values(test=jnumber('one','two')) to pull information out of nested objects. The other piece of functionality worth looking at are ways to reduce the amount of typing, which can presently be done using the development version function spread_all:

devtools::install_github('jeremystan/tidyjson')
library(tidyjson)

json <- '{"a":1,"b":"test","c":3}'
json %>% spread_all()

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants