New Collabra paper draft #568

rempsyc · 2023-04-02T17:42:58Z

New Collabra paper draft.

Real time PDF: https://github.com/easystats/performance/blob/collabra_paper/papers/Collabra/paper.pdf

Edit: reminder to use [skip ci] in this PR

Follow-up on #544

rempsyc · 2023-04-02T19:09:29Z

Should we streamline the review process for Collabra?

Streamlined review?
Authors whose articles have been rejected within the previous 365 days from other journals for reasons that are not due to lack of scientific, methodological, or ethical rigor are welcome to submit prior reviews and decision letters along with their submission, and request a streamlined review in their cover letter.

rempsyc · 2023-04-02T20:59:41Z

@bwiernik, you mentioned the paper would need some polishing for Collabra. Any specific suggestions on the current version?

I did add the height-weight code demo as discussed earlier.

@DominiqueMakowski mentioned maybe arguing more strongly in favour of our multiple methods approach. But I I am not really able to do that. So Dom would you like to make an attempt? If not no worry, we can leave as is.

Also, there is no rticles template for Collabra, so I didn't know what template to use for the paper but they use double blind review so I used the preprint archive thing for now because I didn't want to have to change all the meta-info parameters to classic rmarkdown manually again.

rempsyc · 2023-04-02T21:27:27Z

Also, in the paper, we cite easystats as follows:

Lüdecke, Daniel, Mattan S. Ben-Shachar, Indrajeet Patil, Brenton M. Wiernik, Etienne Bacher, Rémi Thériault, and Dominique Makowski. 2022. “Easystats: Framework for Easy Statistical Modeling, Visualization, and Reporting.” CRAN. https://easystats.github.io/easystats/.

However, that does not correspond to the order of authors on the package website or using report::cite_easystats().

Lüdecke, D., Makowski, D., Ben-Shachar, M. S., Patil, I., Wiernik, B. M., Bacher, Etienne, & Thériault, R. (2023). easystats: Streamline model interpretation, visualization, and reporting (0.6.0) [R package]. https://easystats.github.io/easystats/ (Original work published 2019)

So @IndrajeetPatil which citation/author order is correct!?

IndrajeetPatil · 2023-04-03T06:30:37Z

Also, in the paper, we cite easystats as follows:

Lüdecke, Daniel, Mattan S. Ben-Shachar, Indrajeet Patil, Brenton M. Wiernik, Etienne Bacher, Rémi Thériault, and Dominique Makowski. 2022. “Easystats: Framework for Easy Statistical Modeling, Visualization, and Reporting.” CRAN. https://easystats.github.io/easystats/.

However, that does not correspond to the order of authors on the package website or using report::cite_easystats().

Lüdecke, D., Makowski, D., Ben-Shachar, M. S., Patil, I., Wiernik, B. M., Bacher, Etienne, & Thériault, R. (2023). easystats: Streamline model interpretation, visualization, and reporting (0.6.0) [R package]. https://easystats.github.io/easystats/ (Original work published 2019)

So @IndrajeetPatil which citation/author order is correct!?

Why are we citing the easystats meta-paper?! I don't think we should cite it. We haven't done so for any of the other publications. We can just mention it, and even link to the GitHub organization, but I don't think we need to cite it here. That said, we can cite the relevant JOSS papers. E.g., in ggstatsplot paper, this is what I do:

So @IndrajeetPatil which citation/author order is correct!?

At some point, we will need to decide on the order, but I feel it's a bit early for that.

rempsyc · 2023-04-03T13:02:29Z

I believe @DominiqueMakowski added the reference to easystats. I don't think it's necessarily a bad idea even if it's early and that things could change. I like for instance referring to the website.

I mean that's the outcome of using citation("easystats") so people are probably already citing that. If that's a problem maybe we need to change it?

I had not realized that the two titles were different though. As you say, the first one is probably the meta-paper, whereas in the second one, we're simply citing the website/package, so I think that's fine?

rempsyc · 2023-04-09T00:37:40Z

Anybody else would like to make a last reread of the paper before I submit to Collabra @easystats/core-team? I would like to submit next weekend if nobody requests changes. Thanks.

DominiqueMakowski · 2023-04-10T08:10:46Z

papers/Collabra/paper.Rmd

+
+## What happens ater?
+
+See comment in the thread.


It would be nice to have a discussion or some thoughts about what to do after outlier detection: re-run analyses? Show results with and without outliers? Describe outlying sample? I don't think there is one single best approach but we could at least mention some caveats and important things.

One thing we might highlight is the need for reporting the characteristics of the outliers (how many were removed, the percentage, and eventually the features). For instance, we could add an argument in report_participants() and report_sample(), outliers that takes a vector of outliers / the output of check_outliers, and that would add a description of the outliers.

In report_participants, could look like... "Description of whole sample. Out of this sample, X (3.54%) of participants were flagged as outliers (demographic summary), leaving X participants in the final set (new demographics without)."

in report_sample(): @strengejacke ideas how to present it?

It would be nice to have a discussion or some thoughts about what to do after outlier detection: re-run analyses? Show results with and without outliers?

We do have already a section called "Handling Outliers". And then we go about describing the different types of outliers, and what to do for each type of outlier, and whether to keep, exclude, or winsorize depending on the case. Does that correspond to what you wanted?

One thing we might highlight is the need for reporting the characteristics of the outliers (how many were removed, the percentage, and eventually the features)

We already kind of do that in the transparency section no? (I added a mention of the percentage though, thanks)

we could add an argument in report_participants() and report_sample(), outliers that takes a vector of outliers / the output of check_outliers, and that would add a description of the outliers.

It's true that we don't mention describing the characteristics of the outliers; perhaps I could add a paragraph on that. But do we really need a new function, when we can simply subset on the data directly with the outlier object or as you say a vector of outliers directly, if you have it?

report_participants(data[which(outliers),])

And I wonder if that assumes homogeneity of the outliers, which they should not be for random outliers. Even if they are, there would be no way of knowing unless you also report measures of homogeneity. But with so few observations as 1 to maybe 5 outliers, how meaningful would these data be? Especially if they are e.g., error outliers that are excluded, then any sample description would be meaningless. Maybe only for interesting outliers to try to figure out a pattern? Please convince me why you think this is important :)

rempsyc · 2023-04-17T17:31:14Z

papers/Collabra/cover_letter.Rmd

-In this sense, the paper fits very well with the special issue "Advances in Statistical Computing", as it essentially communicates to the wider public current advances in the statistical computing of outlier detection algorithms and their implementation in currently available open source and free software. This makes the manuscript relevant to data science, behavioural science, and statistical computing more generally.
+It explains the key approaches and highlights recommendations, and shows how users can adopt them in their R analysis with just one function. The manuscript covers univariate, multivariate, and model-based statistical outlier detection methods, their recommended threshold, standard output, and plotting method, among other things.
+
+Beyond acting like a concise review of outlier treatment procedures and practical tutorial, we also describe a new method (a consensus-based approach) and discuss its benefits ad limitations. In this sense, the paper fits very well with the scope of the journal, as it essentially communicates to the wider public current advances in the statistical computing of outlier detection algorithms and their implementation in currently available open source and free software. This makes the manuscript relevant to data science, behavioural science, and statistical computing more generally. 


Actually we don't discuss benefits and limitations of the consensus-based approach right now. @DominiqueMakowski actually what are the method's benefits and limitations exactly? Since I believe you came up with the method, if you give me some pointers I may be able to flesh it out in the paper.

rempsyc · 2023-04-17T21:58:26Z

Alright, so I have finished updating the cover letter and the manuscript, and writing the response to reviewers.

@IndrajeetPatil @strengejacke @DominiqueMakowski @mattansb that's the last opportunity to review those documents before I resubmit. Also two small things for the paper:

On p. 6 of the PDF, I added a summary decision table. Thoughts? Is it OK?
On p. 7, I wanted to draw a nice scatter plot to show the model expectation without adding too many lines of code. But I was not able to do this using see, ggplot2, or base R, so I just rempsyc::nice_scatter. But I don't know how I feel about using rempsyc here. Is it OK?

rempsyc · 2023-05-13T20:11:57Z

I'll create a new PR for JOSE. Should we merge this then?

mattansb · 2023-05-14T04:44:33Z

Merge? Or delete?

rempsyc · 2023-05-14T13:54:48Z

Hum, delete probably, yeah. I was thinking in case we want to keep archives or files and stuff but it was bugging the JOSE submission system anyway so I deleted it as well in the other PR.

new collabra paper draft

3a301ea

easystats deleted a comment from codecov-commenter Apr 2, 2023

address reviewer comment on x labels with ggplot layers [skip ci]

71f299c

remove test template files [skip ci]

621f961

minor cover letter + ms

91849b5

DominiqueMakowski reviewed Apr 10, 2023

View reviewed changes

integrate dom comments [skip ci]

45362ef

rempsyc commented Apr 17, 2023

View reviewed changes

update cover letter, response to reviewers [skip ci]

068804a

rempsyc mentioned this pull request May 13, 2023

JOSE paper #586

Merged

rempsyc closed this May 14, 2023

IndrajeetPatil deleted the collabra_paper branch December 3, 2023 11:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Collabra paper draft #568

New Collabra paper draft #568

rempsyc commented Apr 2, 2023 •

edited

Loading

rempsyc commented Apr 2, 2023 •

edited

Loading

rempsyc commented Apr 2, 2023 •

edited

Loading

rempsyc commented Apr 2, 2023

IndrajeetPatil commented Apr 3, 2023

rempsyc commented Apr 3, 2023 •

edited

Loading

rempsyc commented Apr 9, 2023 •

edited

Loading

DominiqueMakowski Apr 10, 2023

rempsyc Apr 17, 2023 •

edited

Loading

rempsyc Apr 17, 2023 •

edited

Loading

rempsyc commented Apr 17, 2023

rempsyc commented May 13, 2023

mattansb commented May 14, 2023

rempsyc commented May 14, 2023

New Collabra paper draft #568

New Collabra paper draft #568

Conversation

rempsyc commented Apr 2, 2023 • edited Loading

Edit: reminder to use [skip ci] in this PR

rempsyc commented Apr 2, 2023 • edited Loading

rempsyc commented Apr 2, 2023 • edited Loading

rempsyc commented Apr 2, 2023

IndrajeetPatil commented Apr 3, 2023

rempsyc commented Apr 3, 2023 • edited Loading

rempsyc commented Apr 9, 2023 • edited Loading

DominiqueMakowski Apr 10, 2023

Choose a reason for hiding this comment

rempsyc Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

rempsyc Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

rempsyc commented Apr 17, 2023

rempsyc commented May 13, 2023

mattansb commented May 14, 2023

rempsyc commented May 14, 2023

rempsyc commented Apr 2, 2023 •

edited

Loading

rempsyc commented Apr 2, 2023 •

edited

Loading

rempsyc commented Apr 2, 2023 •

edited

Loading

rempsyc commented Apr 3, 2023 •

edited

Loading

rempsyc commented Apr 9, 2023 •

edited

Loading

rempsyc Apr 17, 2023 •

edited

Loading

rempsyc Apr 17, 2023 •

edited

Loading