Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write GenR blog post for Simon Worthington #14

Open
petermr opened this issue Sep 9, 2019 · 15 comments
Open

write GenR blog post for Simon Worthington #14

petermr opened this issue Sep 9, 2019 · 15 comments

Comments

@petermr
Copy link
Owner

petermr commented Sep 9, 2019

No description provided.

@petermr
Copy link
Owner Author

petermr commented Sep 9, 2019

SimonW asked:
Is it possible you could contribute a small post this week to GenR on
the 'OA Climate Change' question. Just as a suggestion, part based on
last week's #eLifeSprint - it would be great if people could know what
'Content Mine' software is, how it can be used in article analysis for
climate change, and essentially what the next step of mining CrossRef
would involve and the kind of stats, types of information, data that
could result. The reason I say this is that then I can use this to talk
with various tech teams, librarians etc., and try and muster some
volunteer help.

Greetings to GenR!

I'm Peter Murray-Rust, a retired chemistry academic from Cambridge University, and I feel that the most important thing in our lives now is climate change.

But what can I do that's most effective for me and the world?

However we solve climate change one thing seems certain - we need global collaboration based on facts. Emotions will keep us going, but facts will decide what we do?

I don't know all the facts that I should. If I lectured to a first year university course I couldn't give an accurate picture of the facts and what actions they dictate. So I'm going to try to learn what is common knowledge. But my special contribution comes from a technology and philosophy that allows us to get huge numbers of facts from reliable sources - the scientific literature.

There are literally hundreds of thousands of articles (papers) that are about CC to some degree. This is searching the biomedical literature (explanation later, and you'll find it's very simple):

getpapers -q "climate change" -n -a
info: Searching using eupmc API
info: Running in no-execute mode, so nothing will be downloaded
info: Found 135931 results

This uses Rik Smith-Unna's getpapers to search EuropePubMedCentral for papers with "climate change" in the text. About half of them (65516) are "open access" and can be downloaded (but this figure is likely to be lower for non-biomedical).
They are about everything:

  • species extinction
  • sea level rise
  • spread of parasite vectors
  • weather changes
  • engineering responses
  • response by society

and they're about everywhere on the planet.

So if you want to find out about crops and West Africa...

getpapers -q "((climate change) AND (west africa) AND (crops))" -n -a
info: Found 1628 results

That's a lot of papers ! But if you have enough disk space and a reasonably good connection you can download them in 5 minutes.

Are they useful? That's where our AMI comes in. AMI searches these papers on your disk within a minute or two for things you might be interested in:

  • species
  • vectors
  • tropical diseases
  • chemicals
  • countries
  • funders
  • international organizations
    and lots more

The great thing is that anyone who can run a program can do this! Lars, in the Netherlands, 15 years old, learn how to do this and developed more software. If you love computers (and have one), or data, or tackling scientific problems or combatting CC that's all you need.

This makes a great citizen science project. Anyone anywhere with a Net connection can do it. The software, data and dictionaries ar all Open (no restrictions on use, no fee, and you can change them without permission). We'll share the data we find (probably on Github) as soon as we capture it. This is "OpenNoteBook science", (no insider knowledge) promoted by Jean_Claude Bradley.

Don't think that because you aren't a "scientist" you can't understand scientific papers. Not all of them (I can't either) or some parts of them, but there are many you can understand the key bits of. If you like maps, graphs, and similar data then you'll feel right at home.

We've set up a project here, on Github. The technology is used in several projects (most notably plants and their medicinal products) so that means that bugs get reported and hopefully fixed. No matter what your interests and skills you're welcome.

There's a lot of useful stuff on the sister project, essential oils. Also the data we extract is open and well organized so we can use a wide range of other software to analyse it.

If you are a techie, there's a tutorial (rather XML-heavy!) I'm giving next week at XMLSummerSchool (Oxford) at
https://github.com/petermr/CEVOpen/blob/master/docs/2019_raw_petermr.potx - you'll have to download it. We're also starting a communal article for Beilsten J. Organic Chem at https://github.com/petermr/CEVOpen/blob/master/BJOC

@mrchristian
Copy link
Contributor

I need to check a few things about the software mentioned above:

I want to make sure I'm getting the correct address for the software, installation manuals, and use manuals.

  1. getpapers - Is this the software repository and install instructions used https://github.com/contentmine/getpapers and is this the tutorial for use you would recommend https://github.com/petermr/tigr2ess/blob/master/getpapers/TUTORIAL.md
  2. AMI - software and installation guide - https://github.com/ContentMine/ami - the tutorial for using AMI https://github.com/petermr/tigr2ess/blob/master/search/TUTORIAL.md

Thanks

Simon

@petermr
Copy link
Owner Author

petermr commented Sep 11, 2019 via email

@mrchristian
Copy link
Contributor

That's great, I can go with these addresses for the moment. Thank you.

Another question: Why use getpapers + AMI and not just search on say europepmc https://europepmc.org/search?query=climate%20change

I can obviously string together a bunch of reasons groups ed around 'data science', 'having a collection, 'putting to use in another context' e.g., literature, media for a class, a citizen science project, etc.: with actions like, download all the papers, keep the papers, carry out further searches as a when you want, using the content in another context for your community - in a project with a groups There is also the post processing that is not be touched on yet.

But it's better you give me the 'OK' that I'm on the right track, or you have another or a complementary vision.

@mrchristian
Copy link
Contributor

NB: I will make three info boxes:

  1. eLifeSprint
  2. Content Mine software: getpapers + AMI
  3. Open Climate Knowledge projects - which combines 'openNotebook' with mission to build actionable plan for '100% OA Climate Change'.

I'll pass them by you when done and move the whole package to a collaborative doc to finish it off.

@petermr
Copy link
Owner Author

petermr commented Sep 11, 2019 via email

@mrchristian
Copy link
Contributor

Great thank you. I'll add a line or two to the article to point out what its more than search.

Info: block 1. about the eLife Sprint

InstruMinetal team: eLifeSprint 2019

Project: SaWaMine (working title)

#eLifeSprint2019 4–5 September 2019, Cambridge UK and online.

The ContentMine software was used by a sprint groups of seven (Sabine Weber, Michael Owonibi, Tiago Lubiana, Peter Murray-Rust, Sophia K. Cheng, Wambui Karuga, and Leonie Mueck) to protoytpe a UI for users to identifying scientific instuments from canidate search results made using ContenMine's software getpapers and AMI extracted from a corpus of papers about phytochemistry called CEVOpen.

Goals of the Project:

  • Create a way of automatically extracting candidates for scientific equipment terms from scientific papers.
  • Create a GUI to display the paper's paragraph from the candidates containing scientific equipment, allowing user to select the ones that are actually instruments.
  • Find out what kind of scientific equipment the papers in the CEVOpen corpus used and add the terms to Wikidata.
  • Long term goal: Connect tool and the UI.

NB: Content is partly based on https://github.com/caffiendFrog/elife2019 from Sophia Cheng
@caffiendFrog

@mrchristian
Copy link
Contributor

Please edit the above infobox here on this pad, will be easier https://cryptpad.fr/code/#/2/code/edit/2nIMow-uTuQpNJv2RzZwdark/

@petermr
Copy link
Owner Author

petermr commented Sep 11, 2019 via email

@mrchristian
Copy link
Contributor

I've turned the 1. eLifeSprint 'infobox' around, my version yesterday was mixed up and a bit rubbish. I think i was running on empty. Much better now.

I will complete the whole blog piece edit and add other infoboxes in the same doc, a bit more sane.

I'm just trying to resolve the order of things. I think that my infobox 2. Content Mine software: getpapers + AMI, should really be called 'openNotebook' https://github.com/petermr/openNotebook | Am I getting this right openNotebook is the wrapper, vehicle, to put forward the toolset and method?

I'll get finished up in content here
https://cryptpad.fr/code/#/2/code/edit/2nIMow-uTuQpNJv2RzZwdark/

@mrchristian
Copy link
Contributor

Article is in reasonable shape now. It has an intro and three infoboxes for the end of the article. I need to give it another working over and gather together some images. I'll do this in an hours time, first I have to have a meeting with a colleague.

https://cryptpad.fr/code/#/2/code/edit/2nIMow-uTuQpNJv2RzZwdark/

I will finish up the article today, recheck in the morning and post before 12 noon CEST Friday. Keep the momentum going :-)

@mrchristian
Copy link
Contributor

OK, I have the article ready to publish. There is one term I'm not sure if I'm getting it right 'species distribution and migration', its one of the subjects we said we'd cover in the openNotebook OA searches. I think I'm mixing up the term?

I will shortly move the doc to Wordpress, but not before I gather pics and I'll make a note here when it moves.

I can see that there is a need to explain a lot more about openNotebook, how it works, what you get out of it: stats, files, data? Good we're starting a new blog for it, there will be plenty to do :-)

@petermr
Copy link
Owner Author

petermr commented Sep 13, 2019 via email

@mrchristian
Copy link
Contributor

OK blogpost published, fiddle, fiddle, fiddle https://genr.eu/wp/open-climate-knowledge-100-oa-for-climate-change/ any mistakes please drop me a line - phew

@petermr
Copy link
Owner Author

petermr commented Sep 13, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants