Releases · brunoamaral/gregory-ai

27 Nov 13:26

de7d040

Hope you're ready, what we have is heavy. Let's hear it for @antoniolopes who shines from the shadows and gave Gregory an AI upgrade. Let's get to it before your patience starts to fade.

António has been helping Gregory since the early stage, with the relevancy algorithm, and advice worthy of a sage. This time he brought a new summariser for the abstracts that can process the database through Django's management commands.

./manage.py get_takeaways will populate the "takeaways" column with the key points within the abstract of each article.

In future releases we may use this to improve the newsletters and automatic tweets.

And his magic didn't stop here. There is a new API endpoint that allows you to add new articles via http POST requests.

There is also a new SciencePaper class to make sure we have all the required information when saving article. This is also used to clean up the abstracts of any weird characters or html.

To save on CPU, and be gentle with the crossref API, we now stop trying to fetch missing data after trying for 30 days.

A special word of appreciation goes out to @codeZenon for taking the time to help us improve the documentation.

Development of new features and improvements has been 3x faster than documentation, and I don't expect it to improve. Our time is scarce. Which isn't the same as saying we don't care.

If you have any questions, please reach out by posting an issue or adding a thread in the discussion page.

Final note, remember to run ./manage.py migrate and pip install -r requirements.txt in the admin container when upgrading.

What's Changed

quick fix by @brunoamaral in #258
remove hardcoded information from crossref script by @brunoamaral in #259
removes utm parameters from urls in feedreader by @brunoamaral in #261
API returns authors as an object with first name, family name, and ORCID url by @brunoamaral in #264
add information from crossref.org upon fetching articles from the rss feeds by @brunoamaral in #269
Improves the way we fetch authors by using Django's ORM by @brunoamaral in #267
Refactor pipeline by @brunoamaral in #270
Apply method to avoid excessive queries to crossreforg by @brunoamaral in #273
partial fix for naive datetime warning by @brunoamaral in #275
fixed some grammer issues by @codeZenon in #276
Auth API by @antoniolopes by @brunoamaral in #271
Added two methods to SciencePaper class refresh() and clean_abstract() by @brunoamaral in #280
Make DOI optional when adding content from the API by @brunoamaral in #281
adds a new endpoint to list articles by journal name by @brunoamaral in #283
Adds a script to calculate the summary of abstracts by @brunoamaral in #284
fix variable name by @brunoamaral in #286
Clean up debug prints, limit results to 100 rows, fix #285 by @brunoamaral in #287
create shell command to process takeaways by @brunoamaral in #288
include takeaways in article json output by @brunoamaral in #289

New Contributors

@codeZenon made their first contribution in #276

Full Changelog: v12...v13

Contributors

antoniolopes, brunoamaral, and anachaba

Assets 2

0 Join discussion

11 Oct 15:47

brunoamaral

v12

158f402

The Ichabod Crane edition

Gregory lost its head, just in time for Halloween, the website code was shot dead.

With the AI engine now standalone, this can be the setting stone for many opportunities ahead.
On fresh install you get an AI and an API, letting you build the frontend you wish. But there's more than this.

We already had the concept of sources from where we fetch the papers. Think of PubMed, the same source can bring us several journals from several publishers. Gregory now saves this information as strings in the database. (We may extend this in the future)

But there's more.

Gregory is an open book, but not all papers are open access. We are now using Unpaywall to tag which articles are free access and which are restricted.

You can get a list of open access papers through the api: https://api.gregory-ms.com/articles/open/
Or you can get it from an rss feed: https://api.gregory-ms.com/feed/articles/open/

The rss feeds were also missing the item pubDate field, we fixed that.

A lot of the information comes from crossref.org, we made that code a lot cleaner and less disperse.

If you need the full details, here they are.

What's Changed

Check if key exists by @brunoamaral in #220
Docs by @brunoamaral in #222
Remove deprecated field from the database sent to twitter by @brunoamaral in #225
Admins should have the option to create new subjects by @brunoamaral in #227
Update documentation by @brunoamaral in #229
Wrong link in the rss feed for clinical trials by @brunoamaral in #231
API should list all relevant articles by @brunoamaral in #233
Add information of availability for science papers in the articles table by @brunoamaral in #239
Adds the option to fetch publication name from the DOI number by @brunoamaral in #240
Add journal title to database by @brunoamaral in #241
Use one script to fetch all the data we need from crossref by @brunoamaral in #244
Add item pubDate to rss feeds by @brunoamaral in #250
Sort by discovery date by @brunoamaral in #251
Add api endpoint and rss feed to list open access articles by @brunoamaral in #253
Website should be detached from the rest of the software by @brunoamaral in #246

Full Changelog: v11...v12

Still there? Here's the important bit. With the website detached from the AI we are opening the door to get much more done with Gregory and getting more flexibility in areas or fields where we can apply this method. Some thoughts include getting the takeaways from articles, finding biomedical entities in the abstracts, asking Gregory to answer specific questions within a subject.

Imagine asking, "what are the current disease modifying therapies available for MS?", and getting back a list of them all, sorted by category and date. It's a long shot, but we'll keep working on this ... 🙂

Thank you, and have a great Halloween!

Contributors

brunoamaral

Assets 2

02 Aug 15:35

brunoamaral

v11

d417f0e

The Flavio Amiel Edition

Take a seat, grab your favorite brew, there's a lot in this release for you.

1. SEO and Content

This release is a thank you to Flavio Amiel, he took some time to look at the website and offer some suggestions to improve the SEO and content.

This release implements some of those suggestions in content and in the url of articles listed. Previous URL was domain.com/articles/<article_id>, the new URL is domain.com/articles/<article_id>/<slug>, and the slug is taken from the noun_phrases found for each article title, to keep it a bit more relevant for search engines.

Content was also reviewed to include search keywords when possible.

Flavio was important not just because he took the time to look at our project but also because he took. a fresh look at a part of it that was ... overlooked.

If you're facing the same issues, you can reach him on twitter or schedule a call.

2. Setting up Gregory

We like to automate the boring stuff and making things easier for everyone.

If you're installing Gregory for the first time, run setup.py and it will take care of 80% of the work in setting up the database and the containers. This little sentence took quite some time but it's worth it to help people get their research up and running faster.

3. Filter feeds and API endpoints

Gregory has the concept of 'subject'. In this case, Multiple Sclerosis is the only subject configured. A Subject is a group of Sources and their respective articles. There are also categories that can be created. A category is a group of articles whose title matches at least one keyword in list for that category. Categories can include articles across subjects.

The one thing it didn't have was a way to filter by subject and category. So we added those options to the API and RSS Feeds in the format articles/category/ and articles/subject/ where and is the lowercase name with spaces replaced by dashes.

RSS Feeds

Latest articles by subject, /feed/articles/subject/<subject>/, for example https://api.gregory-ms.com/feed/articles/subject/multiple-sclerosis/
Latest articles by category, /feed/articles/category/<category>/, for example https://api.gregory-ms.com/feed/articles/category/mobility/

API endpoints

/articles/subject/<subject>/, for example https://api.gregory-ms.com/articles/subject/multiple-sclerosis/
/articles/category/<category>/, for example https://api.gregory-ms.com/articles/category/mobility/

4. Science paper, news, trials

What's in the news? Before this release Gregory could only understand science papers and clinical trials. We now have the option to include news articles without getting them mixed up with the other articles. You'll have to edit your current sources to make sure they have 'science paper' as the value of source for.

We're doing this to help follow the full process of scientific discovery, from publishing hypothesis, running clinical trials, and making it known to the people outside the scientific community.

5. Ignore SSL, if you must

In the past we had some issues reading RSS feeds whose web server didn't have the SSL certificate configured properly, and we were using a workaround that wasn't ideal because it turned off SSL verification for every request. This was fixed now, and each Source can be configured independently to bypass the certificate check if you really must.

What's Changed

197 add noun phrases to url keeping redirect from old format by @brunoamaral in #198
truncate url to 250 maximum of characters by @brunoamaral in #199
sync with main by @brunoamaral in #202
179 error on first install because there are no subscribers by @brunoamaral in #204
192 articles should be of type paper and news by @brunoamaral in #200
send 1 email per admin, save status if at least one send = success by @brunoamaral in #203
191 create an rss feed and api endpoint per category and subject by @brunoamaral in #206
remove hardcoded site domain by @brunoamaral in #207
175 error running setuppy for django by @brunoamaral in #208
Add documentation about env file by @brunoamaral in #209
fix db host on setup.py by @brunoamaral in #210
Add documentation about env file and build script by @brunoamaral in #211
set env variables after user configures them by @brunoamaral in #214
remove relation to sitesettings and django.contrib.sites by @brunoamaral in #215
run django commands from setup script by @brunoamaral in #216
132 implement a better approach to ssl problems in feedreader indexer by @brunoamaral in #217

Full Changelog: v10.7...v11

Contributors

brunoamaral

Assets 2

0 Join discussion

24 Jul 12:23

brunoamaral

v10.7

3c18610

The Rock edition

Content is alive and should evolve, live, and thrive.

Because I don't want anything too set in stone, you can edit the email title and footer in the new custom settings.

Careful on the upgrade ! Pull your changes and run to migrate:

sudo docker exec -it admin /bin/sh
./manage.py makemigrations && ./manage.py migrate

Visit the backoffice and edit the new settings.

This release also includes an example configuration for nginx.

Careful with your flows ! We are also moving away from running a custom version of the @node-red container.

Run sudo docker-compose pull && sudo docker-compose up -d to make the change. You may need to install node-red packages you had installed previously.

What's Changed

resolve #187 docker container breaks with new env variables by @brunoamaral in #188
improve the README file by @brunoamaral in #189
use oficial node-red image by @brunoamaral in #194
example configuration for nginx + instructions by @brunoamaral in #195
190 the email template contains hardcoded information by @brunoamaral in #196

Full Changelog: v10.6.11...v10.7

Contributors

brunoamaral and node-red

Assets 2

11 Jul 15:59

brunoamaral

v10.6.11

5d22414

The "Mise en place" edition

You got it, this is for the setup.py script to get you up to speed without missing a beat.
Not going to be verbose because the changelog should be right on the nose.

Send me a note to [email protected] or comment with any questions.

What's Changed

make sure hugo_path is a string by @brunoamaral in #165
check if hugo_path is set or try to find where it's installed by @brunoamaral in #166
content review by @brunoamaral in #167
setup.py creates a .env file if needed by @brunoamaral in #168
Clarify setup instructions when running setup.py, and try running sudo to launch the containers by @brunoamaral in #169
Exclude author attribute for Trials by @brunoamaral in #170
include django configuration steps in setup.py by @brunoamaral in #172
clean up and make sure we use gunicorn in production by @brunoamaral in #173
make sure we configure the right db host by @brunoamaral in #178
create the metabase database in postgres (you should reload the container once finished) by @brunoamaral in #181
get domain from env variables by @brunoamaral in #182

Full Changelog: v10.6...v10.6.5

Contributors

brunoamaral

Assets 2

0 Join discussion

26 Jun 21:56

brunoamaral

v10.6

86855e9

v10.6

Just some polishes and fixes to issues and other near misses.

What's Changed

fix #151 by @brunoamaral in #159
add server requirement by @brunoamaral in #160
Organize npm files and update documentation by @brunoamaral in #163
get path of hugo command to run build by @brunoamaral in #153

Full Changelog: v10.5.4...v10.6

Contributors

brunoamaral

Assets 2

19 Jun 20:03

brunoamaral

v10.5.4

2c631bc

The happy birthday edition

Not much happening, but I got some wind under my wing, took a deep breath and looked at what was left.

@dippas took some weight off my shoulders by fixing a few bugs on the frontend. Meanwhile, today I looked at all the information Gregory likes to send. The emails and the rss feeds are now listed in the readme file.

Also took a look at the install instructions, because I don't like to be vile, added more info, clarified.

And while we are at it, there were some amazing donations, from at least three nations. We have enough budget to keep the site running for the next 12 months.

I don't like stunts, so I'm adding that information to the next annual review. Transparency is a goal that I'm more than happy to pursue.

That's it, thank you for reading up to this bit.

What's Changed

Allow users to subscribe on their own with a form on the frontend by @brunoamaral in #141
Loads the articles from json and adds to database by @brunoamaral in #143
New indexer for sagepub by @brunoamaral in #144
Get published date from crossreforg by @brunoamaral in #146
Fix missing autoprefixer dependency by @dippas in #147
Fix mobile navigation - class nav-open not compiling by @dippas in #148
Fix footer in authors page by @dippas in #149
Allow users to subscribe on their own with a form on the frontend by @brunoamaral in #150
Added more information on features and install procedure by @brunoamaral in #156
Update readme.md by @brunoamaral in #157

New Contributors

@dippas made their first contribution in #147

Full Changelog: v10...v10.5.4

Contributors

brunoamaral and dippas

Assets 2

0 Join discussion

27 May 13:28

brunoamaral

v10

9d9dd27

The Frankenstein edition

Gregory is made of several bits and we're trying to make sure everything fits.

So we moved the website files into their own directory, hugo and updated the build script to match. Some directories were renamed to be more descriptive, and others deleted.

Things will break and you should be careful not to lose your database. Otherwise, this makes way for a system that is easier to understand and evolve.

Also, it seems that this release includes a bug, the mobile menu isn't working and I will look into it sometime next week. #137

What's Changed

Use psycopg2 to fetch data from PG during the build process by @brunoamaral in #125
fix build error when trial title contains ' character by @brunoamaral in #126
115 list relevant results in the last 30 days in the doctors page by @brunoamaral in #129
add categories to clinical trials by @brunoamaral in #131
134 move site to its own directory by @brunoamaral in #136

Full Changelog: v9.4...v10

Contributors

brunoamaral

Assets 2

27 Apr 21:42

brunoamaral

v9.4

67be97a

The Rosie Jetson edition

This is mostly some house cleaning

What's Changed

fix listing of articles for physical therapists by @brunoamaral in #112
categories now use "terms" as a way to tag articles @brunoamaral in #116
fixes excessive listing of articles in the weekly digest by @brunoamaral in #118
creates a new RSS feed to post articles and clinical trials on twitter by @brunoamaral in #119

Full Changelog: v9...v9.1

Contributors

brunoamaral

Assets 2

15 Apr 17:30

brunoamaral

69ef14f

Breaking the flow

I know, I know, we got to keep the flow.

What you don't want to miss for this release is the new subscription lists for alerts and the django-cron implementation to keep the database up to date and complete.

Subscriptions

You can now create a list to notify people of new clinical trials, send admin digests. The following notifications are included using django-cron:

subscriptions.admin_summary
subscriptions.weekly_summary
subscriptions.trials_notification

Db maintenance

Previously, we had node-red flows to update authors and make sure articles were properly categorized. That is now done with django-cron as well using the following tasks:

db_maintenance.get_authors
db_maintenance.rebuild_categories

Same goes for the prediction of relevant articles and calculation of noun phrases:

gregory.noun_phrases
gregory.predict

Node-red was also fetching some rss feeds using a python script that read from the database. You guessed it, it's now a django-cron task:

gregory.feedreadertask

Building the system

The latest developments have slowly made the system easier to install with docker containers, right now you should be up and running by setting up the correct .env variables, and running docker-compose up -d.

What's broken

Training the Machine Learning models inside the container is not working, seems to run out of memory. The workaround is to build locally and place the files in the ml_models directory.

What's Changed

add ko-fi link by @brunoamaral in #99
Manage subscriptions through django's admin by @brunoamaral in #100
fix to add migrations by @brunoamaral in #101
new template for the notification of new trials by @brunoamaral in #102
run weekly summary from django by @brunoamaral in #103
adds dbMaintenance tasks and moves ML and AI components into django by @brunoamaral in #105
Fix predictor by @brunoamaral in #107
Make Dockerfile use requirements.txt by @brunoamaral in #109

Full Changelog: v8.5...v9

Contributors

brunoamaral

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

What's Changed

Contributors

1. SEO and Content

2. Setting up Gregory

3. Filter feeds and API endpoints

RSS Feeds

API endpoints

4. Science paper, news, trials

5. Ignore SSL, if you must

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

Subscriptions

Db maintenance

Building the system

What's broken

What's Changed

Contributors

Releases: brunoamaral/gregory-ai

The António Lopes Edition

What's Changed

New Contributors

Contributors

The Ichabod Crane edition

What's Changed

Contributors

The Flavio Amiel Edition

1. SEO and Content

2. Setting up Gregory

3. Filter feeds and API endpoints

RSS Feeds

API endpoints

4. Science paper, news, trials

5. Ignore SSL, if you must

What's Changed

Contributors

The Rock edition

What's Changed

Contributors

The "Mise en place" edition

What's Changed

Contributors

v10.6

What's Changed

Contributors

The happy birthday edition

What's Changed

New Contributors

Contributors

The Frankenstein edition

What's Changed

Contributors

The Rosie Jetson edition

What's Changed

Contributors

Breaking the flow

Subscriptions

Db maintenance

Building the system

What's broken

What's Changed

Contributors