Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sample attribute accession links to relevant external DBs #17

Open
only1chunts opened this issue May 10, 2016 · 6 comments
Open

sample attribute accession links to relevant external DBs #17

only1chunts opened this issue May 10, 2016 · 6 comments

Comments

@only1chunts
Copy link
Member

only1chunts commented May 10, 2016

User story

As a website user
I want to be able to click a hyperlink directly from the sample or file tables to any cited accession number (including those to non-GigaDB databases)
So that it's easy to navigate between resources

Acceptance criteria

Given I am on a dataset page with samples citing an accession number
When I click that accession number
Then a new tab opens in my browser displaying the relevant database page for that accession number

Given I am on a dataset page with files citing an accession number
When I click that accession number
Then a new tab opens in my browser displaying the relevant database page for that accession number

Additional information

on the web page, when accession number are shown, link it to the appropriate endpoint in bioregistry.io

Related issues

Update admin interface for link prefix to allow no source and with better database schema modeling E824 #903
link prefix reg-ex E824 #279
Enable validation of dataset links in submission wizard E824 #905
#597
curator's user story for this is: #824

pli888 pushed a commit that referenced this issue Jan 23, 2019
…e, security, usability, CI/CD (#273)

Performed a squash and merge of Rija's code commits from his gigadb-website/rija-nolegacy-dep-ux-php7 branch. These commits come from work done to implement the reccomendations determined from a GigaDB code audit.

Details of the work are available in this Google Docs spreadsheet:

https://docs.google.com/spreadsheets/d/1WVPS5BUTIIKew-_r9auPxNPnZlpoxdNqeQXEiRs7jfw/edit#gid=0

List of work done:

* Run only the faulty functional test

* Print log for web and application container after all steps

* Import Composer autoload into Yii application in CI too

this is the reason recent tests were failing on CI.

* Re-enable all functional tests now that the problem has been resolved

* Make AutoCompleteService an application component

* Add Service for replacing keywords in database after sanitizing input

GitLab SAST report found three high vulnerabilities in DatasetController.

It's actually the same code snippet duplicated in three places in that class.

The snippet's function is to delete keywords in database and replace them
with keywords string from POST data.

So we create an AttributeService that will:
- encapsulate that function in one place to remove duplication
- remove the unsafe code and sanitize input
- stop DatasetController from dealing with storage concerns (that's not what controllers are for)

To that end, I had to fix the way database fixtures work:
 - Removed the init.php file as it was in the way of database constraints
 - Instead added special initialisation sequence for dataset and attribute fixtures (see dataset.init.php and attribute.init.php)
 - Updated the RelationDAOTest to not require a pre-test reset  as datasets are only used for reference data in this test

Also created a DatasetDAO to manage Dataset and Dataset related persistence logic (currently, just the DatasetAttributes).

I had to use Dependency Injection and to create a DatasetAttributes factory class to enable TDD and clean design of the DAO layer.

Updated composer.json.dist so all the new classes can be found and used anywhere in the codebase.

The work is not finished yet, TODOs are updating the controllers to use the new service, test-driven with functional tests.

This is part of the work for ticket: rija#121
More specifically it covers the "Fix unsafe code" task (as well as falling under the "Remove duplicate code" task as a side effect)

It also lays the foundation for the future reworking of DatasetController.

* Resurrect global fixture initialisation script

as CI tests are failing with errors related to initialization

* Use AttributeService for updating keyword on dataset update form

It first required fixing Javascript errors on the form by
pulling most recent version of only the relevant javascript libraries.

It also required fixing a bug in DatasetAttributesFactory
(because that class wasn't test-driven as I thought it was simple enough)

this is part of issue: rija#121

* Use AttributeService in create1 and datasetManagement actions

in order to remove unsafe and duplcated code as part of ticket:
rija#121

I had to update keyword replacement method in AttributeService
to detect whether it is being called inside a database transaction already.
As the DatasetController actions create1 and datasetManagement already
initiate a transaction while the update action (first to use the new AttributeService)
doesn't.

Also I had to fix a bunch of Javascript errors on _form1.

* Update API docs

* Make AttributeService an application component so it's created only once

* Enable merging secret variables from both Gitlab Groups and projects

This is triggered by the need for the variables in local.php.dist
to be specific to a contractor's fork.

This is only for dev environment. On CI, this (merging) is done automatically.

* Manage Google API PHP client with Composer

* Remove Google Yii PathAlias from the main config as it's now autoloaded

* Upgrade Google API client to latest with modern json-based credentials

The use of "Google_Auth_AssertionCredentials"
(that require a binary P12 private key) is deprecated

We now authenticate to the API the recommended way
using the service account json private key.

Added a functional test to verify the google client authenticate correctly.

In order for the test to run on. CI, I have to store the json private key
in GitLab secret variables (on project level) and modify config_generate.sh script
to generate the file dynamically.

in generate_config.sh, we need to process the ANALYTICS_PRIVATE_KEY variable
separately from the other secret variables because it's multiline with special characters
and it would break when sourcing all the shell variables.

That's also why in the .gitlab-ci.yml config, we also exclude it form the .secrets generation

* Dump env variables to console for debug purpose

* Dump more upstream env variables to console for debug purpose

* Add ANALYTICS_PRIVATE_KEY to config's environment in compose file

* Move environment for config container from dev compose to ci compose

* Remove all code related to Sphinx and ElasticSearch

they are not used and even, hypothetically, if later we want to use Elastic Search again,
a different approach is necessary as:
- Elastica is not installed through Composer
- The integration code is not following modern PHP or OO or Yii 2 good practices
- it relies on old version of ES and of the php client (Elastica)

* Replace hand-made "bubble" sort with PHP's built-in sort function

Also add a functional test for the sorted news item on home page.

This is preliminary work before replacing Efeed rss extension with a similar
Composer package.

* Replace Efeed with Suin\RSSWriter using Composer, fix the RSS feed

as part of the work to move all dependencies to Composer:

rija#123

I've replaced old Efeed library with an RSS writer library from Packagist.org

Also it turns out that the RSS feed in GigaDB were actually not implemented:

 /site/feed returned a valid RSS feed for a dummy example.

I fixed that.

the new RSS library is modern and has 100% test coverage.
However it doesn't seem to be actively maintained.

I've also created a new service (NewsAndFeedsService) to manage all the features around the RssMessage and News tables
as well as the latest dataset to be notified in the homepage and in the RSS feed.

* Manage lessphp with composer, set vendor autoload for console commands

because when the webapp start we need to run both composer update
and then compile the less files, I created a new ops scripts to do both.

the version of leafo/lessphp is set to 0.3.9 only because newer versions
break are breaking API compatibility (BC)with the version of bootstrap
still used in the admin side of the app and depended on by the site.less file.

* Configure Composer to use HTTPS when downloading from packagist.org

* Change setup and test runners for more flexibility and quicker execution

* Setup Opauth with Composer, fix affiliate login acceptance tests

As part of work on rija#123

to have all PHP dependencies managed by Composer, I have removed the files
for the Opauth library and its Services specific Strategies from the codebase.

Instead i added the required libraries to the composer.json.dist file.

version of the code is slightly different so main.php.dist had to be changed
to reflect the proper way of configuring Opauth.

Improved security by defining a stronger security salt for Opauth
and storing it as a GitLab secret variables.

Removed the long redundant actionChooseLogin in SiteController
and the corresponding view.

* Dissuade from editing the generated composer.json with a warning

* Remove more redundant code

* Replace Mailchimp client with new one installed with Composer

This is part of the work to manage all PHP dependencies with Composer:
rija#123

Also create a new service (NewsletterService.php) to manage the list
(subscribe and unsubscribe) instead of using the grab-all Utils.php class.

The NewsletterService.php is an Yii Application Component so it is configured
in the local.php.dist.

The functional tests for the service requires the developer to set up
a Mailchimp account and and email service that setup with a catch-all email address.
Although Mailinator.com fits that constraint, it cannot be used as it is banned on Mailchimp.
The info about the dev account should be in the following variables in the GitLab secret variables service
for the developer's fork:
MAILCHIMP_API_KEY
MAILCHIMP_LIST_ID
MAILCHIMP_TEST_EMAIL

* Make class name references consistent for NewsletterService

* Upgrade Yii from 1.1.16 to Yii 1.1.20 to prepare for PHP 7 migration

see all PHP 7 compatibility fixes in the releases after 1.1.16:

https://raw.githubusercontent.com/yiisoft/yii/1.1.20/CHANGELOG

* Upgrade PHP version for web application to PHP 7.0

It fixes the issues with:

session_regenerate_id(): Cannot regenerate session id - headers already sent

when running in PHP 7

See: https://forum.yiiframework.com/t/session-regenerate-id-cannot-regenerate-session-id-headers-already-sent/42512/22

This is a temporary fix (adding ob_start() in index.php).
It's likely a minor update to Yii (1.1.21?) will fix this properly.

Also the PHP runtime for the tests are still under PHP 5.6 for now
as phpunit-mink breaks coverage report when running on PHP 7.0

Also remove unused controller and console command code related to the old Mailchimp extension

* Abstract browser mediation in functional tests so it's easier to change

as part of work to migrate to PHP 7.0, I found issue with phpunit-mink.
It is known by their developers since 2016 and the project hasn't had activity since then.
Before I can change to a more modern browser automation framework,
I need to coral as much as possible all the browser automation into specific abstractions.

In this case I get the FunctionalTesting that contains phpunit-mink dependency
and common setup and tear down operations to be the parent for all my functional tests
and the Browsers*Steps.php traits will contain all the code related to interacting with the browser

Also created a CommonDataProviders trait for common data providers used by several tests.

Frustratingly, the traits needs to be required once in bootstrap.php
despite adding a PSR0 autoloader for them in composer.json.dist

* Exclude protected/extensions from coverage:they are not first party code

* Enable PHP 7.0 in tests

* Enable PHP 7.0 in CI

* Create separate CI job for acceptance tests and phpunit tests

* Fix a dependency between jobs in CI

* Tweak CI config

* Revert CI config temporarily to check performances

* Tweak functional test parent to execute its parent setup and teardown

* Remove redundant reports from CI and replace brittle coverage check code

* Make minor tweak to composer.json.dist

* Fix the coverage check issue where previous coverage was not saved

* Add Yii 2 as Composer package and rename config template for yiic.php

* Fix autoload issue with functional test supporting classes

The supporting classes don't use namespace
but I've erroneously associated namespaces to their path in composer.json
Which prevent the autoloader to correctly find them
(which I've workaround temporarily by manually require_once them in PHPUnit's bootstrap.php)

* Workaround an issue with phpunit-mink on PHP 7.0 and PHPUnit 5.7

On our codebase where classes are loaded with Composer's PSR-0 autoloaders, it causes the PHPUnit code coverage to fail to generate

The fix is temporary until a new release of phpunit-mink happens (they know about the issue and have a fix) or we move to another solution

* Workaround an issue with phpunit-mink on PHP 7.0 and PHPUnit 5.7

On our codebase where classes are loaded with Composer's PSR-0 autoloaders, it causes the PHPUnit code coverage to fail to generate

The fix is temporary until a new release of phpunit-mink happens (they know about the issue and have a fix) or we move to another solution

This fix is better than the previous attempt in previous commit as it works and  only requires changing the config file for PHPUnit.

* Add support for Yii 2.0 in docker-compose

* Add support for Yii 2.0 in docker-compose

missed the change to gitlab config

* Set up Yii 2.0 and Yii 1.1 so Yii 2.0  can be used within Yii 1.1 webapp

For detailed explanation see:
https://www.yiiframework.com/doc/guide/2.0/en/tutorial-yii-integration#using-both-yii2-yii1

The additional changes we needed to do to make it work in this codebase are:
 * Register the modified Yii class's autoload function in addition to the yii\BaseYii's (see Yii.php:872)
 * In our modified Yii class, only require  YiiBase.php iff the class is not loaded yet so it doesn't crash phpunit tests (see Yii.php:9)

* Fix Js issue preventing claiming author on dataset claim popup

* add verbosity to debug coverage check

* Attempt to fix the coverage check not being cached

* Fix a couple of javascript issues

The one in new_main.php caused:

 [ERROR - 2018-10-13T05:32:18.898Z] Session [d9562850-cea8-11e8-8217-37d1863df633] - page.onError - msg: TypeError: undefined is not a function (evaluating 'jQuery('#keyword').autocomplete({'minLength':'2','source':[]})')

[ERROR - 2018-10-12T07:35:29.076Z] Session [6380d4c0-cdf1-11e8-b5ec-67db5ae89842] - page.onError - msg: TypeError: undefined is not a function (evaluating '$("#btnCreateAccount").tooltip({'placement': 'left'})')

the one in new_datasetpage.php caused:

[ERROR - 2018-10-12T07:35:29.076Z] Session [6380d4c0-cdf1-11e8-b5ec-67db5ae89842] - page.onError - msg: TypeError: undefined is not a function (evaluating '$("#btnCreateAccount").tooltip({'placement': 'left'})')

* Fix issues with image ALT attributes on public dataset view and homepage

part of the ticket:
rija#127

* Make font-icons and form fields more accessible on home, dataset, search

part of the work on ticket:
rija#127

using for diagnostics:
http://wave.webaim.org/

* Remove redundant inclusion of javascript libraries and code

* Replace http://cdn.bootcss.com with https://cdnjs.cloudflare.com for js

bootcss.com doesn't support HTTPS, so replace it with a CDN that support it
for improved security when loading third party javascript libraries

* Defer javascript execution to improve page rendering

First had to fix existing javascript errors, ensuring that we are loading
recent versions of jquery, jquery-ui and bootstrap only once on each page.

Then added the "defer" to the <script> tag for external libraries so they are executed
only when the DOM is fully loaded, including jQuery, so to avoid blocking time during loading of resources.

see:
- https://developer.mozilla.org/en-US/docs/Web/HTML/Element/script#attr-defer
- https://html.spec.whatwg.org/multipage/scripting.html#the-script-element

That means that all javascript code that depend on jQuery on pages need to be surrounded by

document.addEventListener("DOMContentLoaded", function(event) {
...
});
otherwise they will error because jQuery is not loaded yet as the script is being parsed and executed by the browser.
The DOMContentLoaded event is fired after all the deferred scripts are executed.

That works fine for code in <script>, but needed:
(1) to unwrap script wrapped in clientScript->registerScript() because that function insists in writing automatically Jquery to the page
(2) ajaxButton and ajaxLink to be deconstructed into a Bootstrap button and a standalone inline javascript function for same reasons as (1)
(3) to subclass CJuiAutoComplete Yii widget so that its run() method don't use registerScript()

Those effort have yielded significant reduction in loading time on the home page, the search result page and most of all on dataset page

* Enable profiling of the web application using xhgui

* Fix the error "headers already sent"

Whenever attempting to login, get following 500 error:

2018/10/18 04:05:33 [error] [php] session_regenerate_id(): Cannot regenerate session id - headers already sent (/opt/yii-1.1/framework/web/CHttpSession.php:190)
Stack trace:
#0 /opt/yii-1.1/framework/web/auth/CWebUser.php(717): CDbHttpSession->regenerateID()
#1 /opt/yii-1.1/framework/web/auth/CWebUser.php(235): WebUser->changeIdentity()
#2 /var/www/protected/models/LoginForm.php(51): WebUser->login()
#3 /opt/yii-1.1/framework/validators/CInlineValidator.php(42): LoginForm->authenticate()
#4 /opt/yii-1.1/framework/validators/CValidator.php(201): CInlineValidator->validateAttribute()
#5 /opt/yii-1.1/framework/base/CModel.php(159): CInlineValidator->validate()
#6 /var/www/protected/controllers/SiteController.php(377): LoginForm->validate()
#7 /opt/yii-1.1/framework/web/actions/CInlineAction.php(49): SiteController->actionLogin()
#8 /opt/yii-1.1/framework/web/CController.php(308): CInlineAction->runWithParams()
#9 /opt/yii-1.1/framework/web/filters/CFilterChain.php(134): SiteController->runAction()
#10 /opt/yii-1.1/framework/web/filters/CFilter.php(40): CFilterChain->run()
#11 /opt/yii-1.1/framework/web/CController.php(1148): CAccessControlFilter->filter()
#12 /opt/yii-1.1/framework/web/filters/CInlineFilter.php(58): SiteController->filterAccessControl()
#13 /opt/yii-1.1/framework/web/filters/CFilterChain.php(131): CInlineFilter->filter()
#14 /opt/yii-1.1/framework/web/CController.php(291): CFilterChain->run()
#15 /opt/yii-1.1/framework/web/CController.php(265): SiteController->runActionWithFilters()
#16 /opt/yii-1.1/framework/web/CWebApplication.php(282): SiteController->run()
#17 /opt/yii-1.1/framework/web/CWebApplication.php(141): CWebApplication->runController()
#18 /opt/yii-1.1/framework/base/CApplication.php(185): CWebApplication->processRequest()
#19 /var/www/index.php(27): CWebApplication->run()

I've previously workaround it by adding at the top of index.dev.php.dist:
ob_start();

but it didn't got to root cause (and it has a performance impact as revealed in xhgui), so in order to get more precise error, I've added:
header("Content-type: text/html");

in code mentioned in the stack trace but before the error occured. Then I got the more precise error:

2018/10/18 04:13:03 [error] [php] Cannot modify header information - headers already sent by (output started at /var/www/protected/config/main.php:1) (/var/www/protected/controllers/SiteController.php:373)
Stack trace:
#0 /opt/yii-1.1/framework/web/CController.php(308): CInlineAction->runWithParams()
#1 /opt/yii-1.1/framework/web/filters/CFilterChain.php(134): SiteController->runAction()
#2 /opt/yii-1.1/framework/web/filters/CFilter.php(40): CFilterChain->run()
#3 /opt/yii-1.1/framework/web/CController.php(1148): CAccessControlFilter->filter()
#4 /opt/yii-1.1/framework/web/filters/CInlineFilter.php(58): SiteController->filterAccessControl()
#5 /opt/yii-1.1/framework/web/filters/CFilterChain.php(131): CInlineFilter->filter()
#6 /opt/yii-1.1/framework/web/CController.php(291): CFilterChain->run()
#7 /opt/yii-1.1/framework/web/CController.php(265): SiteController->runActionWithFilters()
#8 /opt/yii-1.1/framework/web/CWebApplication.php(282): SiteController->run()
#9 /opt/yii-1.1/framework/web/CWebApplication.php(141): CWebApplication->runController()
#10 /opt/yii-1.1/framework/base/CApplication.php(185): CWebApplication->processRequest()
#11 /var/www/index.php(27): CWebApplication->run()

So there was a rogue first line in main.php.dist (template for main.php) before the <?php tag.

Now it's fixed properly. Also fixed missing ?> tag in components files for good measure.

* Remove redundant calls to ob_end_clean()

* Tweak autoloading for performance and correctness

* Create a php.ini for each php version/environment combination

* Enable PHP 7.1 and keep legacy PHP short_open_tag on for now

* Add more tolerance for slow hardware when running acceptance tests

* Build a production docker image in staging deployment stage in CI

* Fix an issue in staging deploy stage on CI with image name

* Add steps for stage deploy job to connect to remote docker host

* exclude docker certs variables from being in .secrets file

* won't use DOCKER_HOST variable as it confuses CI docker operations

* Correct the way variables are fetched for TLSAUTH to docker

* add debug info for stage deploy CI job

* Attempt to get the written credentials to be valid (line breaks missing)

* Fix missing newlines when bash echoing gitlab variables for PEM files

without double-quote, the newlines were used by bash as separators.

see:
https://stackoverflow.com/questions/22101778/how-do-i-preserve-line-breaks-when-storing-a-command-output-to-a-variable-in-bas#22101842

* Connect GL private registry to staging docker and pull gigadb production

* Remove Elastic Search configuration from config files as not used

* Remove ob_start as the issue as been fixed properly

See commit c60901b

* Build nginx container and only share assets between web & app containers

In order to deploy to a production environment , both nginx and php-fpm
containers need to be fully configured and as immutable as possible.
Also we may have limited access to the deployment host (no ssh),
so as much of the config and application files need to be baked in containers.

Also nginx need to be able to read the static files as well as the  asset folder
while the yii application needs to write in the assets folder,
so assets need to be a shared volume.

Because all of that the official Nginx container cannot be used as is,
instead we derive a new image from it (see Web-Dockerfile and docker-compose.build.yml)

* Rename nginx site conf to match the value of GIGADB_ENV on GitLab CI

* Fix typo in Opauth configuration

* Fix affiliate login acceptance tests issues with ORCID

* Configure YII_PATH to default to the vendor directory inside /var/www

It is more correct and simpler, and will allow removing unecessary volume
bind-mount from containers.

* Configure YII_PATH to default to the vendor directory inside /var/www

It is more correct and simpler, and will allow removing unecessary volume
bind-mount from containers.

Addendum from last commit: I missed the changes in .gitlab-ci.yml.

* Make naming and use of environments more consistent

* Attempt to deploy to the staging server

* Fix name of docker services to be deployed together on staging

* Use local instead of local-persist volume driver for staging

* Fix typo in config script

* Use staging ports in staging compose file

* Move the processing of staging variables after loading secrets

because some of the staging variables are defined in .secrets

* Expose docker-compose config in ci log for debug purpose

* Bring the containers down before deploying to staging for now

* Fix an error preventing the web container to be pushed to the registry

* Remove cap_drop to debug why web cannot see application container

* Make explicit the order of container startup for debugging

* Fix nginx startup error: could not build server_names_hash

you should increase server_names_hash_bucket_size: 64

because the server name for stating is the very long AWS automatically generated domain name

* Fix nginx startup error: could not build server_names_hash

you should increase server_names_hash_bucket_size: 64

because the server name for stating is the very long AWS automatically generated domain name

set that property to 128

* Remove capabilities,add restart policy to production services on staging

* Remove instruction for shutting down containers on deployment

* Remove change made for debugging

* Add certbot to the web container and configure ssl with LetsEncrypt

* Invoke certbot before instantiating web container

* copying lestencrypt config is done by config service, not webapp

so that we don't need to call the webapp service before building production container

It is also aimed at fixing the current build failure on CI

* Wont try to copy letsencrypt config in CI environment

* Deallocate TTY form docker-compose exec: not needed, causes error in CI

* Prevents the certs to appears in the bash variables file

* Remove volume for Letsencrypt webroot path and add debug in CI config

* Print installed package in web container

* Add more debug info on web container failure installing some packages

* Add debug info to web container docker file

* Reorder certbot install instructions

* Remove cap_drop from web container temporarily for debugging

* Reintroduce a certbot service as it cannot run inside web container

this is because certbot need to be used before we start the web container

* Build a config container for production

This is because certbot needs its configuration
as on staging/production the application directory is not available in a volume.

* Instantiate the web application before running certbot

because certbot needs to access its webroot path on the web server for validation
so the web server has to be running and be up-to-date

* Separate the nginx configs for http and https for flexibility on staging

The https config can only be enabled after certbot has run but the web server
needs to run on http before certbot run

* Add a step to generate the vendor php dependencies before stage build

* Explicitely call docker-compose up to force restarting the containers

* Change pipeline to enable distinct jobs for new cert and renew

The cert will be renew during normal deployment.

Both jobs are manually triggered so that:
- they dont get accidentally blocked by letsencrypt for hitting rate limits
- give control to the team about what to push to staging

There's also a new release stage in the pipeline for live deployment

Finally, removed the dry-run flag for certbot

* Pull latest production_config image during staging deployment

otherwise some config (currently only certbot) on staging may get stale

* Correct the nginx HTTPs configuration to serve PHP code

* Reorder security/test stage and rename staging deploy actions

* Fix GitLab config

* Install php-libsodium

* Enable strong hashing for users password

This work includes:

* widening the password column in the database to fit the new hashing algorithm's output
* password verification to work with both old and new hash algorithms until all users have changed their password
* password verification for old hash algorithm now has a protection against timing attacks
* hardening of random password generation for new user, password reset, and password change,  using a CSPRNG (cryptographically secure pseudo-random number generator)
* simplifying authentication and user identity management by clearly using two distinct paths for direct authentication and for OAuth authentication

This is part of the work on hardening GigaDB from ticket: rija#126

* Add a DB migration for password column and have CI job to run migrations

* Run database migration in non-interactive mode in CI deploy job

* Change the CI job name for releasing to live environment

* Remove white space and add TODO for candidate for delegation to library

* Fix /site/admin so it's only accessible to admins

* Add table for Yii migrations in the database schema

* Apply PSR2 PHP codestyleguide fixes to DatasetController

./bin/php-cs-fixer fix /var/www/protected/controllers/DatasetController.php --rules=@psr2 --verbose --show-progress=estimating --no-interaction

* Remove development PHP dependencies from production containers

* Restore database prior to acceptance tests run with less constraint

* Backup main database when running acceptance tests only on dev

* Remove hardcoded http mention so that image link can pick up https

* Reinstate datbase backup and restore when running acceptance tests

* Minify jquery-ui and enable gzip compression of static assets

* Revert the last change to acceptance tests re backup/restore

* Remove redundant code from dataset view template

* Long expiration date for very static assets

* Remove some redundant lines

* Change the File size conversion from metric to binary

* Configure Coveralls for coverage check

* Remove redundant call to php-coveralls in gitlab config

* Make CI env variables part of artefacts

* Change coverage_check to user Coveralls API to test against main branch

We want to know how the current coverage value compares to the coverage of the last commit from the main branch

We use Gitlab API to get the commit SHA for that commit and Coveralls API to retrieve the coverage value.

* Parameterize the main branch to compare the coverage with

* Parameterize the main branch to compare the coverage with

missed passing the variable in gitlab config for when running in CI

* Make coverage check more robust

by default, it use GitLab API to provide reference coverage value.

If the token is not set (e.g:  a contractor who didn't set up a gitlab account or an opensource contributor), it reverts to Coveralls.io

If no access to coverage info about main branch, it reverts to a default value.

* Acceptance tests for the dataset view page

* Implement the step definitions of Core scenario for dataset view feature

* Improve some of the step definitions for dataset view acceptance tests

* Remove an old, legacy javascript library

* Run webapp_setup on deploy staging job but not on the deployment target

the css/site.css needs to be baked in the container

* Introduce a helper function to convert http urls to https when possible

* Trim urls before converting them to https

* Implement more steps of the dataset view acceptance tests

* Implement test scenario for non-tabbed external links on dataset view

* Add Content Security Policy to allow http loading of JBrowse in iframe

* Implement test scenario for JBrowse and fixed other @ok tests

* Implement test scenario for call to actions on dataset view

Also I had to change the id of the test user from 346 to 681 (and change all dependent tests)
because dataset view tests scenarios use the production-like test database dump whose gigadb_user table
has a lot of users already.

* Implement test scenarios for History and Protocols.io dataset view tab

* Implement test scenario for Funding tab of dataset view

* Implement test scenario for 3D models tab on dataset view

* Implement test scenario for Code Ocean tab of dataset view

* Implement the most basic test scenario for the Files tab of dataset view

* Implement scenario for showing files tables settings on dataset view

* Implement test scenario for page size personalisation in dataset view

* Improve core test scenario by loading test data from sql

* Implement table settings of columns test scenario for dataset view

* Implement test scenario for files pagination on dataset view

* Implement test scenario for Sample call to actions on dataset view

* Implements test scenarios for the samples tab on dataset view

It similar implementation as for the already implemented files tab scenarios.
So i could generalise the step definitions to works for both tabs.

However I run into the issues that the UI language on the table settings
is the same for both tabs, which is good for the user.

But it caused the tests for the files tab to fails, because the code for both are in the same html page
, with javascript used for routing to one or the other.
the Mink traversal functions are triggered by the the first one they encounter with the form elmeent name
(in this case the sample form elements), causing test failure with the following errors:

"Element is not currently visible and may not be manipulated"

the solution, which also happens to be good practice is to ensure all form elements have and "id" attribute
and that this id is unique. Those will be the target of the step definitions.

* Fix The File class to not raise exception when size is negative

* Hide curation log partial when creating brand new dataset object

This is a way around a 500 error because $dataset_id doesn't exist yet.

Also it seems it doesn't make sense to have it upon creation of the initial dataset object.

* Extract public action of DatasetController into PublicDatasetController

this is part of the work to refactor the fat controller DatasetController

rija#125

* Add application config to the artefacts to save after test job run

* Fix controller id's case as it breaks on case-sensitive OSes (linux)

* Fix config file controller id case

like last commit, but missed one file

* Fix public dataset views controller id case

like last commit, but missed one directory

* Implement acceptance tests for loading the dataset form and mint DOI

* Implement test scenario for keyword in dataset admin screen

* Implement test scenario for dataset view url to redirect and fix it

the new design of gigadb broke the feature I've implemented in:

#41

I discovered that while migrating my CasperJS tests into acceptance tests
to  complete the test harness for my refactoring work in

rija#125

* Fix layout of dataset form and allow testing on development container

* Extract admin actions from DatasetController into AdminDatasetController

* Remove javascript tag from test scenario as it's not necessary

* Fix crash when creating new curation log because id is null

* Fix link missed when creating AdminDatasetController

* Add explanation to the password column migration and allow reverting

* Add instructions for database migrations in README

* Move actions for submission and upload into DatasetSubmissionController

* Update urls to reflect move from Dataset to DatasetSubmissionController

These are urls I missed in the initial work of moving submission actions into DatasetSubmissionController.

It will also fix a 500 error when connecting to the private dataset url
for a dataset created through the wizard
because the wizard doesn't set the publisher_id property of dataset and the view tries to
show publisher name without checking whether it is set.

* Resettle view actions from PublicDataset to Dataset

now that the submission actions have been taken out of Dataset and into DatasetSubmissions

Dataset can now only have the actions doable by the public (dataset view)

* Intall APCu on application  container, setup Yii CApcCache Data caching

* Remove redundant cache settings that prevented APCu cache to function

* Replace submitter email dataset view web component with refactored code

* Replace dataset accessions web component with refactored code

* Add documentation to the DatasetSubmitter interface

* Correct the type hint for a DOI in StoredDatasetSubmitter's constructor

* Regenerated the PHP apidocs

* Replace dataset page's main section web component with refactored code

* Add to CachedDatasetMainSection superclass and interface for caching

in addition of consistent interaction and management of the cache,
it add an invalidation mechanism that will expire the cache content for
a dataset if there's a new dataset_log entry for that dataset

* Fix unit tests for *DatasetAccessions classes to be more concise

a good unit tests only test the edge, the interface and the direct dependency when test double are needed.
So fix these two tests so it doesn't stub/mock the entire dependency chain.

* Modernise *DatasetAccessions with good practice from *DatasetMainSection

cache abstraction and invalidation, as well as using dataset ID instead of identifier (DOI) for main input variable

Also abstracted the DAO code for retrieving the DOI to the superclass DatasetComponents
(as that code is used in all the Stored* web component class)

* Modernise *DatasetSubmitter with good practice from *DatasetMainSection

cache abstraction and invalidation, as well as using dataset ID instead of identifier (DOI) for main input variable

also removed a redudant _doi variable in StoredDatasetAccessions

and make sure a dataset id (and not a doi/identifier) is passed when all these classes are used in DatasetController

* Add missing tests and remove dead code

* Improve management of cache invalidation in dataset web components

Also added the storage and cache components for the presentation of dataset relations.

* Fix incorrect phpdocs again

* Replace dataset page's related datasets component with refactored code

* Remove dead code from dataset view and dataset controller

* Replace dataset page's publications links component with refactored code

* Replace dataset page's projects links component with refactored code

* Replace dataset page's projects external links component with refactored code

* Replace dataset page's projects keywords component with refactored code

* Replace dataset page's projects history component with refactored code

* Replace dataset page's files tab component with refactored code

* update API doc, correct some phpdocs

* Remove code redundant since refactoring Files tab on dataset view

* Make cache invalidation and TTL configurable

* Change cache invalidation query to use both dataset_log and curation_log

* Make citation links configurable in config file

* Replace dataset page's samples tab component with refactored code

* Replace dataset page's Funding tab component with refactored code

* Clean up the tests, remove some inconsistencies

* Fix and secure /search

Fix the issue where clicking search button with no keyword throws an error.

Add rate limit to /search to alleviate DOS attacks

* Use dev branch of phpunit-mink to get the lastest distributed version

* add debug log for test failure in CI

* configure nginx on CI to allow burst requests for functional tests

* Configure DAST (Dynamic Application Security Testing) on Gitlab CI

this is the bare minimum config: unauthenticated, no active attacks against application

the config is straight out of the documenation, except for the job to be run manually.

* Add before and after script for DAST job to deactivate the global ones

* Keep the secrets-sample file up-to-date

* Terraform configuration to instantiate Centos7 server on AWS for Docker

* Ansible script can deploy fail2ban and docker-ce and use terraform host

progress with deployment automation for docker hosts server using
Terraform and Ansible.

The ansible script can use dynamic inventory that connect with Terraform state for staging.
Ansible can also use manual IP address for production.

Modified the aws ec2 terraform script to include a allow-all egress as
terraform removes the implicit default AWS egress rules of the same permissions

* Ansible script can setup systemd service and certs for docker daemon

* Fix systemd role and add postgresql role for ansible script

* Fix certificates generation for docker daemon

* Ansible script install postgres, setup user and database from bootstrap

* Switch staging to the instance created with ansible/terraform

a pre-requisite was to update Gitlab CI Variables with the CA, cert and key PEM files generated with ansible

* Update url to match new staging server

* Use IP address for remote docker host variable to match certificate

* Update job for deploying certficate first time to match the regular job

* Added the vault to gitignore

* Make Ansible to post certs to Gitlab and set vault vars per environment

make sure you have your gitlab private token in a file (by default ~/.gitlab_private_token)

and vault file expected at ops/infrastructure/group_vars/all/vault (not versioned, to be communicated out-of-band alongside its password)

* Remove *_tlsauth_* variables from .secrets

* publish private and public ip of server to Gitlab in Ansible script

* Update documentation for managing infrastructure and platform

* Prevent staging and production PEM certs to end up in .secrets
@only1chunts only1chunts changed the title accessions link to relevant external DBs sample attribute accession links to relevant external DBs Aug 6, 2021
kencho51 pushed a commit to kencho51/gigadb-website that referenced this issue Nov 25, 2021
on production environments, the target directory for the reference data feeds
in the production docker file for worker container service was incorrect.

There was an error in the logs:

```
> docker --tlsverify -H=x.x.x.x:2376 exec rija-gigadb-website_fuw-worker_1 bash -c "journalctl -xn"
-- Logs begin at Thu 2020-08-13 09:52:46 UTC, end at Thu 2020-08-13 11:05:28 UTC. --
Aug 13 11:05:28 b5538c73b3d8 php[21]: gigascience#13 /app/vendor/yiisoft/yii2/base/Controller.php(180): yii\base\InlineAction->runWithParams(Array)
Aug 13 11:05:28 b5538c73b3d8 php[21]: gigascience#14 /app/vendor/yiisoft/yii2/console/Controller.php(179): yii\base\Controller->runAction('exec', Array)
Aug 13 11:05:28 b5538c73b3d8 php[21]: gigascience#15 /app/vendor/yiisoft/yii2/base/Module.php(528): yii\console\Controller->runAction('exec', Array)
Aug 13 11:05:28 b5538c73b3d8 php[21]: gigascience#16 /app/vendor/yiisoft/yii2/console/Application.php(180): yii\base\Module->runAction('queue/exec', Array)
Aug 13 11:05:28 b5538c73b3d8 php[21]: gigascience#17 /app/vendor/yiisoft/yii2/console/Application.php(147): yii\console\Application->runAction('queue/exec', Array)
Aug 13 11:05:28 b5538c73b3d8 php[21]: gigascience#18 /app/vendor/yiisoft/yii2/base/Application.php(386): yii\console\Application->handleRequest(Object(yii\console\Request))
Aug 13 11:05:28 b5538c73b3d8 php[21]: gigascience#19 /app/yii(23): yii\base\Application->run()
Aug 13 11:05:28 b5538c73b3d8 php[21]: gigascience#20 {main}.
Aug 13 11:05:28 b5538c73b3d8 php[21]: 2020-08-13 11:05:28 [3] backend\models\MoveJob (attempt: 1, pid: 21) - Error (0.023 s)
Aug 13 11:05:28 b5538c73b3d8 php[21]: > yii\base\ErrorException: file_get_contents(/var/www/files/data/filetypes.json): failed to open stream: No such file or directory
```
@rija
Copy link
Contributor

rija commented Nov 7, 2022

@only1chunts can't remember why we wanted this, so closing while he's thinking about it.

@rija rija closed this as completed Nov 7, 2022
@only1chunts
Copy link
Member Author

This is a valid feature request, so reopening ticket.
The link that we want to see on the dataset pages is WITHIN the Sample and/or File tabs. I think an image will help clarify what this ticket is actually asking for, the accession number highlighted in the below image is within the sample table but should still be hyperlinked to the relevant page in the SRA:
image

@luistoptal
Copy link
Collaborator

luistoptal commented Nov 20, 2024

@only1chunts If sample attributes need to be converted to a link, I would need a specific logic that allows me to determine the link from the value. With that, I could just take the value and in the frontend or backend use that logic and wrap it in a link

Incidentally, I think the process to create samples is not very user-friendly, essentially I figured it involves creating a species or writing down the taxon id of an existing species, then creating a sample for the species taxon id, and add a list of "valid" sample attrbutes, but there is no indication of what sample attributes are valid

if this is the flow editors use to create samples, it might benefit from a simple help box with information (or a more complex option that extends the Dataset:Samples form with attribute fields similar to the Dataset:files form)

@only1chunts
Copy link
Member Author

It has long been on the wish list for an improved samples admin page #54
With regards to the logic you mention, there are a number of related tickets (e.g. see #905) about using bioregistries (or similar), where we enable the link via the prefix. So where we write something like "BioSample:SAMEA7631915" the prefix of "BioSample" can be looked up in bioregistries to discover that its present and therefore we can use the URL "https://bioregistry.io/biosample:/SAMEA7631915" to resolve it to the final page "https://www.ebi.ac.uk/biosamples/samples/SAMEA7631915"
I guess this means we either need a list of common or expected prefixes that we know are in Bioregistry, or we need a tool to look up prefixes on the fly.
issue #824 also includes more details on link prefixes

@luistoptal
Copy link
Collaborator

luistoptal commented Nov 20, 2024

@only1chunts looking at the tickets, this seems a bit complicated and lots of different things to do but what I get is that

this would not solve all the issues, but it could be a start

What is unclear to me is, what is the "source" on the http://gigadb.gigasciencejournal.com/adminLinkPrefix/admin table, and, does the source depend on some user setting?

@only1chunts
Copy link
Member Author

The Prefix table is not exactly the right place to be looking for the CURIES used by things like BioRegistry, it was/is our attempt at internalising some of the things we used regularly, but really its not the right approach, we shouldn't be maintaining that sort of mapping internally so we may want to think about replacing the external links prefix table with Bioregistry or something.
Given the complicated nature of this ticket, it might be best to leave it to be taken care of when the #824 ticket is worked on, they are very closely related.
Instead your time is probably better spent of #54 (working out how to enable easier input and editing of the sample attributes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

3 participants