-
Notifications
You must be signed in to change notification settings - Fork 21
Dev info
NOTE: Use python 3.7! (Other versions may result in errors.) Some sort of python environment management tool may be your friend (e.g., pyenv, virtualenv).
- Clone the RTX repo and navigate into it (
cd RTX
) - Run
pip install -r requirements.txt
- Give your public RSA key to another ARAX dev for authentication
- If you don't have an RSA key already, you'll need to generate one
- The dev will need to put your public key on
araxconfig.rtx.ai
andarax.ncats.io
- Navigate to
RTX/code/ARAX/test
and runpytest -v
- This will trigger automatic downloads of
confiv2.json
and all necessary databases - Note that this can take over an hour depending on your internet connection as some of the databases that will be downloaded are quite large
- This will trigger automatic downloads of
NOTE: This section is slightly outdated, doesn't seem to work as is.. needs updating
If you are running ARAX_query and friends on your local machine and are generating nice JSON, but you want to be able to visualize these JSON beasts through the UI, here's how you can do that (at least it worked for me on my Windows box):
Step 1) Install one more needed modules for CORS support and make sure connexion[swagger-ui] are installed
pip3 install flask_cors
pip3 install connexion[swagger-ui]
Step 2) Add a custom endpoint destination:
cd code/UI/interactive
cp config.js.example config.js
edit config.js to contain:
config.base = 'http://localhost:5001/';
config.baseAPI = config.base + "api/arax/v1.2";
Step 3) Start the Flask server (blocks this shell and runs until ^C)
cd code/UI/OpenAPI/python-flask-server
python3 -m openapi_server
Step 4) Point your web browser to the UI files on your local filesystem, something like:
file://G:/Repositories/GitHub/RTX/code/UI/interactive/index.html
or
file://G:/Repositories/GitHub/RTX/code/UI/interactive/index.html?r=1
(the r number is the response id that you want to view in the UI)
By changing the r number, you should be able to view the messages you are creating and storing via ARAQ_Query {make sure you don't have return(store=false) in your DSL otherwise there's no r number} In theory launching queries from the GUI should work, too, but I haven't properly tested it.
Care should be taken that the code never just dies because then there is no feedback about the problem in the API/UI. Use the ARAXResponse.error
mechanism to log informative messages throughout your code (see below section for more details):
-
DEBUG
: Only something an ARAX team member would want to see -
INFO
: Something an API user might like to see to examine the steps behind the scenes. Good for innocuous assumptions. -
WARNING
: Something that an API user should sit up and notice. Good for assumptions with impact -
ERROR
: A failure that prevents fulfilling the request. Note that logging an error may not halt processing. Several can accumulate. If you need processing to terminate, eitherreturn
orraise
anException
depending on where this error occurs.
An ARAXResponse
object is passed into each ARAX module's apply()
method; among many things, this object serves as ARAX's log. You may either use this same response object throughout your module by passing it to different methods/classes as needed OR you may instantiate new response objects and then merge them with the response object that is ultimately returned from the module's apply()
method.
- Major methods (not little helper ones that can't fail) and calls to different ARAX classes should always:
- Either instantiate a new
ARAXResponse
object or take one as an input parameter- Log with
response.debug
,response.info
,response.warning
, andresponse.error
- Place returned data objects in the
response.data
envelope (dict
)- Return that response object
- Log with
- Callers of major methods should call with
result = object.method()
- Then immediately merge the new result into the active response (if they are separate response objects)
- Then immediately check
result.status
to make sure it is'OK'
, and if not, return response or take some other action for method call failure - The class may store the
Response
object as an object variable and sharing it among the methods that way (this may be convenient)
We generally manage all work (bug fixes, features, and enhancements) via GitHub issues. The general workflow for working on a GitHub issue is as follows:
- Create a branch for your issue (typically off of the
master
branch, but possibly another branch depending on your particular issue) - Implement the necessary code changes for your issue in your branch
- Ensure your commit messages are under 70 characters and always reference the issue in your commit (e.g., with '#1000', if your issue number was 1000)
- It is generally ok to push commits to your branch that leave the system in a broken state, unless the branch is shared with other devs who do not expect the system to be broken (but you should never push breaking changes to
master
!)
- If you are working on this issue for an extended period of time you will likely want to periodically merge
master
(or whatever your parent branch was) into your branch (see section on Branches and Merging) - It is generally a good idea to add one or more pytests (see the Testing section) that test out your fix/changes, but please ensure the test completes speedily (within ~10 seconds) or mark it with
@pytest.mark.slow
! - Once you believe you are done implementing changes, merge
master
into your branch and run the ARAX Pytest suite - If any tests are failing, you need to figure out why and address those
- Once all tests are passing, you can make a Pull Request to merge your branch into
master
(or whatever your parent branch was)- Be sure to reference the issue from your PR (same way as in commit messages)
- Once you become more experienced you may omit creating a PR and instead directly merge your branch into
master
, but it is best practice to issue a PR
- Next add the
verify in next deployment
tag to your issue - Once your PR is merged, please delete your branch (assuming you aren't using it for any other issues)
- After
master
has been rolled out to one of our ARAX endpoints (either test, beta, or production - see the Different Instances section), verify that your changes are working as expected on that endpoint - After that, post a message in the GitHub issue letting whoever submitted the issue know that the changes are complete (and which endpoint(s) they have been rolled out to)
- If the person who submitted the issue is satisfied, they or you can close the issue
- In your code, do not assume a particular location for the "current working directory". In general, try to use
os.path.abspath
to find the location of__FILE__
for your module and then construct a relative path to find other ARAX/RTX files/modules. - Always run the ARAX Pytest suite before pushing to
master
; do not push your changes tomaster
if any pytests are failing!
The ARAX Pytest suite lives at: RTX/code/ARAX/test/
. The README in that directory provides details on how to use the test suite, but some examples are provided below as well.
To run all tests, cd
to that folder and run
pytest -v .
To run the tests in a specific file
pytest -v <file.py>
To run a specific test:
pytest -v <file.py> -k <a test like test_example_3>
To run the slow tests:
pytest -v --runslow
To run the 'external' tests:
pytest -v --runexternal
To run all tests:
pytest -v --runslow --runexternal
The /asyncquery endpoint is a bit hard to test because you need to have a callback receiver that is Internet accessible or accessible to ARAX. There is a crude callback receiver available on ARAX itself.
How to use such a system is documented here: https://github.com/RTXteam/RTX/issues/1756
- "our" prod: arax.ncats.io
- "our" test: arax.ncats.io/test
- "our" beta: arax.ncats.io/beta
- ITRB production: arax.transltr.io
- ITRB test: arax.test.transltr.io
- ITRB CI/staging: arax.ci.transltr.io
See also this google doc with all endpoints and the branches they run.
The Jenkins dashboard for ITRB builds is here: https://deploy.transltr.io/.
ARAX has one config file that does not live in the RTX repo; it is called config_secrets.json
. The 'master copy' of this file lives on [email protected]
at /home/araxconfig/config_secrets.json
. ARAX developers' public RSA keys need to be listed in authorized_keys
on this instance; this allows config_secrets.json
to be automatically downloaded to their machine when queries are run (it auto-refreshes every 24 hours).
If desired, you may override config_secrets.json
by creating a (local) copy of it at RTX/code/config_secrets_local.json
, which you can tweak to contain whatever usernames/passwords you need. If a config_secrets_local.json
file is present, it will always be used instead of the regular config_secrets.json
.
NOTE: You should never push config_secrets.json
or share its contents in a public space! (i.e., beyond our team)
The ARAX database config file lives in the RTX repo at RTX/code/config_dbs.json
. This file specifies which versions of our various databases should be used. The ARAXDatabaseManager
automatically takes care of downloading/removing databases from developers' machines as needed, according to what is specified in config_dbs.json
.
-
production
anditrb-test
should not be committed to, save for ITRB-specific changes -
master
is to be merged intoproduction
and/oritrb-test
, not the other way around
To merge master
into mybranch
(replace with your own branch name), do the following:
git checkout master
git pull origin master
git checkout mybranch
git pull origin mybranch
git merge --no-ff origin/master
[if any merge conflicts: fix them and commit]
git push origin mybranch
To merge mybranch
into master
, do the following:
WARNING: Be very careful when merging anything into master
! Be sure your changes are fully tested and always first merge master
into your branch and test before doing this.
git checkout mybranch
git pull origin mybranch
git checkout master
git pull origin master
git merge --no-ff origin/mybranch
[if any merge conflicts: fix them and commit]
git push origin master
See this gist
- Install
gh
via these directions. - Check out the PR locally
gh pr checkout <PR number>
- Edit, check, commit, etc.
- If everything looks good:
-
git branch
to see what<branch name>
you are on -
git checkout master
switch to master branch -
git pull origin master
to make sure master is up to date -
git checkout <branch name>
switch back to PR branch -
git merge --no-ff origin/master
merge master into PR - Fix any merge conflicts
-
git checkout master
switch to master -
git merge --no-ff origin/<branch name>
to merge PR to master
-
To switch back to master: git checkout master
:server change-password
sudo service neo4j stop
sudo rm -rf /var/lib/neo4j/data/dbms
sudo -u neo4j neo4j-admin set-initial-password PASSWORD
sudo service neo4j start
$sudo mysql
>GRANT ALL ON RTXFeedback.* TO "rt"@"localhost" IDENTIFIED BY 'PASSWORD';
If rejected use:
$sudo mysql
>set password for 'rt'@'localhost'='PASSWORD';
Note: The synonymizer should be automatically downloaded into your dev environment upon running the pytest suite (or ARAX_database_manager.py
). But if you need to build one yourself for some reason, this explains how to do so.
How to build from scratch:
git pull
If your kg2_node_info.tsv
, kg2_equivalencies.tsv
, and kg2_synonyms.json
files are not already up to date (or you haven't created them yet), you should first do:
cd $RTX/code/ARAX/NodeSynonymizer
python3 dump_kg2_node_data.py
(This pulls down a lot of data over the network and takes 10+ minutes depending on network speed)
Then build the NodeSynonymizer database: (WARNING: The build process needs 25GB of free RAM to work!)
cd $RTX/code/ARAX/NodeSynonymizer
python3 sri_node_normalizer.py --build
python3 node_synonymizer.py --build --kg_name=both
python3 node_synonymizer.py --lookup=rickets --kg_name=KG2
NOTE: If during a branch switch/merge/commit you get a complaint about kg2_node_info.tsv
, kg2_equivalencies.tsv
, or kg2_synonyms.json
being untracked files that would be overwritten, it is safe to delete them. After building the new NodeSynonymizer database, you will not need those files around any more.