You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just stumbled over this project. Excellent work!
IMO you should publish this in an academic paper, since many scholars (in DH and beyond) will be interested...
I have a few suggestions on making this more readily available to users. (And I'd be happy to at least partially volunteer, if you are interested.)
requirements.txt
To maximize utilty for other users, it's necessary to minimize the assumptions/constraints of your explicit dependencies. Currently, that list requires the exact version you happened to have installed when you did pip freeze – but your actual requirements will almost always be broader. Also, it's not necessary to list dependent packages there. E.g. instead of ...
But most of these are already required by Mask-RCNN anyway (see next point), so perhaps this list could be drastically reduced to just:
mask-rcnn>=2.1
pandas
requests
image
Mask-RCNN redistribution
It's not good practice to dump an open source repo's code into your own, for several reasons:
you loose its history and source annotation (you have to dig it up externally)
you loose upstream changes/fixes (you have to sync them in manually)
you loose the possibility to (conveniently) contribute your own changes to upstream
Instead, you should make mask-rcnn an external dependency, like the others. If you do need to make changes of your own, then do that in a Github fork, which you then can either integrate here as a git submodule or reference in requirements.txt via the GH URL of your fork.
Currently, IIUC, the only change you made is your custom.py and Jupyter notebooks, right?
separate code and data
You (thankfully) provided all your raw and intermediate data files along with the converter scripts and tools, so one can re-produce your work. But for re-use it's usually better to strictly separate data and code files. At least in the directory structure, but preferably also with distinct repos (pointing to the data repo in a submodule only).
distribute on PyPI
The most straightforward and visible distribution option for Python projects is the Python Packaging Index. To publish there, all you have to do is register for an account, bring your repo into shape (following the Python packaging guidelines – most notably, create a setup.py with entry_points and linking to requirements.txt), then use twine to build and upload your package.
The text was updated successfully, but these errors were encountered:
@bertsky Wow, thank you for all the feedback! I've been putting off cleaning up this repo and merging in some updates, so this is just the motivation I needed -- and changes I hadn't considered.
I'll begin working on some of these updates, but of course would not say no to any volunteer work :)
but of course would not say no to any volunteer work :)
unfortunately I just ran out of GPU computing quota, so I can only commence testing your project in April.
AFAICT the base revision you used is matterport/Mask_RCNN@a49160a. There have been a couple of important fixes on master since then. But strictly, one should compare the impact of these on your training+evaluation. I could do that myself (first reproducing the current state of affairs, then moving to mask-rcnn master and repeating everything), but for you it's probably a lot less effort.
I just stumbled over this project. Excellent work!
IMO you should publish this in an academic paper, since many scholars (in DH and beyond) will be interested...
I have a few suggestions on making this more readily available to users. (And I'd be happy to at least partially volunteer, if you are interested.)
requirements.txt
To maximize utilty for other users, it's necessary to minimize the assumptions/constraints of your explicit dependencies. Currently, that list requires the exact version you happened to have installed when you did
pip freeze
– but your actual requirements will almost always be broader. Also, it's not necessary to list dependent packages there. E.g. instead of ......you could just write
keras<2.3
.But most of these are already required by Mask-RCNN anyway (see next point), so perhaps this list could be drastically reduced to just:
Mask-RCNN redistribution
It's not good practice to dump an open source repo's code into your own, for several reasons:
Instead, you should make mask-rcnn an external dependency, like the others. If you do need to make changes of your own, then do that in a Github fork, which you then can either integrate here as a git submodule or reference in
requirements.txt
via the GH URL of your fork.Currently, IIUC, the only change you made is your
custom.py
and Jupyter notebooks, right?separate code and data
You (thankfully) provided all your raw and intermediate data files along with the converter scripts and tools, so one can re-produce your work. But for re-use it's usually better to strictly separate data and code files. At least in the directory structure, but preferably also with distinct repos (pointing to the data repo in a submodule only).
distribute on PyPI
The most straightforward and visible distribution option for Python projects is the Python Packaging Index. To publish there, all you have to do is register for an account, bring your repo into shape (following the Python packaging guidelines – most notably, create a
setup.py
withentry_points
and linking torequirements.txt
), then usetwine
to build and upload your package.The text was updated successfully, but these errors were encountered: