Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add python bindings #35

Open
wants to merge 56 commits into
base: master
Choose a base branch
from
Open

Add python bindings #35

wants to merge 56 commits into from

Conversation

danpf
Copy link
Collaborator

@danpf danpf commented Oct 15, 2019

this may not ever be merged, but here I recreate the basics of mmtf-py using this library only.

it is currently significantly faster than the mmtf-py implementation.

time to load a single mmtf file 1000x
cpp bare 0.29s
this library 0.44s
python og 4.34s

the only 'question' that is really left to solve is how to handle the templating of the mapdecoder. But I think that will not be too difficult. Once i get some time to look at it.

not to get ahead of myself -- but TODO:

  • tests for round tripping (implement pytest CI as well)
  • add mapDecoder bindings
  • add python comparator classes + tests
  • add to pip
  • tests for bindings

Copy link
Collaborator

@gtauriello gtauriello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like where this is going. Thanks for the efforts. The TODO-list seems reasonable to me.

I had some concerns originally of whether the binding should be an external tool in a separate repository which simply uses mmtf-cpp. But actually, the way you are implementing it is very nicely separated anyways and it makes sense the way it is included here. So as long as everything stays in the separate "bindings" folder and is only included if requested in cmake, it's all good.

.gitmodules Outdated Show resolved Hide resolved
CMakeLists.txt Outdated
# set(PYTHON_INCLUDE_DIR_A python_include_A)
# set(PYTHON_INCLUDE_DIR_B python_include_B)

target_include_directories(mmtf_py PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/include)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Picky observation: there's a mix of tabs and spaces here. It would be preferred to stick to one style at least within the same file.

@danpf
Copy link
Collaborator Author

danpf commented Oct 15, 2019

@gtauriello thanks for your comments! I think it will be a relatively minor addition to the library when it's all said and done, hopefully totaling in at under 1000 lines (of course there's a lot of python init boilerplate that baloons that), but it's all quite simple.

@danpf
Copy link
Collaborator Author

danpf commented Mar 26, 2020

@gtauriello and @speleo3 could you review this when you get a chance? I think this is ready for a 1-4 week trial run in master, and then once we are happy with it I can publish it to pip.

what are your thoughts?

Copy link
Collaborator

@speleo3 speleo3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to split this into several pull requests

Can you add a README for the Python usage? It's not obvious what the suggested usage pattern is, the tests are the only hints.

The wrapping looks tedious and very explicit, I assume there is no way to do it automated? I hope this complexity is no burden for the long-term maintainability of mmtf-cpp.

[submodule "submodules/msgpack-c"]
path = submodules/msgpack-c
url = https://github.com/msgpack/msgpack-c.git
[submodule "submodules/Catch2"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catch2 listed twice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix this once the other two PRs are merged

.gitmodules Outdated
Comment on lines 16 to 18
[submodule "submodules/--force"]
path = submodules/--force
url = https://github.com/msgpack/msgpack-c.git
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submodules/--force? Looks like an accident

.travis.yml Outdated
addons: *linux64
- os: linux
compiler: clang
addons: *linux64
- os: linux
compiler: gcc
env: ARCH=x86 CMAKE_EXTRA=-DHAVE_LIBM=/lib32/libm.so.6
env: ARCH=x86 CMAKE_EXTRA=-DHAVE_LIBM=/lib32/libm.so.6 TEST_COMMAND=$TRAVIS_BUILD_DIR/ci/build_and_run_tests.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TEST_COMMAND redundant with global value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TEST_COMMAND environment variable is a way to run two different tests. I could not find a better way to run two different test suites. If you have any examples of repos that do this in a nicer way, I would be happy to learn a more efficient way and refactor this.

in this case TEST_COMMAND is not redundant because the env: block is overwriting the global value.

.travis.yml Outdated
compiler: gcc
addons: *linux64cpp17py36
dist: bionic
env: TEST_COMMAND=$TRAVIS_BUILD_DIR/ci/build_and_run_python_tests.sh CC=gcc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a global TEST_COMMAND_PYTHON and use

      env : TEST_COMMAND=$TEST_COMMAND_PYTHON

And is CC needed? Looks redundant with compiler: gcc

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested this, and apparently... yes it is required for some reason, even though you would think otherwise..

CMakeLists.txt Outdated Show resolved Hide resolved
include/mmtf/binary_decoder.hpp Outdated Show resolved Hide resolved
include/mmtf/binary_decoder.hpp Outdated Show resolved Hide resolved
@@ -163,7 +163,7 @@ struct StructureData {
std::string title;
std::string depositionDate;
std::string releaseDate;
std::vector<std::vector<float> > ncsOperatorList;
std::vector<std::vector<float>> ncsOperatorList;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is C++11 syntax. As far as I know this project is still C++03 (but you have my vote for dropping that).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing you meant to comment this somewhere else?

self.sourcedir = os.path.abspath(sourcedir)


class CMakeBuild(build_ext):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the advantages of using cmake here? Can't setuptools build an extension directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could, but we already have cmake baked into this repo and it was relatively simple to set it up to use cmake. If you are aware of some pitfalls of using this type of workflow (setup.py -> cmake) I'm open to changing it, but I found this method relatively painleses.

long_description=open("README.md").read(),
packages=find_packages("python_src", exclude=["tests", "python_src/tests"]),
package_dir={"": "python_src"},
ext_modules=[CMakeExtension("mmtf_cppy/")],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name mmtf_cppy is a bit odd. How about dropping the y?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's possible we could just kill the python implementation with this one too and just use mmtf. but that would make them incompatible with eachother

@danpf
Copy link
Collaborator Author

danpf commented Mar 26, 2020

@speleo3 Thank you for taking the time to review! I'll split this out into 3 requests and then I can address your concerns

@gtauriello
Copy link
Collaborator

Thanks to both of you for your efforts here and sorry if I've been quiet lately. At SWISS-MODEL, we are currently involved in some modeling efforts for SARS-CoV-2 which makes it hard for me to look at this code. That being said, I'll try to contribute what I can...

@@ -1,57 +1,96 @@
---
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I 'linted' this file using yamllint, sorry for the noise

@danpf
Copy link
Collaborator Author

danpf commented Mar 26, 2020

Can you add a README for the Python usage? It's not obvious what the suggested usage pattern is, the tests are the only hints.

done

The wrapping looks tedious and very explicit, I assume there is no way to do it automated? I hope this complexity is no burden for the long-term maintainability of mmtf-cpp.

I actually did use https://github.com/RosettaCommons/binder to generate the initial bindings.

but in this case binder is not a great fit because

  1. msgpack complicates things due to its variant typing system
  2. binder bindings were slower because they did not destroy the c++ data, they returned copies/lists.

it was actually very easy to do most of the bindings. char related data was the only difficult thing to use.

@danpf
Copy link
Collaborator Author

danpf commented Apr 30, 2020

reminder bump!
(if you're still busy w/ covid stuff don't feel obligated to respond) just making sure this isn't completely forgotten (like i just did for a month)

@danpf
Copy link
Collaborator Author

danpf commented Oct 21, 2020

Hi all,
sorry to bump again, but I also forgot about this (and the other PRs) for a few months. Hopefully everyone is healthy and doing well!

@gtauriello
Copy link
Collaborator

Ciao all. I am so sorry that I left this pending for so long. I now merged the other 2 PRs and did a 3rd one for the segfault issue (as @speleo3 had suggested).

Also: honestly, I simply don't find time for this project any more. Can one of you @danpf or @speleo3 take it over? I think I can pass admin rights to one (or both) of you for this...

@danpf for the code itself:
I think all 5 TODO items at the start of this PR are done, right?
Once the other PR is merged (I need a review there), you will have to merge master into this branch to resolve merge conflicts.
Other than that it seems to work just fine...

Comments on the README:

  • can you add some text that "pip install -r requirements.txt" is needed before "pip install ." (also can't the requirements be handled by pip somehow?)
  • can you add a short description on what the example code is expected to do?

@gtauriello
Copy link
Collaborator

@danpf One more thing for after the PR and master merges. Can you add an item to the CHANGELOG about the python bindings?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants