Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atomic nibble instead of mutex #1601

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

betatim
Copy link
Member

@betatim betatim commented Jan 30, 2017

NibbleStorage uses a big (bad) mutex. This is pretty slow. This PR removes that and instead uses an array of atomic bytes.

  • Is it mergeable?
  • make test Did it pass the tests?
  • make clean diff-cover If it introduces new functionality in
    scripts/ is it tested?
  • make format diff_pylint_report cppcheck doc pydocstyle Is it well
    formatted?
  • Did it change the command-line interface? Only backwards-compatible
    additions are allowed without a major version increment. Changing file
    formats also requires a major version number increment.
  • For substantial changes or changes to the command-line interface, is it
    documented in CHANGELOG.md? See keepachangelog
    for more details.
  • Was a spellchecker run on the source code and documentation after
    changes were made?
  • Do the changes respect streaming IO? (Are they
    tested for streaming IO?)

lib/storage.hh Outdated
@@ -393,7 +403,7 @@ public:

Byte ** get_raw_tables()
{
return _counts;
return (Byte **)_counts;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely evil :-/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what to do about this. What is the use case for get_raw_tables() beyond nosy devs wanting to peak inside?

@ctb
Copy link
Member

ctb commented Jan 31, 2017 via email

@betatim
Copy link
Member Author

betatim commented Jan 31, 2017

Seems like this was the first time this came up: #667 From what I gather of the discussion there get_raw_tables for anything using <8bits per bucket is not what they were expecting to get. Same goes for the Node* classes no?

I don't think we can use the buffer interface to fake the "round up to nearest byte".

Should we move it up the inheritance tree to expose it only for Count* classes?

@betatim betatim force-pushed the feature/atomic-nibble branch 2 times, most recently from 5229519 to f422e07 Compare January 31, 2017 17:57
@standage
Copy link
Member

Should we move it up the inheritance tree to expose it only for Count* classes?

I think this is reasonable.

@betatim betatim force-pushed the feature/atomic-nibble branch 2 times, most recently from a870206 to 641436f Compare February 1, 2017 15:16
@betatim
Copy link
Member Author

betatim commented Feb 1, 2017

get_raw_tables has been moved and tests adjusted.

Ready for review! @luizirber, @camillescott, @standage , or @ctb

@codecov-io
Copy link

codecov-io commented Feb 14, 2017

Codecov Report

Merging #1601 into master will decrease coverage by <.01%.
The diff coverage is 52.94%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1601      +/-   ##
=========================================
- Coverage   70.11%   70.1%   -0.01%     
=========================================
  Files          66      66              
  Lines        8906    8877      -29     
  Branches     3009    2999      -10     
=========================================
- Hits         6244    6223      -21     
- Misses       1040    1041       +1     
+ Partials     1622    1613       -9
Impacted Files Coverage Δ
khmer/_khmer.cc 57.1% <ø> (ø) ⬆️
lib/hashtable.hh 82.6% <ø> (-0.38%) ⬇️
khmer/_cpy_smallcountgraph.hh 52% <ø> (-2.77%) ⬇️
lib/storage.cc 47.88% <0%> (ø) ⬆️
lib/hashgraph.hh 70.96% <100%> (+0.96%) ⬆️
lib/storage.hh 86.86% <57.14%> (-2.83%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6dd8430...f47c07c. Read the comment docs.

Use individual bytes that can be updated atomically instead of
mutexes.
Node* or SmallCount* objects pack things into individual bytes
which makes the raw table points unsuitable for numpy.frombuffer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants