Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a SimpleBunch C extension type for speeding the most common operations up #5

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

dsuch
Copy link
Contributor

@dsuch dsuch commented Mar 2, 2012

Hi David,

care to take a look at this pull request?

The thing is that there are some core operations, the most common ones, that I'd like to speed up. If you compile the package now and run this gist for instance

https://gist.github.com/1961505

you'll notice that for what I consider the most commonly used access patterns, which is

import bunch

d = {1:11, 2:22, 3:33}
sb = bunch.SimpleBunch(d)

sb.aa = 'bb'
'aa' in sb
del sb['aa']

well, the difference is an order of magnitude. bunch.SimpleBunch is about 10-15x faster than bunch.Bunch and some 20-30% slower than the plain dict.

There's a caveat though - the reason I'm calling it a 'simple' Bunch is that only getattr and setattr are implemented and even then, setattr assumes everything should be stored as the SimpleBunch's keys, in other words - I haven't seriously played with the whole getattribute machinery on C API level - might be a field for expansion to explore by future contributors and then it may be possible that bunch.Bunch will become a subclass of bunch.SimpleBunch, but that's future for now.

The whole thing is imported optionally, if the extension isn't available, bunch will define SimpleBunch as an alias to regular Bunch.

You might want to have a look at '_simple_bunch_systems' in setup.py and add some other systems that are likely to have a C compiler handy. This of course assumes a C compiler will be always available on any such a system but I hope this isn't that bad an assumption.

What do you think of it?

Cheers!

@dsuch
Copy link
Contributor Author

dsuch commented Mar 5, 2012

OK, I've added some more to it but I'll stop at that :-)

What I did was to make the C implementation of getattr and setattr actually match that in Python. That means the code is now about 4-5 times slower than built-in dictionaries yet it's still 4-5 faster than pure-Python Bunch implementation.

I think the code can be merged in although I certainly wouldn't make it a default for now, let people use it and maybe spot things I've overlooked. Let there be more feedback.

Cheers!

@DannyGoodall
Copy link

Can I put in a request that these C extensions are evaluated and perhaps incorporated? I love the code style improvements Bunch provides - especially when dealing with JSON-like structures. But some high-level speed testing using MongoDB through pymongo shows that Bunch s l o w s things down significantly.

Cheers.

@dsc
Copy link
Owner

dsc commented Sep 20, 2024

I personally think this is very cool, even if it's 12y old. Any new comment?

@dsc dsc added the interesting Cool, but pending discussion label Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interesting Cool, but pending discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants