Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow for big entries (SOLVED) #29

Open
GoogleCodeExporter opened this issue Oct 30, 2015 · 7 comments
Open

Slow for big entries (SOLVED) #29

GoogleCodeExporter opened this issue Oct 30, 2015 · 7 comments

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. cat file | pyp "len(pp)"         # for a file with 500000 numbers
2.
3.

What is the expected output? What do you see instead?
the expected output is 500000

The problem is the method: def flatten_list(self, iterables)
it is recursive and does too many list manipulations.

I've rewritten it:

    def flatten_list(self, iterables):
        '''
        returns a list of strings from nested lists
        @param iterables: nested list to flatten
        @type iterables: list<str>
        '''
        out = []
        stack=[[iterables,0,len(iterables)]]
        while stack:
            curIter,pos,limit=stack[0]
            if pos==limit:
                stack.pop(0)
                continue
            if type(curIter[pos]) not in [str, PypStr]:
                stack[0][1]+=1
                stack.insert(0,[curIter[pos],0,len(curIter[pos])])
            else:
                out.append(curIter[pos])
                stack[0][1]+=1

        return out

Today is the first day I've used pyp, it's great but I've found this issue,
I think that would be great to make it faster, could anyone test this method
and maybe patch the source?

Thanks,
C

Original issue reported on code.google.com by deepbit on 16 Sep 2014 at 5:33

@GoogleCodeExporter
Copy link
Author

awesome! can you post some speed tests? we'll test and put in if it works as 
expected. We're doing a bunch of stuff next month.

thanks!

Toby

Original comment by [email protected] on 16 Sep 2014 at 6:03

@GoogleCodeExporter
Copy link
Author

$ for i in {0..100000}; do echo $i; done > /tmp/numbers
$ time cat numbers | pyp "len(pp)"
100001

real    1m27.045s
user    1m26.166s
sys 0m0.606s

(AFTER CODE PATCH)
$ time cat numbers | pyp "len(pp)"
100001

real    0m0.927s
user    0m0.871s
sys 0m0.050s


Best
C

Original comment by deepbit on 16 Sep 2014 at 9:12

@GoogleCodeExporter
Copy link
Author

Because I haven't written complex PYP commands I am not sure wether the patch 
is completely bug free but I am pretty sure you can test it quickly.

Best
C

Original comment by deepbit on 16 Sep 2014 at 9:15

@GoogleCodeExporter
Copy link
Author

wow, truly impressive...100 times faster.  We'll test this and get back with 
you.

thanks again!


t

Original comment by [email protected] on 16 Sep 2014 at 9:23

@GoogleCodeExporter
Copy link
Author

testing this now. looks ok so far...going to release a big update to pyp 
including this...do you have any further revisions?

thanks!

toby

Original comment by [email protected] on 7 Feb 2015 at 12:17

@GoogleCodeExporter
Copy link
Author

it looks like this works great for lists...increase speed by 100x...line by 
line speed is still the same...let me know if you have any ideas...

cheers,

toby

Original comment by [email protected] on 7 Feb 2015 at 1:09

@GoogleCodeExporter
Copy link
Author

here is the latest if interested.  pretty close to releasing it.

Original comment by [email protected] on 7 Feb 2015 at 1:45

Attachments:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant