performance improvements #60

marcelotduarte · 2016-02-16T07:19:24Z

With this speedup, when you have various pages with png images, the
improvements is significant.
As a example, I used an old dual core machine, with 104 png images (16MB per image), to
generate a pdf, 1 image per page. The time consuming before was 1 minute
and 28 seconds. After this patch, the time is 6 seconds!!!
Yes, 6 seconds, 14,7x faster!

With this speedup, when you have various pages with png images, the improvements is significant. As a example, I used an old dual core machine, with 104 png images, to generate a pdf, 1 image per page. The time consuming before was 1 minute and 28 seconds. After this patch, the time is 6 seconds!!! Yes, 6 seconds, 14,7x faster!

vadmium · 2016-02-16T09:58:07Z

fpdf/fpdf.py

+            if PY3K:
+                if not isinstance(s, bytes):
+                    s = s.encode('latin1')
+                s += b"\n"


According to the readme, this supports Python 2.5, which I suspect would choke compiling this syntax

I had no idea about 2.5 here. I have app using this in py2.7 and py3.4. Do you have a suggestion?

If I am right about s always being str at this point, change it to

s += "\n" if(self.state == 2): self.pages[self.page]["content"] += s else: if PY3K: s = s.encode('latin1') self.stream.write(s)

Otherwise, maybe you could use the b() function from the py3k module.

you are right!

vadmium · 2016-02-16T10:07:43Z

The general idea sounds good to me. Writing into BytesIO or similar should be well optimized, and concatenating strings in Python performs badly in this sort of scenario.

marcelotduarte · 2016-08-02T17:22:58Z

@reingart: This patch is good to be merged?

performance

aussig · 2019-01-10T08:59:57Z

Thank you for this performance work. Swapping in this version reduced the build time for a 112MB, 228 page PDF containing over 4500 images from 40 minutes to 30 seconds - an awesome improvement.

alallier · 2021-03-28T04:25:10Z

This offers a huge performance benefit, any chance this will get merged?

alexp1917 · 2021-03-28T18:58:48Z

@alallier you may want to check out #171 and also check out this link to see that the fpdf2 fork uses a bytearray instead of a string (and is otherwise maintained):

https://github.com/PyFPDF/fpdf2/blob/master/fpdf/fpdf.py#L178

alallier · 2021-03-30T15:41:15Z

Thanks for the link. I noticed shortly after that this project was abandoned which is unfortunate

vadmium reviewed Feb 16, 2016
View reviewed changes

simplification as suggested by vadmium

3470b96

marcelotduarte mentioned this pull request Apr 17, 2018

Speed and memory issues #93

Open

Merge remote-tracking branch 'upstream/master' into performance

2dc2d4d

performance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance improvements #60

performance improvements #60

marcelotduarte commented Feb 16, 2016 •

edited

Loading

vadmium Feb 16, 2016

marcelotduarte Feb 16, 2016

vadmium Feb 16, 2016

marcelotduarte Mar 20, 2016

vadmium commented Feb 16, 2016

marcelotduarte commented Aug 2, 2016

aussig commented Jan 10, 2019

alallier commented Mar 28, 2021

alexp1917 commented Mar 28, 2021

alallier commented Mar 30, 2021

performance improvements #60

Are you sure you want to change the base?

performance improvements #60

Conversation

marcelotduarte commented Feb 16, 2016 • edited Loading

vadmium Feb 16, 2016

Choose a reason for hiding this comment

marcelotduarte Feb 16, 2016

Choose a reason for hiding this comment

vadmium Feb 16, 2016

Choose a reason for hiding this comment

marcelotduarte Mar 20, 2016

Choose a reason for hiding this comment

vadmium commented Feb 16, 2016

marcelotduarte commented Aug 2, 2016

aussig commented Jan 10, 2019

alallier commented Mar 28, 2021

alexp1917 commented Mar 28, 2021

alallier commented Mar 30, 2021

marcelotduarte commented Feb 16, 2016 •

edited

Loading