Releases · jsvine/pdfplumber

21 Dec 14:06

jsvine

v0.6.0

7163d29

v0.6.0

See CHANGELOG.md for a full list of additions, changes, and fixes. In some (hopefully) rare cases, this version may introduce breaking changes, which is why we're bumping to v0.6.0. Highlights from the changelog include:

Upgrade pdfminer.six from 20200517 to 20211012; see that library's changelog for details, but a key difference is an improvement in how it assigns line, rect, and curve objects. (Diagonal two-point lines, for instance, are now line objects instead of curve objects.) (#515)
Add .extract_text(layout=True), an experimental feature which attempts to mimic the structural layout of the text on the page. (#10)
Remove Decimal-ization of parsed object attributes, which are now represented with as much precision as is returned by pdfminer.six (#346 + #520)
.extract_text(...) returns "" instead of None when character list is empty. (#482 + cb9900b) [h/t @tungph]
Add --precision argument to CLI (#520)
Add snap_x_tolerance and snap_y_tolerance to table extraction settings. (#51 + #475) [h/t @dustindall]
Add join_x_tolerance and join_y_tolerance to table extraction settings. (cbb34ce)
.extract_words(...) now includes doctop among the attributes it returns for each word. (66fef89)

And many thanks to @samkit-jain for his feedback and review of contributions to this release. 🎉

Contributors

tungph, dustindall, and samkit-jain

Assets 2

08 May 21:50

jsvine

v0.5.28

1328a00

v0.5.28

From CHANGELOG.md:

Added

Add --laparams flag to CLI. (#407)

Changed

Change .convert_csv(...) to order objects first by page number, rather than object type. (#407)
Change .convert_csv(...), .convert_json(...), and CLI so that, by default, they returning all available object types, rather than those in a predefined default list. (#407)

Fixed

Fix .extract_text(...) so that it can accept generator objects as its main parameter. (#385) [h/t @alexreg]
Fix page-parsing so that LTAnno objects (which have no bounding-box coordinates) are not extracted. (Was only an issue when setting laparams.) (#388)
Fix Page.extract_table(...) so that it honors text tolerance settings (#415) [h/t @trifling]

Assets 2

28 Feb 19:37

jsvine

v0.5.27

3b6fee4

v0.5.27

From CHANGELOG.md:

Fixed

Fix regression (introduced in 0.5.26/b1849f4) in closing files opened by PDF.open
Reinstate access to higher-level layout objects (such as textboxhorizontal) when laparams is passed to pdfplumber.open(...). Had been removed in 0.5.24 via 1f87898. (#359 + #364)

Development Changes

Add a python setup.py build sdist test to main GitHub action. (#365)

Assets 2

11 Feb 02:54

jsvine

v0.5.26

64efb5c

v0.5.26

See CHANGELOG.md for details.

Assets 2

09 Dec 14:22

jsvine

v0.5.25

332f973

v0.5.25

See CHANGELOG.md for details.

Assets 2

20 Oct 13:50

jsvine

v0.5.24

954dc94

v0.5.24

See CHANGELOG.md for details.

Assets 2

15 Aug 17:08

jsvine

v0.5.23

d2e7cfd

v0.5.23

See changelog for details.

Assets 2

25 Jul 11:59

jsvine

v0.5.22

9eb5c4b

v0.5.22

[0.5.22] — 2020-07-18

Changed

Upgraded pdfminer.six requirement to ==20200517 (cddbff7) [h/t @youngquan]

Added

Add support for non_stroking_color attribute on char objects (0254da3) [h/t @idan-david]

Assets 2

06 Jan 02:38

jsvine

v0.5.15

01f7a7e

v0.5.15

Primarily: Upgrades pinned requirements for pdfminer.six and pillow.

Assets 2

06 Feb 14:39

jsvine

v0.6.0-alpha

26bf1c5

v0.6.0-alpha Pre-release

Pre-release

This release is a preview/alpha for pdfplumber v0.6.0. Among the more notable changes:

Revamps the table-extraction methods, to simplify them and make them more flexible.
Adds font size and font name to results of Page/utils.extract_words(...), based on @jsfenfen's suggestions in #28. (Thanks!)

Goals before v0.6.0-beta:

Add Page.find_text_gutters feature, bringing back that table-finding strategy from earlier versions of pdfplumber.
Attempt to fix/address as many extant GitHub issues as possible.
Update the example notebooks, so that they work.

Goals before v0.6.0 full release:

Reach full test coverage.
Add more robust documentation.
Add more/better docstrings.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Added

Changed

Fixed

Fixed

Development Changes

[0.5.22] — 2020-07-18

Changed

Added

Releases: jsvine/pdfplumber

v0.6.0

Contributors

v0.5.28

Added

Changed

Fixed

v0.5.27

Fixed

Development Changes

v0.5.26

v0.5.25

v0.5.24

v0.5.23

v0.5.22

[0.5.22] — 2020-07-18

Changed

Added

v0.5.15

v0.6.0-alpha