Release v0.6.0 · jsvine/pdfplumber

See CHANGELOG.md for a full list of additions, changes, and fixes. In some (hopefully) rare cases, this version may introduce breaking changes, which is why we're bumping to v0.6.0. Highlights from the changelog include:

Upgrade pdfminer.six from 20200517 to 20211012; see that library's changelog for details, but a key difference is an improvement in how it assigns line, rect, and curve objects. (Diagonal two-point lines, for instance, are now line objects instead of curve objects.) (#515)
Add .extract_text(layout=True), an experimental feature which attempts to mimic the structural layout of the text on the page. (#10)
Remove Decimal-ization of parsed object attributes, which are now represented with as much precision as is returned by pdfminer.six (#346 + #520)
.extract_text(...) returns "" instead of None when character list is empty. (#482 + cb9900b) [h/t @tungph]
Add --precision argument to CLI (#520)
Add snap_x_tolerance and snap_y_tolerance to table extraction settings. (#51 + #475) [h/t @dustindall]
Add join_x_tolerance and join_y_tolerance to table extraction settings. (cbb34ce)
.extract_words(...) now includes doctop among the attributes it returns for each word. (66fef89)

And many thanks to @samkit-jain for his feedback and review of contributions to this release. 🎉

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.6.0

Contributors