Skip to content

Releases: extractus/article-extractor

v7.2.7

02 Dec 16:32
aabc0e7
Compare
Choose a tag to compare
  • Update dependencies
  • Fix CI issues
  • Update docs & links

v7.2.6 - Change name

30 Nov 14:32
f31c80f
Compare
Choose a tag to compare
  • Change package name from article-parser to @extractus/article-extractor
  • Move to new organization Extractus

v7.2.5

13 Nov 05:13
cf0508e
Compare
Choose a tag to compare
  • Update dependencies
  • Improve meta data extraction
  • Add security policy

v7.2.4

24 Sep 05:17
0e80ba3
Compare
Choose a tag to compare
  • Improve space/newline processing
    • no longer remove all linebreaks but multi empty lines are stripped
    • similar to spaces, muti spaces will be replaced with single space

v7.2.3

23 Sep 09:21
c938584
Compare
Choose a tag to compare
  • Optimize performance

By removing HTML validation step, we increased the performance to about 4x - 5x faster.

Before, article-parser checks if the extract's input is URL or valid HTML to decide next step.
Now when receiving the input, if that isn't URL, it assumes that's a HTML string and start extracting immediately.

v7.2.2 - Before

v7.2.3 - After

v7.2.2

23 Sep 07:39
22f4dab
Compare
Choose a tag to compare
  • Add options to extract method
    • Replace global config with on-request parserOptions
    • Add new param fetchOptions to extract() method
      • Allow to pass request to proxy
  • Remove unnecessary dependencies for reduce bundle size
  • Fix problem while building esm version for browser
  • Add demo for running on browser

v7.2.1

20 Sep 08:42
64e308d
Compare
Choose a tag to compare
  • Use external string-similarity
  • Improve fetch control
  • Update build script
  • Fix typo error on example packages

v7.2.0

17 Sep 15:39
90692fd
Compare
Choose a tag to compare
  • Refactor some parts to run on deno, bun and tsnode
    • Use internal string-similarity file to by pass bun.js resolve error
    • Stop depending on urlpattern-polyfill to by pass deno/bun error
      • Replace URLPattern syntax with regular RegExp
  • Add some examples for each platform
  • Remove rarely used configuration methods

v7.1.0

17 Sep 07:30
2ab8a99
Compare
Choose a tag to compare

The first step to get it work on deno and bun environment

  • Replace axios with cross-fetch
  • Remove 4 API methods relating to axios and htmlcrush

v7.0.3

16 Sep 09:21
da8df03
Compare
Choose a tag to compare
  • Update dependencies
  • Remove depending on tldts
  • Use conditional exports
  • Improve pre-defined options