Skip to content

Commit

Permalink
chore(bench): add medium case
Browse files Browse the repository at this point in the history
  • Loading branch information
j-mendez committed Dec 4, 2023
1 parent 1837e08 commit 66b492d
Show file tree
Hide file tree
Showing 10 changed files with 5,082 additions and 5,068 deletions.
18 changes: 9 additions & 9 deletions bench/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,18 @@ Linux
Test url: `https://choosealicense.com` (small)
32 pages

| `libraries` | `speed` |
| :-------------------------------- | :-------------------- |
| **`spider-rs: crawl 10 samples`** | `76ms`(✅ **1.00x**) |
| **`crawlee: crawl 10 samples`** | `1s` (✅ **1.00x**) |
| `libraries` | `speed` |
| :-------------------------------- | :------------------- |
| **`spider-rs: crawl 10 samples`** | `76ms`(✅ **1.00x**) |
| **`crawlee: crawl 10 samples`** | `1s` (✅ **1.00x**) |

Test url: `https://rsseau.fr` (medium)
211 pages

| `libraries` | `speed` |
| :-------------------------------- | :------------------- |
| **`spider-rs: crawl 10 samples`** | `0.5s` (✅ **1.00x**) |
| **`crawlee: crawl 10 samples`** | `72s` (✅ **1.00x**) |
| `libraries` | `speed` |
| :-------------------------------- | :-------------------- |
| **`spider-rs: crawl 10 samples`** | `0.5s` (✅ **1.00x**) |
| **`crawlee: crawl 10 samples`** | `72s` (✅ **1.00x**) |

```sh
----------------------
Expand All @@ -47,4 +47,4 @@ Test url: `https://rsseau.fr` (medium)
| **`spider-rs: crawl 10 samples`** | `2.5s` (✅ **1.00x**) |
| **`crawlee: crawl 10 samples`** | `75s` (✅ **1.00x**) |

The performance scales the larger the website and if throttling is needed. Linux benchmarks are about 10x faster than macOS for spider-rs.
The performance scales the larger the website and if throttling is needed. Linux benchmarks are about 10x faster than macOS for spider-rs.
1 change: 1 addition & 0 deletions bench/base.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ export const iterations = process.env.BENCH_COUNT
: 20;

export const TEST_URL = "https://choosealicense.com";
export const TEST_URL_MEDIUM = "https://rsseau.fr";
24 changes: 12 additions & 12 deletions bench/case/crawlee.ts
Original file line number Diff line number Diff line change
@@ -1,21 +1,21 @@
import { CheerioCrawler } from 'crawlee';
import { TEST_URL, iterations } from "../base"
import { CheerioCrawler } from "crawlee";

Check failure on line 1 in bench/case/crawlee.ts

View workflow job for this annotation

GitHub Actions / stable - i686-pc-windows-msvc - node@18

Cannot find module 'crawlee' or its corresponding type declarations.
import { TEST_URL, iterations } from "../base";

export async function bench(url = TEST_URL) {
const crawler = new CheerioCrawler({
async requestHandler({ enqueueLinks }) {

Check failure on line 6 in bench/case/crawlee.ts

View workflow job for this annotation

GitHub Actions / stable - i686-pc-windows-msvc - node@18

Binding element 'enqueueLinks' implicitly has an 'any' type.
await enqueueLinks();
},
});

export async function bench() {
const crawler = new CheerioCrawler({
async requestHandler({ enqueueLinks, request }) {
await enqueueLinks();
}
});

let duration = 0;

const run = async () => {
const startTime = performance.now();
await crawler.run([TEST_URL]);
await crawler.run([url]);
duration += performance.now() - startTime;
};

const bm = async (cb: () => Promise<void>, i = 0) => {
await cb();
if (i < iterations) {
Expand All @@ -34,4 +34,4 @@ export async function bench() {
},
]),
);
}
}
12 changes: 6 additions & 6 deletions bench/case/spider.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
import { Website, NPage } from "../../index.js";
import { TEST_URL, iterations } from "../base"
import { Website } from "../../index.js";
import { TEST_URL, iterations } from "../base";

export async function bench() {
const website = new Website(TEST_URL);
export async function bench(url = TEST_URL) {
const website = new Website(url);

let duration = 0;

Expand All @@ -20,7 +20,7 @@ export async function bench() {
};

await bm(run);

console.log(
JSON.stringify([
{
Expand All @@ -30,4 +30,4 @@ export async function bench() {
},
]),
);
}
}
6 changes: 5 additions & 1 deletion bench/compare.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
import { bench } from "./case/spider"
import { TEST_URL_MEDIUM } from "./base";
import { bench } from "./case/spider";

// small
bench();
// small/medium
bench(TEST_URL_MEDIUM)
6 changes: 5 additions & 1 deletion bench/crawlee.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
import { bench } from "./case/crawlee"
import { TEST_URL_MEDIUM } from "./base";
import { bench } from "./case/crawlee";

// small
bench();
// small/medium
bench(TEST_URL_MEDIUM)
13 changes: 8 additions & 5 deletions bench/oss.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
import { bench } from "./case/spider"
import { bench as benchCrawlee } from "./case/crawlee"
import { bench } from "./case/spider";
import { bench as benchCrawlee } from "./case/crawlee";
import { TEST_URL_MEDIUM } from "./base";

(async () => {
await bench();
await benchCrawlee();
})()
await bench();
await bench(TEST_URL_MEDIUM);
await benchCrawlee();
await benchCrawlee(TEST_URL_MEDIUM);
})();
Loading

0 comments on commit 66b492d

Please sign in to comment.