The Billion Row Challenge in Prolog #2344
Replies: 2 comments
-
|
Beta Was this translation helpful? Give feedback.
-
All the overhead of tiny syscalls will be completely removed when files are entirely mapped to the heap by the OS, using the single syscall |
Beta Was this translation helpful? Give feedback.
-
Hi all - I realise this was perhaps a little distraction, but I thought I'd quickly attempt "The Billion Row Challenge" as written here: https://www.morling.dev/blog/one-billion-row-challenge/ to see what issues may come with Scryer. I've hit this issue before when attempting to use Scryer with massive log files, but if I have DCGs which leave no choicepoints, I still get huge amounts of memory accumulating until the Linux OOM kills the process. The file with a billion rows is 13G in size, but the same program in SWI keeps memory usage nice and constant because of GC. Is there a rough timeline when Scryer will get a GC?
Apart from that, I noticed SWI Prolog is reading from the file in 4kb chunks, which is reasonable as shown from strace:
However for some reason Scryer is issuing read() system calls of only 4 bytes at a time, which is not great for performance. Is there a reason for this?
Beta Was this translation helpful? Give feedback.
All reactions