forked from benhoyt/scandir
-
Notifications
You must be signed in to change notification settings - Fork 0
/
PEP.txt
88 lines (60 loc) · 2.4 KB
/
PEP.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
PEP: XXX
Title: os.scandir() function -- a better and faster directory iterator
Version: $Revision$
Last-Modified: $Date$
Author: Ben Hoyt <[email protected]>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30-May-2014
Python-Version: 3.5
Post-History: ***
Abstract
========
This PEP proposes including a new directory iteration function,
``os.scandir()``, in the standard library. This new function adds
useful functionality and increases the speed of ``os.walk()`` by 2-10
times (depending on the platform and file system) by significantly
reducing the number of times ``stat()`` has to be called.
Rationale
=========
Python's built-in ``os.walk()`` is significantly slower than it needs
to be, because -- in addition to calling ``os.listdir()`` on each
directory -- it calls ``os.stat()`` on each file to determine whether
the filename is a directory or not. But both ``FindFirstFile`` /
``FindNextFile`` on Windows and `readdir` on Linux and OS X already
tell you whether the files returned are directories or not, so no
further ``stat`` system calls are needed. In short, you can reduce the number of system calls from about 2N to N, where N is the total number of files and directories in the tree.
**In practice, removing all those extra system calls makes
``os.walk()`` about 8-9 times as fast on Windows, and about 2-3
times as fast on Linux and Mac OS X.** So we're not talking about
micro-optimizations. See more benchmarks [here].
Somewhat relatedly, many people [1] have also asked for a version of
``os.listdir()`` that yields filenames as it iterates instead of returning them as one big list. This improves memory efficiency for iterating very large directories.
So as well as providing a ``scandir()`` iterator function for other uses, Python's existing ``os.walk()`` function would be sped up a great deal.
Proposal for inclusion
======================
Use in the wild
===============
Previous discussion
===================
Open Issues
===========
*
References
==========
.. [1] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton
(http://www.python.org/dev/peps/pep-0001)
.. [2] PEP 9, Sample Plaintext PEP Template, Warsaw
(http://www.python.org/dev/peps/pep-0009)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: