From 42931aef380353a0352a00b8fe702c426e41ffb0 Mon Sep 17 00:00:00 2001
From: Gautam Mehta <66418526+coderGtm@users.noreply.github.com>
Date: Mon, 25 Dec 2023 13:26:57 +0530
Subject: [PATCH] Update README

Punctuation and grammatical fixes
---
 README | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/README b/README
index f47ffc4..410f0f0 100644
--- a/README
+++ b/README
@@ -8,13 +8,13 @@ your machine.
 
       #define FREQ 3.9
 
-.. and then later down..
+.. and then later down...
 
       // Hugepage size
       #define HUGEPAGE (2*1024*1024)
 
-if you don't change the FREQ define (it's in gigahertz) the tests still
-work, and the absolute time values in nanoseconds are correct, but the
+If you don't change the FREQ define (it's in gigahertz), the tests still
+work and the absolute time values in nanoseconds are correct, but the
 CPU cycle estimation will obviously be completely off.
 
 In addition to the tweakables in the code itself, there are other
@@ -33,23 +33,23 @@ but since the end result will be chasing a random chain it won't be a
 You can use odd stride values to see how unaligned loads change the
 picture, for example. 
 
-Also note that the actual memory sizes used for testing will depend on
-how much cache you have, but also on how much memory you have.  To
-actually get a largepage (when using the "-H" flag), not only does your
-architecture need to support it, you also have to have enough free
-memory that the OS will give you hugepage allocations in the first
+Also, note that the actual memory sizes used for testing will depend 
+not only on how much cache you have, but also on how much memory you 
+have. To actually get a largepage (when using the "-H" flag), not only 
+does your architecture need to support it, you also have to have enough 
+free memory that the OS will give you hugepage allocations in the first
 place. 
 
 So if you are running on some s390 with a 384MB L4 cache, you should
-increase the largest memory area size to at least 1G, but you should
-also increase the stride to 256 to match the cache line size. 
+increase the largest memory area size to at least 1G, along with 
+increasing the stride to 256 to match the cache line size. 
 
-Also not that the use of MADV_HUGEPAGE is obviously Linux-specific, but
+Also, note that the use of MADV_HUGEPAGE is obviously Linux-specific, but
 the use of madvise() means that it is *advisory* rather than some hard
 requirement, and depending on your situation, you may not actually see
 the hugepage case at all.
 
-For example MADV_HUGEPAGE obviously depends on your kernel being built
+For example, MADV_HUGEPAGE obviously depends on your kernel being built
 to support it, and not all architectures support large pages at all. 
 You can still do the non-hugepage tests, of course, but then you'll not
 have the baseline that a bigger page size will get you. 
@@ -58,21 +58,21 @@ have the baseline that a bigger page size will get you.
 Finally, there are a couple of gotchas you need to be aware of:
 
 
- * each timing test is run for just one second, and there is no noise
+ * each timing test is run for just one second, with no noise
    reduction code.  If the machine is busy, that will obviously affect
    the result.  But even more commonly, other effects will also affect
    the reported results, particularly the exact pattern of
-   randomization, and the virtual to physical mapping of the underlying
+   randomization, and the virtual-to-physical mapping of the underlying
    memory allocation. 
 
    So the timings are "fairly stable", but if you want to really explore
    the latencies you needed to run the test multiple times, to get
-   different virtual-to-physical mappings, and to get different list
+   different virtual-to-physical mappings, and to get a different list
    randomization. 
 
 
  * the hugetlb case helps avoid TLB misses, but it has another less
-   obvious secondary effect: it makes the memory area be contiguous in
+   obvious secondary effect: it makes the memory area contiguous in
    physical RAM in much bigger chunks.  That in turn affects the caching
    in the normal data caches on a very fundamental level, since you will
    not see cacheline associativity conflicts within such a contiguous
@@ -93,7 +93,7 @@ Finally, there are a couple of gotchas you need to be aware of:
    quite a bit long before that, and indeed see higher latencies already
    with just a 128kB memory area.
 
-   In contrast, if you run a hugepage test (using as 2MB page on x86),
+   In contrast, if you run a hugepage test (using a 2MB page on x86),
    the contiguous memory allocation means that your 256kB area will be
    cached in its entirety. 
 
@@ -105,8 +105,8 @@ Finally, there are a couple of gotchas you need to be aware of:
 
 
 Finally, I've made the license be GPLv2 (which is basically my default
-license), but this is a quick hack and if you have some reason to want
-to use this where another license would be preferable, email me and we
+license), but this is a quick hack, and if you have some reason to want
+to use this where another license would be preferable, email me, and we
 can discuss the issue.  I will probably accommodate other alternatives in
 the very unlikely case that somebody actually cares.