diff --git a/design-documents/bit-precise-types.rst b/design-documents/bit-precise-types.rst
index 4743daf..c674193 100644
--- a/design-documents/bit-precise-types.rst
+++ b/design-documents/bit-precise-types.rst
@@ -17,7 +17,7 @@ bit-precise integral types defined in C2x.  These are ``_BitInt(N)`` and
 ``unsigned _BitInt(N)``.  These are defined for integral ``N`` and each ``N`` is
 a different type.
 
-The proposal for these types can be found in following link.
+The proposal for these types can be found in the following link:
 https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2763.pdf
 
 As the rationale in that proposal mentioned, some applications have uses for a
@@ -38,20 +38,20 @@ The main trade-offs we have identified in this case are:
 - Size cost of storing values in memory.
 - General familiarity of programmers with the representation.
 
-Since this is a new type there is large uncertainty on how it will be used by
+Since this is a new type, there is large uncertainty on how it will be used by
 programmers in the future.  Decisions we make here may also influence future
-usage.  Nonetheless we must make trade-off decisions with this uncertainty.  The
-below attempts to analyze possible use-cases to make our best guess as to how
-these types may be used when targeting Arm CPU's.
+usage.  We must make trade-off decisions within this uncertainty.  The below
+attempts to analyze possible use-cases to make our best guess as to how these
+types may be used when targeting Arm CPUs.
 
 
 Use-cases known of so far
 -------------------------
 
-There seem to be two different regimes for these types.  The "small" regime
-where bit-precise types could be stored in a single general-purpose register,
-and the "large" regime where bit-precise types must span multiple
-general-purpose registers.
+We believe there are two regimes for these types.  The "small" regime where
+bit-precise types could be stored in a single general-purpose register, and the
+"large" regime where bit-precise types must span multiple general-purpose
+registers.
 
 Here we discuss the use-cases for bit-precise integer types that we have
 identified or been alerted to so far.
@@ -72,19 +72,19 @@ to write code which directly expresses what is needed.  This can ensure the FPGA
 description generated saves space and has better performance.
 
 The notable thing about this use-case is that though the C code may be run on an
-Arm architecture (e.g. for testing), the most critical use is when transferred
-to an FPGA (i.e. not an Arm architecture).
+Arm architecture for testing, the most critical use is when transferred to an
+FPGA (that is, not an Arm architecture).
 
-That said, if the operation that this FPGA performs becomes popular there may be
-a need to run the code directly on CPU's in the future.
+However, if the operation that this FPGA performs becomes popular there may be a
+need to run the code directly on CPUs in the future.
 
-The requirements on Arm ABI's from this use-case are relatively small since the
-main focus is around running on an FPGA.  We believe it adds weight to both the
-need for performance and familiarity of programmers.  This belief comes from the
-estimate that this may lead to bit-precise types being used in performance
-critical code in the future, and that it may mean that bit-precise types are
-used on Arm architectures when testing FPGA descriptions (where ease of
-debugging can be prioritized).
+The requirements on the Arm ABI from this use-case are relatively small since
+the main focus is around running on an FPGA.  We believe the use-case adds
+weight to both the need for performance and familiarity of programmers.  This
+belief comes from the estimate that this may lead to bit-precise types being
+used in performance critical code in the future, and that it may mean that
+bit-precise types are used on Arm architectures when testing FPGA descriptions
+(where ease of debugging can be prioritized).
 
 
 24-bit Color
@@ -119,8 +119,8 @@ performed.
 
 One negative of using bit-precise integral types for networking code would be
 that idioms like ``if (x + y > max_representable)`` where ``x`` and ``y`` have
-been loaded from small bit-fields would no longer be viable.  We have seen such
-idioms for small values in networking code in the Linux kernel.  These are
+been loaded from small bit-fields, would no longer be viable.  We have seen
+such idioms for small values in networking code in the Linux kernel.  These are
 intuitive to write but if ``x`` and ``y`` were to bit-precise types would not
 work as expected.
 
@@ -134,8 +134,8 @@ Hence we believe that ease of debugging of values in registers may be more
 critical than performance concerns in this use-case.
 
 
-To help the compiler optimize (e.g. for auto vectorization)
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+To help the compiler optimize (possibly for auto vectorization)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 The behavior that bit-precise types do not automatically promote to an ``int``
 during operations could remove some casts which are necessary for C semantics
@@ -146,11 +146,11 @@ casts in order to identify the operations being performed.
 The incentive for this use-case is an increased likelihood of the compiler
 generating optimal autovectorized code.
 
-Points which might imply less take-up of this use-case are that the option to
-use compiler intrinsics are there for programmers which want to put in extra
-effort to ensure good vectorization of a loop.  This means that using
-bit-precise types would be a mid-range option providing less-guaranteed codegen
-improvement for less effort.
+One point which might imply less take-up of this use-case is that programmers
+willing to put in extra effort to ensure good vectorization of a loop have the
+option to use compiler intrinsics.  This means that using bit-precise types
+would be a mid-range option providing less-guaranteed codegen improvement for
+less effort.
 
 The ABI should not have much of an affect on this use-case directly, since the
 optimization would be done in the target-independent part of compilers and the
@@ -165,7 +165,7 @@ choosing performance concerns.
 
 In this use-case the programmer would be converting a codebase using either 8
 bit integers or 16 bit integers to a bit-precise type of the same size.  Such a
-codebase may include calls to variadic functions (like ``printf``) in
+codebase may include calls to variadic functions (such as ``printf``) in
 surrounding code.  Variadic functions like this may be missed when changing
 types in a codebase, so it would be helpful if the bit-precise machine types
 passed matched what the relevant standard integral types looked like in order to
@@ -176,8 +176,7 @@ would benefit from having the representation of ``_BitInt(8)`` in the PCS match
 that of ``int`` and similar for the ``16`` bit and unsigned variants (which
 implies having them sign- or zero-extended).
 
-One further point around this use-case, is that decisions which do not affect 8
-and 16 bit types would not affect this use-case.
+Decisions which do not affect 8 and 16 bit types would not affect this use-case.
 
 
 For representing cryptography algorithms
@@ -222,9 +221,10 @@ We have heard of interest in using the new bit-precise integer types to
 implement transparent BigNum libraries in C.
 
 Such a use-case unfortunately does not directly correspond to what kind of code
-will be using this (e.g. would this be algorithmic code or I/O bound code).
-Given the mention of 512x512 matrices in the comment where we heard of this we
-assume that in general such a library would be CPU-bound code.
+will be using this (for example it doesn't indicate whether this code would be
+algorithmic or I/O bound).  Given the mention of 512x512 matrices in the
+discussion where we heard this use-case we assume that in general such a library
+would be CPU-bound code.
 
 Hence we assume that the main consideration here would be performance.
 
@@ -279,9 +279,9 @@ greater than or equal to the size of the object in memory):
 
 - Avoid a performance hit since loading and storing of these "small" sized
   ``_BitInt``'s will not cross cache boundaries.
-- Atomic loads and stores can be made on these objects.
 - The representation of bit-precise types of the same size as standard integer
   types will have the same alignment and size in memory.
+- Atomic loads and stores can be made on these objects.
 
 In the use-cases we have identified above we did not notice any special need for
 tight packing.  All of the use-cases we identified would benefit from better
@@ -309,33 +309,33 @@ Option ``A`` has the following benefits:
 - This would mean that the alignment of a ``_BitInt(128)`` on AArch64 matches
   that of other architectures which have already defined their ABI.  This could
   reduce surprises when writing portable code.
-- Less space used for half of the values of ``N``.
-- Multiplications on large ``_BitInt(N)`` can be logically done on the limbs of
-  size ``M``, which should result in a neater compiler implementation.  E.g.
-  for AArch64 there is a ``SMULH`` which could be used as part of a
-  multiplication on an entire limb.
+- Less space used for half of the large values of ``N``.
+- Multiplications on large ``_BitInt(N)`` can be performed using chunks of size
+  ``M``, which should result in a neater compiler implementation.  For example
+  AArch64 has an ``SMULH`` instruction which could be used as part of a
+  multiplication of an entire chunk.
 
 Option ``B`` has the following benefit:
 
+- On AArch32 a ``_BitInt(64)`` would have the same alignment and size as an
+  ``int64_t``, and on AArch64 a ``_BitInt(128)`` would have the same alignment
+  and size as a ``__int128``.
+- Double-register sized integers match the largest Fundamental Data Types
+  defined in the relevant PCS architectures for both platforms.  We believe that
+  that developers familiar with the Arm ABI would find this mapping less
+  surprising and hence make less mistakes.  This includes those working at FFI
+  boundaries interfacing to the C ABI.
 - Would allow atomic operations on types in the range between register
   and double-register sizes.
   This is due to the associated extra alignment allowing operations like
-  ``CASP`` on aarch64 and ``LDRD`` on aarch32.  Similarly this would allow
+  ``CASP`` on AArch64 and ``LDRD`` on AArch32.  Similarly this would allow
   ``LDP`` and ``STP`` single-copy atomicity on architectures with the LSE2
   extension.
-- On AArch32 a ``_BitInt(64)`` would have the same alignment and size as an
-  ``int64_t``, and on AArch64 a ``_BitInt(128)`` would have the same alignment
-  and size as a ``__int128``.
-- Double-register sized integers match the largest Fundamental Data Types
-  defined in the relevant PCS architectures for both platforms.  We believe
-  that that developers familiar with the AArch64 ABI would find this mapping
-  less surprising and hence make less mistakes.  This also includes those
-  working at FFI boundaries interfacing to the C ABI.
 
 The "large" size use-cases we have identified so far are of power-of-two sizes.
 These sizes would not benefit greatly from the positives of either of the
-options presented here, with the only difference being around the implementation
-of multiplication.
+options presented here, with the only difference being in the implementation of
+multiplication.
 
 Our estimate is that the benefits of option ``B`` are more useful for sizes
 between register and double-register than those from option ``A``.  This is not
@@ -344,9 +344,10 @@ being a smaller difference from other architectures psABI choices.
 
 Other variants are available, such as choosing alignment and size based on
 register sized chunks except for the special case of the double-register sized
-_BitInt.  Though such variants can provide a good combination of the properties
-above we judge them to have an extra complexity of definition and associated
-increased likelyhood of mistakes when developers code relies on ABI choices.
+``_BitInt``.  Though such variants can provide a good combination of the
+properties above we judge the extra complexity of definition to have an
+associated increased likelyhood of mistakes when developers code relies on ABI
+choices.
 
 Based on the above reasoning, we would choose to define the size and alignment
 of ``_BitInt(N > [register-size])`` types by treating them "as if" they are an
@@ -358,9 +359,9 @@ Representation in bits
 There are two decisions around the representation of a "small" ``_BitInt`` that
 we have identified.  (1) Whether required bits are stored in the least
 significant end or most significant end of a register or region in memory. (2)
-Whether the "remaining" bits after rounding up to the size specified in
-`Alignment and sizes`_ are specified or not.  The choice of *how* "remaining"
-bits would be specified would tie in to the choice made for (1).
+Whether the "remaining" bits are specified after rounding up to the size
+specified in `Alignment and sizes`_.  The choice of *how* "remaining" bits would
+be specified would tie in to the choice made for (1).
 
 
 Options and their trade-offs
@@ -400,20 +401,20 @@ require updating every "chunk" in memory, hence we assume large values of option
 
 Option ``A`` has the following benefits:
 
+- Operations ``+,-,%,==,<=,>=,<,>,<<`` all work without any extra instructions
+  (which is more of the common operations than other representations).
+
 - For small values in memory, on AArch64, the operations like ``LDADD`` and
   ``LD{S,U}MAX`` both work (assuming the relevant register operand is
   appropriately shifted).
 
-- Operations ``+,-,%,==,<=,>=,<,>,<<`` all work without any extra instructions
-  (which is more of the common operations than other representations).
-
 It has the following negatives:
 
 - This would be a less familiar representation to programmers.  Especially the
   fact that a ``_BitInt(8)`` would not have the same representation in a
-  register as a ``char`` could cause confusion (e.g. when debugging, or writing
-  assembly code).  This would likely be increased if other architectures that
-  programmers may use have a more familiar representation.
+  register as a ``char`` could cause confusion (we imagine when debugging, or
+  writing assembly code).  This would likely be increased if other
+  architectures that programmers may use have a more familiar representation.
 
 - Operations ``*,/``, saving and loading values to memory, and casting to
   another type would all require extra cost.
@@ -427,19 +428,17 @@ It has the following negatives:
 
 Option ``B`` has the following benefits:
 
-- For small values in memory, the AArch64 ``LDADD`` operations work naturally.
-
 - Operations ``+,-,*,<<``, narrowing conversions, and loading/storing to memory
   would all naturally work.
 
 - On AArch64 this would most likely match the expectation of developers, and
-  e.g. a ``_BitInt(8)`` would have the same representation as a ``char`` in
-  registers.
+  small power-of-two sizes would have the same representation as standard types
+  in registers.  For example a ``_BitInt(8)`` would have the same representation
+  as a ``char`` in registers.
 
-It has the following negatives:
+- For small values in memory, the AArch64 ``LDADD`` operations work naturally.
 
-- The AArch64 ``LD{S,U}MAX`` operations would not work naturally on small values
-  of this representation.
+It has the following negatives:
 
 - Operations ``/,%,==,<,>,<=,>=,>>`` and widening conversions on operands coming
   from an ABI boundary would require masking the operands.
@@ -452,11 +451,11 @@ It has the following negatives:
 - If used in calls to variadic functions which were written for standard
   integral types this can give surprising results.
 
+- The AArch64 ``LD{S,U}MAX`` operations would not work naturally on small values
+  of this representation.
 
-Option ``C`` has the following benefits:
 
-- For small values in memory, the AArch64 ``LD{S,U}MAX`` operations work
-  naturally.
+Option ``C`` has the following benefits:
 
 - Operations ``==,<,<=,>=,>,>>``, widening conversions, and loading/storing to
   memory would all naturally work.
@@ -467,9 +466,10 @@ Option ``C`` has the following benefits:
 - If used in variadic function calls, mismatches between ``_BitInt`` types and
   standard integral types would not cause as much of a problem.
 
-It has the following negatives:
+- For small values in memory, the AArch64 ``LD{S,U}MAX`` operations work
+  naturally.
 
-- The AArch64 ``LDADD`` operations would not work naturally.
+It has the following negatives:
 
 - Operations ``+,-,*,<<`` would all cause the need for masking at an ABI
   boundary.
@@ -477,23 +477,26 @@ It has the following negatives:
 - On AArch64 this would not match the expectation of developers, with
   ``_BitInt(8)`` not matching the representation of a ``char``.
 
+- The AArch64 ``LDADD`` operations would not work naturally.
+
 Summary, suggestion, and reasoning
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 Overall it seems that option ``A`` is more performant for operations on small
-values.  However, when acting on "large" values (i.e. greater than the size of
-one register) it loses some of that benefit.  Storing to and from memory would
-also come at a cost for this representation.  This is also likely to be the most
-surprising representation for developers on an Arm platform.
+values.  However, when acting on "large" values (here defined as greater than
+the size of one register) it loses some of that benefit.  Storing to and from
+memory would also come at a cost for this representation.  This is also likely
+to be the most surprising representation for developers on an Arm platform.
 
 Between option ``B`` and option ``C`` there is not a great difference in
 performance characteristics.  However it should be noted that option ``C`` is
 the most natural extension of the AArch32 PCS rules for unspecified bits in a
 register containing a small Fundamental Data Type, while option ``B`` is the
-most natural extension of the similar rules in AArch64 PCS.  Furthermore, option
-``C`` would mean that accidental misuse of a bit-precise type instead of a
-standard integral type should not cause problems, while ``B`` could give strange
-values.  This would be most visible with variadic functions.
+most natural extension of the similar rules in AArch64 PCS.  Another distinction
+between the two is that option ``C`` would mean that accidental misuse of a
+bit-precise type instead of a standard integral type should not cause problems,
+while ``B`` could give strange values.  This would be most visible with variadic
+functions.
 
 As mentioned above, both performance concerns and a familiar representation are
 valuable in the use-cases that we have identified.  This has made the decision