[pypy-commit] extradoc extradoc: English language cleanups.

Tue Jan 10 19:22:58 CET 2012

Author: edelsohn
Branch: extradoc
Changeset: r4016:0d508d74845b
Date: 2012-01-10 13:22 -0500
http://bitbucket.org/pypy/extradoc/changeset/0d508d74845b/

Log:	English language cleanups.

diff --git a/blog/draft/laplace.rst b/blog/draft/laplace.rst
--- a/blog/draft/laplace.rst
+++ b/blog/draft/laplace.rst
@@ -4,9 +4,10 @@
 Hello.
 
 We're excited to let you know about some of the great progress we've made on
-NumPyPy -- both completeness and performance. Here we'll mostly talk about the
-performance side and how far we have come so far. **Word of warning:** this
-work isn't done - we're maybe half way to where we want to be and there are
+NumPyPy: both completeness and performance. In this blog entry we mostly
+will talk about performance and how much progress we have made so far.
+**Word of warning:** this
+work isn't done -- we're maybe half way to where we want to be and there are
 many trivial and not so trivial optimizations to be written. (For example, we
 haven't even started to implement important optimizations, like vectorization.)
 
@@ -27,10 +28,10 @@
 Numerically the algorithms used are identical, however exact data layout in
 memory differs between them.
 
-**A note about all the benchmarks:** they were each run once, but the
+**A note about all the benchmarks:** they each were run once, but the
 performance is very stable across runs.
 
-Starting with the C version, it implements a dead simple laplace transform
+Starting with the C version, it implements a trivial laplace transform
 using two loops and double-reference memory (array of ``int*``). The double
 reference does not matter for performance and the two algorithms are
 implemented in ``inline-laplace.c`` and ``laplace.c``. They were both compiled
@@ -55,13 +56,14 @@
 | inline_slow python    | 278                  | 23.7               |
 +-----------------------+----------------------+--------------------+
 
-An important thing to notice here is that the data dependency in the inline
-version causes a huge slowdown for the C versions. This is already not too bad
-for us though, the braindead Python version takes longer and PyPy is not able
-to take advantage of the knowledge that the data is independent, but it is in
-the same ballpark as the C versions - **15% - 170%** slower, but the algorithm
-you choose matters more than the language. By comparison, the slow versions
-take about **5.75s** each on CPython 2.6 per iteration, and by estimating,
+An important thing to notice is the data dependency of the inline
+version causes a huge slowdown for the C versions. This is not a severe
+disadvantage for us though -- the brain-dead Python version takes longer
+and PyPy is not able to take advantage of the knowledge that the data is
+independent. The results are in the same ballpark as the C versions --
+**15% - 170%** slower, but the algorithm
+one chooses matters more than the language. By comparison, the slow versions
+take about **5.75s** each on CPython 2.6 per iteration, and by estimation,
 are about **200x** slower than the PyPy equivalent, if I had the patience to
 measure the full run.
 
@@ -78,7 +80,7 @@
 
 We need 3 arrays here - one is an intermediate (PyPy only needs one, for all of
 those subexpressions), one is a copy for computing the error, and one is the
-result. This works automatically, since in NumPy ``+`` or ``*`` creates an
+result. This works automatically because in NumPy ``+`` or ``*`` creates an
 intermediate, while NumPyPy avoids allocating the intermediate if possible.
 
 ``numeric_2_time_step`` works in pretty much the same way::
@@ -90,7 +92,7 @@
 
 except the copy is now explicit rather than implicit.
 
-``numeric_3_time_step`` does the same thing, but notices you don't have to copy
+``numeric_3_time_step`` does the same thing, but notices one doesn't have to copy
 the entire array, it's enough to copy the border pieces and fill rest with
 zeros::
 
@@ -104,12 +106,12 @@
                               (src[1:-1,0:-2] + src[1:-1, 2:])*dx2)*dnr_inv
 
 ``numeric_4_time_step`` is the one that tries hardest to resemble the C version.
-Instead of doing an array copy, it actually notices that you can alternate
+Instead of doing an array copy, it actually notices that one can alternate
 between two arrays. This is exactly what the C version does. The
 ``remove_invalidates`` call is a PyPy specific hack - we hope to remove this
-call in the near future, but in short it promises "I don't have any unbuilt
-intermediates that depend on the value of the argument", which means you don't
-have to compute sub-expressions you're not actually using::
+call in the near future, but, in short, it promises "I don't have any unbuilt
+intermediates that depend on the value of the argument", which means one doesn't
+have to compute sub-expressions one is not actually using::
 
         remove_invalidates(self.old_u)
         remove_invalidates(self.u)
@@ -120,7 +122,7 @@
 
 This one is the most comparable to the C version.
 
-``numeric_5_time_step`` does the same thing, but notices you don't have to copy
+``numeric_5_time_step`` does the same thing, but notices one doesn't have to copy
 the entire array, it's enough to just copy the edges. This is an optimization
 that was not done in the C version::
 
@@ -158,9 +160,9 @@
 the C version (or as fast as we'd like them to be), but we're already much
 faster than NumPy on CPython, almost always by more than 2x on this relatively
 real-world example. This is not the end though, in fact it's hardly the
-beginning: as we continue work, we hope to make even much better use of the
+beginning! As we continue work, we hope to make even more use of the
 high level information that we have. Looking at the generated assembler by
-gcc in this example it's pretty clear we can outperform it, thanks to better
+gcc in this example, it's pretty clear we can outperform it, thanks to better
 aliasing information and hence better possibilities for vectorization.
 Stay tuned.