[pypy-commit] extradoc extradoc: some more changes for my talk

plan_rich pypy.commits at gmail.com
Thu Oct 6 08:01:12 EDT 2016


Author: Richard Plangger <planrichi at gmail.com>
Branch: extradoc
Changeset: r5734:4fc29c6d0bc8
Date: 2016-10-06 14:00 +0200
http://bitbucket.org/pypy/extradoc/changeset/4fc29c6d0bc8/

Log:	some more changes for my talk

diff --git a/talk/pyconza2016/pypy/img/how-jit.png b/talk/pyconza2016/pypy/img/how-jit.png
new file mode 100644
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..ad9ce720343f9b461e202faacd566abc3a331b61
GIT binary patch

[cut]

diff --git a/talk/pyconza2016/pypy/index.html b/talk/pyconza2016/pypy/index.html
--- a/talk/pyconza2016/pypy/index.html
+++ b/talk/pyconza2016/pypy/index.html
@@ -32,16 +32,27 @@
                 </section>
                 <section>
                     <section>
-                        <h1>PyPy is ..</h1>
+                        <h1>More "general" PyPy talk</h1>
+			<p>Goals:</p>
+			<ul>
+				<li>An approach to optimize Python programs</li>
+				<li>Examples</li>
+				<li>How not to start optimizing</li>
+				<li>What is PyPy up to now?</li>
+			</ul>
                     </section>
+                </section>
+                <section>
                     <section>
-			<p>... a software project ... </p>
+                        <h1>PyPy is a ...</h1>
+			<p class="fragment">... <strong>fast virtual machine for Python</strong> </p>
+			<p class="fragment">developed by researchers, freelancers and many contributors.</p>
                     </section>
+                </section>
+                <section>
                     <section>
-			<p>... assembling a <strong>fast virtual machine for Python</strong> ... </p>
-                    </section>
-                    <section>
-			<p>developed by many researchers, freelancers and many contributors.</p>
+                        <p><code>$ python yourprogram.py</code></p>
+                        <p><code>$ pypy yourprogram.py</code></p>
                     </section>
                 </section>
                 <section>
@@ -66,7 +77,7 @@
                     <section>
                         <h1>About me</h1>
 			<p>Working on PyPy (+1,5y)</p>
-			<p>Master degree - Sticked with PyPy</p>
+			<p>Master thesis → GSoC 2015 → PyPy</p>
 			<p>living and working in Austria</p>
                     </section>
                 </section>
@@ -85,17 +96,18 @@
                         <p><strong>Neither</strong></p>
                     </section>
                     <section>
-                        <p>Run you program an measure your criteria</p>
+                        <p>Run your program an measure your <strong>criteria</strong></p>
                     </section>
                     <section>
-                        <h1>Criteria examples?</h1>
+                        <h1>For example?</h1>
 			<ul>
 				<li>CPU time</li>
 				<li>Peak Heap Memory</li>
 				<li>Requests per second</li>
+				<li>Latency</li>
 				<li>...</li>
 			</ul>
-			<p>Dissatisfaction with one attribute of your program!</p>
+			<p>Dissatisfaction with one criteria of your program!</p>
                     </section>
                 </section>
 		<section>
@@ -103,22 +115,23 @@
 			    <h1>Some theory ... </h1>
 			</section>
                 	<section>
-			    <h1>Hot spots</h1>
-			    <p>Loops!</p>
-			    <p>What kind program can you build without loops?</p>
-			</section>
-                	<section>
 			    <h1>Complexity</h1>
-			    <p>Big-O-Notation - Express how many steps a program to complete at most</p>
+			    <p>Big-O-Notation</p>
+			    <p>Classify e.g. a function and it's processing time</p>
+			    <p>Increase input size to the function</p>
 			</section>
                 	<section>
 				<ul>
-			    		<li><code>a = 3</code> # runs in O(1)</li>
-			    		<li><code>[x+1 for x in range(n)]</code> # runs in O(n)</li>
-                                        <li><code>[[x+y for x in range(n)] for y in range(m)]</code> # O(n*m)</li>
+			    		<li><code>a = 3</code> # O(1)</li>
+			    		<li><code>[x+1 for x in range(n)]</code> # O(n)</li>
+                                        <li><code>[[x+y for x in range(n)] \ <br> for y in range(m)]</code> # O(n*m) == O(n) if n > m</li>
 				</ul>
 			</section>
                 	<section>
+				Bubble sort vs Quick Sort
+				<p>O(n**2) vs O(n log n)</p>
+			</section>
+                	<section>
 				<h1>Complexity</h1>
 				<p>Yields the most gain, independent from the language</p>
                                 <p>E.g. prefer O(n) over O(n**2)</p>
@@ -144,28 +157,43 @@
 					<li>Written in Python</li>
 					<li>Moved to vmprof.com</li>
 					<li>Log files can easily take up to 40MB uncompressed</li>
-					<li>Takes ~14 seconds to parse with CPython</li>
+					<li>Takes ~10 seconds to parse with CPython</li>
 					<li>Complexity is linear to input size of the log file</li>
 				</ul>
 			</section>
 			<section>
+				<p><h3>Thanks to Python</h3></p>
 				<p class="advantage">+ Little development time</p>
 				<p class="advantage">+ Easy to test</p>
-				<p><h3>Thanks to Python</h3></p>
 			</section>
 			<section>
 				<p class="disadvantage">- Takes too long to parse</p>
-				<p>Our criteria: CPU time to long</p>
+				<p class="disadvantage">- Parsing is done each request</p>
+				<p>Our criteria: CPU time to long + requests per second</p>
+				<p>(Many objects are allocated)</p>
 			</section>
 			<section>
-				<p class="">Several possible ways</p>
+				<h1>Suggestion</h1>
 				<p>Caching</p>
 				<p>Reduce CPU time</p>
 				<p>Let's have both</p>
 			</section>
 			<section>
-				<p>Caching - Easily done with django caching frame work</p>
-				<p>Reduce CPU time - Look at vmprof</p>
+				<p>Caching - Easily done with your favourite caching framework</p>
+				<p>Reduce CPU time - PyPy seems to be good at that?</p>
+			</section>
+			<section>
+				<h1>Let's run it...</h1>
+				<p><code>$ cpython2.7 parse.py 40mb.log<br>~ 10 seconds</code></p>
+				<p><code>$ pypy2 parse.py 40mb.log<br>~ 2 seconds</code></p>
+			</section>
+			<section>
+				<h1>Caching</h1>
+				<p>Requests really feel instant after the log has been loaded once</p>
+				<p>Precache</p>
+			</section>
+			<section>
+				<h1>The lazy approach of optimizing Python</h1>
 			</section>
 			<section>
 				<h1>VMProf</h1>
@@ -177,14 +205,16 @@
 			</section>
 			<section data-background="img/vmprof-screen-pypy.png">
 			</section>
-			<section>
-				<h1>~4 times faster on PyPy</h1>
-			</section>
                 </section>
 		<section>
 			<section>
 				<h1>Introducing PyPy's JIT</h1>
 			</section>
+                	<section>
+			    <h1>Hot spots</h1>
+			    <p>Loops / Repeat construct!</p>
+			    <p>What kind program can you build without loops?</p>
+			</section>
 			<section>
 				<h1>A simplified view</h1>
 				<ol>
@@ -193,12 +223,17 @@
 					<li>Optimization stage</li>
 					<li>Machine code generation</li>
 				</ol>
-				<p>Cannot represent control flow as a graph (other than loop jumps)</p>
-				<p>Guards ensure correctness</p>
+				
 			</section>
 			<section>
 				<h1>Beyond the scope of loops</h1>
-				<p>Frequent guard failure trigger recording</p>
+				<p>Guards ensure correctness</p>
+				<p>Frequent guard failure triggers recording</p>
+			</section>
+			<section>
+				<h1>Perception</h1>
+				<img src="img/how-jit.png">
+				<small>http://abstrusegoose.com/secretarchives/under-the-hood - CC BY-NC 3.0 US</small>
 			</section>
 			<section data-background-image="img/jitlog.png">
 				<a href="http://vmprof.com/#/7930e1f54f9eee75084738aafa6cb612/traces">→ link</a>
@@ -209,6 +244,17 @@
 				<p>Helps you to learn and understand PyPy</p>
 				<p>Provided at vmprof.com</p>
 			</section>
+			<section>
+				<h1>Properties & Tricks</h1>
+				<ul>
+					<li>Type specialization</li>
+					<li>Object unboxing</li>
+					<li>GC scheme</li>
+					<li>Dicts</li>
+					<li>Dynamic class creation (Instance maps)</li>
+					<li>Function calls (+ Inlining)</li>
+				</ul>
+			</section>
                 </section>
 		<section>
 			<section>
@@ -218,11 +264,11 @@
 			</section>
 			<section>
 				<h1>Magnetic</h1>
-				<p>marketing tech company</p>
-				<p>switched to PyPy 3 years ago</p>
+				<p>Marketing tech company</p>
+				<p>Switched to PyPy 3 years ago</p>
 			</section>
 			<section>
-				<h1>Q: what does your service do?</h1>
+				<h1>Q: What does your service do?</h1>
 				<p>A: ... allow generally large companies to send targeted marketing (e.g. serve ads) to people based on data we have learned </p>
 			</section>
 			<section>
@@ -242,9 +288,36 @@
 				<p>So it spends lots of time blocking</p>
 			</section>
                 </section>
+		<section>
+			<section>
+				<h1>timeit</h1>
+				<p>why not use perf?</p>
+				<p class="fragment">Try timeit on PyPy</p>
+			</section>
+			<section>
+				<h1>Python 3.5</h1>
+				<p>Progressed quite a bit</p>
+				<p class="fragment">async io</p>
+				<p class="fragment">Many more small details (sprint?)</p>
+			</section>
+			<section>
+				<h1>C-Extentions</h1>
+				<p>NumPy on top of the emulated layer</p>
+				<p>Boils down to managing PyPy & CPython objects</p>
+			</section>
+                </section>
+		<section>
+			<section>
+				<h1>Closing example</h1>
+				<p>how to move from cpu limited to network limited</p>
+				<a href="https://www.reddit.com/r/Python/comments/kt8bx/ask_rpython_whats_your_experience_with_pypy_and/">link</a>
+			</section>
+
+		</section>
                 <section>
                     <h4>Questions?</h4>
                     <a href="morepypy.blogspot.com">morepypy.blogspot.com</a><br>
+		    <a href="">software at vimloc.systems</a><br>
 		    Join on IRC <a href="">#pypy</a>
                 </section>
             </div>


More information about the pypy-commit mailing list