Data Analysis with Python
In the last few weeks I needed to crunch some data. It was structured data, so I had a reason for finally jumping into pivoting DataFrames in Pandas1 – a thing I still knew (and know…) very little about.
I’m using Python for any kinds of visualization since quite some time already. It’s so versatile, productive, and handy! #♥
After finishing my paper, I wanted to show my colleagues shortly the basics of what they need to know to massage their data and make nice-looking plots from it. With Python. A kind of Data Analysis with Python 1-0-½.
Here are the slides, which scratch the surface of Matplotlib, Pandas, and Jupyter Notebooks. Also: Seaborn. Navigate with space bar.
The presentation itself is done in a Jupyter Notebook. Hence the embedded HTML presentation with reveal.js, which Jupyter natively generates. If you’re looking for a more static version, there’s a PDF of it as well2. Also, the Notebook is available in this Gist, in case you’d like to see how its done.
Edit, 29 May: There’s a handy cheatsheet available in Pandas’ Github repository.
Let me know what you think of the slides. What would be your recommendations to further simplify or improve data analysis with Python? Tweet me!
-
WTF you say? Well. Read on. Or just jump ahead to the presentation. It all makes sense. I promise. ↩
-
Which were hell to compile. That’s really not the strong suit of those HTML/JS presentation frameworks (and for me a show-stopper). I used the decktape method to get a PDF from the HTML and used
pdfcrop
to get rid of scrollbars. ↩