Effective Pandas is an excellent opinionated guide to pandas. “Opinionated” is important in the context of pandas: it’s a very flexible library that gives you rope to hang yourself, often contradicting the zen of Python: “There should be one — and preferably only one — obvious way to do it”. This can lead to some very messy code, in which a time-pressed data scientist ends up melding several different programming philosophies just to get their aggregation to work.

No doubt some of the people reading this will consider “effective pandas” to be an oxymoron. That’s reasonable. Pandas seldom feels like it was “designed” in the way that R’s tidyverse is. It is a collection of tricks, wherein fluency only arises from many hours spent pulling one’s hair out.

Effective Pandas is Harrison’s effort to define and encourage “idiomatic pandas”, emphasising chaining. It just so happens that chaining is the style of pandas that I have converged on due to its readability and elegance. Seeing a nice piece of chained pandas should mellow the complaints of tidyverse folks:

Untitled

One area that wasn’t properly addressed (which is why I give this 4 stars instead of 5) is memory usage and performance. Some of this is quite important, as there are some methods in pandas that create copies of objects but others that modify objects in place.

This aside, Effective Pandas is a useful and readable outline of an important tool; it has the flavour of a user guide rather than a documentation reference. I skimmed through some of the chapters because I’m reasonably familiar with pandas, but I’d recommend the book to anyone who uses pandas a lot.