Pandas is one of those libraries that suffers from the "guitar principle" (also known as the "Bushnell Principle" in the video game circles): it is easy to use, but difficult to master.
Truly, it is one of the most straightforward and powerful data manipulation libraries, yet, because it is so easy to use, no one really spends much time trying to understand the best, most pythonic way to employ the library to its full extent.
If you haven't read Matt Harrison's book and use Pandas, chances are you're like that Chad at the picnic or camping trip that pulls out his guitar to strum along the same basic chords for an hour straight... Well, NO MORE!
Matt Harrison is ready to drop some knowledge on you and have you riffing your own data manipulation solos like you're Slash in "November Rain", or Prince in "Purple Rain"...
The book goes beyond explaining the data structures and methods that underpin Pandas, but he also provides a ton of practical advice regarding best practices in data manipulation and transformations.
For instance, by the time you're done you'll know which functions to use to leverage Pandas' vectorized structures to ensure your code is fast and efficient, which data types provide huge savings in terms of memory allocation, how to chain operations to ensure you're always accessing the correct intermediary dataframe, how to utilize indices to give you superpowers over your data, how to debug chains, merge, join, melt, style, and more.
It is by far, the best book you can get yourself if you want to take your data science skills to the next level, after all, they say modern data science is 90% data cleaning. I mostly agree.
I have recommended this book to every member of my team. REQUIRED READING.
Highest possible recommendation.