Bluebird8203, I like your use of vert=False for this box plot. Regarding your question, from my perspective this is an atypical example of a box plot. There are over 11000 books in the dataset, with the majority of samples represented by the box, which is overshadowed by the outliers and does give a misleading visual depiction. The high end whisker is Q3 + 1.5* the interquartile range. The 75% percentile page count is 416 (from df.describe() ) while the upper whisker is about 750. The maximum is 9 times this value. So, while the plot is busy with outliers, these are still a small number of samples. The 9 times range of the outliers skews the box plot visual.
Try the box plot on other datasets, like the wine dataset. You can plot multiple box plots on one value filtering on wine quality - you’ll see a set of staggering boxes.
for i in range(1,7):
or use seaborn - its does more with less code
import seaborn as sns
sns.boxplot(df[‘quality’], df[‘citric acid’])