For any and all questions relating to challenge 13.

For a tutorial on how to use Jupyter Notebook, we put together this video:

Still have questions? Read all the FAQs here.

To continue to play around with the datasets in a Jupyter environment, click here.

Anyone have a cleaner sol?
I didnât really see how we had to use sort for this problem. Sort might have made grabbing the rows where quality was equal to 7 or 8 easier but this seemed more like a filtering problem.

``````quality_filter = wine_df['quality'] >= 8
qf = wine_df[quality_filter]
rsf = qf['residual sugar'] > 5
qrsf = qf[rsf]
print(qrsf.index.values)

quality_filter2 = wine_df['quality'] == 8
qf2 = wine_df[quality_filter2]
quality_filter3 = wine_df['quality'] == 7
qf3 = wine_df[quality_filter3]
qf4 = pd.concat([qf2, qf3])
citric_filter = qf4['citric acid'] < 0.4
qcf4 = qf4[citric_filter]
qcf4.count()
``````
4 Likes

Hereâs mine, I combined multiple filters using and logic:

#Q1
wine_df[(wine_df[âqualityâ]>=8) & (wine_df[âresidual sugarâ]>=5)]
#Q2
wine_df[(wine_df[âqualityâ].isin([7,8])) & (wine_df[âcitric acidâ]<0.4)].count()

18 Likes

Q1:
wine_df.sort_values(by=[âqualityâ,âresidual sugarâ], ascending = False)

Q2:
wine_df[(wine_df[âqualityâ]>6) & (wine_df[âcitric acidâ]<0.4)][âqualityâ].count()

I thought the biggest block means the hardest puzzle but it turned out to be an easy one

2 Likes

Video Solution: https://youtu.be/gFAM1o-5TEY

For Q1 I just used two filters
wine_df[ wine_df[âqualityâ]>=8 ][ wine_df[âresidual sugarâ]>5 ]

For Q2 I used a filter for the citric acid content and a condition in the groupby quality >= 7 with count()
This groups the quality as False and True, with the True counts as the number of wines needed.
wine_df[ wine_df[âcitric acidâ]<0.4 ].groupby( wine_df[âqualityâ]>=7 ).count()

1 Like

My intuition was to just use filtering for both questions but the suggestion to use `value_counts` may produce a cleaner solution for Q2.

Q1:

`wine_df[(wine_df['quality'] >= 8) & (wine_df['residual sugar'] >= 5)]`

You could sort by quality descending instead of filtering but that just seems silly.

Q2:

`wine_df[(wine_df['citric acid'] < 0.4)]['quality'].value_counts()`

2 Likes

My approach:

``````#Q1
wine_df_quality_res_filter = wine_df['quality'] >= 8
wine_df_qa_res = wine_df[wine_df_quality_res_filter].sort_values(by=['residual sugar'], ascending=False)

#Q2
wine_quality_df = wine_df[wine_df['quality'] >= 7]
wine_df_quality_res_filter2 = wine_quality_df['citric acid'] < 0.4
wine_quality_df_acid = wine_quality_df[wine_df_quality_res_filter2]
print(wine_quality_df_acid['citric acid'].value_counts().sum())
``````

``````# Answer of question 1
filtered_df = wine_df.loc[(wine_df['quality'] >= 8) & (wine_df['residual sugar'] > 5)]
filtered_df.index.values

filtered_df2 = wine_df.loc[((wine_df['quality'] == 8) | (wine_df['quality'] == 7)) & (wine_df['citric acid'] < 0.4)]
filtered_df2.index.value_counts().sum()
``````
2 Likes

I tried to make a filter that would check for both criteria at the same time, but wasnât having any luck, so I originally kept them separate. After playing around a bit more, I found another method that gave me the exact same result.

``````# Q1 - quality of 8 or higher and a residual sugar level above 5?
# separate filters
Q = wine_df['quality'] >= 8
RS = Qdf["residual sugar"] >5

Qdf = wine_df[Q]
QRSdf = Qdf[RS]

QRSdf

# Q1 - alternate
QRSdf = wine_df[(wine_df['quality'] >= 8) & (wine_df['residual sugar'] > 5)]

# Q2 - quality of 8 and 7 and a citric acid level below 0.4?
df = wine_df[wine_df["quality"] >= 7]
df = df[df["citric acid"] < 0.4]
print(df.count()["quality"])``````

Had no idea why I would need to use .value_counts() but the solution is actually quite clever. I just used three filters.

I like using `query` for filtering and agree that sorting is not a good fit for the problem as phrased.

``````# Q1
wine_df.query('quality >= 8 and `residual sugar` > 5').index.tolist()

# Q2
wine_df.query('`citric acid` < 0.4')['quality'].value_counts()[[7,8]].sum()``````
1 Like

My method:

Q1.

`wine_df.describe()`
Get a sense for the data and see that the maximum for quality is 8

`wine_df.sort_values(by=['quality', 'residual sugar'], ascending=False)`
Can now see the top quality wines which satisfy our â8 or moreâ condition (8 is the max remember) and look for sugars > 5, which there are two.

Q2.

`df1 = wine_df[wine_df['quality']>=7]`
for quality of 7 or higher

`(df1['citric acid']<0.4).value_counts()`
returns a bool map where the `True` value gives you the answer.

``````print(wine_df[(wine_df['quality']>=8) & (wine_df['residual sugar']>5)].index.values)
print(len(wine_df[(wine_df['quality']>=7) & (wine_df['quality']<=8) & (wine_df['citric acid']<0.4)].values))
``````
1 Like

Is this challenge really about sort and value count if many people used a faster and easier solution with less code that doesnât use sort or value count?

3 Likes

My Solution:
df1 = wine_df[(wine_df[âqualityâ]>=8) & (wine_df[âresidual sugarâ]>5)]
print(df1)
df2 = wine_df[(wine_df[âcitric acidâ]<0.4)]
df2[âqualityâ].value_counts()

Hi All!

I am loving seeing all the different ways people are solving these questions and the discussion around different approaches! One of the great thing about data analytics problems is the many different routes that can be taken to arrive at the solutions. The `.sort_values()` and `.value_counts()` are two functions that can be added to the toolbox as we work through this 21 day challenge. They can also be helpful in breaking up your code into digestible steps. Sometimes even if I am able to achieve something in a single line of code I will break it up into a few steps to make it easier to read for someone else stepping into my code. Remember thereâs never just one right way to do something so have fun and try what works for you in your problem solving process!

Enjoy the rest of your Monday everyone!

3 Likes

The lesson is that there are many solutions, and the corollary is that the solution suggested by the problem is rarely the best one.

3 Likes