For any and all questions relating to challenge 6.

For a tutorial on how to use Jupyter Notebook, we put together this video:

Still have questions? Read all the FAQs here.

Donāt need to do the work to answer Q2 & Q3 as only one response has the correct answer to Q1. Have thus far been able to do all the challenges with exceedingly little Python.

16 Likes

I will admit, that part was funny

To me, else is a great tool to be saved for when itās needed.

import pandas as pd
hole_series=pd.Series(hole_sizes)
total_cost = 0.0
for i in range (len(hole_sizes)):
total_cost=total_cost+1.30
if hole_sizes[i]>=20:
total_cost=total_cost+0.30
if hole_sizes[i]>=70:
total_cost=total_cost+0.50
print (āMean size:ā, hole_series.mean(), āmm, averaging \$ā, round(total_cost/len(hole_sizes), 3), ā, total \$ā, round(total_cost, 2))
print (āFrom maxā,hole_series.max(), "mm down to ", hole_series.min(), āmm.ā)

1 Like

`total_cost=total_cost+1.30` can be rewritten as `total_cost += 1.30` for even more succinctness!

6 Likes

This challenge was alright!
But for something to introduce pandas we didnāt really use any. Unless thereās a nice map function I donāt know about, I just used a for loop for Q2 & Q3

3 Likes

the mean isnāt whjat it actually is they made a mistake in case anyone here is confused just do it in the website and youāll get it

yeah ā¦ there is no near to 20ish value for Q1

Well the line with the five listed hole sizes threw me, at first i worked the whole thing using just those five (couldnāt figure out why that line was there). good news is once I figured out I needed all 100, since the code was already written, it solved itself pretty fast! and this is why we code! whether 5 numbers or 100, the code runs exactly the same!
Now can anyone tell me why this line was included in the intro?

``hole_sizes = (hole_sizes[:5])``
1 Like

As per the hint not sure if they wanted us to use pandas or not. I ended up using itā¦saves so many calculations

``````import pandas as pd
random.seed(34)

hole_sizes = [random.randint(1, i) for i in range(1, 101)]
random.shuffle(hole_sizes)

series = pd.Series(hole_sizes)
print(f"Average sized hole {series.mean()}")
hole_prices = []
for i in range(len(hole_sizes)):
if hole_sizes[i] < 20:
hole_prices.append(1.30)
elif hole_sizes[i] >= 20 and hole_sizes[i] < 70:
hole_prices.append(1.6)
else:
hole_prices.append(2.10)
price_series = pd.Series(hole_prices)
print(f"Average cost to fix a hole is {price_series.mean()}")
print(f"Total cost to fix all holes is {sum(hole_prices)}")
print(f"Max size hole is {series.min()}")
print(f"Min size hole is {series.max()}")

``````
8 Likes

you can, but itās more for yourself to practice the lesson! if you wanted

1 Like

I think they were just showing a sample of the data in the dataset. They just chose to show a sample rather than output a massive list of 100 items. I suppose itās also a reminder of a previous challenge to refresh our memory of list syntax.

Should really avoid for-loops when youāre doing data science with Python or R. In fact, thereās no need to use any for-loops or if-statements for this challenge at all if you know how to utilize pandas.

3 Likes

Thatās it.
When you write [:5], it means that you want to show only the first 5 elements of the given list.

Wondering if there is a better way to do this?

``````df = pd.DataFrame(hole_sizes, columns = ['size'])

def cost_per_hole(size):
cost = 0
if size < 20:
cost = 1.3
elif size > 20 and size < 70:
cost = 1.6
else:
cost = 2.1
return cost
``````

How can I substitute extensive if-else statements?

1 Like

I solved them all just for the sake of it, but yeah I immediately saw that loophole as well. Anyone who is practiced in taking multiple choice tests will immediately look for shortcuts like that.

1 Like

In terms of ordering the ifs, I would switch your second and third to limit the amount of typing necessary.

If you make the second statement conditional on >= 70, the double conditioned middle statement just becomes your final āelseā instead of having to write two conditions.

More broadly, since you did this with pandas dataframes instead of arrays, I might have combined it with numpy and written the following code:

df[ācost_per_holeā] = np.where(df[āsizeā] < 20, 1.3, np.where(df[āsizeā] >= 70, 2.1, 1.6)

^ using np.where is my preferred way to set column values based on conditional statements about other columns in dataframes. If functions a bit like a nested if function in excel, if that is familiar to you, but basically the syntax is
np.where([Condition], [Result if True], [Result if False])

You can nest multiple np.where statements by simply entering another np.where into the [Result if False] space

3 Likes