Challenge 6 Megathread

For any and all questions relating to challenge 6. :point_down:

For a tutorial on how to use Jupyter Notebook, we put together this video:

Still have questions? Read all the FAQs here.

Don’t need to do the work to answer Q2 & Q3 as only one response has the correct answer to Q1. :confused: Have thus far been able to do all the challenges with exceedingly little Python.

16 Likes

I will admit, that part was funny :sweat_smile:

To me, else is a great tool to be saved for when it’s needed.

import pandas as pd
hole_series=pd.Series(hole_sizes)
total_cost = 0.0
for i in range (len(hole_sizes)):
total_cost=total_cost+1.30
if hole_sizes[i]>=20:
total_cost=total_cost+0.30
if hole_sizes[i]>=70:
total_cost=total_cost+0.50
print (“Mean size:”, hole_series.mean(), “mm, averaging $”, round(total_cost/len(hole_sizes), 3), “, total $”, round(total_cost, 2))
print (“From max”,hole_series.max(), "mm down to ", hole_series.min(), “mm.”)

1 Like

total_cost=total_cost+1.30 can be rewritten as total_cost += 1.30 for even more succinctness!

6 Likes

This challenge was alright!
But for something to introduce pandas we didn’t really use any. Unless there’s a nice map function I don’t know about, I just used a for loop for Q2 & Q3

3 Likes

the mean isn’t whjat it actually is they made a mistake in case anyone here is confused just do it in the website and you’ll get it

yeah … there is no near to 20ish value for Q1

Well the line with the five listed hole sizes threw me, at first i worked the whole thing using just those five (couldn’t figure out why that line was there). good news is once I figured out I needed all 100, since the code was already written, it solved itself pretty fast! and this is why we code! whether 5 numbers or 100, the code runs exactly the same!
Now can anyone tell me why this line was included in the intro?

hole_sizes = (hole_sizes[:5])
1 Like

As per the hint not sure if they wanted us to use pandas or not. I ended up using it…saves so many calculations

import pandas as pd
random.seed(34)

hole_sizes = [random.randint(1, i) for i in range(1, 101)]
random.shuffle(hole_sizes)

series = pd.Series(hole_sizes)
print(f"Average sized hole {series.mean()}")
hole_prices = []
for i in range(len(hole_sizes)):
    if hole_sizes[i] < 20:
        hole_prices.append(1.30)
    elif hole_sizes[i] >= 20 and hole_sizes[i] < 70:
        hole_prices.append(1.6)
    else:
        hole_prices.append(2.10)
price_series = pd.Series(hole_prices)
print(f"Average cost to fix a hole is {price_series.mean()}")
print(f"Total cost to fix all holes is {sum(hole_prices)}")
print(f"Max size hole is {series.min()}")
print(f"Min size hole is {series.max()}")

8 Likes

you can, but it’s more for yourself to practice the lesson! if you wanted

1 Like

I think they were just showing a sample of the data in the dataset. They just chose to show a sample rather than output a massive list of 100 items. I suppose it’s also a reminder of a previous challenge to refresh our memory of list syntax.

Should really avoid for-loops when you’re doing data science with Python or R. In fact, there’s no need to use any for-loops or if-statements for this challenge at all if you know how to utilize pandas.

3 Likes

No, the answer is correct.

That’s it.
When you write [:5], it means that you want to show only the first 5 elements of the given list.

Wondering if there is a better way to do this?

df = pd.DataFrame(hole_sizes, columns = ['size'])

def cost_per_hole(size):
    cost = 0
    if size < 20:
        cost = 1.3
    elif size > 20 and size < 70:
        cost = 1.6
    else:
        cost = 2.1
    return cost

How can I substitute extensive if-else statements?

1 Like

I solved them all just for the sake of it, but yeah I immediately saw that loophole as well. Anyone who is practiced in taking multiple choice tests will immediately look for shortcuts like that.

1 Like

In terms of ordering the ifs, I would switch your second and third to limit the amount of typing necessary.

If you make the second statement conditional on >= 70, the double conditioned middle statement just becomes your final “else” instead of having to write two conditions.

More broadly, since you did this with pandas dataframes instead of arrays, I might have combined it with numpy and written the following code:

df[‘cost_per_hole’] = np.where(df[‘size’] < 20, 1.3, np.where(df[‘size’] >= 70, 2.1, 1.6)

^ using np.where is my preferred way to set column values based on conditional statements about other columns in dataframes. If functions a bit like a nested if function in excel, if that is familiar to you, but basically the syntax is
np.where([Condition], [Result if True], [Result if False])

You can nest multiple np.where statements by simply entering another np.where into the [Result if False] space

3 Likes