Challenge 18 Megathread

For any and all questions relating to Challenge 18 :point_down: post away!

not too shabby this one! :slight_smile:

1 Like

Chose right answer. But shows wrong :frowning:

Hello everyone, welcome to challenge 18.

Feel free to ask yours questions here and discuss amongst yourselves, without sharing answers.

Thanks.

1 Like

@Kelvine95 Can we share our code in this forum so that others can comment on why the code is not working?

No, you can’t share your code.

@YGW saw your comment in the other forum. Issue with sharing your code is that you might give others your solution which might just need a little fine-tuning. If there is anything you are not sure of about your code, then kindly ask with a small code snippet. But please, you can’t share your entire code.

Hope this is fine?

Sounds good @Kelvine95 . And if we need help with more than just a small snippet, we can DM you or another moderator?

For this question: 1. Which columns have the highest correlation?
Do you mean the relationship between the data being the strongest (largest number regardless of positive or negative) or do you mean having the highest positive correlation?

Yes, you can always dm any of the moderator. We are more than happy to assist you.

The magnitude of Correlation usually refers to the value, irrespective of the sign.

This is great because half of our team is doing this on GMT+8 time and I was doing a previous challenge at the Wifi of a restaurant at Chatuchat market one day after surfing on a cruiseboat on the Chao Phraya river. We reside literally a walk away from Wat Arun by that river.

2 Likes

Over 90% of Thai’s economy depends on foreign tourism. Mainly from the west, China and also some Russia (now a majority of them cannot leave Phuket, interestingly enough, because of sanctions to air travel options)

Hello there,
I found this article about “Interpreting Correlation Coefficients”, because I wasn’t sure what was considered a strong or weak correlation coefficient, it may help somebody else. :slight_smile:

5 Likes

I am not sure how to approach the last question. I looked at the hint but can’t understand how finding the max value in the data frame will help with finding the highest correlation. For a correlation, don’t you need two features?

I googled how to find the correlation between all columns in the data frame. you need to use corr() function too.

1 Like

Further to @DianaMCR’s response, see this tutorial on how to use corr().

4 Likes

Watch Out For Numbers With Commas
This is great! Much better than what I was using haha.

I want to point out though that the .corr() method does not find the correlation coefficient between two series if one of the series is a series of numbers with commas (e.g. 1,200). The numbers in that series has to be converted “manually” from a string type into a float or int type or whatever the .corr() method can handle.

E.g. if you use df.corr(), the resulting table of coefficient will not show data for the "income per tourist" column of the csv file.

1 Like

When I run my code, all I can see are words instead of a plot:
image

My code follows a similar format to what is mentioned in the tutorial:

import pandas as pd
import matplotlib.pyplot as plt

df = ...
plt.figure()
plt.scatter(x = df['numerical_data'], y = df['numerical_data_2'])
plt.show()

How can I make the graph itself show up @Kelvine95 or anyone else who knows?
Thanks.

I had this problem, but I just ran the code again and the plot showed up.