-
Notifications
You must be signed in to change notification settings - Fork 5
Review notebook 8 "Combine DataFrames" #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: review
Are you sure you want to change the base?
Conversation
Check out this pull request on You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB. |
@@ -0,0 +1,1728 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest moving the read_csv for air_quality_stations
and air_quality_parameters
further down the notebook until right before they are used. Currently, it's hard for me to follow the lesson because there's a lot of objects you show me right at the top, and not all of them are relevant yet, so I get distracted by what I need to pay attention to.
Reply via ReviewNB
@@ -0,0 +1,1728 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest splitting this into 3 separate print statements (on 3 lines in the same cell). Currently, it's a bit hard for me to visually link each object name to its shape.
Reply via ReviewNB
@@ -0,0 +1,1728 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is out of order here. Just a few cells above, air_quality
had 7 columns, and now it has 12, even though nothing has been done to it in-between.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think this is due to an error in the execution order of the cells, should be ok after rerun.
@@ -0,0 +1,1728 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I had a bit of a hard time following this lesson because there are so many columns, and so many DataFrames with similar names, plus some DataFrames get overwritten. It's information overload, and I don't know where to focus my attention.
In order for a beginner to follow, I would recommend removing most (but not all) of the irrelevant columns (perhaps during the file reading process), just so they can focus their attention on the relevant columns.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'm going to clean up the example data sets, in order to limit the number of columns the user gets confronted with.
Following the general remark by @justmarkham about introducing the concept of axis of operations, I propose to add this when talking about concatenation. |
No description provided.