Skip to content

BUG: Fix json_normalize throwing TypeError when record_path has a sequence of dicts #22706 #22804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 13, 2018

Conversation

vuminhle
Copy link
Contributor

@pep8speaks
Copy link

pep8speaks commented Sep 22, 2018

Hello @vuminhle! Thanks for updating the PR.

Comment last updated on September 26, 2018 at 04:20 Hours UTC

@codecov
Copy link

codecov bot commented Sep 22, 2018

Codecov Report

Merging #22804 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22804      +/-   ##
==========================================
+ Coverage   92.21%   92.21%   +<.01%     
==========================================
  Files         162      162              
  Lines       51763    51770       +7     
==========================================
+ Hits        47733    47741       +8     
+ Misses       4030     4029       -1
Flag Coverage Δ
#multiple 90.62% <100%> (ø) ⬆️
#single 42.99% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/json/normalize.py 96.93% <100%> (+0.06%) ⬆️
pandas/core/generic.py 96.65% <0%> (ø) ⬆️
pandas/core/groupby/generic.py 87.12% <0%> (+0.04%) ⬆️
pandas/core/groupby/groupby.py 96.67% <0%> (+0.16%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 91802fb...db24124. Read the comment docs.

@WillAyd WillAyd added the IO JSON read_json, to_json, json_normalize label Sep 22, 2018
@vuminhle vuminhle force-pushed the fix-json-normalize-typeerror branch from 9213f9f to da84997 Compare September 26, 2018 04:20
@vuminhle
Copy link
Contributor Author

Any other feedback @WillAyd @jreback ?

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes lgtm. Some minor nits around the docs

@vuminhle
Copy link
Contributor Author

I think this PR also fixes #21605

@WillAyd
Copy link
Member

WillAyd commented Nov 24, 2018

Can you merge master?

@vuminhle
Copy link
Contributor Author

Can you merge master?

Done.

@vuminhle vuminhle closed this Nov 26, 2018
@vuminhle vuminhle reopened this Nov 26, 2018
@WillAyd
Copy link
Member

WillAyd commented Dec 11, 2018

Is there a reason for both _extract and _recursive_extract? Pretty confusing IMO would be better served as one consolidated function I think

@vuminhle
Copy link
Contributor Author

vuminhle commented Dec 12, 2018

Is there a reason for both _extract and _recursive_extract? Pretty confusing IMO would be better served as one consolidated function I think

The new _extract performs the actual record extraction. I can consolidate the two but we need to convert data to list in case it is not. Then we can iterate on data as in the original code. It's the same hack as line 198. _recursive_extract will look roughly like this:

if len(path) > 1:
    ...
else:
    if not isinstance(data, list):
        data = [data]
    for obj in data:
        # perform actual extraction

Since we only have 1 function now, I think we can revert the change of returning tuples. IMO it looks ugly. What do you think?

@WillAyd
Copy link
Member

WillAyd commented Dec 12, 2018

I can consolidate the two but we need to convert data to list in case it is not

I'm pretty sure this is done quite a few other places so not out of the ordinary, though feel free to double check.

Since we only have 1 function now, I think we can revert the change of returning tuples. IMO it looks ugly. What do you think?

Not sure I fully understand what you have in mind but yea I don't think as is this is very readable. Any refactor / simplification here which does away with similar but separate functions and does not require mutation of an argument would be a great improvement

@vuminhle
Copy link
Contributor Author

I can consolidate the two but we need to convert data to list in case it is not

I'm pretty sure this is done quite a few other places so not out of the ordinary, though feel free to double check.

Since we only have 1 function now, I think we can revert the change of returning tuples. IMO it looks ugly. What do you think?

Not sure I fully understand what you have in mind but yea I don't think as is this is very readable. Any refactor / simplification here which does away with similar but separate functions and does not require mutation of an argument would be a great improvement

Can you take another look at the changes? Since _recursive_extract assumes that data is always a list, I added 2 lines to convert data to list if it is not.

@jreback
Copy link
Contributor

jreback commented Dec 13, 2018

this also fixed #21605? can you add a test and add to the whatsnew

@jreback jreback added the Bug label Dec 13, 2018
@jreback jreback added this to the 0.24.0 milestone Dec 13, 2018
@vuminhle
Copy link
Contributor Author

vuminhle commented Dec 13, 2018

this also fixed #21605? can you add a test and add to the whatsnew

Looks like #21605 is a duplicate (of #22804 ). We can just close #21605.

@jreback
Copy link
Contributor

jreback commented Dec 13, 2018

hmm, though that is a list of dicts?

@vuminhle
Copy link
Contributor Author

hmm, though that is a list of dicts?

list of dicts should work already.

@jreback
Copy link
Contributor

jreback commented Dec 13, 2018

ok this lgtm then. @WillAyd over to you.

@WillAyd WillAyd merged commit 31a3512 into pandas-dev:master Dec 13, 2018
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

json_normalize raises TypeError exception
4 participants