Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
183 views
in Technique[技术] by (71.8m points)

python - How to turn nested json or list inside dataframe?

I have set of repsonse from Elasticsearch using aggregation query , and the response is like

'aggregations': {'group': {'doc_count_error_upper_bound': 0,
   'sum_other_doc_count': 0,
   'buckets': [{'key': 1365,
     'doc_count': 518,
     'group_docs': {'hits': {'total': {'value': 518, 'relation': 'eq'},
       'max_score': None,
       'hits': [{'_index': 'mdata',
         '_type': 'ter',
         '_id': 'n1X04XYBlaUrIoJskq9q',
         '_score': None,
         '_source': {'hId': 1365,
          'Id': 5348,
          'type': 'data'},
         'sort': [1610108665027]}]}}},
    {'key': 1372,
     'doc_count': 517,
     'group_docs': {'hits': {'total': {'value': 517, 'relation': 'eq'},
       'max_score': None,
       'hits': [{'_index': 'mdata',
         '_type': 'ter',
         '_id': 'qFUw4nYBlaUrIoJs6rdz',
         '_score': None,
         '_source': {'hId': 1372,
          'Id': 5348,
          'type': 'data'},
         'sort': [1610112617581]}]}}},
    {'key': 1392,
     'doc_count': 491,
     'group_docs': {'hits': {'total': {'value': 491, 'relation': 'eq'},
       'max_score': None,
       'hits': [{'_index': 'mdata',
         '_type': 'ter',
         '_id': '8VXR4XYBlaUrIoJsYKrS',
         '_score': None,
         '_source': {'hId': 1392,
          'Id': 5348,
          'type': 'data'},
         'sort': [1610106358393]}]}}}]},
  'bucketcount': {'count': 3,
   'min': 491.0,
   'max': 518.0,
   'avg': 508.6666666666667,
   'sum': 1526.0}}}

so i try to get the dataframe using

df= pd.json_normalize(result['aggregations']['group']['buckets'])

key doc_count   group_docs.hits.total.value group_docs.hits.total.relation  group_docs.hits.max_score   group_docs.hits.hits
0   1365    518 518 eq  None    [{'_index': 'mdata', '_type': 'ter', '_...
1   1372    517 517 eq  None    [{'_index': 'mdata', '_type': 'ter', '_...
2   1392    491 491 eq  None    [{'_index': 'mdata', '_type': 'ter', '_...

i have apply method in here enter link description here

using forreal = pd.DataFrame(result.get('group_docs.hits.hits')) didnt work for me with empty return

and

works_data = pd.json_normalize(df,record_path ='group_docs.hits.hits') returning Error "TypeError: string indices must be integers"

a Slow method i have try out is using

df= pd.json_normalize(result['aggregations']['group']['buckets'])
df_1 = (df.hits[0]['hits'])

and then append the Dataframe , however it;s slow for me as i have lot of DF to concat or append, i would like if there;s much better method to do ?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You don't specify what you are trying to achieve. The following will fully expand the sample JSON in your question

pd.json_normalize(
pd.json_normalize(results['aggregations']['group']['buckets']).explode("group_docs.hits.hits")
    .to_dict(orient="records")
).explode("group_docs.hits.hits.sort")

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...