for loop taking too long to run

i have a column named data2 in dataframe df1 that contains list of nested dictionaries in each rows. there are total 50983 rows. i used the following codes but the code is taking too much time to run. inside data2 i hv id that i need to update in all dictionaries because id is unique and after that i will have to groupby using id once it is normalised into dataframe with key value pairs . the code works fine but it is taking too long time (10 minutes) .here is the code that i used :

dff = pd.DataFrame()

for i in range(0,len(df1.data2)):
    l=df1['data2'][i]
    id_dict = [d for d in l if 'id' in d][0]
    merge = [ {**d, **id_dict} for d in l if 'id' not in d]
    dff= dff.append(merge)

here is the input file: input data

and my desired output should look like this : desired output

Archive from: https://stackoverflow.com/questions/59050903/for-loop-taking-too-long-to-run

Leave a Reply

Your email address will not be published. Required fields are marked *