dataframe has only one cell for each row how to get that to one column
Answers
Answer:
have a dataframe where some cells contain lists of multiple values. Rather than storing multiple values in a cell, I'd like to expand the dataframe so that each item in the list gets its own row (with the same values in all other columns). So if I have:
import pandas as pd
import numpy as np
df = pd.DataFrame(
{'trial_num': [1, 2, 3, 1, 2, 3],
'subject': [1, 1, 1, 2, 2, 2],
'samples': [list(np.random.randn(3).round(2)) for i in range(6)]
}
)
df
Out[10]:
samples subject trial_num
0 [0.57, -0.83, 1.44] 1 1
1 [-0.01, 1.13, 0.36] 1 2
2 [1.18, -1.46, -0.94] 1 3
3 [-0.08, -4.22, -2.05] 2 1
4 [0.72, 0.79, 0.53] 2 2
5 [0.4, -0.32, -0.13] 2 3
How do I convert to long form, e.g.:
subject trial_num sample sample_num
0 1 1 0.57 0
1 1 1 -0.83 1
2 1 1 1.44 2
3 1 2 -0.01 0
4 1 2 1.13 1
5 1 2 0.36 2
6 1 3 1.18 0
# etc.
The index is not important, it's OK to set existing columns as the index and the final ordering isn't important.
python
pandas
list
Share
Improve this question
Follow
edited Jul 23 '19 at 14:37
cs95
283k7676 gold badges500500 silver badges554554 bronze badges
asked Dec 3 '14 at 4:44
Marius
51.2k1313 gold badges9494 silver badges9696 bronze badges
20
From pandas 0.25 you can also use df.explode('samples') to solve this. explode can only support exploding one column for now. – cs95 Aug 5 '19 at 20:50
Add a comment
10 Answers
68
UPDATE: the solution below was helpful for older Pandas versions, because the DataFrame.explode() wasn’t available. Starting from Pandas 0.25.0 you can simply use DataFrame.explode().