Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

pandas - computing the mean for python datetime

I have a datetime attribute:

d = {
    'DOB': pd.Series([
        datetime.datetime(2014, 7, 9),
        datetime.datetime(2014, 7, 15),
        np.datetime64('NaT')
    ], index=['a', 'b', 'c'])
}
df_test = pd.DataFrame(d)

I would like to compute the mean for that attribute. Running mean() causes an error:

TypeError: reduction operation 'mean' not allowed for this dtype

I also tried the solution proposed elsewhere. It doesn't work as running the function proposed there causes

OverflowError: Python int too large to convert to C long

What would you propose? The result for the above dataframe should be equivalent to

datetime.datetime(2014, 7, 12).
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can take the mean of Timedelta. So find the minimum value and subtract it from the series to get a series of Timedelta. Then take the mean and add it back to the minimum.

dob = df_test.DOB
m = dob.min()
(m + (dob - m).mean()).to_pydatetime()

datetime.datetime(2014, 7, 12, 0, 0)

One-line

df_test.DOB.pipe(lambda d: (lambda m: m + (d - m).mean())(d.min())).to_pydatetime()

To @ALollz point

I use the epoch pd.Timestamp(0) instead of min

df_test.DOB.pipe(lambda d: (lambda m: m + (d - m).mean())(pd.Timestamp(0))).to_pydatetime()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...