Python数据处理

Zhangwenniu 于 2022-10-02 发布

时间序列

时间戳

时间戳的转换,需要用到%y,%d等字符串转换的strformat,参考链接:strftime

在pd.to_datetime()中,如果出现时间戳的转换失败,可能是因为文本里面本身带有时区信息。如果时间文本中带有时区信息,需要设定utc=True。参见pandas.to_datetime

The following causes are responsible for datetime.datetime objects being returned (possibly inside an Index or a Series with object dtype) instead of a proper pandas designated type (Timestamp, DatetimeIndex or Series with datetime64 dtype):

  • when any input element is before Timestamp.min or after Timestamp.max, see timestamp limitations.
  • when utc=False (default) and the input is an array-like or Series containing mixed naive/aware datetime, or aware with mixed time offsets. Note that this happens in the (quite frequent) situation when the timezone has a daylight savings policy. In that case you may wish to use utc=True.

空值

空值检测

pd.isnan()
pd.isnull()

二者是同一个东西,参考pd.isnull()的函数说明。

DataFrame.isnull()
  DataFrame.isnull is an alias for DataFrame.isna.
  Detect missing values.