Python:Data Analytics and Visualization
上QQ阅读APP看书,第一时间看更新

Time series plotting

Pandas comes with great support for plotting, and this holds true for time series data as well.

As a first example, let's take some monthly data and plot it:

>>> rng = pd.date_range(start='2000', periods=120, freq='MS')
>>> ts = pd.Series(np.random.randint(-10, 10, size=len(rng)), rng).cumsum()
>>> ts.head()
2000-01-01 -4
2000-02-01 -6
2000-03-01 -16
2000-04-01 -26
2000-05-01 -24
Freq: MS, dtype: int64

Since matplotlib is used under the hood, we can pass a familiar parameter to plot, such as c for color, or title for the chart title:

>>> ts.plot(c='k', title='Example time series')
>>> plt.show()

The following figure shows an example time series plot:

We can overlay an aggregate plot over 2 and 5 years:

>>> ts.resample('2A').plot(c='0.75', ls='--')
>>> ts.resample('5A').plot(c='0.25', ls='-.')

The following figure shows the resampled 2-year plot:

The following figure shows the resample 5-year plot:

We can pass the kind of chart to the plot method as well. The return value of the plot method is an AxesSubplot, which allows us to customize many aspects of the plot. Here we are setting the label values on the X axis to the year values from our time series:

>>> plt.clf()
>>> tsx = ts.resample('1A')
>>> ax = tsx.plot(kind='bar', color='k')
>>> ax.set_xticklabels(tsx.index.year)

Let's imagine we have four time series that we would like to plot simultaneously. We generate a matrix of 1000 × 4 random values and treat each column as a separated time series:

>>> plt.clf()
>>> ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
>>> df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index, columns=['A', 'B', 'C', 'D'])
>>> df = df.cumsum()>>> df.plot(color=['k', '0.75', '0.5', '0.25'], ls='--')