Description
Code Sample, a copy-pastable example if possible
import pandas as pd
import datetime
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, DayLocator
import matplotlib.ticker as ticker
import matplotlib.dates as mdates
#%matplotlib inline
#comment out any day to show the example with non-contiguous dates
d = {
'2017-10-01': 20,
'2017-10-02': 20,
'2017-10-03': 15,
'2017-10-04': 1,
'2017-10-05': 118,
'2017-10-06': 16,
}
df = pd.DataFrame(list(d.items()), columns=['DATE', 'TMIN'])
df['DATE'] = df['DATE'].astype('datetime64[ns]')
fig, ax = plt.subplots(figsize=(8, 4))
df.plot(x='DATE', y='TMIN', kind='line', ax=ax)
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("\n%b%d"));
#Turn off minor ticks
ax.xaxis.set_minor_formatter(mdates.DateFormatter(""));
ax.grid(True, which='both')
print(ax.get_xticks())
Problem description
Please note: I reported this to Matplotlib also. Please see comments there (including a work around that worked for me).
When using any of the Locators the date calculation (x axis labels) are off by one day sometimes. This occurs when the x data is a list of contiguous dates. If the dates are not contiguous the problem does not occur.
-Using data from a list of contiguous dates:
--problem occurs
--ax.get_xticks() shows that Epoch from Jan 01, 1970 is used
-Using data from a list with non-contiguous dates:
--problem does not occur
--ax.get_xticks() shows that Epoch from Jan 01, 0000 is used
This seems to be a problem in how a date is calculated given an Epoch number of days when the Epoch Jan 01, 1970 is used. Perhaps because some systems use 0 and some use 1 as the first day Jan 01, 1970?
Also notice the print(ax.get_xticks()) -- you can see that for contiguous data it uses Epoch 1970 and for non-contiguous it uses Epoch 0000.
Examples from print(ax.get_xticks()) (notice different Epochs used)
Contiguous data: [17440. 17441. 17442. 17443. 17444. 17445.]
Non-contiguous data: [736603. 736604. 736605. 736606. 736607. 736608.]
Expected Output
The plot should have Oct 01 instead of Sep 30 for the first tick
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Windows
OS-release: 8.1
machine: AMD64
processor: Intel64 Family 6 Model 15 Stepping 11, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.4
pytest: 4.0.2
pip: 18.1
setuptools: 40.6.3
Cython: 0.29.2
numpy: 1.15.4
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: 1.8.2
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 3.0.2
openpyxl: 2.5.12
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.2
lxml: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None