Skip to content

Commit 59a8b6c

Browse files
AdamRJensencwhansewholmgrenkandersolar
authored
Add read_bsrn function (#1145)
* Add bsrn file to read bsrn files Related to issue #1015. * simplified read_bsrn function Simplified how the start and end line of the data is determined. Improved documentation, e.g. moved constants outside of function. * Simplified selection of rows in read_bsrn * Added read_bsrn to api.rst * Delete 2021_01_16_read_bsrn_pull_request_v2.py * Improved format, e.g removed trailing white spaces * Fixed spacing issues * Update v0.9.0.rst * Add iotools.bsrn and import read_bsrn * Split multiple lines to obey 75 character limit * Corrected indentation * Fixed indentation again * Remove bsrn email in description Co-authored-by: Cliff Hansen <[email protected]> * Correct COL_SPEC variable The previous values in the COL_SPEC variables were not all correct, leading to incorrect parsing of the data. * Changed air_temperature to temp_air * Add test_bsrn file File is not complete, as I'm awaiting permission from BSRN to upload test file * Reference to FTP updated * Add zipped bsrn test file * Update test filename * Get file month/year from file instead of filename Previously the month and year of the file were determined from the filename. This has now been changed such that the month/year is found from within the file's metadata section (second line). * Fixed formatting/stickler issues * Fixed formatting/stickler issues * Fixed formatting/stickler issues * Fix to test_format_index * Refactored file opening and utc localization * Fixed indentation issue * Fixed hyperlink * Fixed doc error Air temperature was listed as air_temperature in the docstring instead of temp_air. * Handle file start date explicitly Co-authored-by: Will Holmgren <[email protected]> * Correct pytest fixture magic Co-authored-by: Will Holmgren <[email protected]> * Fix indentation broken by previous commit * Correct Dataframe to DataFrame in doc string * Add offset to line num after explicitly handling start date * Update test_bsrn.py * Added compression='infer', fixed end line number issue * Fixed test issue * Changed timedelta unit from min to minute * Add files via upload All logical records after LR0100 have been removed to reduce space (be below 25 MB), but also to test the functionality of files with few logical records. * Changed to_timedelta unit from minute' to 'T' * Updated test to cover unzipped and zipped files * Removed error causing blank line in test file * Change to Unix end of line character from file by wholmgren * Remove extra line at end of file * Fix typo in bsrn.py doc string Co-authored-by: Kevin Anderson <[email protected]> Co-authored-by: Cliff Hansen <[email protected]> Co-authored-by: Will Holmgren <[email protected]> Co-authored-by: Kevin Anderson <[email protected]>
1 parent b666520 commit 59a8b6c

File tree

7 files changed

+87074
-0
lines changed

7 files changed

+87074
-0
lines changed

docs/sphinx/source/api.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -483,6 +483,7 @@ relevant to solar energy modeling.
483483
iotools.parse_psm3
484484
iotools.get_pvgis_tmy
485485
iotools.read_pvgis_tmy
486+
iotools.read_bsrn
486487

487488
A :py:class:`~pvlib.location.Location` object may be created from metadata
488489
in some files.

docs/sphinx/source/whatsnew/v0.9.0.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,8 @@ Deprecations
6262

6363
Enhancements
6464
~~~~~~~~~~~~
65+
* Add :func:`~pvlib.iotools.read_bsrn` for reading BSRN solar radiation data
66+
files. (:pull:`1145`, :issue:`1015`)
6567
* In :py:class:`~pvlib.modelchain.ModelChain`, attributes which contain
6668
output of models are now collected into ``ModelChain.results``.
6769
(:pull:`1076`, :issue:`1067`)
@@ -124,3 +126,4 @@ Contributors
124126
* Mark Mikofski (:ghuser:`mikofski`)
125127
* Nate Croft (:ghuser:`ncroft-b4`)
126128
* Kevin Anderson (:ghuser:`kanderso-nrel`)
129+
* Adam R. Jensen (:ghuser:`AdamRJensen`)

pvlib/data/bsrn-lr0100-pay0616.dat

Lines changed: 86901 additions & 0 deletions
Large diffs are not rendered by default.

pvlib/data/bsrn-pay0616.dat.gz

4.13 MB
Binary file not shown.

pvlib/iotools/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,4 @@
1313
from pvlib.iotools.psm3 import read_psm3 # noqa: F401
1414
from pvlib.iotools.psm3 import parse_psm3 # noqa: F401
1515
from pvlib.iotools.pvgis import get_pvgis_tmy, read_pvgis_tmy # noqa: F401
16+
from pvlib.iotools.bsrn import read_bsrn # noqa: F401

pvlib/iotools/bsrn.py

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
"""Functions to read data from the Baseline Surface Radiation Network (BSRN).
2+
.. codeauthor:: Adam R. Jensen<[email protected]>
3+
"""
4+
5+
import pandas as pd
6+
import gzip
7+
8+
COL_SPECS = [(0, 3), (4, 9), (10, 16), (16, 22), (22, 27), (27, 32), (32, 39),
9+
(39, 45), (45, 50), (50, 55), (55, 64), (64, 70), (70, 75)]
10+
11+
BSRN_COLUMNS = ['day', 'minute',
12+
'ghi', 'ghi_std', 'ghi_min', 'ghi_max',
13+
'dni', 'dni_std', 'dni_min', 'dni_max',
14+
'empty', 'empty', 'empty', 'empty', 'empty',
15+
'dhi', 'dhi_std', 'dhi_min', 'dhi_max',
16+
'lwd', 'lwd_std', 'lwd_min', 'lwd_max',
17+
'temp_air', 'relative_humidity', 'pressure']
18+
19+
20+
def read_bsrn(filename):
21+
"""
22+
Read a BSRN station-to-archive file into a DataFrame.
23+
24+
The BSRN (Baseline Surface Radiation Network) is a world wide network
25+
of high-quality solar radiation monitoring stations as described in [1]_.
26+
The function only parses the basic measurements (LR0100), which include
27+
global, diffuse, direct and downwelling long-wave radiation [2]_. Future
28+
updates may include parsing of additional data and meta-data.
29+
30+
BSRN files are freely available and can be accessed via FTP [3]_. Required
31+
32+
username and password are easily obtainable as described in the BSRN's
33+
Data Release Guidelines [4]_.
34+
35+
36+
37+
Parameters
38+
----------
39+
filename: str
40+
A relative or absolute file path.
41+
42+
Returns
43+
-------
44+
data: DataFrame
45+
A DataFrame with the columns as described below. For more extensive
46+
description of the variables, consult [2]_.
47+
48+
Notes
49+
-----
50+
The data DataFrame includes the following fields:
51+
52+
======================= ====== ==========================================
53+
Key Format Description
54+
======================= ====== ==========================================
55+
day int Day of the month 1-31
56+
minute int Minute of the day 0-1439
57+
ghi float Mean global horizontal irradiance [W/m^2]
58+
ghi_std float Std. global horizontal irradiance [W/m^2]
59+
ghi_min float Min. global horizontal irradiance [W/m^2]
60+
ghi_max float Max. global horizontal irradiance [W/m^2]
61+
dni float Mean direct normal irradiance [W/m^2]
62+
dni_std float Std. direct normal irradiance [W/m^2]
63+
dni_min float Min. direct normal irradiance [W/m^2]
64+
dni_max float Max. direct normal irradiance [W/m^2]
65+
dhi float Mean diffuse horizontal irradiance [W/m^2]
66+
dhi_std float Std. diffuse horizontal irradiance [W/m^2]
67+
dhi_min float Min. diffuse horizontal irradiance [W/m^2]
68+
dhi_max float Max. diffuse horizontal irradiance [W/m^2]
69+
lwd float Mean. downward long-wave radiation [W/m^2]
70+
lwd_std float Std. downward long-wave radiation [W/m^2]
71+
lwd_min float Min. downward long-wave radiation [W/m^2]
72+
lwd_max float Max. downward long-wave radiation [W/m^2]
73+
temp_air float Air temperature [°C]
74+
relative_humidity float Relative humidity [%]
75+
pressure float Atmospheric pressure [hPa]
76+
======================= ====== ==========================================
77+
78+
References
79+
----------
80+
.. [1] `World Radiation Monitoring Center - Baseline Surface Radiation
81+
Network (BSRN)
82+
<https://bsrn.awi.de/>`_
83+
.. [2] `Update of the Technical Plan for BSRN Data Management, 2013,
84+
Global Climate Observing System (GCOS) GCOS-172.
85+
<https://bsrn.awi.de/fileadmin/user_upload/bsrn.awi.de/Publications/gcos-174.pdf>`_
86+
.. [3] `BSRN Data Retrieval via FTP
87+
<https://bsrn.awi.de/data/data-retrieval-via-ftp/>`_
88+
.. [4] `BSRN Data Release Guidelines
89+
<https://bsrn.awi.de/data/conditions-of-data-release/>`_
90+
"""
91+
92+
# Read file and store the starting line number for each logical record (LR)
93+
line_no_dict = {}
94+
if str(filename).endswith('.gz'): # check if file is a gzipped (.gz) file
95+
open_func, mode = gzip.open, 'rt'
96+
else:
97+
open_func, mode = open, 'r'
98+
with open_func(filename, mode) as f:
99+
f.readline() # first line should be *U0001, so read it and discard
100+
line_no_dict['0001'] = 0
101+
date_line = f.readline() # second line contains the year and month
102+
start_date = pd.Timestamp(year=int(date_line[7:11]),
103+
month=int(date_line[3:6]), day=1,
104+
tz='UTC') # BSRN timestamps are UTC
105+
for num, line in enumerate(f, start=2):
106+
if line.startswith('*'): # Find start of all logical records
107+
line_no_dict[line[2:6]] = num # key is 4 digit LR number
108+
109+
# Determine start and end line of logical record LR0100 to be parsed
110+
start_row = line_no_dict['0100'] + 1 # Start line number
111+
# If LR0100 is the last logical record, then read rest of file
112+
if start_row-1 == max(line_no_dict.values()):
113+
end_row = num # then parse rest of the file
114+
else: # otherwise parse until the beginning of the next logical record
115+
end_row = min([i for i in line_no_dict.values() if i > start_row]) - 1
116+
nrows = end_row-start_row+1
117+
118+
# Read file as a fixed width file (fwf)
119+
data = pd.read_fwf(filename, skiprows=start_row, nrows=nrows, header=None,
120+
colspecs=COL_SPECS, na_values=[-999.0, -99.9],
121+
compression='infer')
122+
123+
# Create multi-index and unstack, resulting in one column for each variable
124+
data = data.set_index([data.index // 2, data.index % 2])
125+
data = data.unstack(level=1).swaplevel(i=0, j=1, axis='columns')
126+
127+
# Sort columns to match original order and assign column names
128+
data = data.reindex(sorted(data.columns), axis='columns')
129+
data.columns = BSRN_COLUMNS
130+
# Drop empty columns
131+
data = data.drop('empty', axis='columns')
132+
133+
# Change day and minute type to integer
134+
data['day'] = data['day'].astype('Int64')
135+
data['minute'] = data['minute'].astype('Int64')
136+
137+
# Set datetime index
138+
data.index = (start_date
139+
+ pd.to_timedelta(data['day']-1, unit='d')
140+
+ pd.to_timedelta(data['minute'], unit='T'))
141+
142+
return data

pvlib/tests/iotools/test_bsrn.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"""
2+
tests for :mod:`pvlib.iotools.bsrn`
3+
"""
4+
5+
6+
import pandas as pd
7+
import pytest
8+
9+
from pvlib.iotools import bsrn
10+
from conftest import DATA_DIR, assert_index_equal
11+
12+
13+
@pytest.mark.parametrize('testfile,expected_index', [
14+
('bsrn-pay0616.dat.gz',
15+
pd.date_range(start='20160601', periods=43200, freq='1min', tz='UTC')),
16+
('bsrn-lr0100-pay0616.dat',
17+
pd.date_range(start='20160601', periods=43200, freq='1min', tz='UTC')),
18+
])
19+
def test_read_bsrn(testfile, expected_index):
20+
data = bsrn.read_bsrn(DATA_DIR / testfile)
21+
assert_index_equal(expected_index, data.index)
22+
assert 'ghi' in data.columns
23+
assert 'dni_std' in data.columns
24+
assert 'dhi_min' in data.columns
25+
assert 'lwd_max' in data.columns
26+
assert 'relative_humidity' in data.columns

0 commit comments

Comments
 (0)