SURFRAD site & date-range download

**Is your feature request related to a problem? Please describe.**
the current [SURFRAD `iotools`](https://pvlib-python.readthedocs.io/en/latest/generated/pvlib.iotools.read_surfrad.html) only reads in a singe day `.dat` from either an URL or a filesystem, EG:
```python
# read from url
pvlib.iotools.read_surfrad('ftp://aftp.cmdl.noaa.gov/data/radiation/surfrad/Bondville_IL/2021/bon21001.dat')
# read from file
pvlib.iotools.read_surfrad('bon21001.dat')
```

Unfortunately, I can't quickly read an entire range or any arbitrarily large date range. I can use `pvlib.iotools.read_surfrad` in a loop, but it takes a long time to serially read in an entire year. Maybe it would be faster if I already had the files downloaded.  It takes about 1-second to read a single 111kb file. So for 10,000 files that would be about 3 hours which is too long if I have to read 7 sites.
```python
%%timeit
bon95 = [
    pvl.iotools.read_surfrad(r'ftp://aftp.cmdl.noaa.gov/data/radiation/surfrad/Bondville_IL/1995/bon95%03d.dat' % (x+1))
    for x in range(16)]  # read in 16 files

## -- End pasted text --
14.4 s ± 295 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```
That's 14.4[s] / 16[files] = 0.9[s] per file. I tried to use threading, but then I get connection errors. I think there's a limit of 5 connections to the NOAA ftp from your computer. That should bring it down to about 30 minutes, hmm, maybe I didn't try hard enough? Anyway, I went a different way.


**Describe the solution you'd like**
The current `read_surfrad` uses Python's [`urllib.requests.urlopen`](https://docs.python.org/3/library/urllib.request.html) for each connection. I have found that opening a long lasting FTP connection using Python's [`ftplib`](https://docs.python.org/3/library/ftplib.html) allows downloading a lot more files by reusing the same connection. However this download is still serial, so I have found in addition using Python [`threading`](https://docs.python.org/3/library/threading.html) allows me to open up to 5 simultaneous connections, but any more and I get a 421 FTP Connection Error, too many connections.

**Describe alternatives you've considered**
I was able to open the FTP site directly in Windows, but it was also a serial connection, and so for about 10,000 (about 1gb) would have taken 4 hours. By contrast, using `ftplib` and `threading` I can download all of the data from a single site in about 25 mintes.

**Additional context**
#590 
#595
gist of my working script: https://gist.github.com/mikofski/30455056b88a5d161598856cc4eedb2c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SURFRAD site & date-range download #1155

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SURFRAD site & date-range download #1155

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions