Description
It is well-known that digitalWrite()
and friends are quite slow, taking dozens of clockcycles for something that can sometimes be done in just 1 or 2 when using direct I/O. Especially when the arguments are constants, it should be possible to severely optimize this.
In addition to the high-level functions like digitalWrite()
, there are also more low level (and AVR-specific) macros like digitalPinToPort()
, which currently use a PROGMEM-stored table to perform translations. Currently, these always result in runtime lookups, even when the argument is constant, because:
- The compiler does not optimize through things like
pgm_read_byte()
(haven't verified this, though) - The contents of these tables is only available in the compilation unit that defines
ARDUINO_MAIN
(wiring_digital.c IIRC)
It would be good to solve these 2 problems to optimize the low-level macros, and additionally improve the higher-level functions to allow (or force) inlining them when the arguments are constants. arduino/Arduino#1285 took a swing at the latter and contains a lot of useful experience. For the low-level problems, arduino/Arduino#1285 modified pins_arduino.h
to put the lookup table contents into a macro, but that doesn't seem very elegant or robust to me.
An alternative I came up with was to define a static table in all compilation units (except for the ARDUINO_MAIN
one) to use for compile-time lookups, and a non-static table in the ARDUINO_MAIN
compilation unit to use for all runtime lookups. In short, this would look something like this:
Arduino.h:
// Use a function to ensure pin is only evaluated once. This might
// also work with a do while loop and/or local variable?
#define digitalPinToPort(P) digitalPinToPortFunction(pin)
static inline uint8_t digitalPinToPortFunction(uint8_t pin) __attribute__((always_inline));
static inline uint8_t digitalPinToPortFunction(uint8_t pin) {
#if !defined(ARDUINO_MAIN)
if __builtin_constant_p(pin)
return digital_pin_to_port[P];
else
#endif
return pgm_read_byte( digital_pin_to_port_PGM + pin );
}
#ifdef ARDUINO_MAIN
#define DIGITAL_PIN_TO_PORT_TABLE const uint8_t PROGMEM digital_pin_to_port_PGM[]
#else
#define DIGITAL_PIN_TO_PORT_TABLE static const uint8_t PROGMEM digital_pin_to_port[]
#endif
pins_arduino.h:
DIGITAL_PIN_TO_PORT_TABLE = {
PD, /* 0 */
PD,
etc.
}
This discussion was started after merging arduino/Arduino#121, where @PaulStoffregen pointed out the performance implications of that merge.