Skip to content

The SPI sharing utilities are broken for fallible chipselect pins #573

Open
@GrantM11235

Description

@GrantM11235

This affects all the spi sharing utilities in some way, but I would like to highlight the worst case scenario:

  • You are using the mutex or critical section impl
  • In thread 1 you do a transaction to device A, but the cs pin fails to deassert
  • Before thread 1 even has a chance to react to the error, it is preempted by thread 2
  • Thread 2 does a transaction to device B while the cs for device A is still active
  • This leads to data corruption at best, and physical damage at worst

Option 1: Infallible only

CS: OutputPin<Error = Infallible>

This is the simplest option. In fact it is even simpler than the existing impls because it allows us to remove the DeviceError enum.

You can still use fallible chipselect pins if you wrap them in some sort of adapter that panics on failure. We could even provide such an adapter.

Option 2: track "poisoned" chipselects

I have some idea how to implement this, it would involve adding a shared flag to each bus, as well as another flag for each SpiDevice. The impl would refuse to do any spi transactions until the offending chipselect is fixed.

This would be a lot of extra overhead and complexity that is completely unnecessary for 99.9% of users. I don't think the compiler will be able to remove the unnecessary overhead.

Option 3: do nothing

Document the broken error handling, how to work around it, and where it is impossible to work around.

If we want to be thorough, the amount of warnings and disclaimers will probably outweigh the amount of actual code:

expand me!
  • how to detect a failed chipselect
    • DeviceError::Cs, obviously, but also
    • DeviceError::Spi, because it can "hide" chipselect errors
    • basically any SpiDevice error
    • downstream errors such as DisplayError::BusWriteError from display_interface (but ironically not DisplayError::CSError)
  • how to properly recover from a chipselect error
    • before using any device on the bus:
      • get access to the SpiDevice by destroying whichever driver was using it
      • get access to the chipselect by destroying the SpiDevice
      • do whatever you need to do to fix and deassert the chipselect pin
    • in multitheaded/concurrent code, it is impossible to ensure that this is done before another thread uses a device on the bus. It is not possible to correctly handle chipselect failure in this case
  • what happens if you use a device on the bus before fixing the chipselect error
    • this is even a minor problem for ExclusiveDevice because the framing will be messed up
    • if two chipselects are active at once
      • the other device will receive garbage data
      • both devices will fight to drive MISO, causing data corruption and possibly even physical damage

Conclusions

If you only have infallible chipselect pins, or if you want to panic on chipselect failure, option 1 is the best.

If you actually want to try to gracefully recover from chipselect errors, option 2 is the only real option.

Option 3 isn't good for either group. It is slightly inconvenient for the first group, and it is almost impossible to use correctly for the second group

My recommendation is option 1 because it is the best option for most people. If someone really needs option 2, they can write it for themself, it doesn't need to be in our crate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions