Bug 200584 - [patch] threads can get stuck in bmc2835 SPI driver bcm_spi_transfer() function
Summary: [patch] threads can get stuck in bmc2835 SPI driver bcm_spi_transfer() function
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: arm (show other bugs)
Version: 10.1-RELEASE
Hardware: arm Any
: --- Affects Some People
Assignee: Luiz Otavio O Souza,+55 (14) 99772-1255
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2015-06-02 02:23 UTC by Dave Rush
Modified: 2015-06-02 16:13 UTC (History)
2 users (show)

See Also:


Attachments
BCM2835 SPI thread safety patch (1.52 KB, application/x-fossil-artifact)
2015-06-02 02:23 UTC, Dave Rush
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dave Rush 2015-06-02 02:23:57 UTC
Created attachment 157362 [details]
BCM2835 SPI thread safety patch

The SPI driver for the BCM2835 (Used in the Raspberry Pi) is not thread safe. If multiple threads are calling into the bcm_spi_transfer() function then one or more of them can get stuck sleeping on a mutex in the following code:

	/* If the controller is in use wait until it is available. */
	while (sc->sc_flags & BCM_SPI_BUSY)
		mtx_sleep(dev, &sc->sc_mtx, 0, "bcm_spi", 0);


The reason this happens is because there is a complex timing interaction between the wakup() call in bcm_spi_intr(), multiple threads waiting for the BUSY flag to clear in the loop mentioned above with a zero "timo" parameter, the releasing of the mutex during the call to mtx_sleep() while waiting for the transaction to complete, and the actual clearing of the BUSY flag in bcm_spi_transfer().


There's also an error in this code with the setting of the BUSY flag and then returning without clearing it if the chip select index is out of range. This will also cause other threads to get stuck waiting forever because the BUSY flag will never clear in this case. The code snippet is shown below.


	/* Now we have control over SPI controller. */
	sc->sc_flags = BCM_SPI_BUSY;

	/* Clear the FIFO. */
	bcm_spi_modifyreg(sc, SPI_CS,
	    SPI_CS_CLEAR_RXFIFO | SPI_CS_CLEAR_TXFIFO,
	    SPI_CS_CLEAR_RXFIFO | SPI_CS_CLEAR_TXFIFO);

	/* Get the proper chip select for this child. */
	spibus_get_cs(child, &cs);
	if (cs < 0 || cs > 2) {
		device_printf(dev,
		    "Invalid chip select %d requested by %s\n", cs,
		    device_get_nameunit(child));
		BCM_SPI_UNLOCK(sc);
		return (EINVAL);
	}


The attached patch corrects the first issue by adding a call to wakup() in the bcm_spi_transfer() function just after clearing the BUSY flag so that any threads that may be waiting here,

  	/* If the controller is in use wait until it is available. */
	while (sc->sc_flags & BCM_SPI_BUSY)
		mtx_sleep(dev, &sc->sc_mtx, 0, "bcm_spi", 0);

are woken up so they can actually reevaluate the BUSY flag and make forward progress if it's clear.

Note: This could also have been fixed simply by not sleeping forever in the mtx_sleep call while waiting for the flag to clear.


The patch corrects the 'never clearing the BUSY flag' issue if the chip select is out of range by moving that code up so it's evaluated before the busy flag is set.


This issue was discovered while stress testing a custom character device module for an external piece of hardware. The test starts up many processes that all do ioctl() calls every 10mS to 100mS which results in the calling of bcm_spi_transfer() to control their own slice of the external hardware. Without this patch in only takes a few minutes for one or more of the test processes to get stuck sleeping forever. With the patch the stress test runs successfully for hours.

This bug affects anyone who is using the SPI interface on their Raspberry Pi running FreeBSD to talk to the outside world.
Comment 1 commit-hook freebsd_committer freebsd_triage 2015-06-02 16:07:35 UTC
A commit references this bug:

Author: ian
Date: Tue Jun  2 16:07:29 UTC 2015
New revision: 283918
URL: https://svnweb.freebsd.org/changeset/base/283918

Log:
  Add a missing wakeup when releasing ownership of the SPI hardware.

  Also, validate the chipselect parameter before grabbing ownership of the
  hardware, and report timeout errors after releasing it.

  PR:		200584

Changes:
  head/sys/arm/broadcom/bcm2835/bcm2835_spi.c
Comment 2 Ian Lepore freebsd_committer freebsd_triage 2015-06-02 16:13:43 UTC
Patch applied (I changed wakeup() to wakeup_one()), thanks for contributing it.