Created attachment 181725 [details] proposed patch I using a dual core ARMADA 38X platform that contains a CESA module to accelerate cryptography operations. I have some crashes when using the cryptodev through OpenSSL. The crash only happen when the following conditions are met: - Multi-core system is used - Hardware acceleration is used - Hardware acceleration perform an interruption to signal the operations completion That are the scenario, first: - A process ask a cryptographic operation - The crypto operation is send to the CESA - msleep is called, but the process seems to be migrated on another CPU, and msleep return immediately (spurious wakeup) (I puted a KASSERT that check if the CRYPTO_F_DONE is present and it rise) - The operation looks like complete and memory is freed - The CESA performs the cryptographic operation on the freed memory that may cause corruption in some other parts of the kernel. Second: - The call to crypto_dispatch is done, but the scheduler prempt the process before the lock on ces->lock is done - The CESA perform the operation and crypto_done is called by the ithread - The flag CRYPTO_F_DONE is put and scheduler preempt the ithread - The user process continue, take the lock and check the flag, that is present, and doesn't call msleep. - It free the memory, and return to the userland - The ithread is resumed and calls the callbacks that crash by accessing the freed memory. I attach my proposed patch to fix that issue
I do not know the opencrypto code. From the brief look around the fragments that you patched, my first question is it possible for one session (struc csession) to have more than one requests (struct cryptop) active simultaneously ? If yes, I do not see how the completion flag could be moved into session. Also, from your description, it seems to be more of an issue in the interface between the driver and crypto infrastructure. I.e. crp must not be freed until consumer acknowledge.
It seems that the "struct csession" can only perform one request at a time, done by cryptodev_op function. I agree that the synchronisation is not well formed. I made my patch to be as small as possible to fix the issue. A re-work may be needed
There's certainly nothing to prevent multiple threads from issuing ioctls against the same session. (However, having 'error' in the cse is buggy, but it should also just be removed as all the uses of it check the crp->crp_etype it is copied from just before using it anyway.) The problem is that CRYPTO_F_DONE isn't really safe for how the /dev/crypto ioctl handlers are using it (and probably isn't safe to use at all and should arguably be removed). Instead, the sleeps need to be conditional on a flag that the callback function itself sets. Probably the right way to handle this is to allocate a separate structure that crp_opaque points to that contains the "done" flag and a pointer to the cse. The callback would set the done flag under the cse lock and then issue the wakeup, and the ioctl routines would sleep on this done flag in the new structure. Something like: struct crypt_op_data { bool done; struct csession *cse; } crypto_dev_op() { struct crypt_op_data *cod; ... cod = malloc(sizeof(*cod), M_WAITOK | M_ZERO); cod->cse = cse; ... crp->crp_opaque = cod; /* not 'cse' anymore */ ... crypto_dispatch(); mtx_lock(&cse->lock); while (error == 0 && !cod->done) mtx_sleep(crp, &cse->lock, ...); mtx_unlock(&cse->lock); ... /* be sure to clear done if retrying */ } cryptodev_cb() { struct cryptop *crp = op; struct crypto_op_data *cod = crp->crp_opaque; mtx_lock(&cod->cse->lock); cod->done = true; mtx_unlock(&cod->cse->lock); wakeup(crp); }
Nice catch ! Just one question, instead of mallocing crypt_op_data, can we use the stack, as a closure ?
(In reply to Alexandre martins from comment #4) It seems that the cession is not thread safe, for example in cryptodev_op: ... 712 cse->uio.uio_iov = &cse->iovec; 713 cse->uio.uio_iovcnt = 1; 714 cse->uio.uio_offset = 0; ... There is no lock around that. Should I propose a bigger patch to fix that ?
(In reply to Alexandre martins from comment #4) If you want to use on-stack memory from another thread or from an interrupt context, be aware that the thread's kstacks are generally swappable. Of course, the stack is swapped in when the thread is marked runnable, but otherwise you may get into a situation when other thread accessing unmapped kstack. You may work around this with PHOLD(), but that would keep whole stack in memory when system desperately needs it. In other words, using malloc(9) is less troublesome.
(In reply to Alexandre martins from comment #5) Well, hmm. If the uio is in the csession than perhaps we should just formalize "only one op at a time" by having the crypto ioctls detect that case and either fail or block. Alternatively we could move the uio into the 'cod'. Probably it is simpler to just enforce "one op at a time" and then your original patch should work I think.
FYI, I have a potential fix for this in the middle of a larger patch series I'm working on to enable AES-GCM offloading for the OpenSSL /dev/crypto engine and some other things. It uses malloc'd per-op buffer. While concurrent operations on a session are probably rare, we don't do anything to prevent them currently and this makes them safe. I also prefer to not muddy the per-session state with per-op state. The relevant commits are: https://github.com/freebsd/freebsd/commit/6d93ba89e16b137cca55b377c46430815cb68464 along with 3 followup fixes in this branch: https://github.com/freebsd/freebsd/compare/master...bsdjhb:openssl_engine I don't have an ETA for when this might reach HEAD apart from "soon-ish". I have some other open reviews for /dev/crypto changes that I haven't been able to get any review for already.
I've posted the commits I mentioned earlier in a review at https://reviews.freebsd.org/D13928
A commit references this bug: Author: jhb Date: Fri Jan 26 23:21:50 UTC 2018 New revision: 328453 URL: https://svnweb.freebsd.org/changeset/base/328453 Log: Move per-operation data out of the csession structure. Create a struct cryptop_data which contains state needed for a single symmetric crypto operation and move that state out of the session. This closes a race with the CRYPTO_F_DONE flag that can result in use after free. While here, remove the 'cse->error' member. It was just a copy of 'crp->crp_etype' and cryptodev_op() and cryptodev_aead() checked both 'crp->crp_etype' and 'cse->error'. Similarly, do not check for an error from mtx_sleep() since it is not used with PCATCH or a timeout so cannot fail with an error. PR: 218597 Reviewed by: kib Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D13928 Changes: head/sys/opencrypto/cryptodev.c
A commit references this bug: Author: jhb Date: Mon Mar 11 22:17:53 UTC 2019 New revision: 345033 URL: https://svnweb.freebsd.org/changeset/base/345033 Log: MFC 328453: Move per-operation data out of the csession structure. Create a struct cryptop_data which contains state needed for a single symmetric crypto operation and move that state out of the session. This closes a race with the CRYPTO_F_DONE flag that can result in use after free. While here, remove the 'cse->error' member. It was just a copy of 'crp->crp_etype' and cryptodev_op() and cryptodev_aead() checked both 'crp->crp_etype' and 'cse->error'. Similarly, do not check for an error from mtx_sleep() since it is not used with PCATCH or a timeout so cannot fail with an error. PR: 218597 Sponsored by: Chelsio Communications Changes: _U stable/11/ stable/11/sys/opencrypto/cryptodev.c