This may be an upstream issue, but I'm starting here in hopes that someone would know for sure. I have not yet tried to reproduce this on other platforms. 1. The clang man page states that -O is equivalent to -O2, which is clearly not the case for clang11 per the benchmark data below. 2. I wrote a very simple benchmark suite using selection sort to compare the performance of various compilers and interpreters. There are subscript and pointer versions for C. I just noticed that the pointer implementation takes more than twice as long as the subscript implementations using clang90 or later. On amd64 with a modern compiler, I would expect very little difference, maybe a slight advantage to the pointer implementation, which is what I see with clang80 and gcc10. To reproduce: git clone https://github.com/outpaddling/Lang-speed cd Lang-speed ./clang-check clang version 8.0.1 (tags/RELEASE_801/final) Target: x86_64-portbld-freebsd13.0 Thread model: posix InstalledDir: /usr/local/llvm80/bin -O Subscripts: 2.27 real 2.27 user 0.00 sys Pointers: 2.05 real 2.05 user 0.00 sys -O2 Subscripts: 2.10 real 2.10 user 0.00 sys Pointers: 2.02 real 2.02 user 0.00 sys clang version 9.0.1 Target: x86_64-portbld-freebsd13.0 Thread model: posix InstalledDir: /usr/local/llvm90/bin -O Subscripts: 2.09 real 2.09 user 0.00 sys Pointers: 4.67 real 4.67 user 0.00 sys -O2 Subscripts: 2.05 real 2.05 user 0.00 sys Pointers: 4.62 real 4.62 user 0.00 sys FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c3fe) Target: x86_64-unknown-freebsd13.0 Thread model: posix InstalledDir: /usr/bin -O Subscripts: 4.62 real 4.62 user 0.00 sys Pointers: 4.67 real 4.67 user 0.00 sys -O2 Subscripts: 2.04 real 2.04 user 0.00 sys Pointers: 4.61 real 4.61 user 0.00 sys
I confirmed that the same issue exists (sort of) on MacOS with clang 13. The performance gap between subscripts and pointers is much less pronounced, but significant.
Just submitted an upstream issue: https://github.com/llvm/llvm-project/issues/53205