Bug 241402

Summary: Unable to installworld from read-only /usr/src and /usr/obj
Product: Base System Reporter: Navdeep Parhar <np>
Component: miscAssignee: Dimitry Andric <dim>
Status: Closed FIXED    
Severity: Affects Only Me CC: avg, bdrewery, dim, emaste
Priority: --- Keywords: regression
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Navdeep Parhar freebsd_committer freebsd_triage 2019-10-21 23:37:42 UTC
installworld no longer works if /usr/src and /usr/obj are mounted read-only.
This seems to be related to the recent import of clang/llvm 9.0 as it used to
work previously.  The exact error with a fresh tree @ r353875 is shown here:

===> lib/clang (install)
===> lib/clang/libllvm (install)
===> lib/clang/libclang (install)
===> lib/clang/liblldb (install)
===> lib/clang/headers (install)
clang-tblgen -gen-arm-fp16  -I /usr/src/contrib/llvm/tools/clang/include/clang/
Basic -d arm_fp16.h.d  -o arm_fp16.h /usr/src/contrib/llvm/tools/clang/include/
clang/Basic/arm_fp16.td
clang-tblgen: error opening arm_fp16.h.d:Read-only file system
*** Error code 1

Stop.
make[6]: stopped in /usr/src/lib/clang/headers
*** Error code 1
Comment 1 Dimitry Andric freebsd_committer freebsd_triage 2019-10-22 05:12:34 UTC
Andriy also saw this recently, as he reported on -current, but I was not able to reproduce it.  Could any of you provide more information on what state the system was in before upgrading?
Comment 2 Dimitry Andric freebsd_committer freebsd_triage 2019-10-25 17:42:34 UTC
Ok, I think I have the reproduction scenario:
* rm -rf /usr/obj/*
* Build world at base r353357 (so just before clang 9.0.0 import)
* svn update src to base r353358
* Build world, now with -DNO_CLEAN
* Remount /usr/src and /usr/obj read-only
* Attempt installworld (I did DESTDIR=/foobar)

This ends with:

===> lib/clang/headers (install)
clang-tblgen -gen-arm-fp16  -I /usr/src/contrib/llvm/tools/clang/include/clang/Basic -d arm_fp16.h.d  -o arm_fp16.h /usr/src/contrib/llvm/tools/clang/include/clang/Basic/arm_fp16.td
clang-tblgen: error opening arm_fp16.h.d:Read-only file system
*** Error code 1

For some reason, it decides to regenerate those two headers, which are in the Makefile as:

GENINCS+=       arm_fp16.h
GENINCS+=       arm_neon.h

and later:

arm_fp16.h: ${CLANG_SRCS}/include/clang/Basic/arm_fp16.td
        ${CLANG_TBLGEN} -gen-arm-fp16 \
            -I ${CLANG_SRCS}/include/clang/Basic -d ${.TARGET:C/$/.d/} \
            -o ${.TARGET} ${CLANG_SRCS}/include/clang/Basic/arm_fp16.td

arm_neon.h: ${CLANG_SRCS}/include/clang/Basic/arm_neon.td
        ${CLANG_TBLGEN} -gen-arm-neon \
            -I ${CLANG_SRCS}/include/clang/Basic -d ${.TARGET:C/$/.d/} \
            -o ${.TARGET} ${CLANG_SRCS}/include/clang/Basic/arm_neon.td

CLEANFILES=     ${GENINCS} ${GENINCS:C/$/.d/}

I'm unsure why make thinks the headers are not up-to-date.
Comment 3 Navdeep Parhar freebsd_committer freebsd_triage 2019-10-25 18:35:10 UTC
I do not use NO_CLEAN but I do use META_MODE.  Here are the exact settings
on these systems:

# cat /etc/src-env.conf 
WITH_META_MODE=yes

# cat /etc/src.conf 
WITH_OFED=yes
WITH_DEBUG_FILES=yes
WITH_SYSTEM_COMPILER=yes
WITHOUT_SENDMAIL=yes
WITHOUT_REPRODUCIBLE_BUILD=yes
Comment 4 Dimitry Andric freebsd_committer freebsd_triage 2019-10-25 20:48:59 UTC
Okay, I think I know what is likely causing this problem.  These headers are generated with the clang-tblgen tool, and this tool has a common part in contrib/llvm/lib/TableGen/Main.cpp which reads:

   102    // Write output to memory.
   103    std::string OutString;
   104    raw_string_ostream Out(OutString);
   105    if (MainFn(Out, Records))
   106      return 1;
   107
   108    // Always write the depfile, even if the main output hasn't changed.
   109    // If it's missing, Ninja considers the output dirty.  If this was below
   110    // the early exit below and someone deleted the .inc.d file but not the .inc
   111    // file, tablegen would never write the depfile.
   112    if (!DependFilename.empty()) {
   113      if (int Ret = createDependencyFile(Parser, argv0))
   114        return Ret;
   115    }
   116
   117    // Only updates the real output file if there are any differences.
   118    // This prevents recompilation of all the files depending on it if there
   119    // aren't any.
   120    if (auto ExistingOrErr = MemoryBuffer::getFile(OutputFilename))
   121      if (std::move(ExistingOrErr.get())->getBuffer() == Out.str())
   122        return 0;

E.g, it always writes the .d file, but if the .h file would have exactly the same contents, it does not overwrite it.  I think this was done in order to save on unnecessary rebuilding in the LLVM build system, but it works out badly for us.

Probably the simplest solution is to touch the generated files after tblgen, or remove them beforehand.  I will try a few things.
Comment 5 commit-hook freebsd_committer freebsd_triage 2019-10-29 16:52:08 UTC
A commit references this bug:

Author: dim
Date: Tue Oct 29 16:51:12 UTC 2019
New revision: 354146
URL: https://svnweb.freebsd.org/changeset/base/354146

Log:
  Pull in r373338 from upstream llvm trunk (by Simon Pilgrim):

    Revert rL349624 : Let TableGen write output only if it changed,
    instead of doing so in cmake, attempt 2

    Differential Revision: https://reviews.llvm.org/D55842
    -----------------
    As discussed on PR43385 this is causing Visual Studio msbuilds to
    perpetually rebuild all tablegen generated files

  Pull in r373664 from upstream llvm trunk (by Nico Weber):

    Reland r349624: Let TableGen write output only if it changed, instead
    of doing so in cmake

    Move the write-if-changed logic behind a flag and don't pass it with
    the MSVC generator. msbuild doesn't have a restat optimization, so
    not doing write-if-change there doesn't have a cost, and it should
    fix whatever causes PR43385.

  This should fix the scenario where an incremental build from before
  r353358 (the clang 9.0.0 upgrade) to r353358 or later fails to update
  the timestamp of the generated lib/clang/headers/arm_fp16.h header.

  After such a build, installing world from read-only source and object
  directories would attempt to generate the header again, leading to
  "clang-tblgen: error opening arm_fp16.h.d:Read-only file system".

  Reported by:	avg, np
  PR:		241402
  MFC after:	1 month
  X-MFC-With:	r353358

Changes:
  head/contrib/llvm/lib/TableGen/Main.cpp
Comment 6 Andriy Gapon freebsd_committer freebsd_triage 2019-10-29 16:55:30 UTC
I suspected that it was something like that.

Thank you very much for finding and importing the fix!
Comment 7 Ed Maste freebsd_committer freebsd_triage 2019-10-31 14:29:38 UTC
We should add a CI job to make sure that this doesn't regress.
Comment 8 Dimitry Andric freebsd_committer freebsd_triage 2019-10-31 21:24:12 UTC
(In reply to Ed Maste from comment #7)
Yes, this is maybe something for Li-Wen Hsu?
Comment 9 Ed Maste freebsd_committer freebsd_triage 2019-11-03 20:02:40 UTC
(In reply to Dimitry Andric from comment #8)
Yep, Li-Wen has added it to an ideas list.