Bug 196326 - framework bug : building occurs in stage phase
Summary: framework bug : building occurs in stage phase
Status: Open
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Ports Framework (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Port Management Team
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-28 13:00 UTC by John Marino
Modified: 2017-04-10 14:29 UTC (History)
6 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Marino freebsd_committer 2014-12-28 13:00:02 UTC
By definition, all the building of a port should be done prior to staging (the phase that is concerned with the actual installation).

The rubygem framework does not respect this define.  If rubygems need native extensions, these are currently built during the stage phase, e.g.

=======================<phase: stage          >============================
===>  Staging for rubygem-cairo-1.12.9
===>   Generating temporary packing list
Building native extensions.  This could take a while...
unable to convert "\x90" from ASCII-8BIT to UTF-8 for lib/cairo.so, skipping
Successfully installed cairo-1.12.9
1 gem installed
Installing RDoc documentation for cairo-1.12.9...
====> Compressing man pages (compress-man)


The rubygem framework needs to be adjusted to build native extensions during the build phases.  No building should occur during the stage phase.
Comment 1 John Marino freebsd_committer 2015-02-23 09:57:14 UTC
Can a ruby person comment on this PR?

I know rubygem's are not the only ports violating the "don't build in stage" rule, but it needs to be fixed.  Is anybody trying to make rubygems behave?
Comment 2 Antoine Brodin freebsd_committer 2015-02-23 10:26:14 UTC
I guess this is an upstream problem,  that "rubygem install" does build stuff.
Comment 3 John Marino freebsd_committer 2015-02-23 10:36:10 UTC
Granted, but that doesn't mean it's impossible to fix at the framework level.  Maybe it is, but has anyone looked? 

I've recently fixed a port that built in stage, but I agree it would have to be a fix that works for all gems.
Comment 4 John Marino freebsd_committer 2015-03-27 09:40:19 UTC
I took a look at ${PREFIX}/lib/ruby/${ruby version}/rubygems/installer.rb

Currently the extension building is part of the installer function, and there is no setting to make it work differently.

However, I think it would be rather straightforward to add a new argument, e.g. "extensions-only=true" that would alter what the install routine does.

So basically the idea would be to add "install" in the do-build with the "extension-only=true" argument set in the Mk/bsd.ruby.mk file.

I think it's possible to make rubygems behave properly.  I don't know ruby at all but I could probably figure this out in one sitting.  I would think that somebody that actually knows ruby can do it faster.

Who is the rubygem guru?  or core team members?
Comment 5 John Marino freebsd_committer 2015-05-10 14:34:13 UTC
I'm adding mmoll@ to CC: since he seems to care about rubygems.  Maybe he'll care enough to fix them so they don't build extensions during staging (i described problem and potential solution already in the PR)
Comment 6 John Marino freebsd_committer 2016-09-01 17:38:00 UTC
I propose that rubygem framework be told to install to temporary stage directory during the build phase and then use the install command to move the installed files to the true stage directory during the install phase.

This is a flagrant disregard for the ports framework and it *really* needs to be fixed.  We actually saw rubygems timeout on synth watchdog under extreme contention because it was building when it shouldn't.

speaking of "timeouts" technically, if I provide a solution to this then a PR timeout will be in effect. 

Is there a primary person responsible for rubygems?  We seem to get them all the time but who is actually in charge of them?  the framework?
Comment 7 Mathieu Arnold freebsd_committer 2016-09-01 22:43:24 UTC
(In reply to John Marino from comment #6)
> Is there a primary person responsible for rubygems?  We seem to get them all
> the time but who is actually in charge of them?  the framework?

As they are all built using USES=gem, I'll let you wonder who is in charge of the build thing.

> speaking of "timeouts" technically, if I provide a solution to this then a
> PR timeout will be in effect. 

Technically, there can be no timeout, ever, on issues regarding the framework.
Comment 8 John Marino freebsd_committer 2016-09-02 21:35:09 UTC
(In reply to Mathieu Arnold from comment #7)
"As they are all built using USES=gem, I'll let you wonder who is in charge of the build thing."

Why be coy?  I asked a serious question and I would have like the respect of a serious answer.  Mk/Uses/gem.mk is listed as "portmgr" for maintainer.  I know for a fact that portmgr as a unit doesn't think rubygem framework is behaving correctly.

"Technically, there can be no timeout, ever, on issues regarding the framework."

This is actually a major issue and potential source of abuse that I hope is rectified in the future.  Nobody should have the ability to block good work for "reasons".

This bug report is coming up on 2 years of age.

I would like to discuss the potential fixes of this but the person in charge should be revealed.  If not, I will (begrudgingly) offer a solid fix for this 2 year old PR and I would like it not to be blocked without a good reason.  Aren't I doing everything correctly so far?  If not, what should I be doing instead?
Comment 9 Mathieu Arnold freebsd_committer 2016-09-02 21:42:56 UTC
I never said anything about blocking good work, I said no timeout can occur in this case.
This means that nobody can commit a patch that has not been approved, so that nobody breaks stuff.
Comment 10 John Marino freebsd_committer 2016-09-02 21:52:59 UTC
I've got several active examples of submissions that haven't been approved nor denied, but are in limbo and there's no timeout.  A person can use this rule to passively block any work.  It's a very bad situation.
Comment 11 Mathieu Arnold freebsd_committer 2016-09-02 22:07:26 UTC
If portmgr was one person, yes, it could, turns out, it is not.
If patches are in limbo because they have neither been approved or denied, it is mostly because they are not needed, or the submitter (or nobody else) cares about the issue.
Comment 12 John Marino freebsd_committer 2016-09-02 23:14:42 UTC
THis is unacceptable in my opinion.
All submissions deserve a disposition.  Somebody put in the work, a timely answer is more than just polite, they should be policy requiring such.

e.g. outright reject, needs work (and explicitly list the work), provisionally accepted, accepted.

No answer at all should NEVER been an option.  If it is, that policy needs to be changed.  Frankly I'm perturbed if you think limbo/no response is okay.
Comment 13 Mathieu Arnold freebsd_committer 2016-09-03 09:23:28 UTC
Good to know your opinion on the matter.

Now, the point is, you do not touch the ports framework, ever, without portmgr's approval.
Comment 14 John Marino freebsd_committer 2016-09-03 12:32:38 UTC
yes, we all know for a long time that portmgr makes itself exempt for the same rules it imposes on others, and the only way to get it changed is through portmgr so there's zero checks and balances here.  It's been a sore subject for a long time.  

Now that you iterated something that wasn't in question:

I'm VERY serious about fixing this.  What is the ACCEPTABLE way forward in doing so?  If I submit a patch to fix this, I EXPECT that it will be evaluated fairly and in a timely manner.  (the normal definitions of those words)

I also EXPECT, after all this talk, that some feedback about how this patch could look would be provided NOW.  Right now you are flexing your muscles but not providing any sort of help here.  Are you going to obstruct this or help?
Comment 15 John Marino freebsd_committer 2016-09-03 12:45:17 UTC
I just check SVN; it seems swills was the most involved with Mk/bsd.ruby.mk (the precessor) as well as gem.mk, so he probably he's the "person in charge" that I've been asking to be identified.

Steve, can we work on this and finally close this PR?
Comment 16 Mathieu Arnold freebsd_committer 2016-09-04 10:03:44 UTC
As a side note, if all you do is complain that this does not go forward, it will not go forward.  If you provide a patch that is acceptable, it may.
Comment 17 John Marino freebsd_committer 2016-09-04 12:31:25 UTC
if you re-read it carefully, I said that I would work on a patch *after* the concept had been discussed so I that wouldn't waste my time.

For that to happen, the "Person in charge" would have had to be revealed and the patch concept be discussed.  So far, nobody, and I mean nobody, on the python team has made any sort of indication that they want to fix this flaw or at least want to spent any time doing it.

It's the python's team responsibility to fix it.  I offered to do the work because python team won't.  I don't want to do it.  I shouldn't have to do it.  Once the flaw was reported, that should have been enough to get a fix started.

As I said before, the safest fix appears to be to move the "rubygem install" install command into the build stage and use a temporary staging directory (not $STAGEDIR).  The install target would move the files from that temporary dir to $stagedir.

That should solve the issue without having to fix the awful rubygem native extension code.
Comment 18 Mathieu Arnold freebsd_committer 2016-09-04 13:20:17 UTC
Like I said a few times already, if something is done, something may happen.

Now, I'll stop wasting my time on this PR.
Comment 19 John Marino freebsd_committer 2016-09-04 13:31:31 UTC
Or, as could happen MANY MANY times before, I could submit something that just gets ignored for no good reason.  My time is just as valuable as yours.

All I asked for is:
1) identify somebody on the python team to "take lead" on this issue
2) have that person discuss the fix concept
3) have that person approve the patch (together with portmgr)

You have done NONE OF THAT.  I don't get why you think you've being so helpful here.  You're not really helping at all.  I don't know why you responded to the PR at all if you are going to take this tact.
Comment 20 John Marino freebsd_committer 2016-09-04 21:11:22 UTC
i keep saying "python", obviously I meant "ruby".
Apparently my brain lumps these together.
Comment 21 Steve Wills freebsd_committer 2017-01-04 22:36:59 UTC
I agree that building during install is an issue, but I don't consider it a major one. A patch to fix it would be a good place to start discussion. We would also want to discuss with upstream.

Given this bug has been open so long without a patch, going to close for now until there is something more concrete to discuss.
Comment 22 John Marino freebsd_committer 2017-01-04 23:25:27 UTC
For the record, I find this disposition UNACCEPTABLE.

If anybody is responsible for a patch, it would be the RUBY TEAM which up until now has completely shirked this duty.

Now, there is a possibility to move this issue above RUBY because there are other ports that are guilty of the same infraction.  Thus, there could be a ports-level fix that the ruby.mk incorporates automatically.

However, this issue is ongoing and must be documented.  please reopen it. Just because a patch hasn't been contributed doesn't mean it won't.

Please do not sweep this fault under the rug.  It must be kept visible.
Comment 23 John Marino freebsd_committer 2017-01-06 14:52:33 UTC
Reopen the PR.

This PR is as valid today as the day it was originally submitted.  We don't close accurate PRs that are technically possible to fix.

However, the core issue is swills, a member of both the ruby team and portmgr doesn't "consider [this to be] a major [problem]" (paraphrased).  This is absolutely something the portmgr should be enforcing.  I'm going to figure out where this is documented and if necessary get it documented so there is some traceability here.

In the meantime, remove ruby@ from assignment since it's clear that the solution will not come from that team.
Comment 24 Adam Weinberger freebsd_committer 2017-01-06 15:34:31 UTC
Why? What's the benefit?

As you mentioned earlier, it's not just Ruby that builds during the install phase. Even many libtool-based ports appear to relink during the install phase. Patching that behaviour out sounds pretty fragile to me, especially given that I don't see what the gain is.
Comment 25 John Marino freebsd_committer 2017-01-06 15:39:15 UTC
the true fix is to add a new phase to the build sequence, one that is off by default but can be turned on via port makefile or bsd.ruby.mk.

Basically it would consider "staging" as a second install phase that installs in a second DESTDIR, and stage would move the installed files from DESTDIR2 to the expected DESTDIR.

(s/DESTDIR/STAGEDIR/)

Divisions is phases must be respected.  

Ports that fetch during building are considered illegal.  This is no different.

IMO, portmgr must reinforce this as a rule and technically if possible.  I think it's possible but requires a modest update to bsd.port.mk, something requiring some testing.
Comment 26 Adam Weinberger freebsd_committer 2017-01-06 15:46:41 UTC
There's a practical reason for fetching being disallowed during build. The builders can't dial out at all, which makes sure that nonsense like BitchX dialing home during build is prevented, and it makes sure that all the builders are using the exact same distfiles for build.

However, what is the benefit to inserting a second install phase? It's still building during an install phase, but now we're just calling the first install phase a mulligan.

I'm assigning this PR to you, John, as you're the primary interested party in this.
Comment 27 Steve Wills freebsd_committer 2017-01-06 15:50:58 UTC
Just changed title so that it doesn't show up in my search for ruby stuff since this isn't really related to ruby at all.
Comment 28 John Marino freebsd_committer 2017-01-06 15:52:31 UTC
No, I wrote the PR.
I may or may not contribute a patch, when I have time, but I don't have time.

I do not want to block somebody else from contributing a fix.
It is fine as "generally open".

It is a shame that I am the primary interested party because it should be portmgr that is primary interested.  It's in their job description as THEY describe it.

Correct division of tasks should be obvious.  Hopefully we can just agree that not only that it "should" work as designed, but that it "must" work as designed.

no building or linking should happen during staging, period.  If it does, it's wrong, and it should be fixed.

Many individual ports have violated this and have been fixed (e.g. lang/racket).  The fact that ruby violates this hundreds of times is a problem.  The fact that there are other violators doesn't alleviate rubygems from being a very bad ports citizen.
Comment 29 John Marino freebsd_committer 2017-01-06 15:55:29 UTC
(In reply to Steve Wills from comment #27)
Oh yes it is.  It's very related to ruby.  The primary driver for a ports-wide fix is directly caused by ruby being a bad citizen (being a big enough problem to invoke change).
Comment 30 Steve Wills freebsd_committer 2017-01-06 16:26:58 UTC
No, it is not.

The framework or the builder could just as well make the build dir read-only during the install phase, allowing only changes to the install dir during install. This would enforce the requirement globally without any change to ruby. It would also break a lot of stuff, including rubygem install. Which would then be a reason to create bug reports for the things that break, including rubygem ports.

And I re-iterate Adams question of what real practical benefit this has? You say "no building or linking should happen during staging, period." I want to know why? Where is the rule written that I'm unable to find at the moment? And was the rule created because that was sort of the expectation at some point, but the rule wasn't enforced and sort of got ignored?

So what is the real problem that is happening that we are trying to solve here in this problem report? And given we are all working with limited time and resources, what is the benefit to users of putting effort into solving this problem?
Comment 31 John Marino freebsd_committer 2017-01-06 16:35:05 UTC
> No, it is not.

I've explained my position. 
I've accused the ruby team of shirking their duty and I truly believe it.
I also realize that you disagree that it's a problem and thus the team refused to come up with a ruby-specific fix for the problem.
There's no more point in arguing if ruby's position is bad or not.  I think it is and you won't change my mind and apparently you've dug your heels in too.

> The framework or the builder could just as well make the build dir read-only during the install phase, allowing only changes to the install dir during install. This would enforce the requirement globally without any change to ruby. It would also break a lot of stuff, including rubygem install. Which would then be a reason to create bug reports for the things that break, including rubygem ports.

Yes, that's possible but I didn't ask for that.
I was, however, considering improving QA tests by adding detection would report when a port is violating it. It's probably better to have an easy makefile solution available before that though, so people can actually easily fix it.

There are real practical benefits for keeping to the design of the framework.  I'll leave it as a mental exercise to figure it out.  Just because you don't see them doesn't mean there aren't very good reasons to enforce these things.  Or do you think I just pick random obscure things to dig my heels on?
Comment 32 Steve Wills freebsd_committer 2017-01-06 16:44:42 UTC
(In reply to John Marino from comment #31)

"Or do you think I just pick random obscure things to dig my heels on?" -- Without an answer to my questions, I have no other choice.
Comment 33 John Marino freebsd_committer 2017-01-06 16:52:11 UTC
(In reply to Steve Wills from comment #32)
Sure you have.  Good reasons for things can exist even if you don't have the background to see them.  Don't put ruby's failure to follow good design on my reluctance to spoonfeed you reasons why the rubygem maintainers should care about this.

The big question is whether such a requirement is clearly documents.  I think it's obvious implicitly but I don't know if it's been explicitly documented.  If it hasn't, it definitely needs to be so we can:
1) justify the framework-level fix
2) justify the associated QA test
3) justify modifying build tools to detect it (after the tree has been fixed the first time with EXP-RUNS)

without such documentation, I'm basically dead in the water.  I've asked bapt but he hasn't responded yet.
Comment 34 Steve Wills freebsd_committer 2017-01-06 17:03:07 UTC
(In reply to John Marino from comment #33)
It's not just ruby that doesn't follow the clear build/install separation. Lots of things, such as libtool as Adam already mentioned, or other language specific package managers do the same.

So, why are they wrong?

We can't document a rule that has no reason to exist.
Comment 35 John Marino freebsd_committer 2017-01-06 17:07:36 UTC
why don't you answer why are they correct first?
As a bonus, you can answer why the ports tree should not enforce the separation of activities.  It's obviously technically possible to do, so why shouldn't it be done?

Don't use "volunteer effort" as a reason. Assume somebody would spend time fixing the ports tree if it were clear it needed to happen.

Also note that I'm not excluding myself from doing it.  I just can't do it in the near future.
Comment 36 Steve Wills freebsd_committer 2017-01-06 17:09:42 UTC
(In reply to John Marino from comment #35)
They are correct because they are not incorrect. Building during install causes no problem that I know of.
Comment 37 John Marino freebsd_committer 2017-01-06 17:16:28 UTC
Just drop out of the bug report if you're going to be like that.
The title has been changed so you aren't bothered with your personal bug searches and it assignment has been moved away from both portmgr and ruby team.  You don't have to participate involuntarily anymore.

Any port that builds during installation is wrong, period.  If you're seriously not going to agree with that, 1) we're too far apart to continue discussion and 2) I'm seriously worried about all technical proposals that have to get approved.  That's not a troll, that's a true queasiness.
Comment 38 Adam Weinberger freebsd_committer 2017-01-06 17:18:46 UTC
John I'm still looking for an explanation of what the benefit is.
Comment 39 Steve Wills freebsd_committer 2017-01-06 17:22:28 UTC
(In reply to John Marino from comment #37)
But I really want to know the answer to the question of why this is a problem, because I just can't figure it out.
Comment 40 John Marino freebsd_committer 2017-01-06 17:28:26 UTC
Here's a small example.
Take the official poudriere builder as an example (synth is the same):

--------------------------------------------------------------------------------
--  Phase: stage
--------------------------------------------------------------------------------
===>  Staging for rubygem-rbtrace-0.4.8_1
===>   Generating temporary packing list
(cd /construction/devel/rubygem-rbtrace/rbtrace-0.4.8; /usr/bin/env RB_USER_INSTALL=yes LANG=C LC_ALL=en_US.UTF-8 LC_CTYPE=UTF-8 /usr/local/bin/gem22 install -l --no-update-sources --install-dir /construction/devel/rubygem-rbtrace/stage/usr/local/lib/ruby/gems/2.2 --ignore-dependencies --bindir=/construction/devel/rubygem-rbtrace/stage/usr/local/bin --rdoc --ri rbtrace-0.4.8.gem -- --build-args )
Building native extensions with: '--build-args'
This could take a while...
ERROR:  Error installing rbtrace-0.4.8.gem:
	ERROR: Failed to build gem native extension.

    current directory: /construction/devel/rubygem-rbtrace/stage/usr/local/lib/ruby/gems/2.2/gems/rbtrace-0.4.8/ext
/usr/local/bin/ruby22 -r ./siteconf20170106-942188-1jwk22i.rb extconf.rb --build-args
  -- tar zxvf msgpack-1.1.0.tar.gz
  -- env CFLAGS=nil LDFLAGS=nil CC=nil
  -- ./configure --disable-dependency-tracking --disable-shared --with-pic --prefix=/construction/devel/rubygem-rbtrace/stage/usr/local/lib/ruby/gems/2.2/gems/rbtrace-0.4.8/ext/dst/ --libdir=/construction/devel/rubygem-rbtrace/stage/usr/local/lib/ruby/gems/2.2/gems/rbtrace-0.4.8/ext/dst/lib
  -- make install
*** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of necessary
libraries and/or headers.  Check the mkmf.log file for more details.  You may
need configuration options.

Provided configuration options:
	--with-opt-dir
	--without-opt-dir
	--with-opt-include
	--without-opt-include=${opt-dir}/include
	--with-opt-lib
	--without-opt-lib=${opt-dir}/lib
	--with-make-prog
	--without-make-prog
	--srcdir=.
	--curdir
	--ruby=/usr/local/bin/$(RUBY_BASE_NAME)22
extconf.rb:6:in `sys': make install failed, please report to https://github.com/tmm1/rbtrace/issues (RuntimeError)
	from extconf.rb:43:in `block (2 levels) in <main>'
	from extconf.rb:34:in `chdir'
	from extconf.rb:34:in `block in <main>'
	from extconf.rb:30:in `chdir'
	from extconf.rb:30:in `<main>'

To see why this extension failed to compile, please check the mkmf.log which can be found here:

  /construction/devel/rubygem-rbtrace/stage/usr/local/lib/ruby/gems/2.2/extensions/x86_64-unknown/2.2/rbtrace-0.4.8/mkmf.log

extconf failed, exit code 1

Gem files will remain installed in /construction/devel/rubygem-rbtrace/stage/usr/local/lib/ruby/gems/2.2/gems/rbtrace-0.4.8 for inspection.
Results logged to /construction/devel/rubygem-rbtrace/stage/usr/local/lib/ruby/gems/2.2/extensions/x86_64-unknown/2.2/rbtrace-0.4.8/gem_make.out
*** Error code 1

Stop.
make: stopped in /xports/devel/rubygem-rbtrace




Both builders list the failed phase.
They list it as stage failure.
stage failures normally imply missing files or files installed in the wrong place.

They should not EVER imply a build failure.

If you think this is okay, then you should also agree that build failures during configure phase is fine, or configure failures in fetch phase is fine.

There are phases that exist for a reason and the activities that those phases do should be strictly enforced.


otherwise you guys might as well argue for combining all the phases into a single one.
Comment 41 Steve Wills freebsd_committer 2017-01-06 17:46:18 UTC
(In reply to John Marino from comment #40)
Thanks, that's really helpful. But I can't seem to reproduce this. Here's the relevant part of my log:

===>  Staging for rubygem-rbtrace-0.4.8_1
===>   Generating temporary packing list
(cd /wrkdirs/usr/ports/devel/rubygem-rbtrace/work/rbtrace-0.4.8; /usr/bin/env RB_USER_INSTALL=yes LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_CTYPE=UTF-8 /usr/local/bin/gem22 install -l --no-update-sources --install-dir /wrkdirs/usr/ports/devel/rubygem-rbtrace/work/stage/usr/local/lib/ruby/gems/2.2 --ignore-dependencies --bindir=/wrkdirs/usr/ports/devel/rubygem-rbtrace/work/stage/usr/local/bin --rdoc --ri rbtrace-0.4.8.gem -- --build-args )
Building native extensions with: '--build-args'
This could take a while...
(eval):8: warning: regular expression has ']' without escape: /\\]cl | cl.exe | *[/
(eval):22: warning: string literal in condition
(eval):22: warning: string literal in condition
Successfully installed rbtrace-0.4.8
Parsing documentation for rbtrace-0.4.8
Installing ri documentation for rbtrace-0.4.8
Installing darkfish documentation for rbtrace-0.4.8
Done installing documentation for rbtrace after 70 seconds
1 gem installed
/bin/rm -f -r /wrkdirs/usr/ports/devel/rubygem-rbtrace/work/stage/usr/local/lib/ruby/gems/2.2/build_info/
/usr/bin/find /wrkdirs/usr/ports/devel/rubygem-rbtrace/work/stage/usr/local/lib/ruby/gems/2.2 -type f -name '*.so' -exec /usr/bin/strip {} +
/usr/bin/find /wrkdirs/usr/ports/devel/rubygem-rbtrace/work/stage/usr/local/lib/ruby/gems/2.2 -type f \( -name mkmf.log -or -name gem_make.out \) -delete
/bin/rm -f -r /wrkdirs/usr/ports/devel/rubygem-rbtrace/work/stage/usr/local/lib/ruby/gems/2.2/gems/rbtrace-0.4.8/ext  /wrkdirs/usr/ports/devel/rubygem-rbtrace/work/stage/usr/local/lib/ruby/gems/2.2/cache 2> /dev/null || true
/bin/rmdir /wrkdirs/usr/ports/devel/rubygem-rbtrace/work/stage/usr/local/lib/ruby/gems/2.2/extensions 2> /dev/null || true
====> Compressing man pages (compress-man)

What could be causing yours to fail where mine is not?
Comment 42 John Marino freebsd_committer 2017-01-06 17:52:40 UTC
I'm not trying to solve individual rubygem failures, I was just giving an example of a port that fails during the stage phase due to building and not doing typical staging activities.

For rubygems, they are building "native extensions".  

This is just an example of a native extension failing to build on DF.

When I see a list of failed ports on a poudriere report, I should see the rubygem-rbtrace failed during building, not during staging.

It's not the only reason.  building in staging causing problems for build watchdogs.  Something that should take less than 5 minutes now has to extend a watchdog timeout as long as a build phase.

In other words, how long does it take to copy a bunch of files vs build libraries and executables?  detecting build timeouts during staging is now severely impacted because the watchdog has to consider that the port is building.  So that's a second, very legitimate reason for enforcing activity separation.
Comment 43 Steve Wills freebsd_committer 2017-01-06 17:59:25 UTC
(In reply to John Marino from comment #42)
But this port doesn't fail during stage phase for me. Maybe you have another example I could try? If it only fails on DF then that's a problem for their bug tracker, not ours.

I'm not sure I understand what you're saying about timeouts, sorry.
Comment 44 John Marino freebsd_committer 2017-01-06 18:05:23 UTC
swills, you are gravely missing the point.

ALL rubygems that build native extensions build in stage.
It is not pertinent to this discussion if the build succeeds or not.
to iterate, I AM NOT REPORTING INDIVIDUAL RUBYGEM failures.

I am stating that if there is any building during stage, regardless if it fails or succeeds, this is wrong.  That's it.  Or in other words: succeeding to build during the stage phase doesn't mean building in stage phase is okay.


Regarding watchdogs:
It's a daemon that monitors a build so it can kill it if it stalls.  It's based on inactivity and measuring time.  If no building in staging were 100% respected, then staging would be a relatively short phase.

In other words, you can see a legitimate no-activity for 40 minutes during build phase (e.g. linking gcc or libreoffice on a heavily loaded or weak machine) but 40 minutes of inactivity for typical staging activities means it stalled.  But because building has leaked into staging, the watchdog can no longer put appropriate timeout limit on it.
Comment 45 Steve Wills freebsd_committer 2017-01-06 18:12:01 UTC
(In reply to John Marino from comment #44)
If they are all doing something wrong (or at least all the ones that build native extensions, which this one does)then there should be a problem you can demonstrate in any of them (including this one). I don't see a problem here.

But if they are wrong even though they are building then I must be misunderstanding. I thought you were saying that build log you posted was evidence of a problem. So if that's not the case, then what am I meant to see there? Or what is the reason that they are wrong even though they build and work properly as far as I can tell?

As for the watchdogs, OK, timeouts are useful, but I don't see an issue.
Comment 46 John Marino freebsd_committer 2017-01-06 18:19:36 UTC
any building == wrong.

I showed a random rubygem failing in the stage phase where it was obvious it was building.  Obviously if the build succeeds, it's hard to detect.  I imagine it would take looking at timestamps to figure it out.

It was also illustrating that it's hindering diagnosis.  Build issues should not be marked as stage failures, period.  That's wrong to do.

I really can't be more clear than this.
You asked for examples, I've given you a couple of them.  If you continue to think building in stage is okay then that's our disconnect.


The easiest solution is to create a variable e.g.
BUILDS_IN_STAGE=yes
for the makefile that the rubygems mk would set which would call "install" target during the build phase and change the location of STAGEDIR.  Then the stage phase would simply copy the contents of STAGEDIR2 to STAGEDIR and whatever post-stage targets do.

It's not particularly hard to fix.  For those s/w that fail to separate install from build, we split install into 2 parts and move the first part to build phase.  That is a solution that works for all ports that violate this premise.
Comment 47 Steve Wills freebsd_committer 2017-01-06 18:24:57 UTC
(In reply to John Marino from comment #46)
What makes building in stage phase a problem? You may call it wrong, but I'm asking why it is a problem.
Comment 48 John Marino freebsd_committer 2017-01-06 18:27:23 UTC
re-read what I wrote, as many times as it takes.  Hopefully you'll finally understand what's been repeatedly said.  I can't do any more.  You've been given enough evidence already.
Comment 49 Steve Wills freebsd_committer 2017-01-06 18:35:52 UTC
(In reply to John Marino from comment #48)

I think I get what you're saying as far as it being wrong. I don't see why you think it's a problem.

If we create a second stage phase and then work around the gem issue by adding BUILDS_IN_STAGE=yes to all of gems that build native extensions, and everything else that breaks the rule, what problem have we solved?
Comment 50 John Marino freebsd_committer 2017-01-06 18:43:42 UTC
it would be added to ALL rubygems.  Nobody is going to waste time figuring out which ones might build native extensions.  Have BUILDS_IN_STAGE=yes when it's not needed doesn't hurt anything, other than some extraneous copying (or perhaps moving) that's not worth trying to optimize.

As for what it's solves, that has definitely been answered.
It fixes both the wrong failure diagnostics (poudriere and synth will properly indicate build failures, not stage failures) and it allows watchdog timeout refinements.

The builders should not be spending 3 hours in stage (as we saw for lang/racket).  This is good for package builders too.
Comment 51 Steve Wills freebsd_committer 2017-01-06 18:49:39 UTC
(In reply to John Marino from comment #50)

Adding extra disk IO that slows down all builds for the sake of improving failure diagnostics and improving time-outs that shouldn't be hit anyway seems like a bad trade-off to me.
Comment 52 John Marino freebsd_committer 2017-01-06 18:56:54 UTC
1) You, as a portmgr, should be striving for correctness IMO.  If you (as portmgr) go down the shortcut road now after portmgr has been shouting for correctness for these last few years, you've undercut all the (unpopular) stances taken wrt to older freebsd release support and staging.

2) at worst we're talking about moving directories which are metadata changes.  Technically you could just move the pre- and main install steps to building stage and leave post-stage as the sole actively for staging.  In that case there is zero IO difference.

I have to point out the obvious:
You're part of the team that first missed this violation and then refused to fix it.  You're biased; it's in your interest to downplay this as a problem.  Right?  Hopefully someone less biased will be more prone to side with correctness.
Comment 53 Steve Wills freebsd_committer 2017-01-06 19:03:50 UTC
(In reply to John Marino from comment #52)
No, I support correctness, but engineering is about making the right trade-offs. If we had a patch that did what you suggested, I'd say lets consider it. If there's no downside, and it makes things more correct and solved the problem, great. I was well aware of ruby doing this long before this PR was created.
Comment 54 John Marino freebsd_committer 2017-01-06 19:13:59 UTC
without looking into details, it seems that if 
BUILDS_IN_STAGE 
were defined and it caused pre-install and do-install to be run as part of build target (and obviously skipped pre-install and do-install for install target), it would solve everything and it wouldn't be a performance hit.  You're just shifting the work to an earlier target.

I don't think the patch would be that big but it would be in bpm so obviously it would require review.

Follow up work:
1) make rubygems set this variable unconditionally
2) add a QA-method so violators can be detected as part of QA checks

extended follow-up work
3) dedicated EXP runs to identify violators (either through grepping logs after a full run, or modifying build tools or framework to set STAGEDIR to READ-ONLY during build phase so that builds fail obviously
4) after tree fully fixed, incorporate READ-ONLY STAGEDIR into tree to catch future violations immediately.
Comment 55 Steve Wills freebsd_committer 2017-01-06 19:25:37 UTC
(In reply to John Marino from comment #54)
It would require review and exp-run, but adding the exp-run is no problem.

I would rather see BUILDS_IN_STAGE added on a case by case basis rather than wholesale to all rubygem- (or other category of) ports. I'm willing to do that work at least for rubygem, there aren't that many really.

QA method, fixing and then making it mandatory by read-only stagedir sounds fine and like the standard way of doing things. But we still have no patch so this is all theoretical.
Comment 56 John Marino freebsd_committer 2017-01-06 19:51:06 UTC
> I would rather see BUILDS_IN_STAGE added on a case by case basis rather than wholesale 

Given that rubygems don't respect building/install lines, there's no benefit for spending a second of manpower to do this.  Do you have an idea what portion of rubygems might build native extensions (which might in itself be platform-specific)?  I wouldn't be surprised if it's 25-60% (although you seem to think that number is much less).  You mentioned trade-offs and I don't think per-rubygem tweaks are a good investment of time.

> But we still have no patch so this is all theoretical.

The patch is not that hard to conceive. Yes, it's "in theory" but there's no reason to believe this can't easily be implemented. I don't see any technical barrier.
Comment 57 Bryan Drewery freebsd_committer 2017-01-06 20:36:37 UTC
(In reply to John Marino from comment #6)
> I propose that rubygem framework be told to install to temporary stage
> directory during the build phase and then use the install command to move
> the installed files to the true stage directory during the install phase.
> 

I don't like that something is building in the install/stage phase, but double staging it would just increase the build time and space used for little benefit.  While building in stage is pedantically wrong, so is staging in build phase.

This is just an inherit problem in trying to use a 3rd party package manager to build the port.
Comment 58 Bryan Drewery freebsd_committer 2017-01-06 20:42:56 UTC
(In reply to Steve Wills from comment #53)
> (In reply to John Marino from comment #52)
> No, I support correctness, but engineering is about making the right
> trade-offs. [...] I was well aware of ruby doing this long before
> this PR was created.

Same here. This problem doesn't seem worth fixing to me. It's just overengineering. While the integration with rubygems is not perfect, it works. 

I don't see a need for a new BUILD_IN_STAGE kind of thing either as this is only impacting an external package builder (meaning not part of the ports framework) and likely only in the rubygem case.  It could easily just check for 'gem' is in the USES list and make its own exception and use a higher timeout for the install phase.
Comment 59 John Marino freebsd_committer 2017-01-06 20:51:45 UTC
>  While building in stage is pedantically wrong, so is staging in build phase.

Yet I gave you 2 very real examples of why this is more than pedantic.

Bryan, I can count on you to oppose every damn thing I say, automatically.  Like GEICO, it's what you do.  You even go out of your way to do this stuff to me (I will forever remind of the fortune incident)

It's an issue.
It's an issue that should be fixed.
It's not the most pressing issue anywhere.

The proposed fix has NO drawbacks.
If somebody has to choose between:
A) building during stage -or-
B) partial installing during building
the smart choice is B.
Comment 60 Steve Wills freebsd_committer 2017-01-06 21:26:15 UTC
(In reply to John Marino from comment #59)

I think we can't really evaluate drawbacks until we have a patch.

If it just comes down to either build while staging or install while building, what's the difference?
Comment 61 John Marino freebsd_committer 2017-01-06 21:35:55 UTC
seriously?  "what's the difference"?

There are 60 comments on this PR already.  The "difference" has been answered a few times in different ways.  There's no point in repeating the answer again and again and again.

And if you consider both to be the same, then you shouldn't care (the path of least resistance would be the go with correctness).
Comment 62 Bryan Drewery freebsd_committer 2017-01-06 21:55:38 UTC
(In reply to John Marino from comment #59)
> >  While building in stage is pedantically wrong, so is staging in build phase.
> 
> Yet I gave you 2 very real examples of why this is more than pedantic.
> 

The examples I see from you are of build failures in stage. That is not disputed at all. What is disputed is the cost-benefit tradeoff of double staging just to make the build error show in "build" phase rather than "stage" in a log.  That's what I refer to as pedantic; what word shows in the log.

> Bryan, I can count on you to oppose every damn thing I say, automatically.

It's just my philosophy and nothing personal.
 
> Like GEICO, it's what you do.  You even go out of your way to do this stuff
> to me (I will forever remind of the fortune incident)

That was not a unilateral decision, it was based on discussion with other portmgr at the time.

> 
> It's an issue.

Yes

> It's an issue that should be fixed.

Sure, but the priority is very low. I don't like how rubygems (and other external packages) are integrated in the framework.

> It's not the most pressing issue anywhere.

Yes

> 
> The proposed fix has NO drawbacks.

Double staging means extra space used, possibly extra ram on tmps, and extra time.

> If somebody has to choose between:
> A) building during stage -or-
> B) partial installing during building
> the smart choice is B.
Comment 63 John Marino freebsd_committer 2017-01-06 21:56:22 UTC
for the record, waiting for a patch is fine with me.

All I wanted was the PR to be reopened, not solved today.
Comment 64 Steve Wills freebsd_committer 2017-01-06 21:59:56 UTC
(In reply to John Marino from comment #61)

I hadn't really considered that BUILDS_IN_STAGE would really just be moving the problem around, but when you put it that way I realized it. I think until upstream changes that's really all we'd be doing, and neither build while staging or install while building, is more correct, afaict. There was mention of a change in gem earlier in the discussion.
Comment 65 Bryan Drewery freebsd_committer 2017-01-06 22:01:19 UTC
Outside of package builders, I don't like that 'make build' in the port directory does not actually build the port for gems.  I see that as a workflow problem.  It has confused me in the past.  At least 'make' defaults to 'make stage', so the problem is hidden by default.
Comment 66 John Marino freebsd_committer 2017-01-06 22:03:09 UTC
> "double staging"
> Double staging means extra space used, possibly extra ram on tmps, and extra time.

That was a early spitball.  It was changed to shifted staged.  No redundant work.  You are responding to the wrong thing.

> It's just my philosophy and nothing personal.
> That was not a unilateral decision, it was based on discussion with other portmgr at the time.

If the "other portmgr" was the one that actually reverted the commit including good non-related changes (while here stuff), that was equally bush-league and not supported by precedent.

The bottom line is that I hope you thoroughly enjoyed that.  I'm am actively avoiding engaging with you as a result; it was the last straw.  I feel the other portmanagers can handle any PRs or reviews or whatnot.  We don't have to interact.
Comment 67 Bryan Drewery freebsd_committer 2017-01-06 23:39:47 UTC
(In reply to John Marino from comment #66)
> > "double staging"
> > Double staging means extra space used, possibly extra ram on tmps, and extra time.
> 
> That was a early spitball.  It was changed to shifted staged.  No redundant
> work.  You are responding to the wrong thing.

Ah you're right. I missed that... the thread is too long.

Moving build+stage to build and making stage a NOP for these ports seems like a reasonable tradeoff to me vs the current situation.


> 
> > It's just my philosophy and nothing personal.
> > That was not a unilateral decision, it was based on discussion with other portmgr at the time.
> 
> If the "other portmgr" was the one that actually reverted the commit

"other portmgr" was plural, sorry for the ambiguity.  It was not in a vaccuum either, it was after a very lengthy email thread about it.

> including good non-related changes (while here stuff), that was equally
> bush-league and not supported by precedent.
> 
> The bottom line is that I hope you thoroughly enjoyed that.  I'm am actively
> avoiding engaging with you as a result; it was the last straw.  I feel the
> other portmanagers can handle any PRs or reviews or whatnot.  We don't have
> to interact.

I think you frequently miss how much we actually agree on, but get caught up
thinking we have a conflict or end up causing one.  In this thread (bug) for example, I have been nothing but professional and trying to stick to the
technical facts and merits of the discussion.  Just because someone disagrees with something you say does not immediately invalidate all of your side of the conversation and mean it needs to be escalated; it's just that, a conversation/discussion.  There's no need to jump to emotions and non-technical aspects.

As for BUILDS_IN_STAGE, I just prefer to keep things simple and would rather
just do this for all of rubygems and case-by-case otherwise.  Note that means I am in disagreement with Steve.  It's normal to have disagreements.  I also just don't see the benefit in mandating that things don't build in stage through some QA check.  It doesn't mean I'm right though.