Rodrigo Vivi: Challenges in Upstream vs. Embargoed Development in Intel Graphics
September 9, 2018
Related Material:
- Handling of embargoed security issues
(albeit a somewhat different type of embargo).
- The hidden costs of embargoes (Red Hat Security Blog).
- Addressing Meltdown and Spectre in the kernel.
- CVE-2018-5390 and “embargoes”.
Additional Participants:
Arnd Bergmann,
Daniel Vetter,
Greg KH,
Jani Nikula,
Jon Masters,
Leon Romanovsky,
Linus Walleij,
Mark Brown, and
Sean Paul.
Rodrigo
Vivi
suggested a discussion on embargo vs. upstream development, hoping to
promote an upstrream-always mentality.
Leon Romanovsky
asked for clarification, wondering why knowledge of internal Intel development
was useful outside of Intel.
Rodrigo
stated that he was concerned not about government restrictions or security
embargoes, but rather embargoes due to internal Intel restrictions.
He also suspects that Intel is not the only organization with this sort
of challenge.
Leon
was happy that Rodrigo favored upstream-first development, but based on
his own experience with Mellanox believes that the required changes are
inside Intel rather than outside
(Greg KH
agrees).
Leon also suspects that Rodrigo is struggling with the business
justification for reducing the embargo.
Daniel Vetter
believes that this problem will persist until open-source hardware becomes
ubiquitous, and states that this is not just an Intel problem.
Greg KH
wondered why this problem was specific to graphics, but agreed that a
discussion on how to handle pre-release hardware and upstream drivers
would be a nice proposal.
Daniel
agreed that it might be good to also look outside of graphics.
Mark Brown and
Leon
agreed that this would be a good topic, and Leon further asked that
Rodrigo or Daniel share their pain points.
Rodrigo
responded with the following:
- Rebasing from mainline on top of LTS is problematic because DRM
moves quickly, which will likely eventually require a rewrite
of LTS's version of DRM.
Leon
suggests basing code on the latest -rc from Linus.
- Everything is to be upstreamed as soon as the embargo lifts,
which requires tracking not only the required patches, but also
their history.
Leon
suggests that several internal developers be responsible for
upstreaming, but that original patch authors were encouraged
to respond to external mailing-list discussions.
- Code-review quality suffers due to the big-bang patch-release
to upstream.
Leon
suggested avoiding staging, citing “constant nightmare
with lustre”.
- Demanding good internal reviews causes problems due to the
patches having “Reviewed-by” when they first appear
in public.
Leon
believes that such “Reviewed-by” clauses are a
good thing, and are in fact a way to reward internal developers
for their time and effort.
Rodrigo believes that they have a good understanding of potential solutions
for 1-3 above, but are especially interested in discussions on 4.
Mark
said that in the past although he used an LTS backport as an integration
point for internal testing and development, he also simultaneously maintained
corresponding patches against -next as the primary development platform,
upstreaming anything that can be upstreamed as soon as feasible.
When the marketing people gave the go-ahead, Mark would push out all the
patches that were held back.
Mark feels that this approach worked pretty well.
Linus Walleij
believes that pretty much any company working with SoCs for routers, handsets,
and so on will have the same problem.
When Linus was responsible for such a situation, he took an ad-hoc approach
that included the following points:
- Classify components so as to let anything non-embargoed to go
upstream immediately.
Mark
noted that this is easier to do for SoCs or standalone chips
than for more complex devices such as the i915 graphics driver.
- Get management to pre-approve a cut-off date for the embargo,
so that when that date arrives, developers can immediately
start pushing code upstream.
Arnd Bergmann
suggests also having a deadline for when the patches
must be publicly posted.
Linus
pointed out that such a deadline would need to be imposed further
up the supply chain, and that he had had good results bringing
vendors around to this requirement over time, with which
Sean Paul
and
Jon Masters
agreed, and which.
Rodrigo
applauded.
Mark
worked this from the supply-chain side, providing such a deadline
and suggesting that their customers demand similar deadlines from
their other suppliers.
- Use internal developers with good upstream skills and experience,
so that upstream code-quality problems can be anticipated and
fixed as soon as possible.
- Rebase internal development frequently so as to minimize the
Hamming distance to mainline.
Linus states that anything else really isn't upstream first,
but that this point was always the most controversial.
Rodrigo
agreed, but stated that he himself had once believed frequent
rebasing to be insane.
However, Rodrigo said that the “rebase” to -rc1
was usually more like a forward port than a rebase.
Rodrigo called out CI as an important component of a successful
constant-rebasing strategy.
Jani Nikula
agreed with Linus, but noted that reviewing the first version does not
always constitute a review of that same code after it has been rebased
a dozen times.
Jani would like to see discussion on this point.