Flag day: sccscheck, bringovercheck, and ctfmerge failures


Date: Thu, 13 Apr 2006 20:46:54 -0400
From: Bill Sommerfeld <sommerfeld at sun dot com>
To: on-all at sun dot com, onnv-gate at onnv dot eng dot sun dot com
Subject: Flag day: sccscheck, bringovercheck, and ctfmerge failures

This message is a flag day for all build machine maintainers, as well
as for all engineers who build ON-Nevada.

Build Machine Maintainers

{{{=========================}}}

You must update your build tools immediately to pick up the two new
build tools and the new version of nightly.  You may install SUNWonbld
(strongly preferred) from /ws/onnv-gate/public/packages/`uname -p` or
pull nightly, sccscheck, and bringovercheck directly from
/net/onnv.eng/opt/onbld/bin/`uname -p`.

Summary

{{{=======}}}

There have been a growing number of poorly understood build failures
which have generally involved failures of the "ctfmerge" tool.

Some of these failures have been root-caused to build races triggered
by the automatic use of "sccs get" by make.

The putback of the fix for

	6407796 "sccs get" rule considered harmful

disables this automatic use of SCCS and thus should reduce the
likelihood of these failures.

It builds on yesterday's integration of

	6410447 Need tool to repair workspaces damaged by 6407791
	6411799 Need standin for "sccs get"

It is important to understand that this is not believed to be a
complete fix to this incompletely-understood problem; a number of
other bugs have been opened to track aspects of this problem,
including 6252653, 6401187, and 6407791.

With this fix, builds may see errors or warnings from a new
"sccscheck" tool, indicating that, without this change they would have
been at significant risk of ctfmerge races and similar carnage.

All Engineers

{{{=============}}}

The new version of nightly MUST be used to build the new source tree; an
older version of nightly is likely to abort early on while it is
preparing the "Build environment" section of the build log.

In addition, today's putback substitutes the new "sccscheck" tool in
place of the "sccs get" defined by the default system make rules when
make detects that a source file may be out of date with respect to the
SCCS/s. delta file.

Incremental builds will see a one-time burst of "sccscheck"
invocations due to the change from "sccs get" to "sccscheck"; if this
disturbs you, do a clobber build and blow away your .make.state files.

You can run "bringovercheck $CODEMGR_WS" to correct out-of-date source
files in your workspace, which should make any remaining "sccscheck"
invocations go away.  The "nightly" script does this immediately after
running bringover.

In its default mode of operation, sccscheck will only complain if a
source file disappears:

	sccscheck: error: Source file $file has gone missing!
	sccscheck: error: Check for overenthusiastic clobber rules

which most likely indicates that a bug similar to 6328950 or 6374221
has been reintroduced.

If the source file is out of date with respect to the SCCS/s. file, by
default sccscheck will just exit successfully without any side
effects; as a result, object files dependant on the out-of-date source
file should not be rebuilt, and ctfmerge races such as the ones
described in 6401187 should be significantly less likely.

Build Tools
~-----------

If your build machine has not yet been updated, you must either use
the nightly.sh from usr/src/tools/scripts and build a copy of the
tools containing my changes (such as with nightly's 't' option) or use
the tools in /ws/onnv-tools/onbld.

Background

{{{==========}}}

There are several interlocking bugs in play here.

sccs get during make not parallel-safe
~--------------------------------------

The version of make on Solaris is SCCS aware, and the default ruleset
include a family of rules which will automatically check out a source
file if the cleartext file is missing or if the SCCS/s. file is newer
than the cleartext file.

However, when a single source file is referenced from multiple source
directories, and two or more "make" processes are independently
started in these directories, it may be the case that multiple
processes notice that a particular source file is out of date with
respect to its SCCS/s. file; two "sccs get"s can be started, most
likely leading to one of them failing.

As simply disabling make's SCCS support is difficult to do without
introducing several non-local side effects, this putback introduces a
new top-level included makefile, Makefile.noget, included from
Makefile.master, which overrides the default make rules relating to
SCCS files, changing them to instead invoke "sccscheck".

mmap-for-write without msync not portable
~-----------------------------------------

See 6407791 and the teamware bug:

6410566 mmap(... PROT_WRITE ...)/munmap without msync results in screwed
up mtimes

In short, in some cases, the code sequence:

	fd = open("file", O_RDWR);
	addr = mmap(...., PROT_WRITE, ..., fd, off)
	< ... modify something inside *addr ... >
	munmap(addr, len);
	close(fd);

and similar close variants does not have an intuitive effect on the
last-modified time of "file" unless a call to msync() is made before
the munmap(), because updates to pages mapped for write may not be
pushed to the filesystem until seconds or minutes after the file is
closed and the mapping is removed.

The impact here is limited to file attributes ~-- other processes will
see the correct file contents, but on some filesystems, the
modification time reported by stat() may be unchanged until some time
well after the process which made the modification exits.  This rarely
if ever happens with UFS, but happened frequently with ZFS before the
fix to 6407791 and is also suspected to happen in some cases with NFS.

This delayed timestamp update plays havoc with make's assumptions
about relative modification times, amplifying the havoc caused by the
automatic "sccs get"; portable code intended for use with "make"
should always use msync() before munmap().

last modified by danmcd on 2009/11/24 14:22
Collectives
Project


© Sun Microsystems Inc. 2009
XWiki Enterprise 1.8.2.19075 - Documentation
Terms Of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines | Site map | Help
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.