From sac-list-owner Mon Apr 9 16:52:53 2001 Date: Mon, 9 Apr 2001 16:54:55 -0700 (PDT) From: Tim Marsland To: psarc@eng.sun.com Subject: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track Cc: raf@lp64.eng.sun.com, tpm@lp64.eng.sun.com Content-Length: 2412 Status: RO X-Status: $$$$ X-UID: 0000000001 I'm sponsoring the following fast-track case for Roger Faulkner. If approved, this project would time out 4/16/01. Release binding is minor (only.) tim ------ EOL Two-level Threads Implementation ==================================== Background ---------- PSARC/1999/481 ('Alternate libthread') introduced a new directory, /usr/lib/lwp, to contain an alternate implementation of libthread. The alternate libthread eliminates the userland thread scheduler, and simply makes all threads be lwps - and thus provides a "One-level" thread model. The interface was introduced as Stable for the following pathnames: /usr/lib/lwp/{,sparcv9/}/libthread{,_db}.so{,.1} Over the past two years, this implementation has been bug fixed, and performance tuned, and is now judged equivalent or better performance and scalability than the most highly tuned version of the existing implementation on significant benchmarks, including Oracle Express (4x), Volano (2x) and SPECjbb (unchanged). The implementation is also the only one that correctly passes all the POSIX conformance suites, and has many fewer bugs than the original libthread in /usr/lib -- in large part because the new implementation is so much simpler than the old -- some of the most tricky issues around supporting the 2-level model no longer apply. Proposal -------- This project proposes to make the alternate libthread implementation be the default, and to remove the existing implementation completely. Thus there will once again be only one libthread in the system. Trailing symlinks will be left behind to preserve runtime linkages i.e. /usr/lib/lwp/{,sparcv9/}/libthread{,_db}.so.1 though strictly speaking this is unnecessary because both ld and run-time linker "fall back" on the version in /usr/lib without additional guidance. However, the symlinks will allow explicit dlopen's to work, as well as scripts that check for the presence of /usr/lib/lwp etc. No warning of the intent to EOL the 2-level threads implementation itself is planned, as the interfaces remain unchanged; the quality of the implementation simply improves. The existence and content of /usr/lib/lwp is declared Obsolete, and will be removed in a (future) minor release of Solaris. An EOL notification will be issued in the release notes to inform the S9 customer base of this EOL intention. A subsequent PSARC case will be used to record the act of removal. From sac-list-owner Wed Apr 11 10:45:45 2001 Date: Wed, 11 Apr 2001 10:48:12 -0700 From: Richard M. X-Accept-Language: en MIME-Version: 1.0 To: Tim Marsland CC: psarc@eng.sun.com, raf@lp64.eng.sun.com, phil.harman@uk.sun.com Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Length: 3093 Status: RO X-Status: $$$$ X-UID: 0000000002 I agree this is the right think to do, going forward, but I have some questions for clarification: - What are the behavior changes that could cause incompatibility with existing applications - e.g. locking wakeup preference for LIFO/FIFO? - How do we propose to deal with applications which may rely on the old (broken) behaviour? - There were some environment variables to control the above in an early prototype - did these go away? - If not, are they documented or do they have stability levels? Thanks, Richard. Tim Marsland wrote: > > I'm sponsoring the following fast-track case for Roger Faulkner. > If approved, this project would time out 4/16/01. > Release binding is minor (only.) > > tim > ------ > > EOL Two-level Threads Implementation > ==================================== > > Background > ---------- > > PSARC/1999/481 ('Alternate libthread') introduced a new directory, > /usr/lib/lwp, to contain an alternate implementation of libthread. The > alternate libthread eliminates the userland thread scheduler, and > simply makes all threads be lwps - and thus provides a "One-level" > thread model. > > The interface was introduced as Stable for the following pathnames: > > /usr/lib/lwp/{,sparcv9/}/libthread{,_db}.so{,.1} > > Over the past two years, this implementation has been bug fixed, and > performance tuned, and is now judged equivalent or better performance > and scalability than the most highly tuned version of the existing > implementation on significant benchmarks, including Oracle Express > (4x), Volano (2x) and SPECjbb (unchanged). The implementation is also > the only one that correctly passes all the POSIX conformance suites, > and has many fewer bugs than the original libthread in /usr/lib -- in > large part because the new implementation is so much simpler than the > old -- some of the most tricky issues around supporting the 2-level > model no longer apply. > > Proposal > -------- > > This project proposes to make the alternate libthread implementation be > the default, and to remove the existing implementation completely. > Thus there will once again be only one libthread in the system. > > Trailing symlinks will be left behind to preserve runtime linkages i.e. > > /usr/lib/lwp/{,sparcv9/}/libthread{,_db}.so.1 > > though strictly speaking this is unnecessary because both ld and > run-time linker "fall back" on the version in /usr/lib without > additional guidance. However, the symlinks will allow explicit > dlopen's to work, as well as scripts that check for the presence of > /usr/lib/lwp etc. > > No warning of the intent to EOL the 2-level threads implementation > itself is planned, as the interfaces remain unchanged; the quality of > the implementation simply improves. > > The existence and content of /usr/lib/lwp is declared Obsolete, and > will be removed in a (future) minor release of Solaris. An EOL > notification will be issued in the release notes to inform the S9 > customer base of this EOL intention. A subsequent PSARC case will be > used to record the act of removal. From sac-list-owner Sun Apr 29 10:55:38 2001 Date: Sun, 29 Apr 2001 10:56:23 -0700 (PDT) From: Andrew T. Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track To: psarc@eng.sun.com Cc: raf@eng.sun.com MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: QowEuypWwZT9jsVxhH4EbA== Content-Length: 645 Since this case seems to still be open waiting for an answer to Richard's questions, I'll toss in mine: Over the years a number of documented interfaces have been added to the system to support the two-level threads model. These include the SIGWAITING and SIGLWP signals, the SA_WAITSIG flag for sigaction(2), and the __LWP_ASLWP flag for _lwp_create(2). There's also _signotifywait(2), _lwp_sigredirect(2), and probably others I'm not remembering right now. Given that this case proposes to EOL the two-level threads implementation, and thus the need for these interfaces, what's the plan for cleaning up these vestiges of the past? Andy From sac-list-owner Thu May 3 02:04:30 2001 Date: Thu, 3 May 2001 02:05:35 -0700 (PDT) From: "Roger A. Faulkner" To: Richard.M. Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track Cc: psarc@eng.sun.com, phil.harman@uk.sun.com, tpm@lp64.eng.sun.com Content-Length: 4329 > From Richard.M. > To: Tim Marsland > CC: psarc@eng.sun.com, raf@lp64.eng.sun.com, phil.harman@uk.sun.com > Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track > > I agree this is the right think to do, going forward, but I have some > questions for clarification: > > - What are the behavior changes that could cause incompatibility with > existing applications - e.g. locking wakeup preference for LIFO/FIFO? Fairness is the main issue. Some multithreaded applications are written with the unconscious assumption of fairness. That is, that threads will automatically get an equal share of cpu time regardless of what locking strategy is used. Experience has shown that fairness is antithetical to performance, so the library attempts unfairness for the most part. Adaptive mutex locking is the most obvious example, for it favors a thread that has just started trying to grab a contended mutex, and leaves threads already on the sleep queue languishing. Applications that rely on fairness can suffer from live locks, that is, they can end up with two threads ping-ponging between each other when neither has any work to do, while other threads asleep on the same queue could be doing useful work. We discovered that Oracle Express suffers from this. To allow such live locks to be broken while not impacting performance excessively, the library adopts a compromise by default: One out of every 16 queue operations will put the queueing thread at the tail of the queue (dequeueing always comes from the front of the queue [within the same priority]). Other behavior changes have not yet been discovered. Hopefully they will be few and easy to deal with... > - How do we propose to deal with applications which may rely on the old > (broken) behaviour? > - There were some environment variables to control the above in an early > prototype - did these go away? > - If not, are they documented or do they have stability levels? The environment variables examined by the new libthread are not to be published, since they can change at any time, including in a patch. They are like the old choke and spark advance/retard controls on automobiles. No modern car has such controls. Likewise, in the fullness of time, these controls will disappear from the library when we have learned enough to make it self-tuning. CTE will be informed of these variables, so that they can use them when investigating customer problems. These are the environment variables examined by the new libthread, along with their associated meanings. Every one is a name=value pair, where the value is a decimal number. The default values are shown in each heading: LIBTHREAD_ADAPTIVE_SPIN=1000 Specifies the number of iterations (spin count) for adaptive mutex locking before giving up and going to sleep. LIBTHREAD_RELEASE_SPIN is set to half this value. LIBTHREAD_RELEASE_SPIN=500 Can be set independently to override the default value of half the value of LIBTHREAD_ADAPTIVE_SPIN. LIBTHREAD_MAX_SPINNERS=100 Limits the number of simultaneously spinning threads attempting to do adaptive locking on one mutex. LIBTHREAD_MUTEX_HANDOFF=0 If set to 1, specifies direct mutex handoff. (This suppresses adaptive locking). LIBTHREAD_QUEUE_FIFO=4 Can be set to a value in [0..8]. Specifies the degree of FIFO queueing. Put a blocking thread on the tail of a sleep queue: 0 : every 256th time (almost never FIFO) 1 : every 128th time 2 : every 64th time 3 : every 32th time 4 : every 16th time (the default value, some FIFO) 5 : every 8th time 6 : every 4th time 7 : every 2nd time 8 : every time (always FIFO) LIBTHREAD_STACK_CACHE=10 Specifies the maximum number of stacks the library retains after threads exit for re-use when more threads are created. LIBTHREAD_ERROR_DETECTION=0 Can be set to 1 or 2: 1: Issue messages about illegal application locking operations. 2: Issue the warning message and dump core. Specifying both LIBTHREAD_MUTEX_HANDOFF=1 and LIBTHREAD_QUEUE_FIFO=8 yields perfect fairness (and almost perfect slowness). LIBTHREAD_ERROR_DETECTION is good help in tracking down locking bugs in multithreaded applications. This is one variable that I would not object to advertising. Roger From sac-list-owner Thu May 3 02:11:43 2001 Date: Thu, 3 May 2001 02:12:50 -0700 (PDT) From: "Roger A. Faulkner" To: Andy T. Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track Cc: psarc@eng.sun.com Content-Length: 1334 > From Andy T. > Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track > To: psarc@eng.sun.com > Cc: raf@eng.sun.com > > Since this case seems to still be open waiting for an answer to > Richard's questions, I'll toss in mine: > > Over the years a number of documented interfaces have been added to the > system to support the two-level threads model. These include the > SIGWAITING and SIGLWP signals, the SA_WAITSIG flag for sigaction(2), > and the __LWP_ASLWP flag for _lwp_create(2). There's also > _signotifywait(2), _lwp_sigredirect(2), and probably others I'm not > remembering right now. Given that this case proposes to EOL the > two-level threads implementation, and thus the need for these > interfaces, what's the plan for cleaning up these vestiges of the > past? The plan is to leave all of these elements of the infrastructure to support the old libthread in Solaris 9, and even to continue to build and test the old libthread for the duration of Solaris 9 after it enters maintenance mode, so that CTE will have emergency binary relief for melting-down customers while we figure out what to do about their particular problem with the new libthread. Early in Solaris 10, we plan to remove all of this infrastructure from the system. Roger From sac-list-owner Thu May 3 07:58:39 2001 Date: Thu, 3 May 2001 07:59:06 -0700 (PDT) From: "Joseph E. Kowalski III" Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track To: Richard.M., Roger.Faulkner@eng.sun.com Cc: psarc@eng.sun.com, phil.harman@uk.sun.com, tpm@lp64.eng.sun.com MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: q8xAsfKoMHVT02/SiKrQqQ== Content-Length: 642 > CTE will be informed of these variables, so that they can use > them when investigating customer problems. Note that if CTE tells enough people and doesn't tell them that they are temporary (Unstable), you may have difficulty removing them in the future. This isn't architecture. Just a suggestion that you do your best to make sure that CTE knows the restrictions and knows to deceminate the restrictions. > These are the environment variables examined by the new libthread, This isn't "C", so it doesn't have any real consequence, but would these be better starting with an "_"? Again, just a suggestion (and a weak one). - jek3 From sac-list-owner Thu May 3 08:02:12 2001 Date: Thu, 3 May 2001 08:02:39 -0700 (PDT) From: "Joseph E. Kowalski III" Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track To: Andy T., Roger.Faulkner@eng.sun.com Cc: psarc@eng.sun.com MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: dOjh2W7jMiaiBGJIK7/kqw== Content-Length: 427 > Early in Solaris 10, we plan to remove all of this infrastructure > from the system. I kinda wonder how you will be able to do this. You can deprecate them, but truthfully I don't see how you will be able to remove them any time before you retire 8^). Anyway, if I read this case correctly, this case doesn't propose the removal of any of these. Its the later case I'm suggesting might have difficulty. Right? - jek3 From sac-list-owner Thu May 3 08:05:03 2001 Date: Thu, 3 May 2001 08:05:31 -0700 (PDT) From: "Joseph E. Kowalski III" Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track To: Richard.M., Roger.Faulkner@eng.sun.com, Joseph.Kowalski@eng.sun.com Cc: psarc@eng.sun.com, phil.harman@uk.sun.com, tpm@lp64.eng.sun.com MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: wFFV3X/zfq4uGzE4nv3kJQ== Content-Length: 604 Re: Environment Variables > > CTE will be informed of these variables, so that they can use > > them when investigating customer problems. > > Note that if CTE tells enough people and doesn't tell them that they > are temporary (Unstable), you may have difficulty removing them in the > future. This isn't architecture. Just a suggestion that you do your > best to make sure that CTE knows the restrictions and knows to deceminate > the restrictions. Hummm,... what is the Taxonomy level for these? By telling CTE I think "Project Private" doesn't fit. Perhaps "Unstable" or "Obsolete"? - jek3 From sac-list-owner Thu May 3 10:29:23 2001 Date: Thu, 03 May 2001 10:30:14 -0700 From: Jordan B. X-Accept-Language: en MIME-Version: 1.0 To: "Joseph E. Kowalski III" CC: Roger.Faulkner@eng.sun.com, psarc@eng.sun.com, phil.harman@uk.sun.com Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Length: 488 Roger said: > The environment variables examined by the new libthread are not to be > published, since they can change at any time, including in a patch. Joe asked: > Hummm,... what is the Taxonomy level for these? By telling CTE I think > "Project Private" doesn't fit. Perhaps "Unstable" or "Obsolete"? There is no non-private interface that can "legally" be changed in a patch. I suspect that this is quite deliberate, since patches aren't supposed to ever cause incompatibility. From sac-list-owner Thu May 3 10:41:25 2001 Date: Thu, 3 May 2001 10:41:52 -0700 (PDT) From: "Joseph E. Kowalski III" Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track To: Joseph.Kowalski@eng.sun.com, Jordan.B. Cc: Roger.Faulkner@eng.sun.com, psarc@eng.sun.com, phil.harman@uk.sun.com MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: IUxDoko3mHWO+gBHXMPqOg== Content-Length: 1139 > Roger said: > > The environment variables examined by the new libthread are not to be > > published, since they can change at any time, including in a patch. > > Joe asked: > > Hummm,... what is the Taxonomy level for these? By telling CTE I think > > "Project Private" doesn't fit. Perhaps "Unstable" or "Obsolete"? > > There is no non-private interface that can "legally" be changed in a > patch. I suspect that this is quite deliberate, since patches aren't > supposed to ever cause incompatibility. Hummm,... true. I think this is a case of "trying to have your cake and eat it too". I guess "Project Private" is the closest fit we have. I guess I would only advise the project team to make sure that CTE knows this is the clasification and that they should let anybody they tell about these know that as well. My concern is that regardless of what we officially call them, there may be difficulty in removing them should customers use them extensively without their expectations being appropriately set. I know there is no way we can guarentee this, but a little effort at expectation setting can go a long way. - jek3 From sac-list-owner Thu May 3 10:44:57 2001 From: Lee D. MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Thu, 3 May 2001 11:45:35 -0600 Cc: Roger.Faulkner@eng.sun.com, psarc@eng.sun.com, phil.harman@uk.sun.com Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track X-Disclaimer: My opinions are so off the wall that nobody claims them Content-Length: 745 I've always thought that if you tell CTE, the interfaces *will* get out (aka /etc/system). So that would mean something like Evolving or Unstable, right? ---------------------------------------------------------------- Roger said: > The environment variables examined by the new libthread are not to be > published, since they can change at any time, including in a patch. Joe asked: > Hummm,... what is the Taxonomy level for these? By telling CTE I think > "Project Private" doesn't fit. Perhaps "Unstable" or "Obsolete"? There is no non-private interface that can "legally" be changed in a patch. I suspect that this is quite deliberate, since patches aren't supposed to ever cause incompatibility. -- *Lee++ Just Say No To MIME From sac-list-owner Thu May 3 10:59:03 2001 Date: Thu, 3 May 2001 10:59:26 -0700 (PDT) From: "Joseph E. Kowalski III" Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track To: Lee D. Cc: Roger.Faulkner@eng.sun.com, psarc@eng.sun.com, phil.harman@uk.sun.com MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: L+e27b7VqMPWaynnaTzAHg== Content-Length: 746 > I've always thought that if you tell CTE, the interfaces *will* get > out (aka /etc/system). I think we can only go so far to combat that reality. All the project team can do is to make a reasonable effort to educate CTE. Also, I think we've had reasonable success in moving things in and out of /etc/system (by leaving null variables to avoid syntax errors). Environment variables don't even have this vestage problem. > So that would mean something like Evolving or Unstable, right? It could, but I don't think it has to. Of course, I also wonder why Roger is concerned about not supporting this list in patches. That seems to indicate that he is willing to "rebreak" somebody who used one of these to work around a problem. - jek3 From sac-list-owner Sun May 6 22:26:32 2001 Date: Sun, 6 May 2001 22:27:36 -0700 (PDT) From: "Roger A. Faulkner" To: Lee D., Joseph.Kowalski@eng.sun.com Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track Cc: psarc@eng.sun.com, phil.harman@uk.sun.com Content-Length: 1521 > From Joseph.Kowalski@eng.sun.com Thu May 3 11:00:15 2001 > Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track > To: Lee D. > Cc: Roger.Faulkner@eng.sun.com, psarc@eng.sun.com, phil.harman@uk.sun.com > > > I've always thought that if you tell CTE, the interfaces *will* get > > out (aka /etc/system). > > I think we can only go so far to combat that reality. All the project > team can do is to make a reasonable effort to educate CTE. > > Also, I think we've had reasonable success in moving things in and out > of /etc/system (by leaving null variables to avoid syntax errors). > Environment variables don't even have this vestage problem. > > > So that would mean something like Evolving or Unstable, right? > > It could, but I don't think it has to. If you have to call them something, call them the most deprecated thing you can think of. > Of course, I also wonder why Roger is concerned about not supporting this > list in patches. That seems to indicate that he is willing to "rebreak" > somebody who used one of these to work around a problem. > > - jek3 My position is that if a customer has to use one of them, then there is a bug, either in the customer's application or in the new libthread. Recommending to use one of the environment variables should be done only to work around a problem and to help in discovering and fixing the problem. They should not be used as a permanent solution to anything. That is why I want them deprecated. Roger From sac-list-owner Sun May 6 22:38:46 2001 Date: Sun, 6 May 2001 22:39:53 -0700 (PDT) From: "Roger A. Faulkner" To: Andy T., Joseph.Kowalski@eng.sun.com Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track Cc: psarc@eng.sun.com Content-Length: 1089 > From Joseph.Kowalski@eng.sun.com Thu May 3 08:03:21 2001 > Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track > To: Andy T., Roger.Faulkner@eng.sun.com > Cc: psarc@eng.sun.com > > > Early in Solaris 10, we plan to remove all of this infrastructure > > from the system. > > I kinda wonder how you will be able to do this. You can deprecate them, > but truthfully I don't see how you will be able to remove them any time > before you retire 8^). The old infrastructure can be removed without removing the #defines and signal numbers. These things just become no-ops. No application will be broken if we remove things that they can never reach. > Anyway, if I read this case correctly, this case doesn't propose the > removal of any of these. Its the later case I'm suggesting might have > difficulty. Right? Back at you, I wonder whu PSARC has to be involved at all. The only reason to have this old infrastructure in the system is to support the old libthread, and the old libthread will be fully expunged in Solaris 10. Roger From sac-list-owner Tue May 8 10:26:25 2001 Date: Tue, 8 May 2001 10:27:34 -0700 (PDT) From: Andrew T. Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track To: Roger.Faulkner@eng.sun.com Cc: psarc@eng.sun.com MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: PMmp4JKLE4u03ayXpDuvUg== Content-Length: 603 [talking about removal of interfaces created to support the two-level model] > > Anyway, if I read this case correctly, this case doesn't propose the > > removal of any of these. Its the later case I'm suggesting might have > > difficulty. Right? > > Back at you, I wonder whu PSARC has to be involved at all. You need PSARC approval to deprecate the existing interfaces to make it clear to the customer that they are no longer relevant. This can (and should) be done whether or not the interfaces will actually be removed in a future release. But I agree that this can be a separate case. Andy From sac-list-owner Wed May 9 10:57:37 2001 Date: Wed, 9 May 2001 11:58:14 -0600 (MDT) From: Andy R. Subject: Re: psarc/2001/287 - EOL Two-level Threads Implementation - fast-track To: psarc@sac.eng.sun.com Cc: raf@eng.sun.com MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: 6wkoY+SWq2/ScloqMqfXBA== Content-Length: 205 This fast-track was approved at today's PSARC meeting. For the record, the ARC agrees with the classification of Project Private for the environment variables intended only for emergency CTE use. -andy