Discussion:
(bash) How (really!) does the "current job" get determined?
(too old to reply)
Kenny McCormack
2024-10-03 23:08:52 UTC
Permalink
Note: This is a "How do things really work - in the real world?", rather
than a "What does the manual say about how things work?" sort of thread.

The manual says the answer is "The job most recently started in the
background or most recently stopped." This is not always the case.

Observe (this is bash version 5.2.15) (and "j" is aliased to "jobs -l"):

$ j
[1]+ 20914 Stopped (signal) sudo bash
$ sleep 100 & j
[2] 12914
[1]+ 20914 Stopped (signal) sudo bash
[2]- 12914 Running sleep 100 &
$ fg
sudo bash
# suspend

[1]+ Stopped sudo bash
Status: 147
$ %2

Note that I start with one background job (the "sudo"). I launch a second
one, but, according to the "jobs" listing, job #1 is still the "current"
job (denoted with the "+"). Further, when I do "fg", I get back to job #1.

Two comments:
1) You generally would like it to work the way it should work, since
you generally want to manipulate the most recent job (the sleep in the
above example, not the sudo). Getting the job id from the pid ($!) is
possible and is my chosen workaround, but it is not trivial.
2) I could not find mention of what exactly the output of the jobs
command means; i.e., what the + and - mean - in "man bash".

Also note: I googled this question and found something on unix.stackexchange.
There is a post there by our own "Stephan Chaz...", that basically just
quotes the manual. As I said, this info is incorrect (as seen above).
--
That's the Trump playbook. Every action by Trump or his supporters can
be categorized as one (or more) of:

outrageous, incompetent, or mentally ill.
Kaz Kylheku
2024-10-04 02:40:06 UTC
Permalink
Post by Kenny McCormack
Note: This is a "How do things really work - in the real world?", rather
than a "What does the manual say about how things work?" sort of thread.
The manual says the answer is "The job most recently started in the
background or most recently stopped." This is not always the case.
It looks buggered.
Post by Kenny McCormack
$ j
[1]+ 20914 Stopped (signal) sudo bash
This is now most recently stopped.
Post by Kenny McCormack
$ sleep 100 & j
This is now most recently started in the background, therefore the
documentation specifies that it is now the current job.

It must be that Bash has no test cases covering the documented
requirements in this area adequate enough to catch what you have found.

Is this automatically tested at all?

Testing interactive job control features pretty much requires Bash to be
behind a pseudo-tty; driven by expect or something like it.

(Or else at least a unit test is required where the function that
identifies the current job is tested in isolation, with the various
conditions mocked up: suspended job introduced while existing job is
stopped, etc.)
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Kenny McCormack
2024-10-04 13:13:59 UTC
Permalink
Post by Kaz Kylheku
Post by Kenny McCormack
Note: This is a "How do things really work - in the real world?", rather
than a "What does the manual say about how things work?" sort of thread.
The manual says the answer is "The job most recently started in the
background or most recently stopped." This is not always the case.
It looks buggered.
Indeed it does. What this means from a scripting programmer's
point-of-view is that you can't count on it. You can't rely on the job you
just launched being the "current job". Thus, you have to convert $! into a
job id, via something like:

jobs -l | awk $!' == $2 { print substr($0,2)+0 }'

From the bash developers point-of-view, the question becomes: What specific
set of circumstances triggers this?

Note also that the underlying problem here is that while most of the "job
related" commands that take a "job spec" will take either something like %1
or an actual pid, but the "fg" command only takes %n. So, if you want to
fg the most recent job, you need to obtain the job id (via the command line
above), before passing it to "fg". Note that "fg" with no arg at all would
fg the wrong job.
Post by Kaz Kylheku
It must be that Bash has no test cases covering the documented
requirements in this area adequate enough to catch what you have found.
Is this automatically tested at all?
Testing interactive job control features pretty much requires Bash to be
behind a pseudo-tty; driven by expect or something like it.
Indeed. Good point.
Post by Kaz Kylheku
(Or else at least a unit test is required where the function that
identifies the current job is tested in isolation, with the various
conditions mocked up: suspended job introduced while existing job is
stopped, etc.)
Yes.
--
Trump could say he invented gravity, and 40% of the country would believe him...
This is where we are at, ladies and gentlemen.
Kaz Kylheku
2024-10-04 14:29:39 UTC
Permalink
Post by Kenny McCormack
Post by Kaz Kylheku
Post by Kenny McCormack
Note: This is a "How do things really work - in the real world?", rather
than a "What does the manual say about how things work?" sort of thread.
The manual says the answer is "The job most recently started in the
background or most recently stopped." This is not always the case.
It looks buggered.
Indeed it does. What this means from a scripting programmer's
point-of-view is that you can't count on it.
What? Scripting should not go anywhere near POSIX job control, which is
an interactive feature that requires a terminal session.

Unless you mean scripting that is peripheral to an interactive session,
for automating some interactive job control use cases? Like making
a more friendly job control system, or whatever?

If you'd like to build some kind of layer over job control, it looks as
if indeed you cannot rely on job control's implicit selection of the
most recent process. If you want your job control layer to have that,
you need your own global variable for it, and always pass down a %n
argument to the job control cruft below you.

Even if the current job variable worked reliably as documented, it would
be unreliable to you because any background can become the current job
at any time, asynchronously to you, due to being suddenly stopped on a
signal.
Post by Kenny McCormack
You can't rely on the job you
just launched being the "current job".
This is what I'm saying: even if it works as documented. There is a race
condition. A fraction of a second after you launch the job, some
existing, executing background job tries to do TTY input and is stopped.
Oops! It is now the "current job".
Post by Kenny McCormack
Note also that the underlying problem here is that while most of the "job
related" commands that take a "job spec" will take either something like %1
or an actual pid, but the "fg" command only takes %n. So, if you want to
fg the most recent job, you need to obtain the job id (via the command line
above), before passing it to "fg". Note that "fg" with no arg at all would
fg the wrong job.
Yes; so if you're writing your own cruft on top of job control, it's
probably a good idea to never call anything below without a %n argument.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Kenny McCormack
2024-10-04 15:57:04 UTC
Permalink
In article <***@kylheku.com>,
Kaz Kylheku <643-408-***@kylheku.com> mysteriously wrote:
...
Post by Kaz Kylheku
What? Scripting should not go anywhere near POSIX job control, which is
an interactive feature that requires a terminal session.
I obviously don't want to get into a pissing match over this, but you need
to expand your mind. Scripting comes in many different forms and you
should not be so narrow-minded.

That is all.
--
When someone tells me he/she is a Christian I check to see if I'm
still in possession of my wallet.
Kaz Kylheku
2024-10-05 02:17:21 UTC
Permalink
Post by Kenny McCormack
...
Post by Kaz Kylheku
What? Scripting should not go anywhere near POSIX job control, which is
an interactive feature that requires a terminal session.
I obviously don't want to get into a pissing match over this, but you need
to expand your mind.
Compared to who, you?
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Christian Weisgerber
2024-10-04 17:49:32 UTC
Permalink
Post by Kaz Kylheku
What? Scripting should not go anywhere near POSIX job control, which is
an interactive feature that requires a terminal session.
Well, there _is_ set -m.
--
Christian "naddy" Weisgerber ***@mips.inka.de
Janis Papanagnou
2024-10-04 19:42:12 UTC
Permalink
Post by Christian Weisgerber
Post by Kaz Kylheku
What? Scripting should not go anywhere near POSIX job control, which is
an interactive feature that requires a terminal session.
Well, there _is_ set -m.
And how will that devaluate what Kaz has said? Please elaborate.

See also Bolsky/Korn (chapter "Job Control") about some details
on implicit and explicit activation, implementation dependencies,
and use of option 'monitor' (-m) [for interactive invocations]
for systems that don't support ("complete") job control.

If you have other information (facts, rationales, or insights),
or if you know of any useful and sensible application contexts
for non-interactive usages I'd certainly be curious to know.[*]

Janis

[*] The job-control "layering" that Kaz mentioned was the only
thing that appeared somewhat obvious to me. I also don't expect
any insights from the OP (who obviously was in insult-mode), so
feel free to jump in.
Christian Weisgerber
2024-10-04 22:48:15 UTC
Permalink
Post by Janis Papanagnou
Post by Christian Weisgerber
Post by Kaz Kylheku
What? Scripting should not go anywhere near POSIX job control, which is
an interactive feature that requires a terminal session.
Well, there _is_ set -m.
And how will that devaluate what Kaz has said? Please elaborate.
Job control does not require an interactive shell or a terminal
session. It can be used in scripting. That's the facts.
Post by Janis Papanagnou
or if you know of any useful and sensible application contexts
for non-interactive usages I'd certainly be curious to know.[*]
I'm curious myself. That said, here's something I stumbled across
recently:

background job &
...
kill %1 # clean up

What happens if the background job has already terminated on its
own accord before we reach the kill(1)? Not much, because with job
control, the shell knows that no such job exists. If you do this
with "kill $!", you signal that PID, which no longer refers to the
intended process and may in fact have been reused for a different
process.
--
Christian "naddy" Weisgerber ***@mips.inka.de
Kaz Kylheku
2024-10-05 02:23:09 UTC
Permalink
Post by Christian Weisgerber
Post by Janis Papanagnou
Post by Christian Weisgerber
Post by Kaz Kylheku
What? Scripting should not go anywhere near POSIX job control, which is
an interactive feature that requires a terminal session.
Well, there _is_ set -m.
And how will that devaluate what Kaz has said? Please elaborate.
Job control does not require an interactive shell or a terminal
session.
It can be used in scripting. That's the facts.
An example of what you mean would help.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Kaz Kylheku
2024-10-10 19:15:08 UTC
Permalink
Post by Kaz Kylheku
Post by Christian Weisgerber
Post by Janis Papanagnou
Post by Christian Weisgerber
Post by Kaz Kylheku
What? Scripting should not go anywhere near POSIX job control, which is
an interactive feature that requires a terminal session.
Well, there _is_ set -m.
And how will that devaluate what Kaz has said? Please elaborate.
Job control does not require an interactive shell or a terminal
session.
It can be used in scripting. That's the facts.
An example of what you mean would help.
*crickets*
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Keith Thompson
2024-10-05 06:10:00 UTC
Permalink
Christian Weisgerber <***@mips.inka.de> writes:
[...]
Post by Christian Weisgerber
I'm curious myself. That said, here's something I stumbled across
background job &
...
kill %1 # clean up
What happens if the background job has already terminated on its
own accord before we reach the kill(1)? Not much, because with job
control, the shell knows that no such job exists. If you do this
with "kill $!", you signal that PID, which no longer refers to the
intended process and may in fact have been reused for a different
process.
That's possible, but in my experience the system avoids reusing PIDs
of dead jobs for as long as possible. On typical Linux systems, the
max PID is about 2**22, and newly allocated PIDs cycle through that
range. (If you're using all the PIDs in that range simultaneously,
you've got bigger problems than running out of PIDs.)

I once ran a test to see how quickly I could make the system reuse
a PID, by forking and terminating as many processes as I could as
quickly as I could. I don't remember the details, but I think it
took the better part of an hour.

On the other hand, I think a contrived pattern of process creation
and termination could result in a PID being reused quickly. I might
try the experiment again.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
Kenny McCormack
2024-10-05 15:02:45 UTC
Permalink
In article <***@lorvorc.mips.inka.de>,
Christian Weisgerber <***@mips.inka.de> wrote:
...
Post by Christian Weisgerber
Job control does not require an interactive shell or a terminal
session. It can be used in scripting. That's the facts.
True. But as they say, there are none so blind as those that will not see.
Post by Christian Weisgerber
I'm curious myself. That said, here's something I stumbled across
background job &
...
kill %1 # clean up
What happens if the background job has already terminated on its
own accord before we reach the kill(1)? Not much, because with job
control, the shell knows that no such job exists. If you do this
with "kill $!", you signal that PID, which no longer refers to the
intended process and may in fact have been reused for a different
process.
The problem of re-used pids is something people frequently worry about, but
which is (for all practical purposes) never seen in real life. For one
thing, even in the old days of 15 bit pids, it is still basically
impossible for it to cycle all the way through in any sort of reasonable
time frame. Nowadays, we have 22 bit pids, so it is even less likely (*).

Some other notes about this:
1) As far as I know, all "normal" Unixes use the simple cycle method of
allocating pids - i.e., just keep going up by 1 until you reach the max,
then start over again at 1 (or 2). But I think at one point, it was
thought that having "predictable" pids was somehow bad for "security",
so they had a random assignment method.
2) Other non-Unix, but Unix-like, environments, such as Windows, treats
pids differently. I think Windows aggressively re-uses them, so one
probably needs to be more careful there than in regular Unix/Linux.
3) As I said, this is more of a problem in theory than in practice, but
the pidfd*() functions were inspired by a perceived need for being able
to be sure.

(*) Actually this kinda begs the question, though, why 22 bits? Why not all 32?
Or 64? Incidentally, there are comments in the kernel to the effect of "22
bits has to be enough; 4 million pids should be enough for anyone" (just
like 640K, I suppose...)
--
Kenny, I'll ask you to stop using quotes of mine as taglines.

- Rick C Hodgin -
Christian Weisgerber
2024-10-05 15:38:42 UTC
Permalink
Post by Kenny McCormack
The problem of re-used pids is something people frequently worry about, but
which is (for all practical purposes) never seen in real life. For one
thing, even in the old days of 15 bit pids, it is still basically
impossible for it to cycle all the way through in any sort of reasonable
time frame. Nowadays, we have 22 bit pids, so it is even less likely (*).
"We" do? Offhand, I don't know the size of pid_t, much less how
much of its numerical range is actually used. There are trivial
concerns, such as how many columns PIDs take up in the output of
ps(1).
Post by Kenny McCormack
1) As far as I know, all "normal" Unixes use the simple cycle method of
allocating pids - i.e., just keep going up by 1 until you reach the max,
then start over again at 1 (or 2). But I think at one point, it was
thought that having "predictable" pids was somehow bad for "security",
so they had a random assignment method.
I thought random assignment of PIDs was standard by now.
Okay, on FreeBSD it isn't but can be enabled.
--
Christian "naddy" Weisgerber ***@mips.inka.de
Kenny McCormack
2024-10-05 17:13:02 UTC
Permalink
In article <***@lorvorc.mips.inka.de>,
Christian Weisgerber <***@mips.inka.de> wrote:
...
Post by Christian Weisgerber
Post by Kenny McCormack
time frame. Nowadays, we have 22 bit pids, so it is even less likely (*).
"We" do? Offhand, I don't know the size of pid_t, much less how
much of its numerical range is actually used.
On Linux, check out: /proc/sys/kernel/pid_max

As I read the various posts on the subject, 15 bit is still the limit on 32
bit systems, 22 bit on 64 bit systems. But, of course, YMMV.

Also, the size of pid_t isn't dispositive. Obviously, it has to be >= the
maximum pid, but the max pid might be (and usually is) less than "pid_t MAX".
On the system I just tested on, pid_t is the same as int, which is 31 bits
(not counting the sign bit). Note that pid_t does have to be a signed
type, since some of the functions (e.g., kill(2)) can take negative pids.
Post by Christian Weisgerber
There are trivial concerns, such as how many columns PIDs take up in the
output of ps(1).
Heh. Yeah, I've noticed that on systems with large pids (7 digits), ps
listings sometimes look a little funky.
Post by Christian Weisgerber
I thought random assignment of PIDs was standard by now.
I'm pretty sure that on all of my systems, it is still incremental.
(Maybe we have different understandings of what the word "random" means)

But it is a wide, wide world. Maybe things are different outside of my
little corner of it.
--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/InsaneParty
Janis Papanagnou
2024-10-07 02:58:56 UTC
Permalink
Post by Kenny McCormack
...
Post by Christian Weisgerber
Post by Kenny McCormack
time frame. Nowadays, we have 22 bit pids, so it is even less likely (*).
"We" do? Offhand, I don't know the size of pid_t, much less how
much of its numerical range is actually used.
On Linux, check out: /proc/sys/kernel/pid_max
As I read the various posts on the subject, 15 bit is still the limit on 32
bit systems, 22 bit on 64 bit systems. But, of course, YMMV.
Hmm.. - my [a bit rusty] 64 bit Linux displays 32768 (15 bit).

(Not that I had anytime needed more than a fraction of these;
currently "only" about 2%.)

Janis
Post by Kenny McCormack
[...]
marrgol
2024-10-07 10:22:44 UTC
Permalink
Post by Janis Papanagnou
Post by Kenny McCormack
On Linux, check out: /proc/sys/kernel/pid_max
As I read the various posts on the subject, 15 bit is still the limit on 32
bit systems, 22 bit on 64 bit systems. But, of course, YMMV.
Hmm.. - my [a bit rusty] 64 bit Linux displays 32768 (15 bit).
32768 is just kernel's default, it can be changed using sysctl
or by writing directly to the file mentioned above.
Kenny McCormack
2024-10-07 11:53:42 UTC
Permalink
Post by Janis Papanagnou
Post by Kenny McCormack
...
Post by Christian Weisgerber
Post by Kenny McCormack
time frame. Nowadays, we have 22 bit pids, so it is even less likely (*).
"We" do? Offhand, I don't know the size of pid_t, much less how
much of its numerical range is actually used.
On Linux, check out: /proc/sys/kernel/pid_max
As I read the various posts on the subject, 15 bit is still the limit on 32
bit systems, 22 bit on 64 bit systems. But, of course, YMMV.
Hmm.. - my [a bit rusty] 64 bit Linux displays 32768 (15 bit).
As another poster has already noted, 15 bits is still the default, but the
limits are as shown above. Presumably, this can be set in some startup
file (which seems to be the case on the machine I am typing on now).

Personally, I don't really see the point (in increasing the max pid #).
For practical purposes, it seems unlikely you'll ever need it (*), but then
again, maybe there really are big systems out there that need to be running
more than 32K processes at a time.

(*) And, as noted earlier, having big pid numbers makes "ps" listings weird.
--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/Noam
Janis Papanagnou
2024-10-07 12:42:17 UTC
Permalink
Post by Kenny McCormack
[...]
Personally, I don't really see the point (in increasing the max pid #).
For practical purposes, it seems unlikely you'll ever need it (*), but then
again, maybe there really are big systems out there that need to be running
more than 32K processes at a time.
Yes, any [multi-user] server system will likely need larger limits
(nowadays). I forgot that my (de facto) "single-user" system is in
principle able to serve hundreds of users (or more) and must be
able to handle all their processes.
Post by Kenny McCormack
(*) And, as noted earlier, having big pid numbers makes "ps" listings weird.
IMO the 'ps' options and output is anyway a pain per se. I wonder
why I haven't yet written my own ps-fontend (based on option '-o');
probably because I don't use it often, only in case of problems.

Janis
Richard Kettlewell
2024-10-07 13:31:51 UTC
Permalink
Post by Janis Papanagnou
Post by Kenny McCormack
Personally, I don't really see the point (in increasing the max pid
#). For practical purposes, it seems unlikely you'll ever need it
(*), but then again, maybe there really are big systems out there
that need to be running more than 32K processes at a time.
Yes, any [multi-user] server system will likely need larger limits
(nowadays). I forgot that my (de facto) "single-user" system is in
principle able to serve hundreds of users (or more) and must be
able to handle all their processes.
My computer has a baseline of about 360 PIDs, rising to about 430 after
logging in and starting a few typical applications. 15-bit PIDs could
handle a hundred of me with plenty of room to spare.

_On Linux_ process IDs and thread IDs share the same number space which
changes the picture quite a bit: 15 bits would only support a handful of
users (with my profile) since some of those typical applications create
many threads.

All this assumes desktop users (a system serving terminal-only users
might have a lot more headroom) and that no other resource runs out
first.
--
https://www.greenend.org.uk/rjk/
Janis Papanagnou
2024-10-07 16:29:00 UTC
Permalink
[ This is getting off-topic, sorry. ]
Post by Richard Kettlewell
_On Linux_ process IDs and thread IDs share the same number space which
changes the picture quite a bit: [...]
This is interesting.
Since processes are handled by the OS kernel what does that imply...?
A common process/thread interface in Linux?
Is that defined by POSIX threads, or is it something specific?

Is there any good link to read more about that?

(My thread times have long passed; I used it from C++, and there were
a lot of things to consider when programming with threads back then.)

Janis
Kenny McCormack
2024-10-07 16:41:49 UTC
Permalink
Post by Janis Papanagnou
[ This is getting off-topic, sorry. ]
Post by Richard Kettlewell
_On Linux_ process IDs and thread IDs share the same number space which
changes the picture quite a bit: [...]
This is interesting.
Since processes are handled by the OS kernel what does that imply...?
A common process/thread interface in Linux?
Is that defined by POSIX threads, or is it something specific?
Is there any good link to read more about that?
"man clone" is a good starting place.

In Linux, threads are almost the same thing as processes. Obviously, there
are differences, but basically, that is true. Both are created via clone(2).
(fork() is implemented on top of clone()).
--
Post by Janis Papanagnou
No, I haven't, that's why I'm asking questions. If you won't help me,
why don't you just go find your lost manhood elsewhere.
CLC in a nutshell.
Lew Pitcher
2024-10-07 16:47:26 UTC
Permalink
Post by Janis Papanagnou
[ This is getting off-topic, sorry. ]
Post by Richard Kettlewell
_On Linux_ process IDs and thread IDs share the same number space which
changes the picture quite a bit: [...]
This is interesting.
Since processes are handled by the OS kernel what does that imply...?
A common process/thread interface in Linux?
Exactly.
In the early 2000's, the Linux kernel moved to supporting 1:1 threads,
and provided the NPTL ("Native Posix Threading Library") to provide the
POSIX application-level API to this new kernel capability.
Post by Janis Papanagnou
Is that defined by POSIX threads, or is it something specific?
It is how Linux implements the kernel-level responsibilities of POSIX
threads.
Post by Janis Papanagnou
Is there any good link to read more about that?
Plenty. Google "NPTL and Linux"
Post by Janis Papanagnou
(My thread times have long passed; I used it from C++, and there were
a lot of things to consider when programming with threads back then.)
Janis
--
Lew Pitcher
"In Skills We Trust"
Lawrence D'Oliveiro
2024-10-07 20:23:17 UTC
Permalink
Post by Janis Papanagnou
Since processes are handled by the OS kernel what does that imply...?
A common process/thread interface in Linux?
Both fork(2) and POSIX thread creation are essentially wrappers around the
underlying Linux-specific clone(2) call.

<https://manpages.debian.org/2/clone.2.en.html>

You’ll notice there are a lot more options in there besides creating pure
POSIX-style processes and pure POSIX-style threads.
Kaz Kylheku
2024-10-05 23:41:35 UTC
Permalink
Post by Kenny McCormack
...
Post by Christian Weisgerber
Job control does not require an interactive shell or a terminal
session. It can be used in scripting. That's the facts.
True. But as they say, there are none so blind as those that will not see.
"They" being mainly billionaire televangelists.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Janis Papanagnou
2024-10-07 02:50:31 UTC
Permalink
Post by Christian Weisgerber
Post by Janis Papanagnou
Post by Christian Weisgerber
Post by Kaz Kylheku
What? Scripting should not go anywhere near POSIX job control, which is
an interactive feature that requires a terminal session.
Well, there _is_ set -m.
And how will that devaluate what Kaz has said? Please elaborate.
Job control does not require an interactive shell or a terminal
session. It can be used in scripting. That's the facts.
Yes, but for one that doesn't explain why you emphasized 'set -m',
and your example below - certainly reasonable for discussion! - I
don't find convincing. In contrast to '$!' that you get and work
with there's no (no easy?) way to obtain the job number that the
shell assigns! And (for concerning your question below) you have
alway 'wait' available, for both, PIDs or job numbers (at least
in Kornshell; don't know about Bash or what POSIX says about it).

Janis
Post by Christian Weisgerber
Post by Janis Papanagnou
or if you know of any useful and sensible application contexts
for non-interactive usages I'd certainly be curious to know.[*]
I'm curious myself. That said, here's something I stumbled across
background job &
...
kill %1 # clean up
What happens if the background job has already terminated on its
own accord before we reach the kill(1)? Not much, because with job
control, the shell knows that no such job exists. If you do this
with "kill $!", you signal that PID, which no longer refers to the
intended process and may in fact have been reused for a different
process.
Kenny McCormack
2024-10-07 11:48:06 UTC
Permalink
In article <vdvi9o$1imug$***@dont-email.me>,
Janis Papanagnou <janis_papanagnou+***@hotmail.com> wrote:
...
Post by Janis Papanagnou
In contrast to '$!' that you get and work
with there's no (no easy?) way to obtain the job number that the
shell assigns!
I showed a method in an earlier post; it consists of piping the output of
"jobs -l" into an AWK script (that matches on $!). It isn't pretty, but it
works.
Post by Janis Papanagnou
And (for concerning your question below) you have
alway 'wait' available, for both, PIDs or job numbers (at least
in Kornshell; don't know about Bash or what POSIX says about it).
What annoys me is that (in bash), most, but not all, of the job control
related commands take either a pid or a job number. To be clear, what
annoys me is that they don't *all* do. In particular, "fg" only takes a
job number. "disown" takes either, which is a very good thing. Wish they
all did.
--
If Jeb is Charlie Brown kicking a football-pulled-away, Mitt is a '50s
housewife with a black eye who insists to her friends the roast wasn't
dry.
Janis Papanagnou
2024-10-07 12:54:54 UTC
Permalink
Post by Kenny McCormack
What annoys me is that (in bash), most, but not all, of the job control
related commands take either a pid or a job number. To be clear, what
annoys me is that they don't *all* do. In particular, "fg" only takes a
job number. "disown" takes either, which is a very good thing. Wish they
all did.
I think that shell's job control purpose is to make job handling
simpler than using PIDs (even though PIDs are also displayed when
a background job gets started). But, yes, a consistent interface
accepting both would be a good thing [for all shell's job control
commands]. Incidentally Bolky/Korn notes: "When a command in this
section [Job Control] takes an argument called /job/, /job/ can be
a process id." - I don't know about Bash, but Kornshell at least
seems to have done it right.

Janis
Kenny McCormack
2024-10-07 13:19:17 UTC
Permalink
In article <ve0lmv$1nkmd$***@dont-email.me>,
Janis Papanagnou <janis_papanagnou+***@hotmail.com> wrote:
...
Post by Janis Papanagnou
Incidentally Bolky/Korn notes: "When a command in this
section [Job Control] takes an argument called /job/, /job/ can be
a process id." - I don't know about Bash, but Kornshell at least
seems to have done it right.
FWIW, I just tested with the "ksh" on this system, and fg does accept a pid
argument.

So, yes, ksh seems to have gotten it right. Note that ksh seems to have
lots of different versions and forks, so I have no idea what exactly I was
testing. I assume you could test on whatever version you normally use.

So, it sounds like I actually have two possible bugs to report to the bash
maintainers:
1) The original thread topic - why the most recently launched job
doesn't (always) become the "current job" (under certain
circumstances). The problem is I haven't really identified what those
circumstances are.

2) Why fg doesn't take a pid arg. Note that a lot of the bash design
is based on features originally implemented in ksh and so they do pay
attention to how ksh does things.
--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/CLCtopics
Janis Papanagnou
2024-10-07 16:12:40 UTC
Permalink
Post by Kenny McCormack
...
Incidentally Bolsky/Korn notes: "When a command in this
section [Job Control] takes an argument called /job/, /job/ can be
Correction: the chapter was "Operating System - Job Control" (and not
the also existing chapter "Job Control").
Post by Kenny McCormack
a process id." - I don't know about Bash, but Kornshell at least
seems to have done it right.
FWIW, I just tested with the "ksh" on this system, and fg does accept a pid
argument.
So, yes, ksh seems to have gotten it right. Note that ksh seems to have
lots of different versions and forks, so I have no idea what exactly I was
testing.
Newer versions support ksh --version to get that information.
I'm used to type <Esc> <Ctrl-V> (in Vi-mode, with set -o vi
defined).
Post by Kenny McCormack
I assume you could test on whatever version you normally use.
The book I mentioned is old, so I expect that any ksh93 version
(and all derived versions) to behave that way. Moreover, there's
also not the typical remark about "availability in newer systems
only", so there's a good chance that it was already existing in
the ksh88 versions - I wouldn't bet on any early clone, though.

Janis
Helmut Waitzmann
2024-10-07 17:26:22 UTC
Permalink
Post by Christian Weisgerber
background job &
...
kill %1 # clean up
What happens if the background job has already terminated on its
own accord before we reach the kill(1)? Not much, because with job
control, the shell knows that no such job exists. If you do this
with "kill $!", you signal that PID, which no longer refers to the
intended process and may in fact have been reused for a different
process.
In order for the pid "$!" to have been reused for a different
process the shell would have needed call "wait()" (or
"waitpid()") beforehand.  (Otherwise the terminated process would
remain a zombie (i.e. an unwaited) process.)  Does the shell even
call "wait()" or "waitpid()" if given the "set" option "+b"?
Richard Harnden
2024-10-07 22:32:58 UTC
Permalink
Post by Christian Weisgerber
I'm curious myself. That said, here's something I stumbled across
background job &
...
kill %1 # clean up
What happens if the background job has already terminated on its
own accord before we reach the kill(1)? Not much, because with job
control, the shell knows that no such job exists. If you do this
with "kill $!", you signal that PID, which no longer refers to the
intended process and may in fact have been reused for a different
process.
There would have to be a very long time in your '...' part for a pid to
get reused. I guess you could ps | grep and check the command name is
what you expect.

You can 'kill -0 <pid>' (or %<job>) for 'could I signal <pid>', rather
that actually sending a signal - it returns non-zero if it can't.

If the job finishes before you wait then the process is gone, ie not
zombied, so ksh/bash must keep track of the wait status. The job is also
gone, so only wait <pid> will get you the correct exit status.
Kaz Kylheku
2024-10-08 17:37:20 UTC
Permalink
Post by Christian Weisgerber
What happens if the background job has already terminated on its
own accord before we reach the kill(1)? Not much, because with job
control, the shell knows that no such job exists.
In Unix, when a child process terminates, it does not go away. The parent
process has to call one of the wait functions like waitpid in order to "reap"
that process. It can be notified of children in this state asynchronously via
the SIGCHLD signal.

The problem of PIDs suddenly disappearing and being recycled behind the parent
process' back does not exist in the operating system.

We can imagine a shell which does nothing when a child coprocess launched with
& terminates spontaneously, so that the script /must/ use the wait command.

In that shell, the process ID of that child will remain reliably available
until that wait.

Only if the shell reaps terminated coprocesses behind the script's back, so to
speak, do you have the reuse problem.

What does POSIX say? Something between those two alternatives:

When an element of an asynchronous list (the portion of the list ended
by an <ampersand>, such as command1, above) is started by the shell, the
process ID of the last command in the asynchronous list element shall
become known in the current shell execution environment; see Shell
Execution Environment. This process ID shall remain known until:

The command terminates and the application waits for the process ID.

Another asynchronous list is invoked before "$!" (corresponding to
the previous asynchronous list) is expanded in the current execution
environment.

The implementation need not retain more than the {CHILD_MAX} most
recent entries in its list of known process IDs in the current shell
execution environment.

It's seems as if what POSIX is saying is that scripts which fire off
asynchronous jobs one after another receive automatic clean up.
A script which does not refer to the $! variable from one ampersand
job, before firing off more ampersand jobs, will not clog the system
with uncollected zombie processes. But a script which does reference $!
after launching an ampersand job (before launching another one) will not
have that process cleaned up behind its back: it takes on the responsibility
for doing the wait which recycles the PID.

Anyway, that's what I'd like to believe that the quoted passage means.
Post by Christian Weisgerber
If you do this
with "kill $!", you signal that PID, which no longer refers to the
intended process and may in fact have been reused for a different
process.
At the system call level, that's not what kill means. It means to
pass a certain signal (fatal or not, catchable or not) to the process.
Even if the signal is uncatchable and fatal, kill deos not mean
"make the target process disappear, so that its PID may be reused".

The waitpid system call will do that (in the situation when the process
is a zombie, and so subsequently the returned status indicates that
it exited, and with what exit code or on what signal).
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Loading...