Discussion:
Which shell and how to get started handling arguments
(too old to reply)
James Harris
2024-04-15 12:22:14 UTC
Permalink
For someone who is relatively new to Unix shell scripting (me) some
advice would be more than welcome on where to begin.

I have two main queries:


Q1) How can one write a script which is maximally compatible with
different systems?

I am thinking to write in /the language of/ the Bourne shell, if
feasible, so that it could be run by either the Bourne shell or Bash,
etc? (Ideally, the shebang line would be #!/bin/sh.)

Or is Bash now so universal that there's no point any longer in writing
for anything else?


Q2) How does one go about handling arguments in preferably a simple but
universal way?

My first idea was to iterate over arguments with such as

while [ $# -gt 0 ]
do
...
shift
done;

and from that to (a) note any switches and (b) build up an array of
positional parameters. However, I gather the Bourne shell has no arrays
(other than the parameters themselves) so that won't work.

I read up on getopts but from tests it seems to require that switches
precede arguments rather than allowing them to be specified after, so
that doesn't seem very good, either.


Online tutorials show different ways to handle this and few talk about
which shell to use for this case so I thought I would ask you guys for
suggestions.

My requirement just now is, in fact, so simple that I don't need a
universal way to handle things but ISTM best to start with an approach
that will scale over time, if there is one.

So any guidance on how to get started would be appreciated!
--
James Harris
Janis Papanagnou
2024-04-15 14:03:34 UTC
Permalink
Post by James Harris
For someone who is relatively new to Unix shell scripting (me) some
advice would be more than welcome on where to begin.
Q1) How can one write a script which is maximally compatible with
different systems?
There's various grades of "portable". These days I would - when
striving for portability - use the _POSIX_ features as base of
your programming. That means; not Bourne shell. If you don't
want to learn what's defined in POSIX you should probably use a
shell that most closely resembles the POSIX subset, maybe 'dash'.
Or get the book from Bolsky/Korn that has an appendix with the
feature comparisons Bourne/POSIX/ksh88/ksh93/...

Personally, I have a less restricted view on "portability". I
don't want to miss the modern features, specifically those that
can all be found in the prominent shells (ksh, zsh, bash). From
those I'd pick what will be available in your systems' contexts.
On Linux you usually have all these shells available, but in a
commercial context (from those three shells) you may have only
ksh available. One "problem" with those shells is that each will
provide own features that the other two shells won't support. So
either you'll have to spend some time learning the differences
or if you write your scripts run them through all three shells
to let the shell provide the information.

There's other factor like execution speed (ksh), consistent new
[non-standard] concepts (zsh), large community (bash), that may
influence your decision.

Personally I use Kornshell which has the richest feature set and
is the fastest; specifically Martijn Dekker's branch (of the
original AT&T) "ksh93u+m". I try to use mostly features also
available in bash and zsh, but wouldn't take that too strict.
Post by James Harris
I am thinking to write in /the language of/ the Bourne shell, if
feasible, so that it could be run by either the Bourne shell or Bash,
etc? (Ideally, the shebang line would be #!/bin/sh.)
Regularly "/bin/sh" nowadays refers to a POSIX shell.

But the '#!' line's purpose is to define any available interpreter.
Just be sure that you specify the shell language used whenever
deviating from POSIX features.
Post by James Harris
Or is Bash now so universal that there's no point any longer in writing
for anything else?
See above.
Post by James Harris
Q2) How does one go about handling arguments in preferably a simple but
universal way?
My first idea was to iterate over arguments with such as
while [ $# -gt 0 ]
do
...
shift
done;
and from that to (a) note any switches and (b) build up an array of
positional parameters. However, I gather the Bourne shell has no arrays
(other than the parameters themselves) so that won't work.
Arrays can be populated by the argument list in one go, e.g. by

a=( "$@" )

(but that may not be what you want).
Post by James Harris
I read up on getopts
It's the right tool.
Post by James Harris
but from tests it seems to require that switches
precede arguments rather than allowing them to be specified after, so
that doesn't seem very good, either.
That's the usual convention, first come the options (with optional
arguments), then the non-option arguments. Here's a syntax example

yagol [-s] [-w width] [-h height] [-g[ngen]] [-d density]
[-i infile] [-o outfile] [-r random-seed] [-u rule]
[-k|-t[sec]|-l|-f] [-p|-n|-c] [-a[gen]] [-m[rate]]

(this one just with options and no further arguments).

Of course it uses ksh's getopts - but note that ksh's getopts is
not portable - because it simplifies processing a lot (and it also
implies a usage and help information).
Post by James Harris
Online tutorials show different ways to handle this and few talk about
which shell to use for this case so I thought I would ask you guys for
suggestions.
My clear getopts favorite (but not only with this featute) is the
original AT&T Kornshell (in form of above mentioned "u+m" version).
Post by James Harris
My requirement just now is, in fact, so simple that I don't need a
universal way to handle things but ISTM best to start with an approach
that will scale over time, if there is one.
This is an excellent thought; it's very typical that you start
with trivial samples, and then (own and foreign) demands come up
to extend it, and at some point you have to do some refactoring
(unless you started with a more general approach). (Above yagol
example also started with only four options.)
Post by James Harris
So any guidance on how to get started would be appreciated!
Hope it helps. - Feel free to come back with more questions.

Janis
Janis Papanagnou
2024-04-15 14:34:41 UTC
Permalink
Post by Janis Papanagnou
Post by James Harris
I read up on getopts
It's the right tool.
Post by James Harris
but from tests it seems to require that switches
precede arguments rather than allowing them to be specified after, so
that doesn't seem very good, either.
That's the usual convention, first come the options (with optional
arguments), then the non-option arguments. Here's a syntax example
yagol [-s] [-w width] [-h height] [-g[ngen]] [-d density]
[-i infile] [-o outfile] [-r random-seed] [-u rule]
[-k|-t[sec]|-l|-f] [-p|-n|-c] [-a[gen]] [-m[rate]]
Please ignore that example (I forgot it's C code using 'getopt()'
from the GNU C library). The features differ in some ways.

For ksh's 'getopts' type in a ksh terminal 'getopts --man' to get a
more extensive manual information than that you find in 'man ksh'.

Janis
Christian Weisgerber
2024-04-15 13:35:40 UTC
Permalink
Post by James Harris
Q1) How can one write a script which is maximally compatible with
different systems?
I am thinking to write in /the language of/ the Bourne shell, if
feasible, so that it could be run by either the Bourne shell or Bash,
etc? (Ideally, the shebang line would be #!/bin/sh.)
Yes. POSIX shell, more specifically. That is the easy part. The
difficult part is that your script will likely call various external
commands and those have a lot of variation as well.
Post by James Harris
Q2) How does one go about handling arguments in preferably a simple but
universal way?
That's too vague...
Post by James Harris
I read up on getopts
If you want to handle option flags, getopts is the way to go.
Post by James Harris
but from tests it seems to require that switches precede arguments
rather than allowing them to be specified after, so that doesn't
seem very good, either.
But that's the way Unix commands work. You cannot specify flags
after the first non-flag argument.

$ touch foo -l
$ ls foo -l
-l foo
$ ls -l foo -l
-rw-r--r-- 1 naddy naddy 0 Apr 15 15:28 -l
-rw-r--r-- 1 naddy naddy 0 Apr 15 15:28 foo

Apparently GNU implementations deviate from this, which makes for
a bad surprise and is incompatible with other implementations as
well as historical practice.
--
Christian "naddy" Weisgerber ***@mips.inka.de
Helmut Waitzmann
2024-04-15 21:03:52 UTC
Permalink
Post by Christian Weisgerber
Post by James Harris
Q1) How can one write a script which is maximally compatible
with different systems?
I am thinking to write in /the language of/ the Bourne shell,
if feasible, so that it could be run by either the Bourne shell
or Bash, etc? (Ideally, the shebang line would be #!/bin/sh.)
Yes. POSIX shell, more specifically. That is the easy part. The
difficult part is that your script will likely call various external
commands and those have a lot of variation as well.
Post by James Harris
Q2) How does one go about handling arguments in preferably a
simple but universal way?
That's too vague...
Post by James Harris
I read up on getopts
If you want to handle option flags, getopts is the way to go.
Post by James Harris
but from tests it seems to require that switches precede
arguments rather than allowing them to be specified after, so
that doesn't seem very good, either.
But that's the way Unix commands work. You cannot specify flags
after the first non-flag argument.
$ touch foo -l
$ ls foo -l
-l foo
$ ls -l foo -l
-rw-r--r-- 1 naddy naddy 0 Apr 15 15:28 -l
-rw-r--r-- 1 naddy naddy 0 Apr 15 15:28 foo
Apparently GNU implementations deviate from this, which makes for
a bad surprise and is incompatible with other implementations as
well as historical practice.
To handle this deviation, always put an end‐of‐flags marker
("--", as specified by POSIX) before the first non‐flag argument,
then even the GNU implementations will well‐behave, i. e. behave
as specified by POSIX:


Compare (using GNU ls) with Christians well‐behaving "ls":


touch -- foo -l

$ ls foo -l
-rw------- 1 helmut helmut 0 Apr 15 15:28 foo

which deviates from POSIX,



$ ls -l foo -l
-rw------- 1 helmut helmut 0 Apr 15 15:28 foo

which deviates from POSIX,



$ ls -- foo -l
-l
foo

which behaves as specified by POSIX,



$ ls -l -- foo -l
-rw------- 1 helmut helmut 0 Apr 15 15:28 -l
-rw------- 1 helmut helmut 0 Apr 15 15:28 foo

which behaves as specified by POSIX.
Kaz Kylheku
2024-04-16 01:14:38 UTC
Permalink
Post by Helmut Waitzmann
touch -- foo -l
$ ls foo -l
-rw------- 1 helmut helmut 0 Apr 15 15:28 foo
touch deviates in the first place; omit the -- and you get

$ touch foo -l
touch: invalid option -- 'l'

That's crazy. foo is a non-option argument, so the options
have ended at that point.

I see where it is documented in "2 Common options" (Coreutils manual):

Normally options and operands can appear in any order, and programs act
as if all the options appear before any operands. For example, ‘sort -r
passwd -t :’ acts like ‘sort -r -t : passwd’, since ‘:’ is an
option-argument of -t. However, if the POSIXLY_CORRECT environment
variable is set, options must appear before operands, unless otherwise
specified for a particular command.

It is disingenous to call it "POSIXly correct", because in fact the
POSIX rules are how everyone understands it and how other implementors
of utilities implement it. (Does anyone else do this crazy thing?)

If all the vendors feature a given extension, so that it is portable,
but POSIX refuses to adopt it, then, sure: the mode which takes the
extension away can be flippantly called "POSIXly correct".

Also the claim "options must appear before operands [in POSIX]" is
misleading, because "must" is usually interpreted as an imposed
requirement, which can be violated and diagnosed. But in fact it is
*logically* impossible for options to appear elsewhere because arguments
that look like options placed in the non-option part of the command line
are operands. It's the logical "must", not the reuqirements "must".
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Helmut Waitzmann
2024-04-16 20:23:46 UTC
Permalink
Post by Kaz Kylheku
Post by Helmut Waitzmann
touch -- foo -l
$ ls foo -l
-rw------- 1 helmut helmut 0 Apr 15 15:28 foo
touch deviates in the first place; omit the -- and you get
$ touch foo -l
touch: invalid option -- 'l'
That's crazy. foo is a non-option argument, so the options
have ended at that point.
Yes, that's the same problem like with GNU "ls".  (I decided to
silently avoid it by making use of the end‐of‐option marker
without commenting it.)
Post by Kaz Kylheku
I see where it is documented in "2 Common options" (Coreutils
Normally options and operands can appear in any order, and
programs act as if all the options appear before any operands.
passwd’, since ‘:’ is an option-argument of -t. However, if
the POSIXLY_CORRECT environment variable is set, options must
appear before operands, unless otherwise specified for a
particular command.
If I understand the last sentence correctly, setting the
POSIXLY_CORRECT environment variable forces the GNU utilities to
stop option processing before the first non‐option argument
(i. e. an argument beginning not with a "-") as if that argument
had been preceded by the end‐of‐option argument "--".
Post by Kaz Kylheku
It is disingenous to call it "POSIXly correct", because in fact
the POSIX rules are how everyone understands it and how other
implementors of utilities implement it. (Does anyone else do
this crazy thing?)
If all the vendors feature a given extension, so that it is
portable, but POSIX refuses to adopt it, then, sure: the mode
which takes the extension away can be flippantly called "POSIXly
correct".
Maybe my knowledge of the English language is not good enough to
understand you correctly.  Under the premise, that the
POSIXLY_CORRECT environment variable has been set, do you see the
GNU utilities behaving in any different way from what is the
behavior specified by POSIX?
Post by Kaz Kylheku
Also the claim "options must appear before operands [in POSIX]"
is misleading, because "must" is usually interpreted as an
imposed requirement, which can be violated and diagnosed. But in
fact it is *logically* impossible for options to appear
elsewhere because arguments that look like options placed in the
non-option part of the command line are operands. It's the
logical "must", not the reuqirements "must".
The GNU manual does not say "in POSIX" but says "if the
POSIXLY_CORRECT environment variable is set", but otherwise I
agree with you.  This is, why I recommend to not make use of the
GNU behavior of looking for options right of the first non‐option
operand when neither the end‐of‐option marker is used nor the
POSIXLY_CORRECT environment variable has been set, but rather
always put any options (if present) first, then – regardless of
whether any options are actually present – in any case supply the
end‐of‐option marker ("--") and finally any non‐option operands. 
This way both, POSIX utilities and GNU utilities, will behave the
same, even, if the POSIXLY_CORRECT environment variable happens
not to be set.
Ben Bacarisse
2024-04-15 14:45:25 UTC
Permalink
For someone who is relatively new to Unix shell scripting (me) some advice
would be more than welcome on where to begin.
Q1) How can one write a script which is maximally compatible with different
systems?
Use only the features described for POIX sh.
I am thinking to write in /the language of/ the Bourne shell, if feasible,
so that it could be run by either the Bourne shell or Bash, etc? (Ideally,
the shebang line would be #!/bin/sh.)
The term "Bourne shell" is a little ambiguous. Many people take it to
mean "POSIX shell" but some people would go further and take it to mean
an older shell without some of the most recent things you can not rely
on.
Or is Bash now so universal that there's no point any longer in writing for
anything else?
That depends on the your audience. For Linux users, pretty much yes.
Q2) How does one go about handling arguments in preferably a simple but
universal way?
My first idea was to iterate over arguments with such as
while [ $# -gt 0 ]
do
...
shift
done;
and from that to (a) note any switches and (b) build up an array of
positional parameters. However, I gather the Bourne shell has no arrays
(other than the parameters themselves) so that won't work.
I read up on getopts but from tests it seems to require that switches
precede arguments rather than allowing them to be specified after, so that
doesn't seem very good, either.
Well that's what most people will be used to. I would want

command -o out1 file1 -z -i out2 file2

to use out1 for the first file, out2 for the second and for the -z to
apply only to the second file.

If you can accept that this is a reasonable way of working, then you can
use your previously written loop. Every non-flag argument is just
processed from inside the loop at the point it is seen.

If you have to save them for later, you could consider building a
string of saved arguments using an "unlikely" separator string:

#!/bin/sh

args=""
sep='
'
while [ $# -gt 0 ]
do
case "$1" in
-*) echo flag: $1
;;
*) args="$1$sep$args"
;;
esac
shift
done
while [ -n "${args}" ]
do
echo "Arg is '${args%%$sep*}'"
args="${args#*$sep}"
done

This reverses the order. You can preserve the order with slightly
different string fiddling.
--
Ben.
Lew Pitcher
2024-04-15 15:06:31 UTC
Permalink
Post by James Harris
For someone who is relatively new to Unix shell scripting (me) some
advice would be more than welcome on where to begin.
Q1) How can one write a script which is maximally compatible with
different systems?
As others have said, write your script to the POSIX shell language
standards. (see
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html)

Most shells support this restricted dialect.
Post by James Harris
Q2) How does one go about handling arguments in preferably a simple but
universal way?
The "simple but universal way" is to sequentially parse your argument list.
But, this leads to complications that may not sit with your script design,
in that you (the programmer) have to decide on whether or not you want
to impose a specific order to the argument list, and, following that
decision, how you want to handle "unflagged" arguments.

Then, there is getopts (which is /not/ a universally-supported extension
to the shell language), which will handle the argument list for you, but
with caveats and argument list order decisions that you might not agree
with.

For the most part, the "simple but universal" rule is "KISS" (Keep It Simple
& Sequential), with flags first, and non-flag arguments in a fixed order,
after the flags.


HTH
--
Lew Pitcher
"In Skills We Trust"
Christian Weisgerber
2024-04-15 15:36:09 UTC
Permalink
Post by Lew Pitcher
As others have said, write your script to the POSIX shell language
standards.
Then, there is getopts (which is /not/ a universally-supported extension
to the shell language),
It is part of POSIX sh.
--
Christian "naddy" Weisgerber ***@mips.inka.de
Kenny McCormack
2024-04-15 17:38:34 UTC
Permalink
Post by Christian Weisgerber
Post by Lew Pitcher
As others have said, write your script to the POSIX shell language
standards.
Then, there is getopts (which is /not/ a universally-supported extension
to the shell language),
It is part of POSIX sh.
It would be useful to know exactly *why* OP wants to stay "as portable as
possible". Yes, I know it is against the creed here to question such
things, but it needs to be done, nevertheless.

I say this as someone who does occasionally program in dash (Debian's
version of the "POSIX shell" paradigm), just to see and to remind myself
about how limited it is. But I would urge OP to think long and hard about
whether or not it matters. I find that when I do program in dash, I rather
quickly run into the limitations and end up regretting the choice.

For example, many things exist in both bash (my preferred shell programming
language, just in case such had not been made clear by now) and in "POSIX"
shell, but have limited functionality in the POSIX version. So, if you are
used to the full functionality you get in the bash version, you can get
bitten if you assume that the "POSIX" version has the same functionality.
I mention this specifically because you mentioned "getopts" - which does
indeed exist in both bash and dash, but I'd be willing to bet, has limited
functionality in the dash version. I never use "getopts", so I don't know
this for a fact; I am just speculating. I know it *is* true for some other
shell commands/functions.

In general, I just don't find it worth the bother to limit myself to a
crippled shell.

But, mind you, it is possible, though unlikely, that OP actually has a
good/valid reason for doing so. Mostly, I think it is just virtue
signalling.
--
When I was growing up we called them "retards", but that's not PC anymore.
Now, we just call them "Trump Voters".

The question is, of course, how much longer it will be until that term is also un-PC.
Kaz Kylheku
2024-04-15 17:57:27 UTC
Permalink
Post by Kenny McCormack
Post by Christian Weisgerber
Post by Lew Pitcher
As others have said, write your script to the POSIX shell language
standards.
Then, there is getopts (which is /not/ a universally-supported extension
to the shell language),
It is part of POSIX sh.
It would be useful to know exactly *why* OP wants to stay "as portable as
possible". Yes, I know it is against the creed here to question such
things, but it needs to be done, nevertheless.
OP probably doesn't have a good feeling for the shell script portability
landscape, and just wants their scripts to work on multiple systems,
with whatever shell is installed by default?

Wild-assed guess; but he did mention multiple systems.
Post by Kenny McCormack
I say this as someone who does occasionally program in dash (Debian's
version of the "POSIX shell" paradigm), just to see and to remind myself
about how limited it is. But I would urge OP to think long and hard about
whether or not it matters. I find that when I do program in dash, I rather
quickly run into the limitations and end up regretting the choice.
Whereas if you program in Bash, it's like you're mounted on a steed that
is galloping over vast, open plains of software engineering techniques ...
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Lew Pitcher
2024-04-15 20:37:15 UTC
Permalink
Post by Christian Weisgerber
Post by Lew Pitcher
As others have said, write your script to the POSIX shell language
standards.
Then, there is getopts (which is /not/ a universally-supported extension
to the shell language),
It is part of POSIX sh.
I did not know that. I've learned something new today.
Thanks :-)
--
Lew Pitcher
"In Skills We Trust"
Keith Thompson
2024-04-15 21:31:38 UTC
Permalink
Post by Lew Pitcher
Post by James Harris
For someone who is relatively new to Unix shell scripting (me) some
advice would be more than welcome on where to begin.
Q1) How can one write a script which is maximally compatible with
different systems?
As others have said, write your script to the POSIX shell language
standards. (see
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html)
Most shells support this restricted dialect.
Bash has an option that tells it to (attempt to) restrict itself to
POSIX semantics:

Starting Bash with the '--posix' command-line option or executing
'set -o posix' while Bash is running will cause Bash to conform more
closely to the POSIX standard by changing the behavior to match that
specified by POSIX in areas where the Bash default differs.

I haven't used this option myself, and I don't know just how closely it
actually conforms to POSIX.

I'm less familiar with ksh and zsh, but they probably have similar
options. At least the "MirBSD Korn shell" has "set -o posix".
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */
Janis Papanagnou
2024-04-16 08:19:19 UTC
Permalink
Post by Keith Thompson
Bash has an option that tells it to (attempt to) restrict itself to
Starting Bash with the '--posix' command-line option or executing
'set -o posix' while Bash is running will cause Bash to conform more
closely to the POSIX standard by changing the behavior to match that
specified by POSIX in areas where the Bash default differs.
I haven't used this option myself, and I don't know just how closely it
actually conforms to POSIX.
I'm less familiar with ksh and zsh, but they probably have similar
options. At least the "MirBSD Korn shell" has "set -o posix".
The branch ksh93u+m has it...

$ set -o | grep -i posix
posix off

But original ksh93u+ doesn't show such an option.

Janis
Christian Weisgerber
2024-04-16 11:11:16 UTC
Permalink
Post by Keith Thompson
Bash has an option that tells it to (attempt to) restrict itself to
Starting Bash with the '--posix' command-line option or executing
'set -o posix' while Bash is running will cause Bash to conform more
closely to the POSIX standard by changing the behavior to match that
specified by POSIX in areas where the Bash default differs.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This only tweaks bash's behavior where it otherwise differs from
POSIX. It does not disable the myriad extensions.

There is a tool ShellCheck that among other things can be used to
warn about unportable code in shell scripts.
https://www.shellcheck.net/

I haven't used it myself yet. It is written in Haskell, so it
suffers itself from portability concerns.
--
Christian "naddy" Weisgerber ***@mips.inka.de
Keith Thompson
2024-04-16 18:54:54 UTC
Permalink
Post by Christian Weisgerber
Post by Keith Thompson
Bash has an option that tells it to (attempt to) restrict itself to
Starting Bash with the '--posix' command-line option or executing
'set -o posix' while Bash is running will cause Bash to conform more
closely to the POSIX standard by changing the behavior to match that
specified by POSIX in areas where the Bash default differs.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This only tweaks bash's behavior where it otherwise differs from
POSIX. It does not disable the myriad extensions.
I stand corrected.
Post by Christian Weisgerber
There is a tool ShellCheck that among other things can be used to
warn about unportable code in shell scripts.
https://www.shellcheck.net/
I haven't used it myself yet. It is written in Haskell, so it
suffers itself from portability concerns.
There's also a "shellcheck" application, installable as a package on
Ubuntu and presumably many other systems. I use it regularly, almost
always on scripts with "#!/bin/bash" so it doesn't check POSIX rules.
But when invoked on a script with "#!/bin/sh" or with "--shell=sh" it
claims to warn about POSIX portability issues.

https://github.com/koalaman/shellcheck
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */
Kenny McCormack
2024-04-16 19:59:03 UTC
Permalink
Post by Keith Thompson
Post by Christian Weisgerber
Post by Keith Thompson
Bash has an option that tells it to (attempt to) restrict itself to
Starting Bash with the '--posix' command-line option or executing
'set -o posix' while Bash is running will cause Bash to conform more
closely to the POSIX standard by changing the behavior to match that
specified by POSIX in areas where the Bash default differs.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This only tweaks bash's behavior where it otherwise differs from
POSIX. It does not disable the myriad extensions.
I stand corrected.
ISTR (which is to say, I can't prove it or point to an example at the
moment), that there were some systems under some circumstances where if
bash was copied/linked as "sh" (and then run as "sh" instead of "bash"),
then it did indeed behave like a plain "POSIX" shell (i.e., extensions were
disabled).

It is to be noted that bash is an evolving (i.e., changing) program and
there is no written "standard" for it - like Perl, it is just whatever its
current maintainers makes it out to be at any particular moment. This
makes it hard to make the kind of hard-and-fast statements about it that
people in newsgroups like this one like so much to do.
--
Elect a clown, expect a circus.
Christian Weisgerber
2024-04-16 21:57:34 UTC
Permalink
Post by Kenny McCormack
ISTR (which is to say, I can't prove it or point to an example at the
moment), that there were some systems under some circumstances where if
bash was copied/linked as "sh" (and then run as "sh" instead of "bash"),
then it did indeed behave like a plain "POSIX" shell (i.e., extensions were
disabled).
From the man page:
If bash is invoked with the name sh, it tries to mimic the startup
behavior of historical versions of sh as closely as possible, while
conforming to the POSIX standard as well. [...]
When invoked as sh, bash enters posix mode after the startup files are
read.

However, it is possible to disable a lot of features at build time.
In particular, configuring bash with --enable-minimal-config produces
a much reduced feature set:

dnl a minimal configuration turns everything off, but features can be
dnl added individually
if test $opt_minimal_config = yes; then
opt_job_control=no opt_alias=no opt_readline=no
opt_history=no opt_bang_history=no opt_dirstack=no
opt_restricted=no opt_process_subst=no opt_prompt_decoding=no
opt_select=no opt_help=no opt_array_variables=no opt_dparen_arith=no
opt_brace_expansion=no opt_disabled_builtins=no opt_command_timing=no
opt_extended_glob=no opt_cond_command=no opt_arith_for_command=no
opt_net_redirs=no opt_progcomp=no opt_separate_help=no
opt_multibyte=yes opt_cond_regexp=no opt_coproc=no
opt_casemod_attrs=no opt_casemod_expansions=no opt_extglob_default=no
opt_translatable_strings=no
opt_globascii_default=yes
fi
--
Christian "naddy" Weisgerber ***@mips.inka.de
Kaz Kylheku
2024-04-16 20:45:50 UTC
Permalink
Post by Christian Weisgerber
Post by Keith Thompson
Bash has an option that tells it to (attempt to) restrict itself to
Starting Bash with the '--posix' command-line option or executing
'set -o posix' while Bash is running will cause Bash to conform more
closely to the POSIX standard by changing the behavior to match that
specified by POSIX in areas where the Bash default differs.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This only tweaks bash's behavior where it otherwise differs from
POSIX. It does not disable the myriad extensions.
"set -o posix" disabling conforming extensions would be a GCC-grade
stupidity.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @***@mstdn.ca
Keith Thompson
2024-04-16 23:38:17 UTC
Permalink
Post by Kaz Kylheku
Post by Christian Weisgerber
Post by Keith Thompson
Bash has an option that tells it to (attempt to) restrict itself to
Starting Bash with the '--posix' command-line option or executing
'set -o posix' while Bash is running will cause Bash to conform more
closely to the POSIX standard by changing the behavior to match that
specified by POSIX in areas where the Bash default differs.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This only tweaks bash's behavior where it otherwise differs from
POSIX. It does not disable the myriad extensions.
"set -o posix" disabling conforming extensions would be a GCC-grade
stupidity.
An option to disable conforming extensions, whether it's spelled "set -o
posix" or in some other way, could be very useful for people who would
like to write scripts that don't (perhaps unintentionally) depend on any
extensions.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+***@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */
Lawrence D'Oliveiro
2024-04-17 00:52:26 UTC
Permalink
There is a tool ...
Or just use a basic POSIX shell. Such things exist, you know.
... ShellCheck that among other things can be used to warn
about unportable code in shell scripts. https://www.shellcheck.net/
I don’t see any mention of “unportability”, only about “bugs”.

Just for fun, I tried this:

#!/bin/bash

collect_expand()
{
local -n arr="$2"
arr=()
coproc expander { find . -maxdepth 1 -name "$1" -print0; }
# must ensure while-loop runs in this process
while read -u ${expander[0]} -rd '' line; do
arr[${#arr[*]}]="$line"
done
wait $expander_PID
} # collect_expand

#+
# Mainline
#-

test_filenames=('file 1.dat' $'file number\n2.dat' $'file\t3 dat')
# test multiple spaces in a row and newlines, among any other odd
# things you can think of!

tmpdir=$(mktemp -d -p '' collect-work.XXXXXXXXXX)
echo "tmpdir =" $(printf %q "$tmpdir")

cd "$tmpdir"
for f in "${test_filenames[@]}"; do
echo "create" $(printf %q "$f")
touch "$f"
done

collect_expand '*dat' found_filenames
for f in "${found_filenames[@]}"; do
echo "found" $(printf %q "$f")
done

cd
rm -rfv "$tmpdir"

and it came back with

Line 9:
while read -u ${expander[0]} -rd '' line; do
^-- SC2086 (info): Double quote to prevent globbing and word splitting.

Did you mean: (apply this, apply all SC2086)
while read -u "${expander[0]}" -rd '' line; do

Line 12:
wait $expander_PID
^-- SC2154 (warning): expander_PID is referenced but not assigned.
^-- SC2086 (info): Double quote to prevent globbing and word splitting.

Did you mean: (apply this, apply all SC2086)
wait "$expander_PID"

Line 24:
echo "tmpdir =" $(printf %q "$tmpdir")
^-- SC2046 (warning): Quote this to prevent word splitting.

Line 26:
cd "$tmpdir"
^-- SC2164 (warning): Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

Did you mean: (apply this, apply all SC2164)
cd "$tmpdir" || exit

Line 28:
echo "create" $(printf %q "$f")
^-- SC2046 (warning): Quote this to prevent word splitting.

Line 33:
for f in "${found_filenames[@]}"; do
^-- SC2154 (warning): found_filenames is referenced but not assigned.

Line 34:
echo "found" $(printf %q "$f")
^-- SC2046 (warning): Quote this to prevent word splitting.

Line 37:
cd
^-- SC2164 (warning): Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

Did you mean: (apply this, apply all SC2164)
cd || exit

I would have to count every single one of those messages as spurious.
Kenny McCormack
2024-04-17 09:02:36 UTC
Permalink
Post by Lawrence D'Oliveiro
There is a tool ...
Or just use a basic POSIX shell. Such things exist, you know.
... ShellCheck that among other things can be used to warn
about unportable code in shell scripts. https://www.shellcheck.net/
I dont see any mention of unportability, only about bugs.
ShellCheck reads the #! line and figures out which flavor of shell you are
using, but this can be overridden by a command line switch. So, presumably,
you could tell it to parse your batch script as if it were plain old
"POSIX" (i.e., crippled) sh.
...
Post by Lawrence D'Oliveiro
I would have to count every single one of those messages as spurious.
Yes, ShellCheck complains about a lot of things, and most of its complaints
can and should be ignored. I still find it interesting and useful, but you
have to take (most of) what it says with a (big) grain of salt.
--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/WeekendAwayFromHome
Christian Weisgerber
2024-04-17 14:23:35 UTC
Permalink
Post by Lawrence D'Oliveiro
... ShellCheck that among other things can be used to warn
about unportable code in shell scripts. https://www.shellcheck.net/
I don’t see any mention of “unportability”, only about “bugs”.
https://www.shellcheck.net/wiki/
The long list of items also has just short of sixty about things
that are undefined in POSIX sh.
--
Christian "naddy" Weisgerber ***@mips.inka.de
Helmut Waitzmann
2024-04-15 21:49:33 UTC
Permalink
Post by James Harris
I read up on getopts but from tests it seems to require that
switches precede arguments
Yes, that's true.
Post by James Harris
rather than allowing them to be specified after, so that doesn't
seem very good, either.
The problem with specifying options after non‐option arguments is
that non‐option arguments may take any form:  They even may start
with an "-", that is, look like options even when they aren't
meant to be used as options.


So, if you have some command "some_command" with one non‐option
argument "a" followed by an option "-b"


$ some_command -a -b


and parse the arguments from left to right then there is no way
for "some_command" to investigate that "-a" is to be taken as a
non‐option argument while "-b" is to be taken as an option.


Whereas, when "some_command" expects options before any
non‐option argument and recognizes the end‐of‐options marker
("--") then there is no ambiguity:


$ some_command -b -- -a


"-b" is an option, while "-a" (because the series of options is
terminated by the end‐of‐options marker) is the (first)
non‐option argument "-a".


That's the way how the POSIX‐shell builtin "getopts" works.
Lawrence D'Oliveiro
2024-04-15 22:20:34 UTC
Permalink
Post by James Harris
I am thinking to write in /the language of/ the Bourne shell, if
feasible, so that it could be run by either the Bourne shell or Bash,
etc? (Ideally, the shebang line would be #!/bin/sh.)
There is such a thing as a standardized “POSIX shell”. On Debian, for
example, /bin/sh will launch Dash, which is a minimal POSIX-compliant
shell.

It’s certainly a safe, boring choice. ;)
Post by James Harris
Or is Bash now so universal that there's no point any longer in writing
for anything else?
This is where we get into “Unix®” the trade mark, versus “Unix” as an
informal description of a collection of traditional OS behaviour.

I say this because the only currently “Unix®” trade mark licensee still
seeing any significant use is Apple’s macOS, and that does not offer
Bash--at least, not any reasonably recent version. This is for ideological
reasons or something.

So if you are targeting “Unix” in the latter sense, then Bash is quite
widespread, yes.
Post by James Harris
I read up on getopts but from tests it seems to require that switches
precede arguments rather than allowing them to be specified after, so
that doesn't seem very good, either.
One reason for that convention is that it is possible for file/directory
names to begin with “-”. To minimize the confusion this causes, there is a
an additional common convention among command-line tools that a plain “--”
option means “don’t look for any more options after this”. That is to say,
treat the remaining items as file names (or whatever else the program does
with them), even if they begin with “-”.
Loading...