Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As someone who has to occasionally modify 100+ line bash scripts written by Coworkers from Christmas Past which matched your spec in terms of what they had to do, please please just use Python (or similar).

Yes, you will have a few extra lines but it will be vastly more readable and maintainable.

And yes, I know I will get the standard the person who wrote the script did a bad job but at some point it should be okay to blame the tools instead of the workman if workmen disproportionately create worse results with a set of tools.



As someone who also has to semi-frequently modify 100+ lines bash scripts written by others, I'd suggest every serious bash scripter read the bash man page. It is much smaller than any book on Python.

For scripts written in Python, I'd use a similar argument and suggest every serious Python scripter to learn Python. As for Perl, or Ruby, or Julia, or anything really. It's just that learning bash from its man page is, IMO, much easier than learning any of those languages to the same degree.

Of course – and it goes without saying — that some things are just not suited to bash, and in those cases a suitable language/framework must be used and learned if not learned. As far as process calling, environment management, or stdio streams management are concerned, bash is better suited than (all the other languages I've tried) Python, Ruby, or Go.


The problem is most people don't want to consider themselves "serious bash scripters" but still think they can write bash, which always results in unstable and vulnerable scripts. Python on the other hand can be written by unserious python scripters and more often be at least accidentally correct. Another big issue is that the quoting, escaping and expansion rules in bash can be daunting even for serious bash scripters, with silent errors or completely different behaviour just because you forgot a : before (or was it after?) the $-

As a test, run shellcheck on any random shell-script written by these Coworkers from Christmas Past (or your own past) and it will spew serious warnings on almost every single line, run an equivalent analyzer on an equivalently unserious python file and you might in bad cases get 2 or 3 minor warnings per 100 lines.

The comparison is a bit unfair because just by getting these compiler errors and exceptions from a real language you force yourself into a more serious mental mode of programming instead of happy scripting, but that is just another argument in favor for not using bash IMO.


I agree with you, mostly, and let me point out where I don't.

1. People who don't want to consider themselves "serious bash scripts" shouldn't be writing non-trivial bash scripts unless they're okay with it turning out buggy. I agree this is a personal standards thing, and reality is often less simple and more lenient than that.

2. The "compiler errors and exceptions" that you speak of in regards to Python also have equivalents in Bash. Agreed, they're still optional and non-"serious bash scripters" don't often know of them. Which is why I make my coworkers use them when I review their code.

3. There are classes of bugs that would happen in Python (and Go, from my experience) that wouldn't happen in bash, simply because they are in areas where bash shines. At work, I've seen process management and stdio management bugs — some of which have bit us in the field — simply because the non-bash language (Go) has a weird affinity to its child processes, or it (both Python and Go) defaults to not wiring up the stdio of child processes. In the latter case, the proper thing for the developer to do was the same as with bash: read the documentation. Most of our process management code is now in bash (because it's simple and safe) and systemd (because it's thorough and absolute).


In my experience, those kinds of shell scripts get written by people who needed automate something quickly and didn't have (or at least perceive) any other language available besides maybe Perl. They're either Unix admins without much programming experience or they're part-time programmers who primarily work in some other language and just don't have the familiarity with Python or Ruby needed to write a good glue script.

But they know how to accomplish this task interactively in the shell, so scripting what they're already doing (or already know how to do) seems like the natural next step. So you wind up with an imperative shell script that's basically a long, flat sequence of commands with some logic and variables sprinkled in haphazardly as they realized they needed it.

Due to the organic way these scripts often emerge, it's not like advocating Python is an easy sell. By the time they think to consider alternatives, the shell version already exists.


I agree. And I agree with your bash 'whitelist'. I'd also add a hard blacklist for any serious string manipulation. A series of awks and seds look clever but they are quite annoying to deal with.

If it's just a bunch of cuts or tr, sure.


And it's SO MUCH MORE expensive to fork all those processes. Just use a real language, please.


"A series of awks and seds look clever but they are annoying to deal with."

Would the following be annoying for you to deal with?

   #!/bin/sh
   sed 's/#.*//' \
   | sed 's/:/#/g' \
   | cat AMD64 - \
   | ./qhasm-ops \
   | ./qhasm-regs \
   | ./qhasm-fp \
   | ./qhasm-as \
   | sed 's/%32/d/g' \
   | sed 's/%raxd/%eax/g' \
   | sed 's/%rbxd/%ebx/g' \
   | sed 's/%rcxd/%ecx/g' \
   | sed 's/%rdxd/%edx/g' \
   | sed 's/%rsid/%esi/g' \
   | sed 's/%rdid/%edi/g' \
   | sed 's/%rbpd/%ebp/g'
where qhasm-as and qhasm-fp are each awk scripts (222 and 427 lines, respectively).

source: http://cr.yp.to/qhasm/qhasm-20061116.tar.gz qhasm-20061116/qhasm-amd64


i'd argue that it's much easier to write bad Bash than bad Python


Assuming that "bad" doesn't mean merely "ugly to glace at." I find that depends largely on the problem at hand.


I think this might help others, but ShellCheck[0] is a good place to start to help eliminate poor shell scripting.

And I would make an argument though that even large shell scripts in bash have their place.

I often write scripts in either Node or Python, but only when I need things bash is bad about (any sort of proper data structure beyond strings or arrays).

But there are just so many things bash makes insanely easy, especially with operating on files and directories.

And functions that are used as completions or need access to aliases or functions in the current process are also better in bash.

I wish there were a scripting language like bash, but enhanced with at least some hash maps and proper array manipulation, and maybe some formal IPC to allow scripts to request info from the parent process.



Whoops! Thanks for that


My favorite advice was from a search giant's dev infra engineer who said that "any Python script over 100 lines should be rewritten in Bash, because at least that way you're not kidding yourself into thinking it's production quality"


> As someone who has to occasionally modify 100+ line bash scripts written by Coworkers from Christmas Past which matched your spec in terms of what they had to do, please please just use Python (or similar).

As someone who has inherited thousand-line shell scripts, and had to debug many 3rd party scripts, I stand by my assertion.

> Yes, you will have a few extra lines but it will be vastly more readable and maintainable.

Readability is important but it's not the only aspect to maintainability, nor is maintainability to sole concern of a tool. A low bug rate helps maintainability and actually having the features you need, in an acceptable timeframe, is also important.

For example, the OP mentioned the 'set -e' option that causes the script to exit if any command returns a non-zero exit code. In Python, you'd either have to remember to check the return code for every subprocess or define a wrapper, which adds complexity, reducing readability and can lead to bugs and errors. Nor is Python always the best answer for readability anyway. In many cases, it's not like it's just a few lines you're saving. Here are some functions I've used when scripting in Python

    import subprocess, shlex
    def process_run(cmd_string, stdin=None):
        return subprocess.Popen(shlex.split(cmd_string),
                                stdin=stdin,
                                stdout=subprocess.PIPE,
                                stderr=subprocess.PIPE)
    
    def process_results(process_object):
        (stdout, stderr)=process_object.communicate()
        return (process_object.returncode, stdout, stderr)
    
    def process(cmd_string, stdin=None):
        return process_results(process_run(cmd_string, stdin=stdin))
It's 10 lines of boilerplate to set up an approximation of behavior that is trivial to achieve any shell language. There's actually 7 more functions I use to handle different common subprocess execution patterns. For example, the "stdin" in that process_run function needs to be a filehandle (at least in Python 2.7, I'm not sure about python 3). To pass a string to standard input you'll need something like this:

    f=SpooledTemporaryFile()
    f.write(stdin_string)
    f.seek(0)
    results=process(cmd_string, stdin=f)
    f.close()
    return results
> And yes, I know I will get the standard the person who wrote the script did a bad job but at some point it should be okay to blame the tools instead of the workman if workmen disproportionately create worse results with a set of tools.

Actually what I'd say first is that it's quite possible the person writing the script knew what they were doing. I've inherited bad code in my life, I've inherited some real gems, and I've inherited a lot of code in between. One thing I've learned is that I tend to be unfairly critical of average code. It's hard to read unfamiliar code and easy to criticize inconvenient design choices when you have to adapt their code to some new problem that they never anticipated. Usually I'll be better off just buckling down and untangling the spaghetti.


subprocess.run does exactly what your wrapper does, subprocess.check_output returns stdout only and automatically throws exception on non-zero return code, this is the function you should be using 99% of the time. Those functions both accept a string as stdin-parameter.


Subprocess.run is Python 3+ only. If we're talking about replacing bash, Python 2.7 (possibly with 2.6 compatibility) is the more reasonable target. CentOS 7 and Debian 8 (I've not used 9 yet) still ship with Python 2.7.

Also, who is to say what I should be using "99% of the time?" Each problem has different constraints and different priorities.


Please have a look at https://pythonclock.org/ and stop riding dead horses.


The domain under discussion is scripting and specifically comparisons with Bash. Python 3 has not achieved anywhere close to the platform deployment that Python 2.7 has. When the common OS distributions you're likely to need to script on ship with Python 3 as the default rather than python 2.7, we can start using Python 3 in random comparisons with shell scripts. Until then, Python 2.7 is the language for comparison no matter what rhetoric you want to employ.


Most distros ship with python3 (though `python` will refer to python2).

Is there a reason you cannot just say `#!/usr/bin/env python3` in your scripts? I don't see why you require python3 to be the default, am I missing something here?


The first OS I've used that includes Python3 in the base install is Debian 8(Jessie) and I no longer use Debian in production. CentOS 7 does not include Python3 in the base install. You can install it, sure, but why bother when you can just use the python that is already there? Or better yet, /bin/bash...

Again, in the context of this discussion, the whole argument is yet another point in shell's favor. There's no major backwards-incompatible change in the language. With Bash, you just decide whether POSIX compliance is something you need, and that's basically it. Both versions are still supported and no one interrupts discussions to announce that beatings will continue until morale improves whenever the deprecated version of Python comes up.


If you are using a ten year old software and can’t install a package, that’s your problem. Quit acting like it is the default situation.

Or you could put the wrapper functions in a module and call it a day, either way problem solved.


He's not acting like it's the default situation. For the distributions he mentioned, it is the default situation.


problem is that it is difficult to anticipate; It may look like a simple 'glue the commands together' task, but it may turn out to be more tricky.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: