Am I crazy in thinking that bash is good enough for production?
This may make some people pull their hair out, but I’d love to hear some arguments. I’ve had the impression that people really don’t like bash, not from here, but just from people I’ve worked with.
There was a task at work where we wanted something that’ll run on a regular basis, and doesn’t do anything complex aside from reading from the database and sending the output to some web API. Pretty common these days.
I can’t think of a simpler scripting language to use than bash. Here are my reasons:
Reading from the environment is easy, and so is falling back to some value; just do ${VAR:-fallback}; no need to write another if-statement to check for nullity. Wanna check if a variable’s set to something expected? if [[ <test goes here> ]]; then <handle>; fi
Reading from arguments is also straightforward; instead of a import os; os.args[1] in Python, you just do $1.
Sending a file via HTTP as part of an application/x-www-form-urlencoded request is super easy with curl. In most programming languages, you’d have to manually open the file, read them into bytes, before putting it into your request for the http library that you need to import. curl already does all that.
Need to read from a curl response and it’s JSON? Reach for jq.
Instead of having to set up a connection object/instance to your database, give sqlite, psql, duckdb or whichever cli db client a connection string with your query and be on your way.
Shipping is… fairly easy? Especially if docker is common in your infrastructure. Pull Ubuntu or debian or alpine, install your dependencies through the package manager, and you’re good to go. If you stay within Linux and don’t have to deal with differences in bash and core utilities between different OSes (looking at you macOS), and assuming you tried to not to do anything too crazy and bring in necessary dependencies in the form of calling them, it should be fairly portable.
Sure, there can be security vulnerability concerns, but you’d still have to deal with the same problems with your Pythons your Rubies etc.
For most bash gotchas, shellcheck does a great job at warning you about them, and telling how to address those gotchas.
There are probably a bunch of other considerations but I can’t think of them off the top of my head, but I’ve addressed a bunch before.
So what’s the dealeo? What am I missing that may not actually be addressable?
What gave you the impression that this was just for development? Bash is widely used in production environments for scripting all over enterprises. The people you work with just don’t have much experience at lots of shops I would think.
It’s just not wise to write an entire system in bash. Just simple little tasks to do quick things. Yes, in production. The devops world runs on bash scripts.
We are not taking about use of Bash in dev vs use Bash in production. This is imho incorrect question that skirts around the real problem in software development. We talk about use of Bash for simple enough tasks where code is rarely changed ( if not written once and thrown away ) and where every primitive language or DSL is ok, where when it comes to building of medium or complex size software systems where decomposition, complex data structures support, unit tests, error handling, concurrency, etc is a big of a deal - Bash really sucks because it does not allow one to deal with scaling challenges, by scaling I mean where you need rapidly change huge code base according changes of requirements and still maintain good quality of entire code. Bash is just not designed for that.
Honestly, if a script grows to more than a few tens of lines I'm off to a different scripting language because I've written enough shell script to know that it's hard to get right.
Shellcheck is great, but what's greater is a language that doesn't have as many gotchas from the get go.
One thing that I don't think anyone else has mentioned is data structures. Bash does have arrays and hashmaps at least but I've found that working with them is significantly more awkward than in e.g. python. This is one of several reasons for why bash doesn't scale up well, but sure for small enough scripts it can be fine (if you don't care about windows)
Just make certain the robustness issues of bash do not have security implications. Variable, shell, and path evalutions can have security issues depending on the situation.
I've worked in bash. I've written tools in bash that ended up having a significant lifetime.
Personally, you lost me at
reading from the database
Database drivers exist for a reason. Shelling out to a database cli interface is full of potential pitfalls that don't exist in any language with a programmatic interface to the database. Dealing with query parameterization in bash sounds un-fun and that's table stakes, security-wise.
Same with making web API calls. Error handling in particular is going to require a lot of boilerplate code that you would get mostly for free in languages like Python or Ruby or Go, especially if there's an existing library that wraps the API you want to use in native language constructs.
I'm afraid your colleagues are completely right and you are wrong, but it sounds like you genuinely are curious so I'll try to answer.
I think the fundamental thing you're forgetting is robustness. Yes Bash is convenient for making something that works once, in the same way that duct tape is convenient for fixes that work for a bit. But for production use you want something reliable and robust that is going to work all the time.
I suspect you just haven't used Bash enough to hit some of the many many footguns. Or maybe when you did hit them you thought "oops I made a mistake", rather than "this is dumb; I wouldn't have had this issue in a proper programming language".
The main footguns are:
Quoting. Trust me you've got this wrong even with shellcheck. I have too. That's not a criticism. It's basically impossible to get quoting completely right in any vaguely complex Bash script.
Error handling. Sure you can set -e, but then that breaks pipelines and conditionals, and you end up with really monstrous pipelines full of pipefail noise. It's also extremely easy to forget set -e.
General robustness. Bash silently does the wrong thing a lot.
instead of a import os; os.args[1] in Python, you just do $1
No. If it's missing $1 will silently become an empty string. os.args[1] will throw an error. Much more robust.
Sure, there can be security vulnerability concerns, but you’d still have to deal with the same problems with your Pythons your Rubies etc.
Absolutely not. Python is strongly typed, and even statically typed if you want. Light years ahead of Bash's mess. Quoting is pretty easy to get right in Python.
I actually started keeping a list of bugs at work that were caused directly by people using Bash. I'll dig it out tomorrow and give you some real world examples.
As I've matured in my career, I write more and more bash. It is absolutely appropriate for production in the right scenarios. Just make sure the people who might have to maintain it in the future won't come knocking down your door with torches and pitchforks...
I'm fine with bash for ci/cd activities, for what you're talking about I'd maybe use bash to control/schedule running of a script in something like python to query and push to an api but I do totally get using the tools you have available.
I use bash a lot for automation but PowerShell is really nice for tasks like this and has been available in linux for a while. Seen it deployed into production for more or less this task, grabbing data from a sql server table and passing to SharePoint. It's more powerful than a shell language probably needs to be, but it's legitimately one of the nicer products MS has done.
End of the day, use the right tool for the job at hand and be aware of risks. You can totally make web requests from sql server using ole automation procedures, set up a trigger to fire on update and send data to an api from a stored proc, if I recall there's a reason they're disabled by default (it's been a very long time) but you can do it.
Run checkbashisms over your $PATH (grep for #!/bin/sh). That's the problem with Bash. #!/bin/sh is for POSIX compliant shell scripts only, use #!/bin/bash if you use bash syntax.
Over the last ten - fifteen years, I've written lots of scripts for production in bash. They've all served their purposes (after thorough testing) and not failed. Pretty sure one of my oldest (and biggest) is called temporary_fixes.sh and is still in use today. Another one (admittedly not in production) was partially responsible for getting me my current job, I guess because the interviewers wanted to see what kind of person would solve a coding challenge in bash.
However, I would generally agree that - while bash is good for many things and perhaps even "good enough" - any moderately complex problem is probably better solved using a different language.
Pretty much all languages are middleware, and most of the original code was shell/bash. All new employees in platform/devops want to immediately push their preferred language, they want java and rust environments. It's a pretty safe bet if they insist on using a specific language; then they don't know how awk or sed. Bash has all the tools you need, but good developers understand you write libraries for functionality that's missing. Modern languages like Python have been widely adopted and has a friendlier onboarding and will save you time though.
Saw this guy's post in another thread, he's strawmanning because of lack of knowledge.
Wanna check if a variable’s set to something expected? if [[ <test goes here> ]]; then <handle>; fi
Hey, you can't just leave out "test goes here". That's worst part by a long shot.
The rest of the syntax, I will have to look up every time I try to write it, but at least I can mostly guess what it does when reading. The test syntax on the other hand is just impossible to read without looking it up.
I also don't actually know how to look that up for the double brackets, so that's fun. For the single bracket, it took me years to learn that that's actually a command and you can do man [ to view the documentation.
I deeply despise bash (edit: this was hyperbole. I also deeply appreciate bash, as is appropriate for something that has made my life better for free!). That Linux shell defaults settled on it is an embarrassment to the entire open source community. (Edit: but Lexers and Parsers are hard! You don't see me fixing it, so yes, I'll give it a break. I still have to be discerning for production use, of course.)
Yes, Bash is good enough for production. It is the world's current default shell. As long as we avoid it's fancier features (which all suck for production use), a quick bash script is often the most reasonable choice.
For the love of all that is holy, put your own personal phone number and no one else's in the script, if you choose to use bash to access a datatbase. There's thousands of routine ways that database access can hiccup, and bash is suitable to help you diagnose approximately 0% of them.
If I found out a colleague had used bash for database access in a context that I would be expected to co-maintain, I would start by plotting their demise, and then talk myself down to having a severe conversation with them - after I changed it immediately to something else, in production, ignoring all change protocols. (Invoking emergency change protocols.)
Edit: I can't even respond to the security concerns aspect of this. Choice of security tool affects the quality of protection. In this unfortunate analogy, Bash is "the pull out method". Don't do that anywhere that it matters, or anywhere that one can be fired for security violations.
(Edit 2: Others have mentioned invoking SQL DB cleanup scripts from bash. I have no problem with that. Letting bash or cron tell the DB and a static bit of SQL to do their usual thing has been fine for me, as well. The nightmare scenario I was imagining was bash gathering various inputs to the SQL and then invoking them. I've had that pattern blow up in my face, and had a devil of a time putting together what went wrong. It also comes with security concerns, as bash is normally a completely trusted running environment, and database input often come from untrusted sources.)
In your own description you added a bunch of considerations, requirements of following specific practices, having specific knowledge, and a ton of environmental requirements.
For simple scripts or duck tape schedules all of that is fine. For anything else, I would be at least mindful if not skeptical of bash being a good tool for the job.
Bash is installed on all linux systems. I would not be very concerned about some dependencies like sqlite, if that is what you're using. But very concerned about others, like jq, which is an additional tool and requirement where you or others will eventually struggle with diffuse dependencies or managing a managed environment.
Even if you query sqlite or whatever tool with the command line query tool, you have to be aware that getting a value like that into bash means you lose a lot of typing and structure information. That's fine if you get only one or very few values. But I would have strong aversions when it goes beyond that.
You seem to be familiar with Bash syntax. But others may not be. It's not a simple syntax to get into and intuitively understand without mistakes. There's too many alternatives of if-ing and comparing values. It ends up as magic. In your example, if you read code, you may guess that :- means fallback, but it's not necessarily obvious. And certainly not other magic flags and operators.
As an anecdote, I guess the most complex thing I have done with Bash was scripting a deployment and starting test-runs onto a distributed system (and I think collecting results? I don't remember). Bash was available and copying and starting processes via ssh was simple and robust enough. Notably, the scope and env requirements were very limited.
May I introduce you to rust script? Basically a wrapper to run rust scripts right from the command line. They can access the rust stdlib, crates, and so on, plus do error handling and much more.