Skip Navigation

Is it better to start from scratch rather than refactoring?

Refactoring gets really bad reviews, but from where I'm sitting as a hobby programmer in relative ignorance it seems like it should be easier, because you could potentially reuse a lot of code. Can someone break it down for me?

I'm thinking of a situation where the code is ugly but still legible here. I completely understand that actual reverse engineering is harder than coding on a blank slate.

23 comments
  • I'm almost always of the opinion that refactoring is better than a rewrite as long as the tech stack is supportable.

    Everyone wants to rewrite stuff, because the old system is 'needlessly complicated'. 90% of the time though, they end up finding it was complicated for a reason and it all ends up going back in. It does allow a system to be written with the full knowledge of its scope though, instead of an old system that has been repeatedly bodjed and expanded. Finally, if your old tech stack is unsupportable (not just uncool, unsupportable) then it can be the most feasible way. It will take ages though with no/little return until it's all finished.

    Refactoring is more difficult, as developers need to understand the existing codebase more to be able to safely upgrade it in situ. It does mean you can get continuous improvement through the process though as you update things bit by bit. You do need to test that each change doesn't have unexpected impact though, and this can be difficult to do in badly written systems.

    Most Devs hate working on other people's code though, so prefer rewrites.

    (Ran out of time to go into more detail)

  • Refactoring is always easier.

    Both methods really need acceptance tests before you can start, but with refactoring you at least have a working system for the duration of development.

  • Responding as a java/kotlin maintainer of one single large system with frequent requirement changes. what i call "high entropy" programs. Other developers have different priorities and may answer differently based on what kind of system they work in, and their answers are also valid, but you do need to care about what kind of systems they work on when you decide whether or not to follow their advice.

    In my experience, if the builder of the original system didn't care about maintainability, then it's probably faster to rewrite it.

    Of course, then you'd have to be able to tell what maintainable code looks like, which is the tricky part, but includes things like,

    • Interfaces
    • Dependency injection
    • Avoidance of static or const functions
    • Avoidance of "indirect recursion" or what I call spaghetti jank that makes class internals really hard to understand.
    • Class names indicate design patterns being used. Such as "Facade". This indicates that the original builder was doing some top-down software design in an effort to write maintainable code.
    • Data has one, and only one, source of truth. A lot of refactoring pain comes from trying to align multiple sources of truth, since disgreements cause mayhem to the program state.

    Bad signs:

    • Oops, all concrete classes.
    • Inheritance. You get one Base Class, and only one, before you should give the code the death glare. Its extremely difficult for a programmer to be able to tell a true "is a" relationship from a false one. For starters you have to have rock solid class definitions to start with. If the presence of Inheritance smells like the original builder was only using it to save time building the feature, burn it with fire! Its anti-maintainable.
    • Too much organizing - you have to open 20 files to find out what one algorithm does. That's a sign that the original builder didn't know the difference between organizing for organizing sake and keeping code together that changes together.
    • Too little organizing - the original builder shoved eveything into one God class so they could use a bunch of global variables. You'd probably have a hard time replacing a component so big. Also, it probably won't let you replace parts of itself - this style forces you to burn down the whole thing to make a change.
    • Multiple sources of truth for data: classes that keep their own copies of data as member variables are a prime example of this kind of mistake.
  • Neither.

    If you can code it in a week (1), start from scratch. You'll have a working prototype in a month, then can decide whether it was worth the effort.

    If it's a larger codebase, start by splitting it into week-sized chunks (1), then try rewriting them one by one. Keep a good test coverage, linked to particular issues, and from time to time go over them to see what can be trimmed/deprecated.

    Legible code should not require "reverse engineering", there should be comments linking to issues, use cases, an architecture overview, and so on. If you're lacking those, start there, no matter which path you pick.

    (1) As a rule of thumb, starting from scratch you can expect a single person to write 1 clean line of code per minute on a good day. Keep those week-sized chunks between 1k and 10k lines, if you don't want nasty surprises.

23 comments