In defense of reinventing wheels

One of the first things a software engineer learns is “don’t reinvent the wheel”. If something is already made, use that instead of writing your own. “Stand on the shoulders of giants, they know what they’re doing better than you”. Writing your own tools and libraries, even when one already exists, is labelled “NIH syndrome”  and is considered quite bad.

“But what if my version is better?”. Surely, reinventing the wheel can’t be bad when your new wheel improves existing wheel designs, right? Well, not if the software is open source, which is usually the case in our industry. “Just contribute to it” you’ll be told. However, contributing to an open source project is basically teamwork. The success of any team depends on how well its members work together, which is not a given. Sometimes, your vision about the tool might be vastly different from that of the core members and it might be wiser to create your own prototype than to try and change the minds of all these people.

However, Open Source politics is not what I wanted to discuss today. It’s not the biggest potential benefit of reinventing the wheel. Minimizing overhead is. You hardly ever need 100% of a project. Given enough time to study its inner workings, you could always delete quite a large chunk of it and it would still fit your needs perfectly. However, the effort needed to do that or to rewrite the percentage you actually need is big enough that you are willing to add redundant code to your codebase.

Redundant code is bad. It still needs to get parsed and usually at least parts of it still need to be executed. Redundant code hinders performance. The more code, the slower your app. Especially when we are dealing with backend code, when every line might end up being executed hundreds or even thousands of times per second. The slower your app becomes, the bigger the need to seriously address performance. The result of that is even more code (e.g. caching stuff) that could have been saved in the first place, by just running what you need. This is the reason software like Joomla, Drupal or vBulletin is so extremely bloated and brings servers to their knees if a site becomes mildly successful. It’s the cost of code that tries to match everyone’s needs.

Performance is not the only drawback involved in redundant code. A big one is maintainability. This code won’t only need to be parsed by the machine, it will also be parsed by humans, that don’t know what’s actually needed and what isn’t until they understand what every part does. Therefore, even the simplest of changes become hard.

I’m not saying that using existing software or libraries is bad. I’m saying that it’s always a tradeoff between minimizing effort on one side and minimizing redundant code on the other side. I’m saying that you should consider writing your own code when the percentage of features you need from existing libraries is tiny (lets say less than  20%). It might not be worth carrying the extra 80% forever.

For example, in a project I’m currently working on, I needed to make a simple localization system so that the site can be multilingual. I chose to use JSON files to contain the phrases. I didn’t want the phrases to include HTML, since I didn’t want to have to escape certain symbols. However, they had to include simple formatting like bold and links, otherwise the number of phrases would have to be huge. The obvious solution is Markdown.

My first thought was to use an existing library, which for PHP is PHP Markdown. By digging a bit deeper I found that it’s actually considered pretty good and it seems to be well maintained (last update in January 2012) and mature (exists for over 2 years). I should happily use it then, right?

That’s what I was planning to do. And then it struck me: I’m the only person writing these phrases. Even if more people write translations in the future, they will still go through me. So far, the only need for such formatting is links and bold. Everything else (e.g. lists) is handled by the HTML templates. That’s literally two lines of PHP! So, I wrote my own function. It’s a bit bigger, since I also added emphasis, just in case:

function markdown($text) {
 // Links
 $text = preg_replace('@\\[(.+?)\\]\\((#.+?)\\)@', '<a href="$2">$1</a>', $text);

 // Bold
 $text = preg_replace('@(?<!\\\\)\\*(?<!\\\\)\\*(.+?)(?<!\\\\)\\*(?<!\\\\)\\*@', '<strong>$1</strong>', $text);

 // Emphasis
 $text = preg_replace('@(?<!\\\\)\\*(.+?)(?<!\\\\)\\*@', '<em>$1</em>', $text);

 return $text;
}

Since PHP regular expressions also support negative lookbehind, I can even avoid escaped characters, in the same line. Unfortunately, since PHP lacks regular expression literals, backslashes have to be doubled (\\ instead of \ so \\\\ instead of \\, which is pretty horrible).

For comparison, PHP Markdown is about 1.7K lines of code. It’s great, if you need the full power of Markdown (e.g. for a comment system) and I’m glad Michel Fortin wrote it. However, for super simple, controlled use cases, is it really worth the extra code? I say no.

Rachel Andrew recently wrote about something tangentially similar, in her blog post titled “Stop solving problems you don’t yet have“. It’s a great read and I’d advise you to read that too.

  • Anonymous

    I wholeheartedly agree with you, Lea! Instead of blindly including frameworks, libraries, plugins or boilerplates, we should focus more on if we really need them and if a specific problem in a project could be solved with a few lines of code instead. It’s so hard to find stripped-down, focused code. Instead developers tend to get influenced too much by other’s people feedback and build in thousands of features nobody ever needs. We should focus more on writing code, which solves a small set of features and is rather extendable, so people can use it as a staring point for their own projects. Maybe all we need is smaller, more flexible wheels :)

  • Anonymous

     This is exactly why I wrote my own Javascript library.  I only need
    simple ajax, dom, and event handling. So jquery, while great, is a bit
    large for anything I do. Therefore, I created kis-js. https://github.com/aviat4ion/kis-js

  • Anonymous

    The wheel is never reinvented, only reengineered. 

  • http://twitter.com/phalasz Peter Halasz

    Good points and I agree with you on this!

    There is no need to include everything if you do not have a need for it. Just make sure it is easy to extend or replace later on.

  • Johnny Cardy

    For client-side code, yes. You’re reducing what the user downloads. For server code, like PHP, what exactly is the overhead? Surely CPU time and disk space are sufficiently insignificant in comparison to developer time?

    • Johann

      But CPU time increases with amount of requests, developer time doesn’t…
      After having all sorts of problems with RSS parser libs for PHP, I wrote my own, based on simplexml_load_string() — it’s a joke, but it gets me the info I want from 98% of the feeds I throw at it (certainly atom ones). When it fails to parse a feed I really do want it to parse, I can extend/fix it, and unlike what I derped around with before, it never ever leads to a fatal error, it’s much smaller, and took much less time to make from scratch than I had wasted on trying to make other stuff work, all of which having a trillion features I’ll never need. I haven’t looked back yet!

      Not because what I coded is so great, but because it’s simple and returns *exactly* the fields I want exactly in the format I want, without any fixes that don’t apply to my web stack etc. That just *feels* nicer, you know? Developer productivity could be argued to be a product of developer motivation/happiness and time, not just time :) Sometimes getting results quickly by using 3rd party code is (very) nice, sometimes being perfectionist (or even being crappy, but learning something new) is nice, too…

      • Kay

        Just to ping in, I think you’re both wrong. Hardware is cheap. Developer time is expensive. If your business case attempts to include your mental state as a line item, then your billrate logically needs to be based on that of your psychiatrist.

        • Daniel Paull

          Oh boy…  Kay, you are also wrong.  What if cheap hardware is not fast enough?  Your generalisation of the economic situation is over simplified.

        • http://twitter.com/bhwu98 helen wu

          hardware is always cheaper but difficult to make changes. That’s why software comes in.

        • Johann

          Heh. Wrong about what, exactly? Seeing how you basically repeat what the first poster said, with which you claim to disagree and all that ^^ I’m certainly not wrong about my little anecdote, and if you call that a business case, the joke’s on you there, too :P

  • http://twitter.com/aldo_mx Aldo Fregoso

    100% agreed with you, it might not be related with Web Development, but right now I’m trying to design/develop a rythm based game intended to replace StepMania (the most popular and used one), and your article described exactly many of the reasons I decided to start a new game from scratch. I felt like you read my mind.

  • Anonymous

    Good thoughts, and I’m so relieved to hear someone just come out an’ say it. To your reasons for reinventing wheels, I would add: when you write your own stuff, you know how it works, and it’s shaped and tuned for your cognitive processes. Kinda like custom-carved golf club handles.

    Now, some libraries (e.g., jQuery) are so well thought out that they require almost no adjustment to learn, but I have written libraries of my own simply because the existing solutions had interfaces that were obviously the product of a different sort of brain from mine.

    In these cases, I guess rolling your own is a calculated trade-off — more work up front, but better maintainability in the long run.

  • Emilio

    Thanks for making a point in such a clear and eloquent manner. I’m starting to think, more and more, that the future for front-end (especially) frameworks and libraries somehow contains some sort of build-chain for their API consumers.

    I’m imagining either a declaratively defined build-specification (basically a way of cherry-picking parts of a greater library, slimming that which is included) or something that does statical analysis to see which parts of the included files/functions/whatever are _actually_ used.

    Keep up your great work!

    Peace out,
    Emilio

  • David

    Interesting read, valid points you made – but I’m wandering how much of a problem is this? I mean, how many people actually use a library (Let’s use the ones you have used as an example), that’s 1.7k lines when they are able to do it in a few? It makes me think why would anyone use jQuery to select an element, and do some basic adjustments to it, instead of just using the native getElementById etc …? I don’t think anyone would. (maybe a very small amount of people would).

    • http://leaverou.me Lea Verou

      Oh you’d be surprised.

      • David

         I guess I was hoping humanity was better than this :-(.

  • Joel Caballero

    Sorry, disagree with almost everything you said.  The overhead comes in maintaining the code you write.  Please don’t forget to document, and unit test where possible.  Or the next poor sap who starts where you left off ( usually me ) has to dig thru a framework that only one person understands (you).  Libraries with redundant code are bad and should be avoided, but there are real good ones out there that frankly nail it.  And if you find the need to write your own, do it as a last resort and stick to Design Patterns that are tested and proven.  By doing this you’ll live a much less frustrating life :)

  • Anonymous

    The first thing you learn in computer science is critical thinking and although I do agree with you on some parts, others I don’t. If you’re doing a small website then (maybe) that’s fine, do your own thing and you’ll probably be quicker. If you’re working on very large projects then you’ll see that most of time you’ll need some sort of framework/lib to get you started. I guess the most important thing is to know what you’re doing and what you’re working with.
    About frameworks, It’s not only the code, per se, that comes with it it’s also way of programming. Some sort of protocol that will  kinda “force” everyone to do things in a certain way and that’s really important when you’re dealing with a lot of coders.I don’t get the redundant code bit!? If you’re not using something then it’s not being called. So  other than disk space there’s not much to worry about, is it?

  • Dieter Müller

    Focus on a single developer you are right (80%). But if more people work on one project it is difficult to use only needed pieces of frameworks, you better use all code.
    20% wrong, because when you work alone and you have to change a project after one year, it is easier to use the complete code of e.g. PHP Markdown, as only a small piece of it. Because when you use it in many projects you know the tools and frameworks and you dont have to think about “did the function exist or not”, “what if i change this, what happen then”…

  • http://www.coldfusionjedi.com Raymond Camden

    I know the focus here is on “projects” (or, Real World Stuff), but I’m a huge believer in reinventing the wheel when learning. Let’s say I wanted to learn PHP. I’d probably build a blog. Why? So I could focus on the language instead. I know what features a blog needs so I’d be laser focused on figuring out how to implement it in the language I’m learning.

  • DigDag

    If you want to re-write something that already exists, then publish it.  This is especially important if you are reinventing a wheel in an enterprise environment.  You need to make sure that x years down the line developers are familiar with the framework you have built.

    Also if you think that what you are doing is really better than what exists, have you tried contributing those ideas to make the current wheels better?

    • http://leaverou.me Lea Verou

      Actually, yeah, I have. I’ve published lots of open source code, just visit my github profile.

  • Johan

    Interesting post – I would however say that the main reasons for rewriting instead of reusing is more than just technical.  The reason multiple implementations is a good thing is because it creates competition which in turn leads to better quality products. Think about it – Linux itself is the ultimate NIH project. An open source Unix already existed but with some legal problems – BSD. Minix was also out there. Then Linux came in, with less features at the time and it came to dominate.

  • Stan1026

    If no one had ever reinvented the real wheel – our cars would be rolling around on big wooden logs…

  • http://www.facebook.com/ted.weissgerber Ted Weissgerber

    I guess it all depends on what kind of wheel you are re-inventing?

  • Kay

    Not quite “in defense of elitism,” but I generally agree.

  • Kay

    BTW, in regards to reinventing wheels and templating via regular expressions: In my experience, language agnostic, regular expression classes/etc are far too heavy for how you’re using them (lack of verbosity and complexity). In this case I think you would probably see performance increases from simply using an injection pattern and string or stringcollection.Replace().

    Rule of thumb being that string replace algorithms see diminishing returns as the observable data’s complexity increases, whereas a larger amount of data would show exponential returns with regular expressions. The exception to this would be stringbuilder.replace and a fixed key set in.NET and Java, which would outperform regex every single time. That’s a specific exception though.

    Just a thought, you’d need to performance test it in PHP, but I’ve never seen a regex lib perform better than a string replace operation on a small set. Unless you intend for this to be infinitely scalable in terms of the template’s extent, in which case disregard and soldier on. =D

    • http://leaverou.me Lea Verou

      Keep in mind that the average string length is a few words and most of them don’t even contain Markdown. Also, the site is unlikely to have to scale, due to its nature.
      Therefore, that would be premature optimization. Especially since there are absultely no performance issues at the moment.

      • Matt Aybara

        I’m with Kay in regards to regex v string replace. I’ll also add, that string replace isn’t just about performance. It is also about simplicity and readability. Most people have to spend more brain cycles reading a regex than they do a string replace.

        In regards to your post,

        Yes, sometimes it is the right choice to reinvent the wheel; however, most of the time it is not. It is a form of premature optimization to say that you shouldn’t use 3rd party libs unless you would use $arbitraryPercentOfCode. Try somebody elses wheel and use it as long as it works. If the maintenance of their wheel is greater than re-engineering it yourself, make your own. If it doesn’t scale the way you need it to, make your own. It takes experience to be able to evaluate a 3rd party lib and make the right call upfront. When you don’t have that experience, it is almost always the right call to go with the 3rd party lib.

        • http://leaverou.me Lea Verou

          Are you sure you’re talking about the same thing as Kay? This can’t be done with simple str_replace()s.

  • Laxator2

    Never heard the phrase “Why don’t you just use ?” said by people who don’t have any idea what code you write? I prefer to re-invent and re-engineer the wheel every time I come across that.

  • http://twitter.com/dafsverige DafrallahKhan

    I landed here around this article almost accidentally and I guess that inventing the wheel again after the mesopotamians is not a good idea. The problem was a bit twisted by an unknown cause. By analogy I take the case of great softwares like Oracle, SAP, MERISE ….Etc. They all work accordingly to your needs even though they contain millions of strings and codes this does not build any hinder or decrease the performance in any way. The issue is solved when the software is optimized by the staff who implement it in the organisation. In reality their work is not to edit any script but adapt the wheel to the terrain, environment they are in.
    In conclusion what is the use of inventing another new wheel since we already had it done and perfectly by the way.  

  • Ken

    The biggest problem I have in not re-inventing the wheel is a reliable process for finding the wheel in the first place. The second problem after finding the wheel may be $. They say time is money, so how much money do you want to spend looking. Should I spend more by looking more?

    • http://twitter.com/bhwu98 helen wu

      “The biggest problem I have in not re-inventing the wheel is a reliable process for finding the wheel in the first place.” if you did not trust a programmer to do just that, you should not let them programming for you.

  • Martin Omander

    I agree with the article. The emergence of higher level languages and frameworks have tilted the scale for “buy vs build”, toward the “build” side. (I guess it’s “import vs build” when it comes to open source.) Why learn the quirks of a new software library if I can hack together the exact sub-set of functionality I need in a dozen lines of Python? In the days of C, the trade-off was quite different.

  • http://gatesvp.blogspot.com Gaëtan Voyer-Perrault


    Redundant code is bad. It still needs to get parsed and usually at least parts of it still need to be executed. Redundant code hinders performance.

    Redundant code == multiple pieces of code that perform the same operation

    Clearly this is bad. 

    But your article is not describing redundant code. You are describing “code I do not intend to use”. This code is not redundant, just unused.

    The eternal pitfall with “unused code” and “re-inventing the wheel” is simply that the (supposedly) unused code is probably there for a reason. Somebody else has made a mistake and learned a lesson that you can simply avoid by using their code.

    As much as you are defending “wheel re-invention”, you did not re-invent the wheel at all. You simply duct-taped a one-off solution to a very specific problem.

  • http://www.facebook.com/DaveMaiden Dave Maiden

    I personally believe that the re-invention of the wheel as it is phrased is a positive motion and should be done every few years. This applies to programming but also to organisational operations. I have worked with many people (including those in organisations) who lead the project (and those beneath), and they are not aware of exactly what is happening.

    I have been in organisations where the process flows are that messed up that they barely work, but nobody has the insight to look into it. If we started again and documented it (and maybe did again in a few years) everyone would have a reference and would have different input.

    Some Programmers will ensure their code simply works to the specification, others will ensure the most efficient approach, now the problem is if that code was viewed five years later the chances are the compiler would have improved or could have given a quicker option.

    Its a never ending battle with ensuring everything is made to future compliant but we cannot predicate the future technologies so provided everything is understandable, maintainable and can be followed by anybody with some knowledge, then it is going to to changed.

    Dave

  • Hagen

    Good to see that not all developers blindly “frankenstein together” code from various libraries and samples. It is important to analyze, look into tradeoffs, and in most cases to choose the simplest working solution for a problem.
    If this involves using a library – fine.
    But if the full library is “overkill”, pruning it down or writing your own code makes the whole system more robust and better maintainable.

  • http://WebWizArt.be/ Younes Baghor

    Great article, i think the same way about libraries. I still use vanilla JS and not intending to change that soon. I had the same discussion with a lot of people but still they couldn’t persuade me. People don’t even know plain JavaScript anymore, if you do a search in google you have to add -jquery to find some decent code examples.

    To all the people that want to stick with the old wheel and are mumbling about adding to code,
    you do it wrong, others are so good, bla bla bla.  The baddest reply i got “So the UI looks the same everywhere.”.  Sorry i don’t think it should be like that,  the UI should be different and device specific.
    When Jobs created a new music player i heard nobody say to him he had to stick with the old wheel. And when the iPhone was released i didn’t hear nothing either. 

    The breakthrough in all fields in history came from people who did reinvent/reengineered  stuff,  not because they followed the other sheep. Look at the WHATWG and W3C about HTML5, WHATWG is leading the sheep now.

    • http://twitter.com/bhwu98 helen wu

      Arguing about code, right or wrong, is the most expensive way to show one’s ego. None one with right mind would do that.

      • http://WebWizArt.be/ Younes Baghor

        My bad but i don’t get your answer? Can you explain please

  • Pingback: Boilerplates und Frameworks | just curious

  • Pingback: Mein Navishop - Beste Preise und alle Infos rund um NavigationsgeräteMein Navishop

  • Pingback: http://lea.verou.me/2012/04/in-defense-of-reinventing-wheels/ « Genba's Tech Thoughts

  • Nerd in a Can

    Agreed. I’m a developer and I may just be an arrogant schmuck, but I have yet to see a “wheel” produced by any programmer, professional or otherwise, that couldn’t use some improvement. Most “wheels” in development come with so many ridiculous assumptions that they don’t port well to multiple places making the concept of code reuse ridiculous in practice.

  • http://twitter.com/bhwu98 helen wu

    I am with you, Lea. “reinventing wheel” is the only way to prolong software’s life.

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo@smashing | seo博客大全

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo | PHP Developer Resource

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo | MyOfflineTheme.com Skyrocket Your Offline Business Just Now

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo | FloroGraphics.com

  • Jeff Edsell

    I think two problems come into play: perception and time. Writing your own functions almost always takes longer than using a plugin (though sometimes, bending a plugin to your will can take just as long). But it’s easy to fall into the trap of thinking that rolling your own always takes so long that it’s not going to be worth the effort.

    Sometimes, of course, you simply have no time. It has to be done yesterday, and loading a plugin with 90% more functionality than you need is a necessary evil. But if a project isn’t a hair-on-fire emergency, it’s almost always worth at least considering writing your own functions, perhaps even knocking out a little proof-of-concept code, especially if you’re already using a framework to take care of the heavy lifting.

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo | Web Design Kingston

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo | E BLADE

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo | DigitalMofo

  • Pingback: « Camilo Kawerín

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo - Cumbria Web Design

  • Pingback: Rutweb Technology : Smashing Daily #7: Wheels, Print And Bingo

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo | VisionOn Technologies

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo

  • Gergo Buchholcz

    I think this kind of mentality leads to unmaintainable spaghetti code. I would never ever want to touch that 3 line of regexp nightmare that you introduced.
    Markdown might be a bit of an overkill at the first glance but it is well documented, supported, feature rich. Countless times I had to add new feats to code like this that was “reinvented” by a long lost developer. It’s not the way to go. If you have the time (chance, opportunity, whatever) then never go the Fast And Dirty ™ way.

  • http://twitter.com/philipobenito Phil Bennett

    Good read, to add another reason in favour of reinventing the wheel… Learning, what better way to understand something than writing it yourself? Look at Laravel framework for example, this was started as an exercise for a predominantly .NET developer to learn more about PHP and the best ways to implement design patterns in PHP and look at it now, on it’s way to becoming the dominant non-enterprise framework in the industry.

  • Pingback: Smashing Daily #7: Wheels, Print And Bingo | Smashing Magazine