In defense of reinventing wheels
One of the first things a software engineer learns is âdonât reinvent the wheelâ. If something is already made, use that instead of writing your own. âStand on the shoulders of giants, they know what theyâre doing better than youâ. Writing your own tools and libraries, even when one already exists, is labelled âNIH syndromeâ Â and is considered quite bad. âBut what if my version is better?â. Surely, reinventing the wheel canât be bad when your new wheel improves existing wheel designs, right? Well, not if the software is open source, which is usually the case in our industry. âJust contribute to itâ youâll be told. However, contributing to an open source project is basically teamwork. The success of any team depends on how well its members work together, which is not a given. Sometimes, your vision about the tool might be vastly different from that of the core members and it might be wiser to create your own prototype than to try and change the minds of all these people.
However, Open Source politics is not what I wanted to discuss today. Itâs not the biggest potential benefit of reinventing the wheel. Minimizing overhead is. You hardly ever need 100% of a project. Given enough time to study its inner workings, you could always delete quite a large chunk of it and it would still fit your needs perfectly. However, the effort needed to do that or to rewrite the percentage you actually need is big enough that you are willing to add redundant code to your codebase.
Redundant code is bad. It still needs to get parsed and usually at least parts of it still need to be executed. Redundant code hinders performance. The more code, the slower your app. Especially when we are dealing with backend code, when every line might end up being executed hundreds or even thousands of times per second. The slower your app becomes, the bigger the need to seriously address performance. The result of that is even more code (e.g. caching stuff) that could have been saved in the first place, by just running what you need. This is the reason software like Joomla, Drupal or vBulletin is so extremely bloated and brings servers to their knees if a site becomes mildly successful. Itâs the cost of code that tries to match everyoneâs needs.
Performance is not the only drawback involved in redundant code. A big one is maintainability. This code wonât only need to be parsed by the machine, it will also be parsed by humans, that donât know whatâs actually needed and what isnât until they understand what every part does. Therefore, even the simplest of changes become hard.
Iâm not saying that using existing software or libraries is bad. Iâm saying that itâs always a tradeoff between minimizing effort on one side and minimizing redundant code on the other side. Iâm saying that you should consider writing your own code when the percentage of features you need from existing libraries is tiny (lets say less than  20%). It might not be worth carrying the extra 80% forever.
For example, in a project Iâm currently working on, I needed to make a simple localization system so that the site can be multilingual. I chose to use JSON files to contain the phrases. I didnât want the phrases to include HTML, since I didnât want to have to escape certain symbols. However, they had to include simple formatting like bold and links, otherwise the number of phrases would have to be huge. The obvious solution is Markdown.
My first thought was to use an existing library, which for PHP is PHP Markdown. By digging a bit deeper I found that itâs actually considered pretty good and it seems to be well maintained (last update in January 2012) and mature (exists for over 2 years). I should happily use it then, right?
Thatâs what I was planning to do. And then it struck me: Iâm the only person writing these phrases. Even if more people write translations in the future, they will still go through me. So far, the only need for such formatting is links and bold. Everything else (e.g. lists) is handled by the HTML templates. Thatâs literally two lines of PHP! So, I wrote my own function. Itâs a bit bigger, since I also added emphasis, just in case:
function markdown($text) {
// Links
$text = preg\_replace('@\\\\\[(.+?)\\\\\]\\\\((#.+?)\\\\)@', '<a href="$2">$1</a>', $text);
// Bold
$text = preg_replace(â@(?<!\\\\)\\*(?<!\\\\)\\*(.+?)(?<!\\\\)\\*(?<!\\\\)\\*@â, â<strong>$1</strong>â, $text);
// Emphasis
$text = preg_replace(â@(?<!\\\\)\\*(.+?)(?<!\\\\)\\*@â, â<em>$1</em>â, $text);
return $text;
}
Since PHP regular expressions also support negative lookbehind, I can even avoid escaped characters, in the same line. Unfortunately, since PHP lacks regular expression literals, backslashes have to be doubled (\\
instead of \
so \\\\
instead of \\
, which is pretty horrible).
For comparison, PHP Markdown is about 1.7K lines of code. Itâs great, if you need the full power of Markdown (e.g. for a comment system) and Iâm glad Michel Fortin wrote it. However, for super simple, controlled use cases, is it really worth the extra code? I say no.
Rachel Andrew recently wrote about something tangentially similar, in her blog post titled âStop solving problems you donât yet haveâ. Itâs a great read and Iâd advise you to read that too.