A Minimalist Approach to Software∗
Total Page:16
File Type:pdf, Size:1020Kb
A Minimalist Approach to Software∗ Ammar H. Hakim March 25th 2021 1. What is minimalism? Minimalism does not mean writing less code, but writing code that counts. Minimalist programs are elegant and have a tight code structure, and do one thing well. 2. Brutal minimalism means removing all features and code that are superfluous and not needed to achieve the minimum viable program (MVP). Some features are “good to have” but if they are not “must have” they should be eliminated if aiming for a brutally minimalistic design. 3. Programming is never about lines of code or less typing or other such superficial measures. (Though all these are typically the outcome of minimalist design). Programming is about expressing executable ideas cleanly. It is a very difficult art and requires removing the fear of (full or partial) rewrites and a sharp, mathematical and axiomatic focus on minimal concepts required to implement features efficiently. 4. In our notation an object is a unit in which related data are kept together. Example: instances of a C struct containing plain-old-data (POD). To achieve a clean design, data and operators on those data must be kept separate. This is the mathematically correct thing to do as it allows constructing different systems of functions to manipulate the same data in an independent and non-intrusive manner. 5. Data in objects should be treated as read-only and not directly modified. Data modification should only happen via functions. 6. Though sometimes flexibility and extensibility are important, very often they are not impor- tant and in such situation only the special case should be handled, but handled well. 7. Flexibility and extensibility should not be based on the existence of common terms in ordinary language to describe two otherwise disparate systems. Ordinary language is not precise enough to express commonality and only through very careful analysis one discovers commonality (or lack thereof). 8. Different systems should not be shoehorned into one without significant analysis. In fact, flexibility usually is increased when systems are cleanly separated, but allow structured communication between them. Consider Unix command-line tools and their simple and elegant chaining mechanisms via pipes and output/input redirection. ∗Updated April 16th 2021 1 9. When flexibility and extensibility are required they should be implemented with an elegant and minimalist design without the need for complicated class hierarchies and fat interfaces. A hierarchical class structure that first suggests itself usually does not work cleanly in practice. Object (data) nesting is fine, class inheritance is usually not as it leads to incestuously shared state. 10. Inheritance should not be used to implement feature extensions as it leads to code bloat and brittle class hierarchies. Some code duplication is fine (and duplicate code should be refactored into functions). 11. Separation of data and operators allows dispatch on multiple object types. That is, func- tions can be written that take two or more objects to perform an action. This removes the incestuous state sharing that occurs when data and operators are mixed. 12. Dependencies should be minimized, and especially dependencies that one does not understand in a deep manner should be avoided. Exponential dependencies (if each dependency adds two more) should be avoided at all costs. If dependency management leads to adoption of a complex package manager that does “magical things” like install everything under the sun from scratch, then the situation should be re-examined very carefully and simplification undertaken. 13. Modern scripting languages are very flexible and powerful. Some like Lua are specially de- signed for embedding in larger applications and have a very tiny footprint. C code (or C APIs) are very easy to bind in multiple languages. Hence a good architectural motif (used in redis, haprox and most games) is to write the low-level performance critical code in C (or a carefully curated subset of C++) and use scripting to provide higher level control. 14. It should be remembered that not all control structures need be possible in C or the Curated- C++. Higher-level scripting languages allow more complex and elegant control structures (like lexical closures or coroutines) even when they are missing from the low-level language used to implement the performance critical aspects of the code. 15. The API exposed to the scripting language should be fine-grained enough to allow use of complex control structures like lexical closure, coroutines and iterators. Allow users the ability to pass structured data between the script and compiled layer. 16. Proper use of C structs and function pointers can lead to surprisingly elegant designs and clean code. Memory management is not the burden it is made out to be. Recall highly robust and reliable software like the Linux kernel, redis, haproxy, sqlite etc are written in C. Static analysis tools and valgrind are your friends. Remember: at first one wants results but very soon one wants control. C and Curated-C++ give you complete control. 17. Last year’s problems should not be papered over with yet another layer of code. Layered soft- ware design is good but layers should be used in the sense of indirections and not bandages. 18. To understand an existing software library/framework popularity should not be used as a metric. Some popular libraries may have high-quality code but more often popularity is simply an indicator of good marketing (funding pressures or corporate greed to establish platform tie-in). Further, popularity, specially when it comes with a promise of quick initial returns, often indicates mediocrity as popularity can only be achieved by targeting people who 2 can’t be bothered to develop a deep understanding and create minimalist programs. Typical minimalist applications do not have extensive enough needs to require including everything- under-the-sun frameworks. In fact, it is a good idea to avoid anything that has the word "framework" or other buzzwords in it. 19. Minimalist and MVPs should be quick to build. Incremental builds should not take more than a few seconds and full system distclean rebuild should not take more than several seconds. Note that using some heavily (infernally) templated C++ libraries slow builds notoriously. These infernal templated libraries (ITLs) should be avoided1. 20. There is no need for the development and deployment build systems to be the same. In fact, it is possible that for deployment a simpler build system (even plain make) is a good option. Recall that at deployment one does not need full dependency tracking and so it is sufficient to simply build everything. Efficient development, on the other hand, requires fast incremental builds and hence a fast minimum-dependency build system (preferably with a nice scripting language) is desirable. 21. Consider sqlite that takes the extreme step of creating a single monster C file that can be simply built with “cc -c -O sqlite3.c”. This is not always possible or desirable for all projects, and perhaps an “amalgamated directory” approach is better. In this approach a script or build target generates a deployment directory, constructs Makefiles and/or shell-scripts to compile all code and tar.gzs everything. Note cmake generated Makefiles are not stand-alone and so can’t be used in amalgamated deployment archives. Obviously, this amalgamation approach does not work for script code but is also not needed: amalgamation should ease builds while scripts do not require building. 22. In summary: creating efficient and innovative software requires a minimalist or even brutally minimalist approach. The goal should be to construct one or more MVPs that have struc- tured data exchange protocols instead of giant monolithic programs. Frequent rewrites and refactoring may be needed before one discovers the correct design of an MVP. Monolithic pro- grams and over-engineered systems are almost invariably slower, harder to maintain (despite their developers having used the latest OOP and “Agile”" fads to make them extensible) and difficult to understand. 1Templates are sometimes needed, but often after first discovering templates one tends to go into an orgiastic frenzy to use templates everywhere. Templating often becomes like a viral infection: once it is introduced it spreads everywhere. Insane use of tempalates should be avoided at all costs. 3.