My 10 Years of Programming Experience

Regardless of whether it's the end of the calendar decade or not it's the end of a programming decade for me. I started early in 2010 and since then I've been programming almost every day, including weekends and vacations. This was a really exciting period in my life and I realized that it's been a while since 2010 only recently. So, I decided to put into words some of my learnings from that time. Warning: the content of this article is highly opinionated and extremely subjective.

Combine object-oriented, procedural, and functional programming techniques

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.
–– Dr. Alan Kay

Object-oriented programming seems to be a prevalent technique these days. Different people hear different things when one says object-oriented. I personally don't treat inheritance and polymorphism as necessary attributes of object-orientation. However, hiding and protection of the state by providing public methods is the mandatory attribute of the approach to me. Entities combining the hidden state with the public methods are called objects and they communicate with each other by sending messages (i.e. invoking the public methods of other objects). Most probably, this definition of OOP is too narrow, but these aspects seem the most practical and useful to me. I try to employ them to express any suitable problem in code.

However, the keyword in the previous paragraph is suitable. Yes, there are problems that fit well into the OO-paradigm. For instance, consider the vector - an abstract data structure solving the dynamic array problem:

class Vector<T> {
    private int size;
    private int capacity;
    private T items;

    public Vector(int capacity) {
        this.size = 0;
        this.capacity = capacity;
        this.items = new T[capacity];
    }

    public int size() {
        return this.size;
    }

    public void push(T elem) {
        if (this.size == this.capacity) {
            this.realloc();
        }
        this.items[this.size++] = elem;
    }

    private realloc();

    // ...
}

The vector hides its state (current size, current capacity, etc) and allows the modification only through its public interface. Thus, assuming the proper implementation, the consistency of the state is maintained constantly.

However, not every problem can be easily designed in terms of objects and their communications. And I'm not talking here about some specific domains. Consider the very common problem of registering a new user:

def register_user(name, email):
    user = User(name, email)

    user_repo.save(user)
    start_trial_period(user)
    schedule_welcome_email(user)

    # etc, etc...

    return user

What kind of object should be the owner of the register_user() method? Early at my programming career, I would argue that there should exist a registration service class. However, what's about its state? What attributes should the registration service possess? Are we modifying its state by registering a new user?

Another time, you can find yourself writing a method which performs some computation related to the current class but without accessing its attributes:

class HttpClient:
    def __init__(self, foo, bar, baz, ...):
        pass

    def request(self, url, query, headers, etc, attempts=1):
        for c in range(attempts):
            resp = self._do_request(url, query, headers, etc)
            if is_fine(resp):
                return resp

            self.sleep(self._backoff_delay(c))

        raise Exception('HTTP request failed')

    def _backoff_delay(self, attempt_no):
        return min(1000, 100 * (attempt_no + 1))

The _backoff_delay method can easily be just a self-sufficient function. And there is a huge benefit of having this function separate from the class - its testing becomes much simpler. If the function stayed a method, in order to test the backoff delay computation algorithm we would need to create an instance of the HttpClient supplying all its dependencies (and probably the dependencies of the dependencies) as well as the fake HTTP request parameters. With the pure function we can just do:

assert _backoff_delay(0) == 100
assert _backoff_delay(1) == 200
assert _backoff_delay(10) == 1000

So, eventually, I came to a conclusion, that from time to time it's absolutely fine to have just a function performing some logic. Quoting the gorgeous Zen of Python –– [...] practicality beats purity. Stop writing classes for the sake of having classes. Use classes (or objects) only, if the problem fits well into the OO-paradigm. Procedural programming is a valid and very powerful technique.

I confess, that for a long time I was biased toward the pure OO, ignoring and avoiding other approaches. Accepting the procedural style helped me to become a better programmer. At least I hope so.

However, during my career, I met quite a few developers with the opposite skewness. People have been applying procedural programming even for tasks when one can benefit from the OO-paradigm a lot:

interface Player {
    cursed: boolean;
    scores: number;
}

function collectCoint(p: Player) {
    if (!p.cursed) {
        p.scores += 100;
    }
}

function killMonster(p: Player, m: Monster) {
    if (!p.cursed) {
        p.scores += m.rank;
    }
}

We have a struct Player here and all its attributes are public. We also have a bunch of business logic functions, performing some game actions. These functions are so easy to test. Forge a fresh player object, pass it into the function and check the attributes afterward. However, there is a problem with this approach. Imagine, we added a new function:

function savePrincess(p: Player) {
    p.scores += 100500;
}

It's so easy to forget about the is cursed check. Thus, we broke the game logic by violating the consistency of the player's state. Having an OO-implementation of the player would allow us to avoid such a pitfall:

class Player {
    addScores(score: number) {
        if (!this.cursed) {
            this.score += number;
        }
    }
}

So, how to decide between these two techniques? My personal waymark is a potential violation of the incapsulation. If I see a publicly exposed state and a number of functions here and there modifying it, I will immediately factor it out to a class. In all the other cases, the procedural approach is superior due to its simplicity.

But what's about the functional programming paradigm? I incorporated it in my style on a lower-level. If my strategy is rather based on a mix of object-oriented and structural approaches, my tactic is based on functional methods. Compare the following implementations of a hypothetical revenue function:

function revenue(orders) {
    let total = 0;
    for (const o of orders) {
        if (o.paid) {
            total += o.price;
        }
    }
    return total * fee;
}

// vs

function revenue(orders) {
    return _.sum(orders.filter(x => x.paid).map(x => x.price)) * fee;
}

Due to its declarative form, the second version is much shorter. It even resembles the human-readable text. One can read it as a sum of all paid order prices multiplied by the fee rate.

Let's try to go one step further:

function promoEligible(user, orders, etc...) {
    let total = 0;
    for (const o of orders) {
        if (o.paid) {
            total += o.price;
        }
    }

    if (total == 0) {
        return false;
    }


    // ...
    // do 42 more checks before returning true.
    // ...
}

If we follow the imperative way here, the total variable will stay around until the very end of our function even though it was needed only at the very beginning. This increases the mental load, especially when I need to read this code a month later. I'd rewrite it as follows:

function promoEligible(user, orders, etc...) {
    return orders.some(x => x.paid)
        && anotherCheck(user)
        && oneMoreCheck(etc);
}

No single extra local variables have been introduced on the promoEligible() level.

I'd not call this technique full-fledged functional programming. It's rather an excerpt of some techniques. Once I started writing code in this way, I rarely can find for-loops in my code, my utility functions tend to be smaller and pure (i.e. avoiding mutation of their arguments, as well as any other side-effects) and my variables - to be immutable. And for some reason, I feel safer and sleep better.

I came to this technique rather intuitively, but it seems that modern languages like Rust are trying to make these patterns the first-class citizens. Variables in Rust are immutable by default and almost every statement in Rust is an expression, e.g.:

let res = match compute_something() {
    VeryGood(x) => 5 * x,
    GoodEnough(x) => 2 * x,
};

Take a look at the std::iter::Iterator trait, it's full of map-reduce alike methods:

let a = [0i32, 1, 2];
let mut iter = a.iter().filter(|x| x.is_positive());

let a = [1, 2, 3];
let mut iter = a.iter().map(|x| 2 * x);

let a = [1, 2, 3];
let (even, odd): (Vec<i32>, Vec<i32>) = a
    .iter()
    .partition(|&n| n % 2 == 0);

let a = [1, 2, 3];
let sum: i32 = a.iter().sum();

The error handling and the optional value also support functional-alike styles, not only in Rust, but in modern Java as well:

Optional.ofNullable(smth).orElseThrow(NullPointerException::new);

Use fewer local variables

In my opinion, variables increase the mental load by introducing extra state and decreasing the expressiveness of the code, making it lower-level. And eventually, they complicate the refactoring.

Imagine, you have a function with two local variables:

function doStuff() {
    // ...

    const color = pickColor();
    const temperature = measureTemperature();

    if (temperature > 0) {
        if (color == Colors::RED) {
            // do something
        }
    }
    if (temperature == 0) {
        if (color == Colors::GREEN) {
            // do something else
        }
    }

    // ...
}

The domain of the color's type is of 3 distinct values {RED, GREEN, BLUE}. And the temperature could be positive, negative, or zero. In total, we have |color| x |temperature| = 9 unique states. When I write code, I tend to think about all possible edge-cases. Thus, I need to validate in my head (or better by putting explicit assertions in the source code) every possible state. Now, imagine a third variable appears due to new feature development and its domain is of 4 distinct values. The total number of states surges from 9 to 36!. The grows of complexity is exponential. Yes, most probably not all the combinations are valid, for instance, having a red color with the non-positive temperature could be an illegal combination. And most probably it'll never occur in runtime. But it doesn't offload the mental effort to read the foreign code. Convert the code to something like this and you'll keep the total number of states under 10:

function doStuff() {
    return match measureTemperature() {
        t if t > 0 => doPositiveStuff(),
        t if t == 0 => doZeroStuff(),
        t if t < 0 => doNegativeStuff(),
        _ => assert(0, 'unreachable'),
    };
}

function doPositiveStuff() {
    const color = pickColor();
    // ^ the total number of states in this function is just 3.
}

Luckily, functional programming techniques are eliminating variables by their nature. Just one more reason to utilize them to the highest possible extent.

Another problem with variables is their effect on the expressiveness of the code. With a bunch of local variables, the top-level goal of the function becomes blurred behind the implementation details.

Consider the following code:

def pretty_vague_name(users, orders, promos):
    order_by_user = {}
    for o in orders:
        order_by_user[o.user_id] = order_by_user.get(o.user_id, [])
        order_by_user[o.user_id].append(o)

    promos_by_user = {}
    for u in users:
        total = 0
        for o in order_by_user.get(u.user_id, []):
            total += o.price

        if total > 42:
            promos_by_user[u.id] = Promo()

    # a hundred more line of nested for loops doing some groupings

Usually, by the end of such a function, we will have tens of local variables, some of them are used only near to the corresponding for loop, while others are reused through the remaining part of the function's code. Validation of the correctness of the possible states is already out of scope and you are just cursing the author (maybe even yourself from a month ago). To avoid such situations, I'm usually trying to split functions with too many local variables on sub-functions using the functional expressions to limit the scope of the temporary variables:

def still_a_vague_name():
    user_orders = group_orders_by_user(users, orders)
    user_promos = {u.id: Promo() for u in users
                   if total_sum(user_orders[u.id]) > 42}

Last but not least, if I finally decide to use a variable, I'm striving on reducing the distance between its declaration and the usage. Please, don't do this:

Finally, having fewer variables and with relatively short lifetimes simplifies the refactoring to me, allowing moving parts around easier.

Pass through as little information as only needed

Developers tend to group related attributes in data structures like tuples, structs, dictionaries, etc. That's perfectly fine, especially if the grouping represents a valid abstraction. Then we pass such an object, let's say with 7 fields, to a function, that needs to access only two of them. Then this function passes the object further down the stream through another function to the third function which finally accesses one of those two. That's a pretty common situation, unfortunately. Sometimes, instead of an object we just have a scalar variable being passed through tiers of functions only to be decremented by one somewhere downstream. It's extremely hard to follow, debug, or modify such code.

I usually try to follow the approach of providing the least required amount of information to every given component in the code. If a function needs only two out of seven attributes of an object, it’s either a reason to pass them in separately, or to introduce a more fine-grained abstraction, or to restructure the code. It may be tedious at the beginning because the immediate benefits aren't clear. However, over time I find the codebase written in this manner much easier to maintain. It's like paying a little extra fee every time you need to accomplish a task during the whole lifecycle of the codebase instead of having a period of rapid development and then unbearable occasional payouts every time a new feature or a bug fix needs to be introduced. Probably this approach is somewhat related to the Law of Demeter.

Code in plain text by introducing abstractions and notations

Quite often I see code iterating over a hashmap of lists of tuples:

def handle_request(data):
    result = {}
    for key in data:
        datum = data[key]
        for item in datum:
            left, right = item
            r = db.find_by_id(left)
            if r and r.status == right:
                result[r.id] = True

    return result

This code is not just too low-level to grasp it quickly, but also it's lacking abstractions. Poor naming (data, item, etc) is usually an indicator that some extra abstractions failed to be introduced early on during the development or even design phases.

There is always two way to develop something new. We can focus on the algorithms, or we can focus on the data structures. If we focus on the algorithms, we can eventually find our selves crunching some low-level hashes and lists here and there. By the time of writing it may have a perfect sense. However, every time I need to read it, later on, I need to decode the logic and match it with a higher level design intention (of course, if I remember one, otherwise I need to rebuild it from scratch and hope that it's correct). Alternatively, I can focus on data structures, introducing the necessary abstractions every time I notice some pattern and adding a thin layer of code on top of it.

This technique allows me to write code which is closer to the plain text expression of the idea:

class Request:
    pass

class Record:
    pass

def handle_request(req):
    return {r.tx_id: found_in_status(r.tx_id, r.status) for req.records}

def found_in_status(tx_id, status):
    tx = db.find_by_id(tx_id)
    return tx and tx.status == status

If the domain is well-understood, and we are following the abstractions-way of development, eventually we'll end up with a new notation, some kind of a domain-specific language, enabling us to express new use cases in higher-level terms. Thus, keeping the code readable and maintainable.

Make the design driven by tests

I'm keen on unit testing. And the easiest thing to test is a pure function. This makes me obsessed with applying functional programming techniques every time it's possible. When I see a piece of code which can be extracted from its current context and tested independently, I apply the refactoring.

Obviously, not every piece of code can be extracted like this. If I have a class, and there is a part of the method I want to test, I need to mock all the dependencies of the class and make the surrounding code of the method runnable as well. It's rarely a simple task. Thus, I have another motivation to split the method and extract the part I want to test:

class MagicService {
    public void castTheSpell() {
        this.fetchSpellsFromDb();
        this.validateSpells();

        // do the magical mambo-jambo
        // I really want to test

        this.saveOutcomeInDb();
        this.reportToTheMinistryOfMagic();
    }
}

// becomes

class MagicService {
    public void castTheSpell() {
        this.fetchSpellsFromDb();
        this.validateSpells();

        this.doTheMagicalMamboJambo();

        this.saveOutcomeInDb();
        this.reportToTheMinistryOfMagic();
    }

    private void doTheMagicalMamboJambo() {
        // Test me please!
    }
}

Unsurprisingly, things are much easier to extract when they are self-contained, that is during the extraction the refining of abstractions often happens. Eventually, the design of a program driven by testing obsession becomes granular and all the abstractions turned well-thought. The funniest part here, that you don't really need tests! With time, this way of thinking becomes a habit and the testability of the architecture is being checked somewhere in the background.

Not only unit testing is helpful here. Having end-to-end tests constantly passing is not a trivial task, because of the 3rd party dependencies, time-dependent tests, etc. But abstracting such components away in order to mock them during the integrational test runs makes the architecture even better.

Limit the size of everything

I use (rather artificial) limits for different components of my code. Every time I exceed some of them, I eventually find the code violating the single responsibility principle.

Abidance of the limits helps me to keep the coupling of my modules low and the cohesion of the code - high. I also think that these limits are somehow related to The Magical Number Seven, Plus or Minus Two.

Deliver features by removing code

There is a common belief that the best code is the code you don't write. Hence, if I work on an existing codebase, I think ten times before writing new code. Pretty often during the discussion of the new requirement, we can come to the conclusion that deletion and reorganizing of the existing code can give us the desired effect.

Pick a tool for a task

Some of us are specialists, while others are generalists. I'm rather a generalist because I tend to spot similarities between languages, approaches, architectures, etc and then extrapolate this knowledge to new domains in order to tackle them faster. But if I decided to be a generalist, why to stick then with a single programming language? Some of them are more suitable for one kind of task and some - for another. We can choose the best available tool out there before approaching a new task.

Learn the structure of the project first

When I start working on a new codebase, I always start from its structure. It gives me an understanding of the current project state and sometimes even of its evolution. There is plenty of tools like tree, cloc, or even flamegraph to help with this task.

Don't be stuck (analysis paralysis)

There are multiple reasons to get stuck while programming. Over-architecting, overcomplicating, procrastinating... But the bane of my life is to constantly look for the only right way to solve a problem. Obviously, it's rarely possible. Almost any solution is a trade-off (space vs time, completeness vs simplicity, portability vs efficiency, etc). While higher-level trade-offs were somehow easier to me, I could spend a day trying to find the only right way to iterate over a list of numbers in C++, or figure out the proper usage of inheritance in JavaScript (hint: there is more than one equally valid way), or decide between classes and modules in Ruby. This could stop me from enjoying programming.

And then I met Python with its ideology of having "the one-- and preferably only one --obvious way to do it". Python had loosened the problem of being stuck for me. I'd been programming in Python for years and it was a pretty productive time. Luckily enough, when I returned to other languages, I realized that my absolutism had gone. I finally accepted the fact that programming languages are different, some of them are like LEGO, while others are more like play-dod. It's perfectly fine to have more than one way to solve a problem. The real best-practices are cross-disciplinary. Things like DRY, SOLID, or TDD can be applied almost universally. Relax and keep writing code.

Read books

It's kinda obvious. But some books are better (or more useful) than others for you at this particular stage of your career. Unfortunately, you don't know the exact list upfront. The books with the biggest impact on my understanding of programming were:

  • Refactoring by Martin Fowler, Kent Beck, John Brant, William Opdyke, and Don Roberts. I read an early edition, something that was available in 2010. Surprisingly or not, the most useful part of the book to me was not the catalog itself, but the first few chapters focusing on the software design principles.

  • Design Patterns: Elements of Reusable Object-Oriented Software by the famous "Gang of Four" (Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides). As with the Refactoring book, the first few chapters became a game-changer for me. In total with the first book, it's under 200 pages and they shaped my vision of the topic for years.

  • The Pragmatic Programmer: From Journeyman to Master by Andrew Hunt and David Thomas. Even in 2010, some war stories from this book sounded a bit outdated. However, the practices stay relevant up to date. I read about hard (DRY, build orthogonal systems, do prototyping or use tracing bullets) and soft (catalyze change, avoid broken windows, etc) skills and to be honest it was pretty hard to digest on early stages of my career. I forgot about the book and focused on coding. I've returned to this book recently and I was amused how many tricks I've actually been applying unconsciously throughout my journey as a software engineer. Having all these years of experience now, I can confirm, the book truly teaches you pragmatic things.

  • The C Programming Language by Brian Kernighan and Dennis Ritchie and Structure and Interpretation of Computer Programs (SICP) by Massachusetts Institute of Technology. Regardless of your years of experience or professional domain, these two books must be read. One of them teaches you how simple and at the same time complete and powerful the C programming language is. It eliminates the fear of the computer spirit, the low-level beast scaring you every time you write the next line in your favorite high-level programming language. Another one approaches the programing craft from the opposite side. It eradicates the admiration for the high-level programming paradigms revealing how everything can be implemented having only functions and the assignment operator at your disposal (starting from simple lists and loops and going up to the complex data structures and even object-oriented language implementation).

Learn algorithms and data structures

No, really, learn them, finally! It's fun, it's useful for cracking interviews. But most importantly, they are the fundamentals of our craft.

Most probably you don't code algorithms on a daily basis. But every day you rely on them indirectly. The most common form of database indexes is a tree. If you have only some rough guesses what does a tree look like, how can you write efficient SQL queries? Do you need to search for some records in a huge file? Sublinear search is possible only in ordered data. Maybe you'll figure it out on your own, but that would be a basic lesson from any algorithms tutorial. Do you need to implement a task system where task A can be performed only after task B which depends on task C? Probably you could write some naive code to implement such a system, but if you were familiar with graph algorithms beforehand, you'd immideately recognize an incarnation of the topological sorting problem. Do you need to implement a discrete-event simulation system? Probably you'd need a priority queue for that, and the most widely available data structure is a binary heap. But the binary heap is just another form of a tree.

After I've spent some reasonable amount of time on studying algorithms, data structures, and discrete math, I started noticing applications around me. It works like pattern matching, you see a problem and something triggers inside signaling a name of a suitable algorithm. You see a feature and something triggers inside saying how it could be implemented internally. And if you know the properties of the algorithms or data structures you spotted, you can predict the behavior or use the software more efficiently. I wish I study algorithms thoroughly much earlier in my career.

Here is my personal list of online resources to train the topic:

  • Leetcode has a nice problem set to crack the coding interview, the problems can be hard to solve, but they are rare convoluted. Has nice auto-tests. Can be a fast lane to practice coding for interviews.

  • Geeksforgeeks has a very similar problem set to leetcode, but every problem usually starts from an explanatory article. Good for learning about algorithms and new techniques.

  • HackerRank - has a lot of programming problems, as well as math, AI, etc problems. But programming problems there are rather intentionally convoluted since they are primarily for the programming contests. People doing contests regularly are very good at pattern matching and can map a problem to a solution very fast, so they need to be distracted a bit by introducing some indirect problem statements. So, IMO hackerrank is not that helpful in cracking the coding interviews but is very valuable just for improving problem-solving skills.

  • Codeforces - very similar to hackerrank, but with more focus on conducting programming contests. But it does have a huge problems archive and auto-tests.

  • Topcoder - for programming contests professionals.

  • Rosetta Code - how to implement X in language Y. Very interesting compilation of standard problems and corresponding solutions.

  • CodinGame - automated platform for writing game bots and alike. Can be very fun, with nice visualization and competitive features. Definitely improve your coding and math skills but solutions often are heuristic-based.

Conclusion

Maybe a decade it is not that much time actually? I wish I could fit more learnings in it. But anyway, it was an amazing journey for me and I hope you are having at least as much fun as I when writing code. As a conclusion, I'll put some meta-thoughts to the list here as a reminder for future me:

  • When writing code, be as concise as hell.
  • Strive to eliminate uncertainties.
  • Cover all the branches, think about all the edge-cases.
  • Read source code (and don't trust documentation and comments).
  • Be passionate and patient.

Make code, not war!