Historically, programs were structured as input-processing-output. Object-oriented programming (OOP) emphasizes the data, while processing logic is bound to data objects and mostly exists as the interaction of objects. This article presents what OOP is and why it is better than other paradigms worthwhile.

While Simula was the first programming language to introduce features like classes, inheritance and virtual methods, Smalltalk is generally considered the first "object-oriented" language. Today all major programming languages are (at least optionally) object-oriented and that includes Cobol and Fortran.

Object-oriented means polymorphism

While many people know OOP, it is hard to devise a clear definition. The term itself was coined by Alan Kay, while building Smalltalk.

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them. ― Alan Kay

This quote wasn't intended as a clear definition and also demonstrates Alan Kays distaste for static types (extreme late-binding), which I consider an orthogonal concept. Local retention and protection refers to the common concept of objects having private data. Messaging in Smalltalk is the same as method invocation for Java or C++ programmers.

Jonathan Rees collected a list of nine properties of OO-languages. Subsets are often used for a definition, yet no programming language implements all properties.

  1. Encapsulation
  2. Protection of private data
  3. Polymorphism
  4. Parametric polymorphism (generics)
  5. Everything is an object
  6. All you can do is send a message (actors model)
  7. Specification inheritance (subtyping)
  8. Implementation inheritance (reuse)
  9. Finite set of methods/fields

An interesting game is to remove properties by finding a language that is considered object-oriented, yet doesn't use a property. For example the class concept is unnecessary, as seen with prototype-based languages like Javascript. Subtyping is not a requirement, since it doesn't make sense in dynamically typed languages like Python. The finite set of methods/fields is necessary for static type checks, but dynamically typed languages do useful things without this restriction (consider Rubys method_missing).

To categorize some languages according to the nine properties: Lisp/CLOS does {3,4,5,7}, while Java has {1,2,3,4,7,8,9}, Python has {1,3,4,5,8}, and the original Simula had {1,3,7,9}. The intersection of these sets is {3}, so polymorphism is the essence of OOP.

The argument for OOP

Why is OOP considered superior to other paradigms? For example here is a list that mentions things like simplicity, modularity and reusability.

These points are most common but shallow. For example modularity doesn't come naturally, but the programmer has to abide by complex rules like the Law of Demeter and the SOLID Principles) to write decoupled programs. Those catchwords like "simplicity" set a goal, how good code should look like, but it depends mostly on the programmer instead of the language. For another example Paul Rogers wrote an article about how encapsulation doesn't imply (and isn't the same as) information hiding. Encapsulation only means to couple data and program logic, but information hiding also means to construct classes that don't reveal internal design decisions. OOP nudges the programmer into the right direction, but doesn't prevent mistakes. One can write bad code in any language and OOP doesn't help a bad programmer that much.

Another (more convincing) argumentation uses the principle of locality, which can be satisfied via polymorphism. For example consider data objects of two types of lists (maybe a linked list and an array list) and the append functionality. A procedural approach needs to apply different procedures (linked_append or array_append) on data objects explicitly, while an object-oriented approach would call a method (list.append) and dynamic dispatching implicitly applies the correct procedure. Now consider that we would like to create a third type of list (a tree list). OOP abides by the principle of locality, since we only need to create a new class and can pass the objects around. In the procedural code every explicit procedure selection needs to be changed to incorporate the new type.

One could refactor the explicit selection into a procedure (append) to have only one location to change, but where does this procedure belong to? It can't belong to one of the lists, since they don't know of each other. Every list user needs to implement their own select procedure depending on which types of lists are used. The clean solution is to implement an indirection mechanism, which means to implement a vtable mechanism. So following the principle of locality leads to an architecture that is inherently in an object-oriented style, since it implements polymorphic data objects.

Conclusion

One point that is quite subjective and still needs discussion though is the tradeoff of language complexity. If a C program needs to use OOP for example, you could use the more complex C++ language or a library like gobject. On the other hand it sounds unlikely that there will be a popular language without object-oriented influences some day, so at least minimal syntax-level support is desirable.

The principle of locality demands polymorphism and polymorphism is the essence of OOP. Other paradigms like functional programming may complement OOP, but will never replace them.

Thanks to Sebastian Gregorzyk, Christian Jülg, David Förster and Markus Herhoffer for proof-reading and to everybody contributing to the HackerNews discussion
© 2009-12-26