I have a similar approach. I only use classes when I find myself passing around ...

gruseom · on March 17, 2012

In what language do you find it easiest to make classes only when you feel like it?

most of the time, OO is a waste of time, and a few arguments is all you need

That reminds me of another key point. I try not to let function signatures have more than a few arguments, and I try to keep those arguments primitive. (A good litmus test is how easy it is to call from the REPL.) When my code starts to break these guidelines, that's a sign of design weakness, and I do what it takes to break up the complexity. It still amazes me how far you can get in terms of simple, decoupled design just by doing this.

The widespread OO practice of factoring some of those arguments into a new class, so that now you need pass only one thing (the new composite object) instead of several old ones, does nothing to solve the problem, but instead makes it worse: you've both added complexity and lost transparency. It's like a kid saying yes when mother asks "did you clean your room", having shoved all the mess under the bed.

hackinthebochs · on March 18, 2012

I feel all this talk against OO is missing the fundamental idea being objects: You are in fact strictly reducing the solution complexity by the proper use of classes. They are a mechanism of data-hiding. I know data within a class cannot be accessed outside of my defined interface, thus my "interaction space" has been reduced. That is, there is data I can't access and code that cannot access my data. This is a marked reduction in solution complexity.

The same can be said about functional programming. But once you degenerate to passing a ton of parameters around in each function call, you're probably better off defining an explicit object anyway.

Measuring complexity by number of lines and saying defining a class is adding complexity is off the mark.

chousuke · on March 18, 2012

My intuition tells me that hiding data in objects does not really help as long as the data is still mutable. Immutability is the only way to guarantee true encapsulation for objects that don't actually represent something inherently stateful. (eg. file on disk)

Coupling data and operations is also a problem. It's non-extensible, both in that you can't add more data and use the same operations, and you can't add more operations that work on the added data. Inheritance works around some of this, but it doesn't solve the general problem. What if it makes sense to add the same piece of data to objects of classes from different parts of the hierarchy?

The thing about functions is that you can always write a new one that works with anything you want. You don't need to add it to a class or an interface (you can, if the language allows that), and if possible, with immutable data that essentially makes the function a unit of perfect encapsulation.

gruseom · on March 18, 2012

You're going to have to offer some evidence for this, because (as many threads on HN have discussed) all the evidence I've seen points to precisely the opposite of what you say: the best measure of complexity is simply the overall size of the program. If that's true, defining a class certainly does add complexity, and your argument is plausible but wrong.

Not content with repeating myself in this thread, I will repeat myself from threads of the past as well: it's astonishing that the evidence is so consistent on this, yet almost nobody takes it seriously. But I do! And I have a really interesting question, too: what it would it mean, not just for a programmer to organize their work around this principle, but for an entire company to do so?

hackinthebochs · on March 18, 2012

You're probably referring to the study that showed lines of code correlated with error rate. I completely believe it. But I think there's more to it than that.

We all know that lines of code is a poor measure of code complexity, but it does correlate with it. It is far more reasonable that error rate is proportional to code complexity, which is weakly measured by LOC.

Unfortunately I don't have any studies to back up my intuition here. But I would bet that my definition of complexity, aka "interaction space", would correlate much stronger to error rate than LOC does. Also, that decently good usage of classes would in fact reduce the complexity, and thus error rate.

jodrellblank · on March 18, 2012

Unfortunately I don't have any studies to back up my intuition here. But I would bet that my definition of complexity, aka "interaction space", would correlate much stronger to error rate than LOC does.

Intuition is often surprisingly wrong - lesswrong.com shows a lot of examples. That doesn't mean it's wrong in this case, but now you have code which is hidden so the interaction space is limited, so there could be:

1) Now you have to add complexity to get information into and out of this interaction space. Similar to how it's easier to walk onto a field than it is to walk into a castle - if the walls protect the contents, they also contort your path around them.

2) Leaky abstractions. It might be the case that your protected code is needed elsewhere. Now you have to duplicate it, or break around the protection/expose a new way through, all of which would not be necessary for a free function. At the very least you have overhead considering this. All of a sudden your neatly wrapped class is two classes, one in use in two places where you could again change code and have an effect far away if you're not careful.

3) In a language such as Python, you don't have your code inside a class protected by anything except convention. So it seems that you either must: a) accept that what protects groups of code is programmer attentive care, not the system of classes itself, and thus that any other programmer paying the right kind of attention could have the same benefits and less code, or b) declare languages like Python to be excluded from the benefits of classes by not implementing them properly. Do you agree?

gruseom · on March 18, 2012

First: not badgering you :)

We all know that lines of code is a poor measure of code complexity

We certainly don't! Program length is the only good measure we have. The best measure of program length is up for grabs, but LOC is probably as good as any (there was a big thread about this a few months ago).

But I would bet that my definition of complexity, aka "interaction space", would correlate much stronger to error rate than LOC does

But the best work on this points to so exactly the opposite of what you say, I would really recommend you take a look at it: http://www.neverworkintheory.org/?p=58 - and tell us what you think.

I'm going away now.

hackinthebochs · on March 18, 2012

Good link. The thing that stands out to me about the standard code metrics used in studies like this, is that they attempt to measure actual complexity, rather than perceived complexity. LOC perhaps is the only measure that directly measures perceived complexity. If I add 100 lines of code to my project, its possible that cyclomatic complexity stays constant (I didn't add any branches, just a straight fall through), but the perception of complexity no doubt increased.

If we can come up with more metrics that are based on our perception of a codebase's complexity, I think we would see interesting results.

pnathan · on March 18, 2012

> In what language do you find it easiest to make classes only when you feel like it?

Any language that isn't a fairly close C descendant. Haskell, Lisp, Python, Perl are languages I've done some work in with minimal OOness.

When I used Java a few years ago, it required classes. C++ winds up being very class-ish. C of course is limited.

I have come to be very pleased with not using classes unless the solution really demands it. It's sort of a "grow the solution" idea, instead of "waterfall the solution".

slowpoke · on March 18, 2012

>Any language that isn't a fairly close C descendant. Haskell, Lisp, Python, Perl are languages I've done some work in with minimal OOness.

At least in Python - which is the only one of those four I can speak for - all you are doing is basically OO (in fact, everything in Python is an object). The beautiful thing about Python is how it's abstracting a lot of this away by providing functions which implicitly call "magic methods" which are either generated by the interpreter or can be defined by the programmer (ie, __iter__(), __next__(), __init__()), and by virtue of Duck Typing, which means an object defines itself by what it can do, not its identity or origin.

You'd never even notice a lot of this unless you dip into the internals of Python a bit. And ultimately, I don't think this is actually possible without using OO or some close equivalent. So in the end, a lot of the "solutions" people have proposed in this thread ultimately depend on the very thing they want to "solve": Object Orientation.

tptacek · on March 18, 2012

I'm reminded of 100 horrible C APIs that take a "struct foo_opts" instead of just conceding that they need 7 or 8 silly arguments.

Or of "sockaddr_in".

barrkel · on March 18, 2012

The sockets API would be much more future proof if C better supported polymorphism. That's usually the reason for using structs instead of arguments; in Windows, most API structs start with the size of the struct, effectively indicating the version. Makes binary compatibility easier when you add arguments in version n+1.

sockaddr_in is the way it is because of sockaddr, sockaddr_in6 etc. Judicious polymorphism could have led to fewer headaches converting IPv4 apps to IPv6.