Recently I worked on a document search for my current company. As part of the crawling and indexing process I worked on scoring the documents. The project was done during a hack week, and thus the scope was limited to what you can achieve in a few days. Because of that we started with a trivial approach and incrementally improved it from there on.

V1: Most trivial Approach

Concept: To score a document for a given query, we create a vector of all possible words (e.g. all the union of all unique words in all documents) and define this as our alphabet of size N. Next we define a function function c as c(w,q) ε {0|1} and c(w,d) ε {0|1}, which returns 0 or 1 based whether the given term w is in the given query resp. the given document.

We can think about this, as creating two bit vectors of size N. One document vector and one query vector. And then we can compute the score simply as the dot product of the two vectors.

This can also be expressed with Sum notation:

Naiv sum of occurrences

Over all possible words w(i) to w(N) we compute c(w,q) and c(w,d) and multiply the result. This means if the term is not present in the query or in our document, the product is 0 which is the identity / neutral element for sums. Then we just sum all these products.

Solved problems:

  • [x] score for matches
  • [x] no score for non matches

Open problems: If c(w,d) results in x, where x ε {0|1}, then the similarity of the document and query, is not expressed strong enough. With other words: Our score of 1/0 for matches is too simple to make a statement about how good a term in the query matches a given document. We can only say that it occurred at least once.

V2: Term frequency instead of a bit vector

Improvement: Instead of returning 0 or 1 for each possible word (bit vector), we return the count of occurrences of the given term. This is called the Term frequency (TF). So our counting function c is now defined as:

c(w,q) and c(w,d) = x , x >= 0

Solved problems:

  • [x] higher score for more matches
  • [x] no score for non matches
  • [x] c(w,d) now expresses the similarity of the word and the document better

Open problems:

  • The frequency does not have an upper bound, so repeating the same word over and over again results in huge scores, thus we can have bad matches or people can fool our search engine
  • Stop words and rare words have the same impact for the scoring function, thus words with a lot of regular english words but without any matches for nouns and verbs might score very high.

V3: Inverse Document frequency

Prioritize rare terms to common terms by introducing document frequency

In order to increase the importance of rare (meaningful) terms compared to common terms (stop words) we will penalize terms with occur in many documents (high document frequency). Therefore we can define:

M = Total number of all documents

and document frequency:

df(w) = number of documents that contain w

This approach is based on the idea of term specificity.

Now we can represent the inverse document frequency of w as:

inverse document frequency

Leading us to this new scoring function:

document frequency

With c(w,q) and c(w,d) = x , x >= 0

Solved problems:

  • [x] score for matches
  • [x] no score for non matches
  • [x] c(w,d) now expresses the similarity of the word and the document better
  • [x] rare terms are more important than stop words

Open problems:

  • Term frequency does not have an upper bound, so repeating the same word over and over again results in huge scores.

V4: An upper bound for the Term frequency

In order to address the final open issue lets take a look at BM25 as one of the available options to ensure an upper bound on the term frequency.

TF transformation BM25 transformation
TF transformation BM25

BM25 is a well established algorithm for this task. It also falls back to our initial binary 1/0 logic for k = 0.

Using our c(w,d) as the x in the BM25 definition we just need to change one part of our scoring function and keep c(w,q) as well as the IDF term at the end. Our new equation looks like this:

Equation with BM25 transformation
New equation including BM25

Solved problems:

  • [x] higher score for more matches
  • [x] no score for non matches (multiplying query count by document count)
  • [x] c(w,d) now expresses the similarity of the word and the document better
  • [x] rare terms are more important than stop words (IDF)
  • [x] Term frequency has an upper bound so that repeating a single world many times does not fool our algorithm.

Open problems:

  • When handling documents with different length, the longer documents will always have an advantage, since they have more content and are thus more likely to define the word we search for.

V5: Introduce Pivoted length normalization

Introducing document length normalization

To solve the final problem we are left with we will introduce a way to penalize very long documents to a certain degree to have a better balance for long and short documents. We define a “Normalizer” as follows:

length normalizer equation
Pivoted document length normalizer

with b ε [0,1]. Here b controls the impact of the normalizer. With b = 0 the normalizer is always 1, when we increase b towards 1 the penalty gets bigger.

We can now replace the simple use of k in our denominator with this normalizer, which us to our final equation.

scoring function with normalizer
New scoring function including document length normalizer

Solved problems:

  • [x] score matches
    • use of the sum of products
  • [x] do no score non matches
    • multiplication of query count by document count
  • [x] c(w,d) expresses the similarity of the term and the document
    • we replaced a bit vector with term frequency (TF)
  • [x] rare terms are more important than common words (stop words)
    • we added IDF
  • [x] Upper bound for term frequency, to handle repeated words properly
    • we added BM25
  • [x] handle a mix of short and long documents well
    • Introduction of document length normalization


We were able to arrive at a state of the art scoring function by incrementally improving our initial approach that was very naive and simple to implement. The similarity to the okapi / BM25 family can be compared in this wiki page or in the publications from Robertson and Walker about Okapi / BM25.

This article used equations based on the “vector space model”, which for example is used in Apache Lucene and which is quiet impressive this means even Elasticsearch used it under the hood! However other real world engines and the “official” BM25 definition uses a propabilistic relevance models instead. So maybe this might a interesting for futher reading.

second attempt

In my recent Don’t blame rails post, I talked about my opinion on most of the rails blaming that is happening in the recent past.

It was interesting how much some of the people I consider good software engineers disagreed with me. I am happy that I was finally able to distill the difference in opinions so that I can now share it. In addition to that, I am now able to make a more precise statement on my opinion, than I was in the last blog post.

Skill level

The first important distinction is about the skill level of the complaining person seems to blur into my perception of the statements. I don’t have any issue on discussing the problems rails has, with somebody who really understands the problems and the benefits alternatives have. I am just annoyed by criticism that is based on repeating other people’s statements without own experience or expertise.

Replacing frameworks with frameworks

I guess one thing that seems to make me skeptical is that people criticize rails and its complexity or the implications on the app they build, and still want to switch to phoenix or some framework in any other language. I find it confusing to just swap frameworks as a solution that should address architecture or complexity. I am afraid that one might be destined to experience the problems with the new framework, that one originally had with rails.

Two distinct topics

My last post and the above reflection on it, was mainly about addressing the way people blame and criticize rails in ways, that partly bothered me.

Today I also want to take the time to express my own opinion.

My opinion

I really appreciate what rails did for the web development community. I think it made this part of our industry a better place. (Just thinking of the fact that I had to use PHP before Ruby and Rails)

However don’t like it’s complexity, the missing support for separation of concerns and also the direction rails is heading, even though, I really appreciate the fact the the core members are really doing a great job in simplifying parts of it.

Frameworks and rails’ design choices

I think part of this complexity and the implications is just part of the fact that its a framework. However other parts don’t just fit my imagination of software design. Some examples are:

  • I prefer, having per action abstraction and not one controller with methods per actions, which then might be compensated by having a service/interactor per action.
  • Having a distinction between domain models (entities) and database object mappings (in OO contexts) makes things easier. It’s easy to do manually, but it’s additional boilerplate.
  • The fact that one has to use hooks instead instantiation is a bad thing itself. This is the case with controllers, but also with the way DHH prefers to expose the attributes API, which otherwise would be just great.
  • Having a repository abstractions seems valuable to me, and its one of the parts where I prefer Ecto over AR.

These are just some things that bother me. I guess one can compensate most of it rails, quiet easily, but just don’t like working with it. I guess it just no longer fits my needs and the way I like to write/build software systems. Maybe ActiveRecord does no longer provide the abstractions/API that match the learnings of the last years and current best practices.

My future way

By default I personally would not use rails for new projects. I think I am aiming for simpler and less coupled components to create properly maintainable systems. This also implies I would also prefer libraries over frameworks. This means working on top of rack, WAI, ring and so on seems to be a better fit for the kind of API-only micro services I am creating at the moment.

I guess there will be some use-cases or sets of requirements in the business context that will make rails the proper choice, but unless that is the case, I will go with smaller and simpler solutions.

Another reason is that I would not use Ruby by default anymore, but instead go with Elixir, Clojure or Haskell, depending on the runtime constraints. The reason behind that, is simply my believe in the benefits of functional programming compared to imperative programming (Let’s avoid the ‘definition of OO’ discussion here). And even though ruby offers some functional concepts, its always a pain to go that direction in a language that is not designed to be a functional programming language.

Ruby and Rails are great

However I don’t want to take anything away from Rails or Ruby. They have their benefits and they have made great achievements possible. So Ruby and Rails were solving problems at that time and might haves surpassed that time and need.

UPDATE: As a funny coincident I just saw this talk from Justin searls at rails conf, which also goes into some details about the fact that people are leaving rails from a quiet different angle, that I did not consider so far. Nice to have additional ideas on the topic

It is usually not rails’ fault

In the recent past it appears as blaming rails for productivity issues in larger projects has become quiet popular. Usually I do not really care much about these trends. However this time it happened that I came a blog post that is kind of representable for most of the criticism towards rails these days.

My background

Before I want to comment, I just wanted to shortly say, that I have been working with rails since 2006 as part of my job as a software engineer. I believe I have a solid understanding of the trade offs and most of the pros and cons of certain aspects of rails, and its biggest component active record.

That said, I hope it is clear, that my goal is not to protect rails (as a project or as a community). I think we should focus on the real facts that might be downsides and not get lost in speculation and irrational or inconsistent arguments.

Rails, frameworks and architecture

Rails is not your architecture

I think it is important to understand, that rails does not force a certain architecture. Rails is a framework and thus it pushes you into a certain direction. However, no matter which framework, libraries you use of if you decide to write all code you self. It is your (teams) responsibility to ensure a proper design and software quality. This includes your architecture as well as coding styles and guidelines.

here are some sources about this topic:

There is also some great information on thoughtbot upcase and other thoughtbot sources that show that rails and a proper architecture can go hand in hand

I just want to bring this point up shortly, because I will come back to it later.

Rails quality and your app’s quality

I think we should really distinguish between issues with the rails code base and its internals and problems the Rails design is creating in application build with it. Thus the amount of methods in ActiveRecord is not really a problem in projects I work on. However I still don’t like it as a developer.

On the other hand there are issues with the API that rails offers. This could be for examples the inability to instanciate controller classes ( or actions / services / interactors) directly. This leads either to before/after actions or rack middleware being added. The same goes for the way templates and controllers are tight together. However I think the later issue can be properly addresses by using proper OO abstractions like form objects.

Other issues like ActiveSupport magic might indicate bad design/software engineering practices, but, they would push you towards a bad design, since it is the developers decision, which tools to use. Even within a framework.

Finally rails is a framework and it locks you into a certain structure, as every other framework would. And thus there are also issues that just arise from the fact of using a framework. These issues should also be discussed separately, or at least be understood as what they are.

My opinion on most of the blaming

My comments will mainly focus on this article. However as I already mentioned there a various sources that almost refer to the same points.

Some good points

I think the article really some points I totally agree with. But these make up only a small part of the article.

  • I really discourage monkey patching. – However the fact that active support features are migrated into the language might show issues with ruby as a language – This is true for try and the existence of nil, as well of Enumerable.pluck and the missing type safety in ruby
  • Having no control over the instantiation of the controller and thus having to use before/after is a not great
  • The fact that associations can be loaded lazily caused tons of N+1 query bugs, which should at least be allowed to disable

There are some more points that cant be avoided and annoy me, but these are some examples, some of them can be found in the article.

The Complexity caused by rails

The article points out that rails introduces unnecessary complexity, which of course is a bad thing. There are some examples for it:

  • Monkey patching
  • The public methods amount of active record classes
  • Focus on adding features

Most of the other sentences hardly have any points.

I think these points are issues, that I also do not like. However I am not sure, why this should really be a problem because as I mentioned above, its your team that makes the call. Monkey patches ( ActiveSupport additions to core classes), don’t have to be used, so if you do not like using monkey patching just don’t use it.

There is also no reason to expose ActiveRecord methods to controllers. You can happily create service objects or use ActiveRecord to implement a repository your self.

In addition to that, its unfortunate that active record classes provide too many public methods. However this indicates issues with ActiveRecord and should not cause issues in your application. Usually there are multiple ways to do a certain thing and within a project the team should be consistent about it. This eliminates a large amount of public methods from actual use. I think it’s again the responsibility of the developer to ensure a proper code base and a consistent use of the libraries and tools the project is using.

When it comes to views, I really think there are some design issues in rails that create an API that pushes towards coupled code. However experienced developers should notice that and use a proper solution. Simply following Sandi Metz’ rule to only pass one object to your view solves most of the problems less experienced developers have.

With regards to ActiveRecord, I don’t really expect a skilled software developer to massively use ActiveRecord callbacks or implement business logic into the persistence layer.

Doing so would really quickly cause trouble. However I think the time when the community or the rails developers advocated this way of programming are over for years.

The rails way

Often people refer to ‘the rails way’ or to how ‘the rails community’ is doing things. As I just said, I think things really shifted around 2010 or so, which proved bob martin’s statement about the lost 10 years kind of correct. In the last year I listend to all episodes of:

and watched all the ruby related content on thoughtbots weekly iteration as well as the upcase ruby content from thoughbot.

This was really a good time investment, since I learned that a huge part of the ruby community, the rails community and the rails core team really care about the quality of their projects and that the time of ‘fat models’ has passed a long time ago. In addition to that the core team really does a great job on simplifying the internals and provide easier and more explicit APIs to rails components.

I am really happy to learn that thoughtbot and other companies with a lot of rails experience offer advice howto design your rails application to ensure a good architecture as wells as thinks like fast test suits and much more.

This really made me change my opinion about how hard it is to build proper (well designed / maintainable) applications with rails.

Mixing framework and language issues

When comparing frameworks it is easy to take language differences and project them on frameworks / libraries This often happens in the rails vs phoenix discussion, where many points are often about ruby vs elixir or even about the benefits of functional programming.

This might effect your choice of using ruby+rails vs elixir+phoenix or similar, but this is not a thing that can be attributed to rails. This is true, for things like Object#try, due to the fact that ruby supports nil. However its also true that large method interfaces are a general ruby issue. Of course still active record adds may too many methods. However this is more a topic for discussing rails internals or improving rails and not about reasons your application might be in a bad state.

A word on “The core team”

Throughout the article the author over and over repeats his opinion on the ‘philosophy’ of the core team. I am not sure with which members he is actively communicating, but these statements are just generalizations of statements and decisions that mostly DHH made, which of course has a special opinion about rails. However his task is more or less to sell rails and not to be an rails architecture consultant. I think if somebody disagrees with that direction, he should at least refer to DHH and not always talk about the ‘core team’. This really shows a lack of respect towards core members that spent months and years on simplifying active record, the rails router and other rails components within the last years.

Its even worse to talk about the DHH/TDD discussion or his focus on simple over easy within an article about rails. It might also be a bad thing just during thinking whether rails is a good choice or not. This is simply a topic that is totally unrelated to a proper decision on which framework to choose (if any)

Rails killed merb and data mapper

Original quote < These projects were killed by rails

I think this point is hardly worth commenting. However I think it is important to point this out anyway, so that reader that may buy this argument have a chance to rethink it.

Though it is true that merb kind of disappeared after the merge, this is what one would expect when two projects are merged. In addition to that, this merge caused quiet a lot of changes in rails. On the other side the fact that merb or data mapper (which is not dead at all but just did not took off), is hardly rails’ fault but a decision by the ruby community on which tools they want to use. I am really not buying the fact that it is rails fault that

< building anything new in the ruby ecosystem turned out to be extremely difficult., Since peoples’ attention is < Rails-focused, new projects have been highly influenced by Rails

I think bigger companies and ruby shops did shift to other libraries and web frameworks. Also sinatra gained more and more attention and adoption. And those projects usually do not use active record as a data source.


This article, as well as this years railsconf talk about ecto, mentioned the trailblazer project. I don’t want to rant too much, but I hardly cant take any rails criticism serious that also promotes this project.

If complexity is an issue, than adding a framework on top of a too big framework makes things just worse. It also makes things more fragile and make it harder for people to test early rails version, since it is highly coupled to rails.

But it gets worse


In my opinion this component is the worse attempt of implementing a serializer I ever had to work with. Its overly complicated, inheritance is misused and overused. Especially the entire attempt to mix a presenter into a model instance is totally bizarre. However the other option to have a wrapper around a model instance also introduces a lot of complexity. In addition to that the API is way too huge. Everything can be adjusted via metaprogramming/DSLs and besides that its by far the slowest serializer I worked with. I had to dig deep into it for month and I really discourage anybody from using it.

OO Design and inheritance

I watched a video about using trailblazer and I got confused when for the controller related content inheritance was introduced. Especially in a way that the purpose was to share code and the subtypes also stubbed out methods from the original class. Calling this ‘returning to good OO practices’ was the worse part of it. Its obvious that this kind of decision does not really sound like a well designed framework that reduces the application’s complexity. Again I think that trailblazer here introduces a worse architecture on top on an architecture that is criticised.

I could go way more into detail but I can only say, that I would think twice before adding this additional indirection and complexity to my project. And its kind of intuitively clear. Why would adding a set of components on top of something that might be too big and complex solve the issue?

So is all nice and good ?

Rails really has some issues when it comes to its implementation and its design. However this really has orthogonal to the application rails users have in their applications. And if you are really concerned about it, you can always contribute to it.

On the other side there are also some API/Design issues that push parts of the application towards a tightly coupled design. I think most of these issues can be addressed properly with proper designs and a good architecture.

The last thing that might be important, even though it is obvious, is the fact that rails is not a good match for every problem. It surely has its sweetspot and there are problems were it does not shine. I think this is especially the case where elixir is currently really taking off.

Ruby itself has its limitation that are thus also limitations for rails. And when it comes to performance or large scale requirements, you should really check how much of the rails stack you actually need.

If that is not at least ‘most of it’, it might be a bad choice in the first place.

Bad projects I worked on

The worst code base that paralyzed a company was a php project I entered that did not allow to introduce a permission management in more than 6 months of work. This project suffered from so many issues and it did not use any framework at all.

The worse rails projects I worked on were mostly implemented by me, mostly in a stage of my career I would not consider myself of being able to make proper architectural decisions. Or a time when I was still listening to management when they wanted me to “go faster”. This has totally changed and the above video from bob martin about professionalism is a great resource for this topic.


To summarize, I would say that there are really some things I don’t like about rails. However I have to be aware of the fact, that I am no longer excited about ruby as well. So this does blur my opinion. However there are other voice in the ruby community, that are really experienced and they agree with these points.

That is the reason why there were so great improvements to the state of active record, including the adequate record refactoring or the attributes API, which is even nicer, when used without symbols so that the object creation is explicit.

That said there are way too many blog posts and voices out there at the moment complaining and blaming rails for their bad architecture. I think we should really start thinking about what we as the person who wrote the code did wrong and then analyze the role of the framework.

Most of the complains mix comparisons of languages with frameworks. Or compare frameworks with different feature sets. In addition to that, issues with the rails implementation are mixed with issues that really impact the design of the application.

However there are things I do not like about rails and about the direction DHH is moving rails to. I really not like monkey patching, callbacks, implicit object management, redundant interfaces, the focus on easy over simplicity, preferring features over improvements.

And still I am able to write rails application were all these things almost not impact me, because I can chose what to use and what not.

I personally prefer having my framework (be it open source or just company internal), to default to a proper architecture, so that I don’t have to write all the boilerplate to setup repositories and interactors and all the components I use. However if I don’t know these concepts these frameworks wont help me create an application that will stay maintainable on the long term.

Attempting to send xx_ids: []

I have been recently working on an REST API (with rails 4.x) and a client wanted to send an empty array to indicate truncation of an association. The resource already existed and she wanted to send a PUT (with all attributes) including an empty array for a related collection that should now be empty. This could be the case if an author of a blog posts decides to delete all its comments and sends comments: []. From a REST perspective I found this reasonable, and we went for adding myassoc_ids: [] to the relevant controller. Everything looked fine and we deployed the code to staging (which runs in a production environment)

Next, she complained that the API requests fail with an 422 error. I took a look at the logs and found an entry that strong_parameters was complaining about:

found unpermitted parameters: myassoc_ids

This was really confusing since it had been added to the controller. Re-checking the log we found the cause.

Value for params[:object][:myassoc_ids] was set to nil, because it was one of [], [null] or [null, null, …]. Go to for more information.

I took a look at the section of the documentation and it says that empty arrays are turned into nil to make sure, the developer does not perform a nil check as a sensitization which is not sufficient for the example cases in the warning.

However the fact that rails turns empty arrays into nil, totally removes the ability to properly accept empty collections to indicate truncation of the collection. It turns out, that other people have already discusses this behavior and the rails team added this flag: config.action_dispatch.perform_deep_munge. Just as a note: deep_munge is the action that performs this parameter conversion. So everything is fine, just add a small test and disable deep_munge via config and this should be fixed quickly.

Well it should be, but some hours later I found myself failing to reproduce the problem in my test-suite. I am using rspec and I wrote a request spec to make sure all middleware parts and parameter interactions happen as they do in reality. However when I looked at the parameters in my controller they were always missing the empty associations. So after trying over and over again, I decided to run a debugger, since I had a certain feeling this might be a problem in rspec or in rails’ integration tests, since rails also added deep_munge in the first place.

After stepping deep down through rspec and action_dispatch into rack-test utils I finally found some methods that build a rack input from the given request.

There is a method called : build_nested_query in rack-test/lib-rack/test/utils.rb that handles arrays in a typecase by using each. Thus the processing of the request does not do anything on an empty array.

After searching at github I found This bug, which is exactly the bug I hit. Unfortunately the issue is more than a year old and it appears as the maintainer is not working on fixing it.

Now I have to change the configuration without writing a test to prevent regression. Hopefully there will be some feedback for this bug soon.

I skipped through everything quiet fast, so lets try to understand what happened in detail and how the different oddities and bugs worked together.

Understanding the problem

So what happened in detail and how did the different parts interact with one another?

StrongParameter per environment configuration

We only noticed the problem in staging (production) environment, since strong_parameters usually is configured to raise in production only and not in development or testing. Testing would require some heavy refactoring since usually attributes_for or similar is used for the controller and request specs and thus the parameter validation would fail. However I think since this is a fork of logic, every user of strong parameters, including myself, should check how to DRYly make the permitted parameters available for the controller and the tests. One suggestions I though of is, that the parameter specification could be moved into a dedicated object. I talked to somebody in the freenode #ruby channel and he told me they do something similar by using pundit policies for the parameter validation, which rally sounds interesting. I think I might write a followup on this topic.

So I just recommend trying to configure strong_parameters the same way for all environments.

Strong Parameter types

Per default strong parameters only permits scalar types as described in the readme. To allow an array, like I needed to, I had to set it explicitly to myassoc_ids: []


However since deep_munge converts arrays into nil, the type of the permitted value does no longer match and thus the resulting error is

unpermitted parameter …

even though the parameter is permitted, but due to deep_munge’s interference, the type does no longer match. Thus the error got really confusing at let us search at the wrong place for quiet some time.

It seems as If deep_munge is gone in the latest 5.x versions of rails and the security fixes have been moved into ActiveRecord. I am not sure if that is the right way, however at least sending empty arrays will work.

TDD and rack-test

After we understood all those interactions it was bad luck, that this case is also not supported by rack-test at all. This means, that nobody is able to write a test, that verifies the behavior of sending an empty array as parameter.

I wonder if this is possible? Is there really NOBODY sending empty collections to indicate an empty set? If somebody has an idea for a better way to do this, please let me know.

However merging the change into master feels uncomfortable without having regressions tests added to the project.

Hopefully the rack-test maintainer will respond to the open issues and pull requests or respond to my email, in which i offered supporting him, in case he has no time for working on the project anymore.

Wrap up

I don’t want to complain too much about decisions the rails team made for security concerns, or that open source software like rack-test has bugs. I think maybe there is just too much magic and indirection in this basic rails-api stack I was working on, or maybe it really just was bad luck.

Hopefully sharing this experience helps someone else finding the solution with less effort, in case she enters the same situation.

ISP in Ruby

I think most of the people writing code in Ruby know the SOLID principles to a certain degree.

Today I want to share my thoughts about the interface segregation principle in the context of Ruby. Originally the SOLID principles were formalized with more statically typed languages in mind. In languages like Java and Go explicit interfaces are required for a certain type of polymorphism.

Thus to pass a different thing to a method, the client would have to implement all methods defined by the interface and it totally makes sense to minimize the interface’s methods to whats necessary in order to make sure: no client should be forced to depend on methods it does not use.

Duck typing solves the problem automatically

In Ruby and other languages where duck typing is possible a client only has to implement the smallest possible interface, which could be the one method, that is expected to be called.

Thus one might think this principle hardly applies to Ruby, which seems kind of true. However I think there is a certain danger, most Ruby developers are not aware of, and that is the public API, of the object that is used.

Duck typing creates other problems

Thinking about ISP in the context of Ruby, I noticed, that there is a negative side to it. Duck typing lets us accept everything that at least responds to a required methods. However this usually means Ruby programmers, including myself, pass objects that have a way bigger public API, like:

  • active record models
  • decorated models ( in case of a full delegation )
  • or any other object that has additional methods

Where in the Java case, we make sure the required and available interface is minimal, in Ruby we only make sure the required interface is minimal, and we tend to ignore the available interface. This can easily lead other developers, our self, or library users to hook into those additional methods our object provides, which will cause tight coupling and unwanted dependencies.

This of course is a violation of the ISP.

I think one can think of passing object with unwanted APIs is like publishing an API, and thus there is no longer control of whether a client uses the additional methods or not. Thus changing the type of the object or any public method it has, should be handled as an incompatible change.

So it may appear as if duck typing solves the problems addressed by ISP, but it makes it easy to believe the interface is small, when it is huge in reality, causing unnecessary dependencies, up to the complete public API of a given object.

A solution

To solve this issue, an obvious solution would be to pass object with minimal interfaces. This can be done via the façades , decorators(that don’t automatically delete all methods) or by simple composition. Of course a blank class’ interface in Ruby is still huge, but we can at least make sure that we dont expose additional methods that the client should not be coupled to.

So please dont pass objects that expose an unnecessary large APIs to others.

A note on inheritance

This observation includes the usage of inheritance, mixins and decorators that delegate_all, since all this approaches are effectively inheritance and expose all methods when only some of them are needed.

Final thought

I think it is interesting how Ruby can make sure that the client automatically depends on the smallest interface possible, at first glance, when he/she really may depend on the largest interface possible in the worse case.