ISP in Ruby

I think most of the people writing code in Ruby know the SOLID principles to a certain degree.

Today I want to share my thoughts about the interface segregation principle in the context of Ruby. Originally the SOLID principles were formalized with more statically typed languages in mind. In languages like Java and Go explicit interfaces are required for a certain type of polymorphism.

Thus to pass a different thing to a method, the client would have to implement all methods defined by the interface and it totally makes sense to minimize the interface’s methods to whats necessary in order to make sure: no client should be forced to depend on methods it does not use.

Duck typing solves the problem automatically

In Ruby and other languages where duck typing is possible a client only has to implement the smallest possible interface, which could be the one method, that is expected to be called.

Thus one might think this principle hardly applies to Ruby, which seems kind of true. However I think there is a certain danger, most Ruby developers are not aware of, and that is the public API, of the object that is used.

Duck typing creates other problems

Thinking about ISP in the context of Ruby, I noticed, that there is a negative side to it. Duck typing lets us accept everything that at least responds to a required methods. However this usually means Ruby programmers, including myself, pass objects that have a way bigger public API, like:

  • active record models
  • decorated models ( in case of a full delegation )
  • or any other object that has additional methods

Where in the Java case, we make sure the required and available interface is minimal, in Ruby we only make sure the required interface is minimal, and we tend to ignore the available interface. This can easily lead other developers, our self, or library users to hook into those additional methods our object provides, which will cause tight coupling and unwanted dependencies.

This of course is a violation of the ISP.

I think one can think of passing object with unwanted APIs is like publishing an API, and thus there is no longer control of whether a client uses the additional methods or not. Thus changing the type of the object or any public method it has, should be handled as an incompatible change.

So it may appear as if duck typing solves the problems addressed by ISP, but it makes it easy to believe the interface is small, when it is huge in reality, causing unnecessary dependencies, up to the complete public API of a given object.

A solution

To solve this issue, an obvious solution would be to pass object with minimal interfaces. This can be done via the façades , decorators(that don’t automatically delete all methods) or by simple composition. Of course a blank class’ interface in Ruby is still huge, but we can at least make sure that we dont expose additional methods that the client should not be coupled to.

So please dont pass objects that expose an unnecessary large APIs to others.

A note on inheritance

This observation includes the usage of inheritance, mixins and decorators that delegate_all, since all this approaches are effectively inheritance and expose all methods when only some of them are needed.

Final thought

I think it is interesting how Ruby can make sure that the client automatically depends on the smallest interface possible, at first glance, when he/she really may depend on the largest interface possible in the worse case.

Update

A few days after I finally wrote this small article about distributed systems and databases, I had the time to view another video of my “see later” playlist at youtube.

The video I am talking about is applying the saga pattern by Caitie McCaffrey at Goto 2015. The content is strongly related the the content of my former blog post, thus I decided to update the old one, instead of repeating all the stuff I just said.

Here is a link to the new section saga pattern

Thanks to Caitie McCaffrey for her great talk.

Serializability

I recently skipped through the youtube playlist of the “GOTO Conferences 2015”, when I noticed a talk named: “Don’t Give Up on Serializability Just Yet” by Neha Narula.

In this video she talks about the property of serializability and the benefits transactional databases are giving software system engineers. The talks a little bit about the CAP theorem and the FLP theorem and how the two concepts relate to one another. After that she continues talking about her recent work on parallel processing of serialized operations.

I highly recommend this talk and it is available at youtube.

CAP and FLP

By coincidence a few days after I saw the talk, I friend of mine pointed me to a paper that talks about the inequality of the FLP and CAP theorem.

The author points out that the two are mathematically inequal. However I found the blog post not detailed enough, thus I searched for a more scientific paper and found this post on quora: Distributed Systems: Are the FLP impossibility result and Brewer’s CAP theorem basically equivalent. This post points to a Paper by Gilbert and Lynch and even quotes the part, that addresses the CAP vs FLP question. Here are some quotes that specify the two concepts:

FLP theorem:

The FLP theorem states that in an asynchronous network where messages may be delayed but not lost, there is no consensus algorithm that is guaranteed to terminate in every execution for all starting conditions, if at least one node may fail-stop.

CAP theorem:

The CAP theorem states that in an asynchronous network where messages may be lost, it is impossible to implement a sequentially consistent atomic read / write register that responds eventually to every request under every pattern of message loss.

The most important section of the above quote is

achieving agreement is (provably) harder than simply implementing an atomic read/write register.

I highly recommend reading these articles and the paper, or at least the relevant section. I found this new insights interesting and it certainly improved my knowledge on both theorems and their relation to each other.

Parallel execution of conflicting transactions

Within the talk about serializability, Mrs. Narula mentioned, that she was currently working on improving performance of serialized operations. I notices, that there already is a paper on that topic and found her thesis named Parallel execution of conflicting transactions This paper is really interesting and shows that there is still a lot of potential to improve parallelization of serialized operations, even in case of conflicting transactions. She also includes a in Memory database that serves as a kind of PoC implementation.

Commutativity

In her papar Mrs. Narula mentions the importance of commutativity. Since this is a general concept I looked further and found another interesting video on youtube. The scalable commutativity rule at papers we love. Mrs. Narula presents a paper that shows the importance of commutativity and scalability. The authors of the paper show that if a set of operations commute, there exists an implementation that has the property of scaleability. The videos goes into way more detail and presents various exmaples from operating systems, where commutativity can improve scalability and parallelization.

I highly recommend watchting this video, and maybe even reading the paper.

I just thought I share this set of really interesting sources, since I really learned a lot, by spending some days on this topic. Even as a software engineer, my insights will help me in practice to not only use indempotency but also commutativity as concepts when implementing concurrent and scalable systems.

Thanks to Neha Narula for her presentation of her own work and her collegues papar.

UPDATE (2016-01-11)

The saga pattern

I have to admit I just learned about the so called saga pattern. I have been working with similar concepts in practice, but it is always great to laern about the formal background.

I saw Applying the Saga Pattern by Caitie McCaffrey a few days ago and I want to thank her for great talk. The managed explain the pattern and its use in distributed systems in around 30 minutes, which I find awesome.

For those who don’t know it yet, the concept is about breaking long running and/or distributed transactions into smaller sub-transactions. A worklog is kept and if any error occurs, conpensating transactions will be applied to all sub-transactions that have been executed. In addition to that the forward recovery concept uses safe-points after successfull sub-transactions so that the operation can be retried or even fixed by alternative algorithms or even manually so that the complete saga will be finished successfully, and no role back ( with potential data loss ) will happen.

This is over simplified and just a short intruduction. I encourage anyone to watch the video. Its a great presentation with a lot of practical conciderations between the lines.

After the talk I took the time to read the original paper called SAGAS by Moline et al. Its a short and interesting read, so I recommend taking a few minutes for it.

I really have to say, that Mrs. McCaffrey managed to put all the relevant content into her talk, so that watching the video teaches almost to complete concept explained in the paper.

It was great to here about the constraints that sub-transactions require in order for backward recovery and forward recovery. When building the next distributed transaction I will have a clearer knowledge about the requirements and constraints.

So thanks to Mr. Molino and Mrs. McCaffrey ( @caitie on Twitter ).

I finally had the time to cleanup this blog. I removed jekyll-bootstrap entirely and switched to plain old jekyll. This has been on my mind for a quiet some time, but I finally did it. YAY.

Besides removing a big dependency and unnecessary complexity, I also had the chance to switch to a brand new theme, with proper mobile support.

Hopefully the new version inspires me to write some more posts, so that this page has some up2date and hopefully more software related information.

Es ist nun ca. zwei Wochen her, dass ein sehr kritischer Softwarefehler in der Verschlüsselungssoftware OpenSSL bekannt wurde. Die Schwachstelle wird als Heartbleed-Bug oder einfach Heartbleed bezeichnet. Ich möchte neben einer kurzen Zusammenfassung des Problems, einige Empfehlungen für Internernutzer festhalten. Direkt zur Empfehlung

Hintergrund

Die Schwachstelle wurde am 08.04.2014 durch das einspielen eines Patches, der diese behebt, bekannt. Sie besteht allerdings bereits seit 2012 und ermöglicht es, alle Daten, die zu jemand an einen betroffenen Servern übertragen hat zu entschlüsseln. Dies betrifft sowohl in der Vergangenheit aufgezeichnete, aber auch aktuelle Daten. Betroffen waren mindestens laut Schneier on Security ca 500.000 Seiten, zu denen u.A. auch gmx, web.de, google, facebook, gmail, twitter, instagram, dropbox, yahoo und godadday gehören. Hier gibt es eine Übersicht dazu. Wer die Schwachstelle verstehen möchte, sollte sich folgende Artikel anschauen:

Nach ca einer Woche hat cloudflare eine sog. Heartbleed Challange gestartet, deren Ergebnis gezeigt hat, das der Worse-Case, den ich oben beschrieben habe, praktisch möglich ist. Nachdem endgültig klar war, wie ernst die Schwachstelle zu nehmen ist, tauchten auch erste Hinweise auf, dass die Schwachstelle bereits 2013 ausgenutzt wurde. Am 14.04.2014 gab es dann hinweise, dass Daten beim kanadischen Finanzamt ausgelesen werden konnten. Da noch immer nicht alle Server aktualisiert waren, zeigte sich am 17.04.2014, dass noch immer 1000 Tor-Exitnodes betroffen sind. Leider sind bis heute noch immer nicht alle Server aktualisiert. Nicht nur die o.g. Tor-Nodes, sondern sehr viele Server, vor allem von kleine Firmen, haben Ihre Server noch immer nicht aktualisiert. Am 11.04.2014 verwies Schneier.com 3 auf diesen Artikel der klärt, wie man neben Servern auch Clients angreifen kann, die von dieser Schwachstelle betroffen sind.

Empfehlungen

Empfehlungen für jeden Benutzer

  • Ändern der Zugangsdaten, die man bei den betroffenen Diensten verwendet hat. Dies gilt neben http auch für Email und VPN Zugänge.
  • Wenn man unsicher ist, ob die Website, die man mit SSL betreibt sicher ist, kann man einen Live-Test durchführen. Sollte Sie noch immer betroffen sein, macht euch klar, dass jeder die Zugangsdaten jetzt oder in der Zukunft mitlesen kann.
  • Prüft euren Browser, ob er Zertifikats-Annullierung ( “Certificate Revocation” ) unterstützt. Euer Browser verhält sich korrekt, wenn ihr bei dieser Website, einen Fehler erhaltet. Wenn Ihr sie normal sehen könnt, bitte den Browser aktualisieren oder einen anderen verwenden.
  • Linux und Unix Benutzer sollten prüfen, ob Sie OpenSSL in einer betroffenen Version installiert haben. Sollten Ihr System betroffen sein, sollten Sie OpenSSL dringend aktualisieren.

Zusätzliche Empfehlungen technisch versierte Benutzer

  • Prüft zusätzlich, ob die Seite von Heartbleed betroffen war und ob das SSL-Zertifikat erneuert wurde.
  • Nehmt IT-Sicherheit wirklich ernst

Zusätzliche Empfehlungen für Server-Admins

Da bei weitem nicht alle Server aktualisiert wurden, möchte ich auch das Offensichtliche sagen:

  • Aktualisiert betroffene OpenSSL Versionen auf eine sichere Version. ( z.B. 1.0.1g statt 1.0.1a-f)
  • Tauscht alle Zertifikate, die auf den betroffenen Servern liefen. Und zwar mit einem neuen Private-Key
  • Informiert alle betroffenen Nutzer, dass sie ihre Zugangsdaten sollen.

Hinweis an gmx,web.de,… Nutzer

Durch das Anpassen der Zugangsdaten, kann man neben dem Heartbleed Problem auch gleich auf das jüngesten Ershceinen von 18.000.000 Emailkennwörtern reagieren.

Quellen