Wednesday, May 25, 2011

...on a mote of dust suspended in a sunbeam

The famous "Pale Blue Dot" photo showing earth as seen from the edge of the solar system:


"Look again at that dot. That’s here. That’s home. That’s us. On it everyone you love, everyone you know, everyone you ever heard of, every human being who ever was, lived out their lives. The aggregate of our joy and suffering, thousands of confident religions, ideologies, and economic doctrines, every hunter and forager, every hero and coward, every creator and destroyer of civilization, every king and peasant, every young couple in love, every mother and father, hopeful child, inventor and explorer, every teacher of morals, every corrupt politician, every ‘superstar,’ every ‘supreme leader,’ every saint and sinner in the history of our species lived there — on a mote of dust suspended in a sunbeam." Carl Sagan 


Also illustrated by the photo is the superiority of imagination over knowledge ... knowledge sent the Voyager to the edge of the solar system and beyond, but imagination (by Sagan) turned its camera toward earth to take this photo - perhaps one the most impact-full photos of all time.



Sunday, May 15, 2011

To OSGi or not to OSGi ... that is NOT the question


The topic of OSGi is attracting some attention these days, at least in my neck of the woods. The short of it is that a lot of web application developers are asking whether they should use OSGi or not. My answer: this is not the question you should be asking!

Yes, I know that the popular narrative is that OSGi makes your system more modular, makes dependency management a thing of the past, certainly solves all CLASSPATH issues, allows you to have multiple version of the same bundle running at the same time and enables you to start, stop, and deploy new bundles without restarting your framework (or server).

The reality though is that OSGi is a component technology (much like SOA or EJB) and a tool, it perhaps can make a well design system be implemented with more ease and fidelity to its original design, but it can NOT do anything for a poorly designed or organically grown system and in fact makes it more complex. So the right question to consider is what qualities my system has to have so that OSGi can actually help me?

To me, the answer is similar to SOA system, or any granularly componentized system, 

1- Granularity and boundaries of components: This is perhaps the most important aspect of a distributed system. OSGi unit of component is a "bundle", physically bundle is a jar file, it can logically be a Java class or a full subsystem - such as Jetty or a large web application - or anything in between, OSGi does not offer any hint here - nor should it. Granularity of modules is an architectural matter. For existing system, this (breaking the system apart into logical modules) is almost always the most difficult step toward any modularization. If your system already does not clear module with defined boundaries, and if it is organically grown, there is no easy or automated way to decide what modules are, needless to say that simply creating one massive OSGi module does not help you at all and simply add one more layer of useless abstraction on top of everything else.  Your best bet here is to use a dependency graph tool and try to isolate and bundle packages/jar files based on some topological sort method. This often requires refactoring of existing code to remove bad dependencies and transform the graph into architecture layering you intend. This brings me to the second aspect of a well design modular system.

2- Layering and velocity: In order to define your logical modules correctly, you need to define some form of layering that informs your dependency management i.e. a lowest layer (let's call it Kernel), with no dependencies but the standard runtime, one layer above (say Core), with dependency on Kernel. Core is a layer and may include multiple logical module (jars/packages), then you may have Service and Application layer etc. You need to decompose and map your entire code based to your own pre-defined layer and in addition decide a velocity (release cycle) for each layer, as well as whether your release of lower layer would be forever backward compatible or it would impact higher layers. (more on this in 4)
Again OSGi does not offer help here - and nor should it - it simple is a technology.

3- Dependency Management: defining layers does not guarantee that dependency schema will be enforced, still you need to manage it (preferably using tools and automatically) to make sure that for example your Kernel does not depend on your Core layer or there is no cyclical dependencies among your logical modules. More tricky yet is the nature of dependencies. If dependencies are not managed you may notice that there is a very large module in say , Core, layer with a large number of services and applications depending on it. At first glance it may look like a useful module! but the large size should make you suspicious, often time the upper layers simply leak what should be located in application or service layer down to lower layer - lack of engineering disciplines, knowledge, time of all above.
In this case, OSGi would help you capture and discover the dependency, but does not tell you that they should not be there to begin with.

4- Version-ing policy: As I said in (2), well designed systems has layers, from “lower” layer to “upper” layers – based on topological sort of dependency graph. Typically each layer has a version number visible to other layers (some may choose to have each module in a layer to have a version number visible to all other module in upper layer, this makes life a bit more difficult for module in upper layers). One should decide how many active version of each layer (or module) to be active at any given time. This, seemingly straightforward, decision have significant implications, options are
-        If at any given time you maintain only one version, everything is a bit easier, but you either have to maintain perpetual backward compatibility or force all the upper layer change at the same pace with lower layers.
-        If you maintain multiple version, you don’t need to be backward compatible and you may transition upper layer gradually – very desirable. But you have to deal with two version at  the same time (not only at runtime, but development branches, testing …)

For most web application, people maintain one version and deal with the downsides – often in form of a backward compatible changes. OSGi can help with maintain multiple version at the same time - something that is certainly useful for client side application, for web application most people I talked to are not planning to use this feature.


5- Testing strategy: Distributed systems are tough to test. A monolithic system is one large binary, you can build, deploy it and test it. For distributed system, test environment has to be setup, one would build only his module, the other module you depend on must be ready (either as out of process services in SOA, or bundles in OSGi) and have the right version. If you are using, say, five modules and each of them has two active version, there are 32 possible combinations you need to test (to be exhastive) – one reason having only one version at the time is often preferable. Again OSGi does not help you with designing your test strategy, you should have one regardless of technology you use to modularize you system.

6- Deployment: Last but not least is deployment of your system/web application. You need to decide whether to deploy OSGi framework as a web application under you Servlet container, or deploy your servlet container as a bundle in your OSGi framework. If you are using SOA, you need to decide where to deploy each service and how to bundle service stubs with your application (if any stubs are needed), or you may use a combination by deciding that each service stub is an OSGi bundle. In any case, there should be a clear design for correct deployment of a distributed/modular system.

If you design a system in a way that these aspects are taken into accounts, then OSGi probably helps you implement it easier - although for web applications the issue to "two runtimes" is a bit too much for my taste - but then again if all the above aspects are taken into account, you may not have an urgent need to OSGi anyway (banks offer credit of people with good credit but probably the don't need it anyway....). Often times engineers and managers who work on poorly designed or organically grown systems, and in an effort to reduce complexity and increase productivity of people working on them, stumble upon OSGi...if you fit into this category, my recommendation is to focus on fixing the underlying issues that makes your system complex, inter-dependent and coupled. Until you do that, OSGi (or any other alphabet soup of technology) will not help you.