I have thought a lot in recent months about how to best leverage cloud computing resources, or utility computing, that are increasingly becoming available to the general development community and one issue in particular makes me cringe: LICENSING.
It just so happens that in my field, proteomics, the open source set of tools gather a lot of press, but really most submissions to publish research are still using commercial algorithms for the initial data analysis, even as plenty of research has been show (and published) that results from commercial and open source algorithms are comparable. There is an inherent level of trust manuscript reviewers have in the commercial offerings that is hard to overcome, hence most researchers still opt to use the commercial algorithms as the gold standard.
Not that this is a bad thing, mind you. As someone whom supports the informatics efforts for many researchers, I find that the commercial offerings are much more stable, and are subsequently easier to support and maintain than most of the current crop of open source offerings.
The trouble lies with rigid licensing models of commercial offerings. Specifically you must purchase perpetual licenses for a certain number of compute cores. Such a model is just not compatible with what I would like to do as a service center, namely to provide software-as-service billing to researchers. True the high up-front licensing can be amortized over the life of the support contract, but it assumes that the computers running the algorithm are already procured and dedicated to the software (also not a bad assumption in most cases, since the hardware costs pale in comparison to the licensing).
In effect, there is no way to make a utility computing model, such as one offered by Amazon Web Services, work with these sorts of license restrictions. The set-up and tear down of a compute job is too high to be a viable full time solution.
What I would like to do is augment my current computing capacity during crunch times. Dedicated licensing prevents this. As does the way most networked algorithms work, but that's another post.
It just so happens that in my field, proteomics, the open source set of tools gather a lot of press, but really most submissions to publish research are still using commercial algorithms for the initial data analysis, even as plenty of research has been show (and published) that results from commercial and open source algorithms are comparable. There is an inherent level of trust manuscript reviewers have in the commercial offerings that is hard to overcome, hence most researchers still opt to use the commercial algorithms as the gold standard.
Not that this is a bad thing, mind you. As someone whom supports the informatics efforts for many researchers, I find that the commercial offerings are much more stable, and are subsequently easier to support and maintain than most of the current crop of open source offerings.
The trouble lies with rigid licensing models of commercial offerings. Specifically you must purchase perpetual licenses for a certain number of compute cores. Such a model is just not compatible with what I would like to do as a service center, namely to provide software-as-service billing to researchers. True the high up-front licensing can be amortized over the life of the support contract, but it assumes that the computers running the algorithm are already procured and dedicated to the software (also not a bad assumption in most cases, since the hardware costs pale in comparison to the licensing).
In effect, there is no way to make a utility computing model, such as one offered by Amazon Web Services, work with these sorts of license restrictions. The set-up and tear down of a compute job is too high to be a viable full time solution.
What I would like to do is augment my current computing capacity during crunch times. Dedicated licensing prevents this. As does the way most networked algorithms work, but that's another post.