July 30, 2007
Each trading day is a perfect storm. Every month, every quarter, the volume of data increases, the sophistication of algorithms and business processes grows, and the competitive pressure to get things done as quickly and efficiently as possible mounts. In the past, Moore’s Law has rescued us from drowning in computing demand, but the pace of progress alone can no longer stem the tide. While we previously had been able to recompile applications with new processor optimizations and deploy them on bigger, faster systems to keep ourselves afloat, new systems offer greater concurrency instead of greater speed, and simple recompilation and deployment cannot take advantage of them. Our appetite for computing power isn’t satisfied with lone, uncoordinated machines. For financial services, distributed computing isn’t a luxury: it puts food on the table.
The ongoing adoption of distributed computing within financial services has not been easy, though. Retrofitting applications to benefit from distributed architectures has required significant knowledge, resources and effort; often more than we have had on hand. For example, many financial applications begin with prototypes developed in Microsoft Excel by quantitative analysts or other business managers. Typically, those prototypes have not lent themselves to concurrent or distributed implementations. Software engineers, who in the past could afford to do relatively literal translations of spreadsheet logic into production code, have since needed to transform solutions into something more amenable to running in parallel. This transformation generally requires developers with greater technical skill and finer ability to engage with and understand the underlying business problem. While individual or small teams of average developers can, with very light impact on IT, implement monolithic or standard n-tier applications, implementing distributed applications generally requires more sophisticated developers and specialized knowledge in networking, security, concurrency and performance.
Furthermore, these projects tend to require significantly more IT involvement, coordination and production management in order to support the physical infrastructure of distributed applications. Before the current crop of distributed computing tools came to market, many engineering teams rolled their own distributed application infrastructures. Even when relying on message-passing libraries such as MPI, teams had to invest heavily in provisioning, deployment and data distribution. Such teams spent an inordinate amount of time developing distributed computing infrastructures instead of the custom business logic that generates unique value.
Fortunately, the past several years have revealed many enabling developments in distributed computing, including a number of high-quality vendor products brought to market that significantly reduce the complexity and cost of delivering distributed applications. Now more than ever, it is much easier to develop distributed application across a range of different platforms. Nonetheless, distributed application development still requires a fair amount of architectural skill and understanding, IT involvement, and nontrivial transformation of business logic.
Distributed Computing Today
The key aspect of distributed computing today is that it is no longer just theoretical. You actually can write certain types of distributed applications (such as those that are embarrassingly parallel) with off-the-shelf products, and with minimal time, effort and cost. The range of stable, usable distributed computing platforms -- such as those from Platform Computing, GigaSpaces and Digipede Technologies -- is impressive, as are the other supporting technologies -- such as distributed data frameworks from GemStone, Tangosol and ScaleOut Software, and event processing systems from Progress Apama and BEA -- that enable more sophisticated distributed designs and architectures. Thus, it is becoming much rarer to find software development teams in financial services working on this type of plumbing.
Additionally, there has been a significant rise in conferences, articles and blog entries on distributed computing in financial services and in enterprises at large. While there have been several notable distributed computing projects in the past -- everything from key cracking and searches for Mersenne primes to genome/proteome mapping and signal analysis for SETI -- few were structured in a way that represented how financial services needed to use distributed computing. There was a dearth of information and dialogue about the unique demands of distributed computing in finance, and a lack of live projects from which the community could learn. Now, we are seeing a growing number of financial institutions, from global investment banks to hedge funds, not only piloting distributed computing projects, but also talking about them in public and semi-public forums.
On the other hand, the current state of the world offers a number of serious obstacles. For example, while it is positive that there is a wave of vendor products that solve different parts of the distributed computing puzzle, few of them treat distributed application development as a holistic endeavor that encompasses many problems (i.e., job scheduling, event processing, data distribution and caching, security, deployment, APIs, IDEs, etc.) at once. Except for GigaSpaces, most distributed computing architectures require the assembly of infrastructure from several different vendors. While this does permit architectures built from best-of-breed solutions, it can be challenging to stitch the various pieces together into a coherent developer framework.
Another obstacle is that the organizations designing business logic have not been thinking of business logic in a form amenable to distributed computing. Most algorithms, prototypes and problem descriptions exhibit a serial bias and usually require significant transformation to adapt the design to a distributed model. For example, many designs assume a canonical database, a master process and reliable determinism, and these assumptions get subtly baked into the requirements. That means the software engineering process must reach back into the business to search for equivalent, distributed solutions. This, unfortunately, puts significant pressure on perhaps the weakest interaction in many financial organizations: the interaction between subject matter expert and software engineer.
Things are also somewhat bleak on the developer side. From a programming language perspective, we are still in the assembly language era of distributed computing. Most distributed programs are intimately involved on a line-by-line basis in concurrency, synchronization, coherency and other plumbing. Design patterns and language concepts have not sufficiently formed and stabilized to migrate into our mainstay programming languages, although there are some interesting indications of the things to come in technologies such as Erlang and Microsoft’s CCR/DSS.
Page: 1 of 2(Digg, Technorati, more)