Design Goals

Design GoalsIn my last post (Wide open spaces) we discussed the elegance of using space based architecture platforms based on their simplicity and power. Compared to other models for developing distributed applications, it offers simpler design, savings in development and debugging effort, and more robust results that are easier to maintain and integrate.  Recall, this model combines and integrates distributed caching, content-based distributed messaging, and parallel processing into a powerful architecture within a grid computing framework.  

That was a mouthful. You may want to read that last sentence again carefully. And think about what this means to you as a professional practitioner.  More importantly, how this may change the way you think about application platforms in general.

Before diving into this important concept, I think it is always good idea to express our stated design goals right up front – and use these to guide the inevitable trade-offs and decisions that will need to be made along this journey. So let’s get started with a few design goals I’m comfortable with. I’m sure there are more, but this represents a good start.

The platform’s ability to scale must be completely transparent

The architecture should be based on technology that can be deployed across a grid of commodity hardware nodes, providing a scalable and adaptable platform that supports high-volume, high-performance processing. The resulting platform should be tolerant of failure in individual nodes, can be matched to changing volumes easily by increasing (or decreasing) the number of processing nodes and, by virtue of its decoupled business logic, is extendible and adaptable to evolve as the business landscape changes.

Unlike conventional application server models, our elastic application platform should not require application developers to do anything different in their code in order to scale. The developer uses a simple API that provides a vast key-value data store that looks like a large shared memory space. Underneath the covers distributed caching features of the application platform spread the data across multiple servers (e.g. using a sophisticated hash algorithm). The application developer should remain unaware of the underlying implementation that distributes the data across the servers on his behalf. In brief, the goal of the grid-enabled middleware is designed to hide complexities of partitioning, distributing, and load balancing.

The platform provides resiliency by design

Applications must be available to customers and expected service level objectives must be met.  The business cannot afford a single point of failure to impact customer access to other features and functions of the customer applications suite otherwise available. The platform should operate continuously and needs to be highly resilient to avoid any interruption in processing. This means that the application suite cannot have any single point of failure in the software, hardware, or network. High Availability (HA) is a basic requirement.  Failing services and application components will continue on different backup servers, without service disruption.

Distributed data caches are resilient by design because they should automatically replicate data stored in the cache to one or more backup servers, guided by the policies defined by an administrator and executed in a consistent controlled manner. If one server fails, then another server provides the data (the more replicas, the more resilient the cache). Note, distributed data caches can be vulnerable to data center outages if all the compute servers are located in the same physical data center. To address this weakness, the distributed caching mechanism should offer special WAN features to replicate and recover data across multiple physical locations. The improvement in resilience reduces the risk of expensive system down-time due to hardware or software failure, allowing the business to continue operating albeit with reduced performance, during partial outages. An added benefit of this architecture composed of discrete units working together would enable rapid development and a controlled introduction of new features in response to changing requirements without the need for a big-bang rollout approach.

The platform is prepared to meet demanding performance requirements

A performance characteristic of distributed caches is that they store data in fast-access memory rather than on disk, although backing store on disk may be an option. Since this data spans multiple servers, there is no bottleneck or single point of failure. Using this advanced elastic application platform provides a means to ensure that cached data will tend to be on the same server where application code is processing, reducing network latency. We can do this by implementing a “near-cache” concept that places data on the server running the application using that data or by directly managing application code execution in the platform, placing adjacent code and data in cache nodes that are on the same server.

The platform needs to support robust integratation with other data sources

Most distributed caching platforms offer read-through, write-through, and write-behind features to synchronize data in the cache with external data sources. Rather than the developer having to write the code that does this, an administrator configures the cache to automatically read or write to a database or other external data source whenever an application performs a data operation in the cache. Data is an asset that is valuable. Sharing this asset across the platform improves the ability to support better data enrichment, improve accuracy and meet business goals.

The platform’s application workload is by nature distributed

For elastic application platforms offering distributed code execution we should consider the nature of the workload the applications will present to servers. If we can divide the workload into units that naturally fit into the distribution schemes as offered then the greater sophistication of the distributed code execution capability can be just what’s needed to turn a troublesome, resource intensive application into one that performs well and meets expectations.

Specific application responsibilities that repeat (or are redundant) across the application architecture should be separated out in the application architecture.  Shared global or common use application functional solutions are sometimes referred to as “cross-cutting concerns” and forward the key principle of “separation of concerns”. The platform should support component designs which minimize coupling.  The law of Demeter (Principle of Least Knowledge or only know your neighbor applies). The platform should promote loose coupling by minimizing:

  • dependency between modules (e.g. shared global variables)
  • discouraging content coupling (one module relying on another’s content)
  • protocol or format dependencies
  • control based coupling where one program controls another’s behavior
  • Non-traceable message coupling which can lead to a dynamic spaghetti-like results impossible to manage

There are other goals I have not addressed here which we should all be familiar with to include:

  • Desire to BUY vs. Build and Maintain
  • Remain Technology and Vendor Independent
  • Promote Interoperability
  • Meet security and privacy needs

So, now we have a better idea of the design goals we are going to try to achieve. I think it is always important to take these goals to the next step in the high-level specification in order to begin quantifying how we will meet these into actionable objectives. Remember our original strategy which has driven our design goals. The design goals now should be used to create quantifiable objectives we can plan and measure progress to.

Wide open spaces

Wide open spaces - Wyoming

Okay, okay – know I should keep this blog more up to date, just have been a little busy with my day job… and now after a much needed rest in last weeks in August I can now share a few things you may find especially interesting and timely.  It is no coincidence that the image accompanying this post is of wide open spaces. This is in fact where I spent most satisfying part of my “summer vacation”.  And spaces (Tuple Spaces) is what I intend to share with you in the coming weeks.           

As architects we have a professional responsibility to always remain on the look-out for new (and sometimes revisited) ideas about how to improve and adopt good ideas. Especially when our business needs to invest in some key technology changes to remain competitive and deliver value customers will continue to seek for its distinctive quality of service and value.            

I have been pretty busy in the last year engaged in a variety of industries where road map development and execution of next generation platforms and paradigm shifts were needed.  Many of the more difficult challenges were solved by adopting a Space-Based Architecture (SBA) architecture pattern. This is a demonstrated pattern used to achieve near linear scalability of stateful, high-performance applications using the tuple spaces. This is not a new idea; the tuple space model was developed by David Gelernter over thirty years ago at Yale University. Implementations of tuple spaces have also been developed for Smalltalk, Java (JavaSpaces), and the .NET framework). A tuple space is an implementation of the associative memory model for parallel (distributed) computing by providing a repository of tuples that can be accessed concurrently. I know, this is a mouthful and a little too academic for me too. What this really means is we can group processors that produce pieces of data and group processors that use the data. Producers post their data as tuples in the space, and the consumers then retrieve data from the space that match a certain pattern. This is also known as the blackboard metaphor. Tuple spaces may be thought as a form of distributed shared memory. The model is closely related to other patterns that have been proved successful in addressing the application scalability challenge used by Google and Amazon.com (EC2) for example. The model has also been applied by many firms in the securities industry for implementing scalable electronic securities trading applications for example.   

Before you think I have gone daft on you, I recommend you see a commercial implementation of this at Gigaspaces.  Review the site and developer documentation and you will see how this platform is used to embrace many of the principles of Representational State Transfer (REST), service-oriented architecture (SOA) and Event-driven architecture (EDA), as well as elements of grid computing.  The beauty of the space based architecture resides in its tandem of simplicity and power. Compared to other models for developing distributed applications, it offers simpler design, savings in development and debugging effort, and more robust results that are easier to maintain and integrate.          

The pattern represents a model that combines and integrates distributed caching (Data Grid), content-based distributed messaging (Messaging Grid), and parallel processing (Processing Grid) into a powerful service oriented architecture built on shared spaces within a grid computing framework. Research results and commercial use have shown that a large number of problems in parallel and distributed computing have been solved using this architecture. And the implications of its adoption beyond high performance On-Line Transaction Processing extend well into other uses (including Master Data Management, Complex Event Processing, and Rules Processing for example).           

And this is what I intend to share with you in the coming weeks. 
Wide open spaces…