Big Data Analytics – Unlock Breakthrough Results: (Step 2)

Herding-CatsThis post is part of a larger series to provide a detailed set of steps you can take to unlock breakthrough results in Big Data Analytics. The simple use case used to illustrate this method will address the perplexing management challenge of platform and tool optimization. This step is used to identify the types and nature of the operating models used within the analytic community. I’m using a proven approach for solving platform and tool optimization in the same manner that proven practice suggests every analytic decision be made. Here we are simply using an organizing principle to group and categorize our findings in what can quickly become a bewildering experience (much like herding cats) in its complexity and nuance.

Recall the nine steps to take as summarized in a prior post.

1) Gather current state Analytic Portfolio, and compile findings.
2) Determine the Analytic Operating Models in use.
3) Refine Critical Analytic Capabilities as defined.
4) Weight Critical Analytic Capability according to each operating model.
5) Gather user profiles and simple population counts for each form of use.
6) Gather platform characteristics profiles.
7) Develop platform and tool signatures.
8) Gather data points and align with the findings.
9) Assemble decision model for platform and tooling optimization.

Let’s start with examining the type and nature of the analytic operating models in use. Note an organization of any size will most likely use two or more of these models for very good reasons. I myself have seen all of these models employed at the same organization in my own practice. When moving on to the remaining steps it will become increasingly evident that having a keen understanding of the strategy, organization, technology footprint, and culture that drives the model adoption in question will become invaluable. First, let’s define our terms.

What is an operating model? 
Wikipedia defines an operating model as an abstract representation of how an organization operates across a range of domains in order to OperatingModels_Summaryaccomplish its function. An operating model breaks this system into components, showing how each works together. It helps us understand the whole. In our case we are going to focus on the analytic community and use this understanding to evaluate fit when making changes to ensure the enabling models will still work after the recommended optimization is called for. Thanks to Gartner who published Critical Capabilities for Business Intelligence and Analytics Platforms this summer (12 May 2015 ID:G00270381) we have a reasonably good way to think about form and function across the different operating models which Gartner refers to in their work as baseline use cases to include.

– Centralized Provisioning,
– Decentralized Analytics,
– Governed Data Discovery, and
– OEM/Embedded Analytics.

You may think what you will about Gartner I believe they have done a good job of grouping and characterizing the signatures around the four (4) operating models using fifteen (15) critical analytic capabilities to further decompose the form and function found within each. At a summary level the capabilities are grouped as follows.

– Traditional Styles of Analysis
– Analytic Dashboards and Content
– IT-Developed Reports and Dashboards
– Platform Administration
– Metadata Management
– Business User Data Mash-up
– Cloud Deployment
– Collaboration and Social Integration
– Customer Services
– Development and Integration
– Ease of Use
– Embedded Analytics
– Free Form Interactive Exploration
– Internal Platform Integration
– Mobile

Note: Detailed descriptions and characteristics of each of the fifteen (15) critical capabilities can be found in step three (3) where I will refine the Gartner definitions of Critical Analytic Capabilities to add additional context.

Why is this important?
Each of the four models have very different needs influenced by strategy, footprint, and culture of the organization. Each optimization will have to recognize their differences and accommodate for them to remain meaningful. A set of tools and 805platforms which are ideal for Centralized Provisioning are usually terrible and completely unsuited for use within a Decentralized Analytics operating model. Critical capability essential to Embedded Analytics is very different from Governed Data Discovery. Yes there are some capabilities that cross operating models (e.g. metadata), and some that are far important than others. In general this is a truly sound way to determine where your investment in capability should be occurring – and where it is not. Along the way you will surely stumble across very clever professionals who have solved for their own operating model limitations in ways that will surprise you. And some just downright silliness; remember culture plays a real and present role in this exercise. At a minimum I would think carefully about what you uncover across the following facets or dimensions.

  • Structure is drawing boundaries for each analytic community, defining the horizontal mechanisms that ensure coordination and scale, and evaluating the resource levels that reflect the roles of the each. It should define the high-level organization chart if form follows function. If you look carefully, the clues to helping understand and classify each model are there. And note some overlap and redundancy is expected between each of the models.
  • Accountability describes the roles and responsibilities of the organizational entities within each model and clarify how organizational units come together to make effective cross-enterprise analytic decisions. This is where a lot of organizational friction can occur resulting in undefined behaviors and unnecessary ambiguity.
  • Governance refers to the configuration and cadence for discussing and resolving issues of strategy, resource allocation (including talent), performance management and other matters under each model. Note the wide variety of skills and competencies needed under each model and the potential for a rapid proliferation of tools and methods.
  • Working describes how people collaborate across the seams that lie between different models. Behavior that’s consistent with intended values is critical to effective execution. Less understood by many, remember you really can’t do effective predictive or prescriptive analytic work without the descriptive or diagnostic data sets usually prepared by others under what is typically a very different operating model.
  • Critical Capability can be determined by using the collection referred to above to balance people, processes and technology investment. The choice of operating models has implications for the type of talent or technology platform and tool optimization required. This collection is a suggestion only (and a good one at that), in step three I will refine this further to illustrate how to extend and refine this set of capabilities.

Step Two – Determine the operating models in use
In this step we are going to gather a deep understanding for the characteristics within each operating model, where they differ, and what common components and critical capability are shared. If you read the Gartner reference they consider metadata to be most heavily weighted in the Centralized Provisioning and Governed Discovery models. Based on my experience it is just as critical (and perhaps even more so) in the Decentralized model as well, especially in the Big Data world where tools like Alation, Adaptive, and Tamr are becoming essential to supporting discovery and self-service capability. The rest of this post will briefly describe the key characteristics for each operating model, their signature attributes, and highlight a few differences to help determine which operating models are employed.

Centralized Provisioning

CentralizedProvisioningThe classic model used for years in delivery of what has been referred to as business intelligence. Typically we would find tight management controls to push through centralized strategy and efficiency, usually at a high cost. Tightly managed processes for collecting and cleaning data before consumption can be found in the classic patterns associated with Extract, Transform, and Load operations into a data warehouse or mart. Most often characterized by formal processes where a developer or specialists collects business requirements from the users and then creates sanctioned reports and dashboards for them on trusted data. Centralized provisioning enables an information consumer to access their Key Performance Indicators (KPIs) from an information portal — increasingly on a mobile device or embedded in an analytic application — to measure the performance of the business. Interactivity and discovery in centrally developed content is limited to what is designed in by the content author. Seven of fourteen most important capabilities needed this model would include:

– IT-Developed Reports and Dashboards
– Traditional Styles of Analysis
– Platform Administration
– Development and Integration
– Metadata Management
– Ease of Use
– Customer Services

Decentralized Analytics

DecentralizedAnalyticsThe opposite of centralized provisioning, this model or loose confederation encourages local optimization and entrepreneurial drive. Look for a community that rapidly and interactively explores trends or detects patterns in data sets often from multiple sources to identify opportunities or risks with minimal support from the IT development community. Interactivity and discovery in this model is NOT limited to what is designed in by the content authors we find in the Centralized Provisioning model. The users are the content authors. Users of platforms and tools that excel at the decentralized analytics model can explore data using highly interactive descriptive analytic (“what happened” or “what is happening”) or diagnostic analytic (“Why did something happen?”, “Where are areas of opportunity or risk?”, and “What if?”). Because of embedded advanced analytic functions offered by many vendors, users can extend their analysis to some advanced descriptive analysis (for example, clustering, segmenting and correlations) and to a basic level of predictive analytic (for example, forecasting and trends). They can also prepare their own data for analysis, reducing their reliance on IT and improving time to insight. As decentralized analytics becomes more pervasive, the risk of multiple sources of the truth and information governance itself becomes a real challenge. Six of fourteen most important capabilities important capabilities needed in this model would include:

– Analytic Dashboards and Content
– Free Form Interactive Exploration
– Business User Data Mashup and Modeling
– Metadata Management
– Ease of Use
– Customer Services

Governed Data Discovery

GovernDiscoveryA hybrid of centralized and decentralized this model is best characterized by offering freedom within a framework to enhance transparency and effectiveness. This model features business users’ ability to prepare and combine data, explore and interact visually with this data to enable discovery to be deployed and managed across the enterprise. With the success of data discovery tools in driving business value, there is an increasing demand to use data discovery capabilities for a broader range of analysis and an expanded set of users than previously addressed by traditional reporting and dashboards. Governed data discovery enables users to access, blend and prepare data, then visually explore, find and share patterns with minimal IT support using their own technical and statistical skills. At the same time, this model must also satisfy enterprise requirements for business-user-generated model standards, data reuse and governance. In particular, users should be able to reuse sanctioned and approved business-user-created data or data sets, derived relationships, derived business models, derived KPIs, and metrics that support analyses.

Governed data discovery can enable pervasive deployment of data discovery in the enterprise at scale without proliferating data discovery tooling sprawl. The expanded adoption of data discovery also requires analytic leaders to redesign analytics deployment models and practices, moving from an IT-centric to an agile and decentralized, yet governed and managed approach. This would include putting in place a prototype, pilot and production process in which user-generated content is created as a prototype. Some of these prototypes would need to be used in recurring analysis and promoted to a pilot phase. Successful pilots are promoted to production and operationalized for regular analysis as part of the system of record. Each step provides more rigor and structure in governance and Quality Assurance testing. Business user data mashup and modeling, administration, and metadata capabilities should be based understanding on the following characteristics which would differentiate a Governed model from the Decentralized Analytics model discussed earlier. Pursuing the following questions will help define the differences.

– Where are permissions enabled on business models?
– Who can access shared data connections and data sets?
– Who can create and publish data sets?
– Who can access shared user work spaces to publish visualizations?
– Is there shared metadata about usage, connections and queries ?
– Are usage, connections and queries monitored?
– Is there a information catalog available to enable discovery?

Eight of fourteen most important capabilities needed in this model would include:
– Analytic Dashboards and Content
– Free Form Interactive Exploration
– Business User Data Mashup and Modeling
– Internal Platform Integration
– Platform Administration
– Metadata Management
– Ease of Use
– Customer Services

Embedded Analytics

EmbeddedAnalyticsIn this model analytics (decisions, business rules, and processes) are integrated into the organization to capture economies of scale and consistency across planning, operations, and customer experience. Most typically found where developers are using software development kits (SDKs) and related APIs to include advanced analytics and statistical functions within application products. These capabilities are used to create and modify analytic content, visualizations and applications and embed them into a business process, application or portal. Analytic functions can reside outside the application, reusing the infrastructure but should be easily and seamlessly accessible from inside the application, without forcing users to switch between systems. The ability to integrate analytics with the application architecture will enable the analytic community to choose where in the business process the analytics should be embedded. On example of a critical capability for embedding advanced analytics would include consuming a SAS/R or PMML model to create advanced models embedded in dashboards, reports or data discovery views. Six of the fourteen most important capabilities needed in this model would include:

– Embedded (includes both developer and embedded advanced analytics)
– Cloud Deployment
– Development and Integration
– Mobile
– Ease of Use
– Customer Services

Putting It All Together
Believing form really does follow function it should be clear after this step what operating models are driving the platforms and tools that are enabling (or inhibiting) effective performance. Using the Gartner work and the refinements I have extended this with we can now see at a glance what core capabilities are most important to each model as illustrated in the following diagram. This will become a key input to consider when assembling the decision model and discovering platform and tooling optimization in the later steps.

Now that this step is completed it is time to turn our attention to further refining the critical analytic capabilities as defined and begin weighting each according to their relative importance to each operating model.  It will become increasingly clear why certain critical capabilities essential to one model will be less important to another when this task is completed.

If you enjoyed this post, please share with anyone who may benefit from reading it. And don’t forget to click the follow button to be sure you don’t miss future posts. Planning on compiling all the materials and tools used in this series in one place, still unsure of what form and content would be the best for your professional use. Please take a few minutes and let me know what form and format you would find most valuable.

Suggested content for premium subscribers: 
Big Data Analytics - Unlock Breakthrough Results: Step Two (2) 
Operating Model Mind Map (for use with Mind Jet - see https://www.mindjet.com/ for more)
Analytic Core Capability Mind Map
Enterprise Analytics Mind Map 
Analytics Critical Capability Workbooks
Analytics Critical Capability Glossary, detailed descriptions, and cross-reference
Logical Data Model (XMI - use with your favorite tool)
Reference Library with Supporting Documents

Big Data Analytics – Unlock Breakthrough Results: (Step 1)

tlmd_mitos_que_afectan_la_vida_de_tu_mascota_17You’ve made the big data investment. You believe Nucleus Research when it says that an investment in analytics return a whopping thirteen (13) dollars for every one (1) dollar spent. Now it’s time to realize value. This series of posts is going to provide a detailed set of steps you can take to unlock this value in a number of ways.  As a simple use case I’m going to address the perplexing management challenge of platform and tool optimization across the analytic community as an example to illustrate each step. This post addresses the first of nine practical steps to take.  Although lengthy, please stick with me, I think this you find this valuable. I’m going to use a proven approach for solving platform and tool optimization in the same manner that proven practice suggests every analytic decision be made.  In this case I will leverage the CRISP-DM method (there are others I have used like SEMMA from SAS) to put business understanding front and center at the beginning of this example.

Yes, I will be eating my own dog food now (this is why a cute puppy is included in a technical post and not the Hadoop elephant) and getting a real taste of what proven practice should look like across the analytic community.  Recall the nine steps to take summarized in a prior post.

1) Gather current state analytics portfolio, interview stakeholders, and compile findings.
2) Determine the analytic operating models in use.
3) Refine Critical Analytic Capabilities as defined to meet site specific needs.
4) Weight Critical Analytic Capability according to each operating model in use.
5) Gather user profiles and simple population counts for each form of use.
6) Gather platform characteristics profiles.
7) Develop platform and tool signatures.
8) Gather data points and align with the findings.
9) Assemble findings and prepare a decision model for platform and tooling optimization.

Using the CRISP-DM method as a guideline, we find that each of the nine steps corresponds to the CRISP-DM method as illustrated in the following diagram.

CRISP_StepAlignment

Note there is some overlap between understanding the business and the data. The models we will be preparing will use a combination of working papers, logical models, databases, and the Decision Model Notation (DMN) from the OMG to wrap everything together.  In this example the output product is less about deploying or embedding an analytic decision and more about taking action based on the results of this work.

Step One – Gather Current State Portfolio
In this first step we are going to gather a deep understanding for what exists already within the enterprise and learn how the work effort is organized. Each examination should include at a minimum:

  • Organization (including its’ primary and supporting processes)
  • Significant Data Sources
  • Analytic Environments
  • Analytic Tools
  • Underlying technologies in use

The goal is to gather the current state analytics portfolio, interview stakeholders, and document our findings. In brief, this will become an integral part of the working papers we can build on in the steps to follow.  This is an important piece of the puzzle we are solving for. Do not even think about proceeding until this is complete. Note the following diagram (click to enlarge) illustrates the dependencies between accomplishing this field work and each component of the solution.

UMLDependencyDiagram

Unlocking Breakthrough Results – Dependency Diagram

Organization
If form follows function, this is where we begin to uncover the underlying analytic processes and how the business is organized. Understanding the business by evaluating the organization will provide invaluable clues to uncover what operating models are in use.  For example, if there is a business unit organized outside of IT and reporting to the business stakeholder, you will most likely have a decentralized analytics model in addition to the centralized provisioning most analytic communities already have in place.

Start with the organization charts; but do not stop there. Recommend you get a little closer to reality in the interview process to really understanding what is occurring in the community. By examining the underlying processes this will become clear. For example, what is the analytic community really doing? Do they use a standard method (CRISP-DM) or something else? An effective way to uncover this beyond the simple organization charts (which are never up-to-date and notorious for mislabeling what people are actually doing) is using a generally accepted model (like CRISP-DM) to organize the stakeholder interviews. This means we can truly understand what is typically performed by whom, using what processes to accomplish their work.  And where boundary conditions exist or in the worst case are un-defined.  An example is in order.  Using the CRISP-DM model we see there are a couple of clear activities that typically occur across all analytic communities.  This set of processes is summarized in the following diagram (click to enlarge).

CRISP_DM_MindMap

Gathering the analytic inventory and organizing the interviews now becomes an exercise in knowing what to look for using this process model. For example, diving a little deeper we can now explore how modeling is performed during our interviews guided by a generally accepted method. We can structure questions around the how, who, and what is performed for each expected process or supporting activity. Following up on this line of questioning should normally lead to samples of the significant assets which are collected and managed within an analytic inventory. Let’s just start with the modeling effort and a few directed questions.

  • Which organization is responsible for the design, development, testing, and deployment of the models?
  • How do you select which modeling techniques to use? Where are the assumptions used captured?
  • How do you build the models?
  • Where do I find the following information about each model?
    •     Parameter, Variable Pooling Settings
    •     Model Descriptions
    •     Objectives
    •     Authoritative Knowledge Sources Used
    •     Business rules
    •     Anticipated processes used
    •     Expected Events
    •     Information Sources
    •     Data sets used
    •     Description of any Implementation Components needed
    •     A Summary of Organizations Impacted
    •     Description of any Analytic Insight and Effort needed
  • Are anticipated reporting requirements identified?
  • How is model testing designed and performed?
  • Is a regular assessment of the model performed to recognize decay?

When you hear the uncomfortable silence and eyes point to the floor you have just uncovered one meaningful challenge.  Most organizations I have consulted into DO NOT have an analytic inventory, much less a metadata repository (or even a simple information catalog) I would expect to support a consistent, repeatable process.  This is a key finding for another kind of work effort that is outside the scope of this discussion.  All we are doing here is trying to understand what is being used to produce and deploy information products within the analytic community.  And is form really following function as the organization charts have tried to depict? Really?

An important note: we are not in a process improvement effort; not just yet. Our objective is focused on platform and tool optimization across the analytic community.  Believing form really does follow function it should be clear after this step what platforms and tools are enabling (or inhibiting) effective response and solving for this important and urgent problem across the organization.

Significant Data Sources
The next activity in this step is to also gain a deeper understanding what data is needed to meet the new demands and business opportunities made possible with big data.  Let’s begin with understanding how the raw materials or data stores can be categorized.  Data may be sourced from any number of sources to include one or more of the following:

  • Structured data (from  tables, records)
  • Demographic data
  • Times series data
  • Web log data
  • Geospatial data
  • Clickstream data from websites
  • Real-time event data
  • Internal text data (i.e. from e-mails, call center notes, claims, etc.)
  • External social media text data

If you are lucky there will be an enterprise data model or someone in enterprise architecture who can point to the major data sources and where the system of record resides. These are most likely organized by subject area (Customer, Account, Location, etc.) and almost always include schema-on-write structures. Although the focus is big data, it still is important to recognize that vast majority of data collected originates in transactional systems (e.g. Point of Sale).  Look for curated data sets and information catalogs (better yet an up-to-date metadata repository like Adaptive or Alation) to accelerate this task if present.

Data in and of itself is not very useful until it is converted or processed into useful information.  So here is a useful way to think about how this is viewed or characterized in general. The flow of information across applications and the analytic community from sources external to the organization can take on many forms. Major data sources can be grouped into three (3) major categories:

  • Structured Information,
  • Semi-Structured Information and
  • Unstructured Information.

While modelling techniques for structured information have been around for some time, semi-structured and unstructured information formats are growing in importance. Unstructured data presents a more challenging effort.  Many believe up to 80% of the information in a typical organization is unstructured this must be an important area for focus as part of an overall information management strategy. It is an area, however, where the accepted best practices are not nearly as well-defined. Data standards provide an important mechanism for structuring information. Controlled vocabularies are also helpful (if available) to focus on the use of standards to reduce complexity and improve reusability. When we get to modeling platform characteristics and signatures in the later steps the output of this work will become increasingly valuable.

Analytic Landscape
I have grouped the analytic environments, tools, and underlying technologies together in this step because they are usually the easiest data points to gather and compile.

  • Environments
    Environments are usually described as platforms and can take several different forms. For example, you can group these according to intended use as follows:
    – Enterprise Data Warehouse
    – Specialized Data Marts
    – Hadoop (Big Data)
    – Operational Data Stores
    – Special Purpose Appliances (MPP)
    – Online Analytical Processor (OLAP)
    – Data Visualization and Discovery
    – Data Science (Advanced Platforms such as the SAS Data Grid)
    – NLP and Text Engineering
    – Desktop (Individual Contributor; yes think how pervasive Excel and Access are)
  • Analytic Tools
    Gathering and compiling tools is a little more interesting. There is such a wide variety of tools designed to meet several different needs, and significant overlap in functions delivered exists among them. One way to approach this is group by intended use.  Try using the INFORMS taxonomy for example to group the analytic tools you find.  There work identified three hierarchical but sometimes overlapping groupings for analytics categories: descriptive, predictive, and prescriptive analytics. These three groups are hierarchical and can be viewed in terms of the level of analytics maturity of the organization.  Recognize there are three types of data analysis:

    • Descriptive (some have split Diagnostic into it’s own category)
    • Predictive (forecasting)
    • Prescriptive (optimization and simulation)

This simple classification scheme can be extended to include lower level nodes and improved granularity if needed. The following diagram illustrates a graphical depiction of the simple taxonomy developed by INFORMS and widely adopted by most industry leaders as well as academic institutions.

INFORMS_Taxonomy

Source: INFORMS (Institute for Operations Research and Management Science)

Even though these three groupings of analytics are hierarchical in complexity and sophistication, moving from one to another is not clearly separable. That is, the analytics community may be using tools to support descriptive analytics (e.g. dashboards, standard reporting) while at the same time using other tools for predictive and even prescriptive analytics capability in a somewhat piecemeal fashion. And don’t forget to include the supporting tools which may include metadata functions, modeling notation, and collaborative workspaces for use within the analytic community.

  • Underlying technologies in use
    Technologies in use can be described and grouped as follows (and this just a simple example and is not intended to be an exhaustive compilation).

    • Relational Databases
    • MPP Databases
    • NOSQL databases
      • Key-value stores
      • Document store
      • Graph
      • Object database
      • Tabular
      • Tuple store, Triple/quad store (RDF) database
      • Multi-Value
      • Multi-model database
    • Semi and Unstructured Data Handlers
    • ETL or ELT Tools
    • Data Synchronization
    • Data Integration – Access and Delivery

Putting It All Together
Not that we have compiled the important information needed, where do we put this for the later stages of the work effort?  In an organization of any size this can be quite a challenge, just due to the sheer size and number of critical facets we will need later, the number of data points, and the need to re-purpose and leverage this in a number of views and perspectives.

Here is what has worked for me.  First use a mind or concept map (Mind Jet for example) to organize and store URIs to the underlying assets. Structure, flexibility, and the ability to export and consume data from a wide variety of sources is a real plus.  The following diagram illustrates an example template I use to organize an effort like this. Note the icons (notepad, paperclip, and MS-Office) even at this high level point to a wide variety of content gathered and compiled in the fieldwork (including interview notes and observations).

EA_MindMap

Enterprise Analytics – Mind Map Example

For larger organizations without an existing Project Portfolio Management (PPM) tool or metadata repository that supports customizations (extensions, flexible data structures) it is sometimes best to augment the maps with a logical and physical database populated with the values already collected and organized in specific nodes of the map.  A partial fragment of a logical model would look something like this, where some sample values are captured in the yellow notes.

Logical

Logical Model Fragment

Armed with the current state analytics landscape (processes and portfolio), stakeholder’s contributions, and the findings compiled we are now ready to move on to the real work at hand. In step (2) we will use this information to determine the analytics operating models in use supported by the facts.

If you enjoyed this post, please share with anyone who may benefit from reading it. And don’t forget to click the follow button to be sure you don’t miss future posts. Planning on compiling all the materials and tools used in this series in one place, still unsure of what form and content would be the best for your professional use.  Please take a few minutes and let me know what form and format you would find most valuable.

Suggested content for premium subscribers: 
Big Data Analytics - Unlock Breakthrough Results:(Step 1) 
CRISP-DM Mind Map (for use with Mind Jet, see https://www.mindjet.com/ for more)
UML for dependency diagrams.  Use with yUML (see http://yuml.me/)
Enterprise Analytics Mind Map (for use with Mind Jet)
Logical Data Model (DDL; use with your favorite tool)
Analytics Taxonomy, Glossary (MS-Office)
Reference Library with Supporting Documents

Big Data Analytics – Nine Easy Steps to Unlock Breakthrough Results

breakthrough1An earlier post addressed one of the more perplexing challenges to managing an analytic community of any size against the irresistible urge to cling to what everyone else seems to be doing without thinking carefully about what is needed, not just wanted.  This has become more important and urgent with the breath-taking speed of Big Data adoption in the analytic community. Older management styles and obsolete thinking have created needless friction between the business and their supporting IT organizations.  To unlock breakthrough results requires a deep understanding of why this friction is occurring and what can be done to reduce this unnecessary effort so everyone can get back to the work at hand.

There are two very real and conflicting views that we need to balance carefully.  The first, driven by the business is concerned with just getting the job done and lends itself to an environment where tools (and even methods) proliferate rapidly. In most cases this results in overlapping and redundant expensive functionality.  Less concerned with solving problems once, the analytic community is characterized by many independent efforts where significant intellectual property (analytic insight) is not captured and inadvertently placed at risk.

The second view, in contrast, is driven by the supporting IT organization charged with managing and delivering supporting services across a technology portfolio that values efficiency and effectiveness.  The ruthless pursuit of eliminating redundancy, leveraging the benefits of standardization, and optimizing investment drive this behavior.  So this is where the friction is introduced. Until you understand this dynamic be prepared to experience organizational behavior that seems puzzling and downright silly at times.  Questions like these (yes they are real) seem to never be resolved.

– Why do we need another data visualization tool when we already have five in the portfolio?
– Why can’t we just settle on one NoSQL alternative?
– Is the data lake really a place to worry about data redundancy?
– Should we use the same Data Quality tools and principles in our Big Data environment?

What to Do
So I’m going to share a method to help resolve this challenge and help focus on what is important so you can expend your nervous system solving problems rather than creating them. Armed with a true understanding of the organizational dynamics it is now a good time to revisit a first principle that form follows function to help resolve and mitigate what is an important and urgent problem. For more on this important principle see Big Data Analytics and Cheap Suits.

This method knits together several key components and tools to craft an approach that you may find useful.  The goal is to organize and focus the right resources to ensure successful Big Data Analytic programs meet expectations. Because of the content delivered believe I will just break this down into several posts, each building on the other to keep the relative size and readability manageable.  This approach seemed to work with earlier series on Modeling the MDM Blueprint and How to Build a Roadmap so think I will stick to this method for now.

The Method
FLW_QuoteFirst let’s see what the approach looks like independent of any specific tools or methods.  This includes nine (9) steps which can be performed concurrently by both business and technical professionals working together to arrive at the suggested platform and tooling optimization decisions. Each of the nine (9) steps in this method will be accompanied by a suggested tool or method to help you prepare your findings in a meaningful way.  Most of these tools are quite simple; some will be a little more sophisticated.  This represents a starting point on your journey and can be extended in any number of ways to create more refined uses to re-purpose the data and facts collected in this effort. The important point is all steps are designed organize and focus the right resources to ensure successful Big Data Analytic programs meet expectations.  Executed properly you will find a seemingly effortless way to help:

– Reduce unnecessary effort
– Capture, manage, and operationally use analytic insight
– Uncover inefficient tools and processes and take action to remedy
– Tie results directly to business goals constrained by scope and objectives

So presented here is a simplified method to follow to compile an important body of work, supported by facts and data to arrive at any number of important decisions in your analytics portfolio.

1) Gather current state analytic portfolio, interview stakeholders, and document findings
2) Determine the analytic operating model in use (will have more than one, most likely)
3) Refine Critical Analytic Capabilities as defined to meet site specific needs
4) Weight Critical Analytic Capability according to each operating model in use
5) Gather user profiles and simple population counts for each form of use
6) Gather platform characteristics profiles
7) Develop platform and tool signatures
8) Gather data points and align with the findings
9) Assemble findings and prepare a decision model for platform and tooling optimization

The following diagram illustrates the method graphically (click to enlarge).

MethodSummary

In a follow-up post I will dive into each step starting with gathering current state analytic portfolio, interviewing stakeholders, and documenting your findings.  Along the way I will provide examples and tools you can use to help make your decisions and unlock breakthrough results. Stay tuned…

Big Data Analytics and Cheap Suits

Why?Sometimes I just want to staple my head to the carpet and wonder how to help others manage the seemingly irresistible urge to cling to what everyone else seems to be doing without thinking carefully about what is needed, not just wanted.  I will be discussing a topic I have been buried in the last couple of years in the Big Data Analytics space which most everyone by now is familiar with.  The technology is sound, evolving quickly, and solves for problems I could not image attacking a decade ago.  On the other hand the breath-taking speed of this platform adoption has left many scratching their heads and wondering why the old familiar rules of thumb and proven practice just don’t seem to work well anymore.   Less current management styles and obsolete thinking have created needless friction between the business and their supporting IT organizations.  This never ends well, but does keep me very busy.

First let’s put this challenge in perspective with a little context.  Over my career there have been a number of times when the need for efficient, cost effective data analysis has forced a change in existing technologies. The move to a relational model occurred when older methods to reliably handle changes to structured data led to the shift toward a data storage paradigm that was modeled on relational algebra. This created a fundamental shift in data handling, introducing a variety of tools and techniques that made all of our lives more rewarding. The current revolution in technology referred to as Big Data has happened because the relational data model can no longer efficiently handle the current needs for analysis of large and unstructured data sets. It is not just that data is bigger than before, or any of the other Vs (Variety, Volume, Velocity, Veracity, and Volatility) others have written about.  All of these data characteristics have been steadily growing for decades. The Big Data revolution is really a fundamental shift in architecture, just as the shift to the relational model was a shift that changed all of us. This shift means building new capabilities, adopting new tools, and thinking clearly about solving the right problems with the right tools the right way.  This means we need to truly understand what critical analytic capability is needed and make a focused investment in time and energy to realize this opportunity.  This should sound familiar to any of you working in this space. Many are already answering some of the obvious questions we should address at a minimum.

– When do we use a big data platform as opposed to the other platforms available?
– What are the platform drivers or key characteristics beyond storage and advanced analytics?
– Is low latency, real time application access required?
– How about availability and consistency requirements (see the CAP theorem for more on this)
– Workload characteristics – consistent flows or spikes?
– What is the shape of the data (e.g. structured, unstructured, and streaming)?
– Is there a need to integrate with existing data warehouse or other analytic platforms?
– How will the data be accessed by the analytic community and supporting applications?

Note that last question carefully; this is where the fun starts.

Why? There are two very real and conflicting views that we need to balance carefully.

The first, driven by the business is concerned with just getting the job done and lends itself to an environment where tools (and even methods) proliferate rapidly. In most cases this results in overlapping and redundant expensive functionality.  Less concerned with solving problems once, the analytic community is characterized by many independent efforts where significant intellectual property (analytic insight) is not captured and likely put a risk.  And not even re-used across the organization by others solving the same question.  There are very good reasons for this, this is completely understandable when the end justifies the means, and getting to the end game is the rewarded behavior. Like a cheap suit the analytic community simply doesn’t believe one size fits all. And I agree.

What-to-look-for-in-a-good-cheap-suit-by-DapperedThe second view, in contrast, is driven by the supporting IT organization charged with managing and delivering supporting services across a technology portfolio that values efficiency and effectiveness.  The ruthless pursuit of eliminating redundancy, leveraging the benefits of standardization, and optimizing investment drive this behavior.  I think it is easy to see where the means becomes the critical behavioral driver and the end is just assumed to resolve itself.   Just as cheap suits are designed to be mass-produced, use standard materials, and provide just enough (and no more) details to get by with the average consumer (if there really is such a thing).  Is there really an average analytic consumer? No; there is not (see the user profile tool in the next post for more). And I do agree with this view as well, there are very sound reasons why this view remains valid.

So this is where the friction is introduced. Until you understand this dynamic get ready for endless meetings, repeated discussions about capability (and what it means), and organizational behavior that seems puzzling and downright silly at times.  Questions like these (yes these are real) seem to never be resolved.

– Why do we need another data visualization tool when we already have five in the portfolio?
– Why can’t we just settle on one NoSQL alternative?
– Is the data lake really a place to worry about data redundancy?
– Should we use the same Data Quality tools and principals in our Big Data environment?

What to Do

So I’m going to share a method to help resolve this challenge and help focus on what is important so you can expend your nervous system solving problems rather than creating them. Armed with a true understanding of the organizational dynamics it is now a good time to revisit a first principal to help resolve what is an important and urgent problem.

First Principal: Form follows function.

The American architect, Louis Sullivan coined the phrase saying “It is the pervading law of all things organic and inorganic, of all things physical and metaphysical, of all things human and all things superhuman, of all true manifestations of the head, of the heart, of the soul, that the life is recognizable in its expression, that form ever follows function. This is the law”. And this has since become known by its’ more familiar phrase “form follows function“.

It is truly interesting that Sullivan developed the shape of the tall steel skyscraper in late 19th Century Chicago at the very moment when technology, taste and economic forces converged and made it necessary to drop the established styles of the past. If the shape of the building was not going to be chosen out of the old pattern book something had to determine form, and according to Sullivan it was going to be the purpose of the building. It was “form follows function”, as opposed to “form follows precedent”. Sullivan’s assistant Frank Lloyd Wright adopted and professed the same principle in slightly different form perhaps because shaking off the old styles gave them more freedom and latitude.

Sound familiar? It should, for any of us actively adopting this technology. This is where the challenge of using tried and true proven practice meets the reality of shaking off the old styles and innovating where and when it is needed in a meaningful, controlled, and measured manner.

So if form follows function, let’s see what makes sense. Thanks to Gartner who published Critical Capabilities for Business Intelligence and Analytics Platforms this summer (12 May 2015 ID:G00270381) we have a reasonably good way to think about form and function.  You may think what you will about Gartner I believe they have done a good job of grouping and characterizing fourteen (14) critical capabilities for analytics across four (4) different operating models (Gartner referred to them as baseline use cases) as follows.

– Centralized Provisioning
– Decentralized Analytics
– Governed Data Discovery
– OEM/Embedded Analytics

In this case capabilities are defined as “the ability to perform or achieve certain actions or outcomes through a set of controllable and measurable faculties, features, functions, processes, or services”.  They grouped the capabilities in questions into fourteen (14) major categories to include:

– Analytic Dashboards and Content
– Platform Administration
– Business User Data Mashup
– Cloud Deployment
– Collaboration and Social Integration
– Customer Services
– Development and Integration
– Ease of Use
– Embedded Analytics
– Free Form Interactive Exploration
– Internal Platform Integration
– IT-Developed Reports and Dashboards
– Metadata Management
– Mobile
– Traditional Styles of Analysis

Note there may be more than one operating model or baseline use case delivery scenario in use at your organization.  I just completed an engagement where three of the four operating models are in use.  This is exactly where the friction and confusion is created between IT Management and the Analytic Community. Every problem does not represent a nail where 805a hammer is useful. A set of tools and platforms which are ideal for Centralized Provisioning are usually terrible and completely unsuited for use within a Decentralized Analytics operating model.  Critical capability essential to Embedded Analytics is very different from Governed Data Discovery.  Yes there are some essentials that cross operating models (e.g. metadata), and in general this is a truly sound way to determine where your investment in capability should be occurring – and where it is not. In short, form follows function.  This is extremely helpful in using a common vocabulary where all stakeholders can understand the essentials when making analytic portfolio investment or simply selecting the right tool for the right job.

In a follow-up post I will provide an example and some simple tools you can use to help make ToolImage_01these decisions.  And remain committed to delivering value. After all, there is another prinicipal we should always remember. Analysis for analysis sake is just plain ridiculous.  Or has Tom Davenport said “…If we can’t turn that data into better decision making through quantitative analysis, we are both wasting data and probably creating suboptimal performance”.

Stay tuned…

How to build a Roadmap – Publish

TheRoadAhead_ChildThis post represents the last of the Road Map series I have shared with over 60,000 readers since introduced in March of 2011 at this humble little site alone.  I never would have thought this subject would have attracted so much interest and helped so many over the last three years. Quite frankly I’m astonished at the interest and of course grateful to all the kind words and thoughts so many have shared with me.

The original intent was to share a time tested method to develop, refine, and deliver a professional roadmap producing consistent and repeatable results.  This is should be true no matter how deep or how wide or narrow the scope and subject area we are working with. I think I have succeeded in describing the overall patterns employed.  The only regret I have is not having enough time and patience with the constraints of this media to dive deeper into some of the more complex and trickier aspects of the delivery techniques. I remain pleased with the results given the little time I have had to share with all of you. And sincerely hope that what has worked for me with great success over the years may help you make your next roadmap better.

This method works well across most transformation programs. As I noted earlier most will struggle to find this in textbooks, class rooms, or in your local book store (I have looked, maybe not hard enough). This method I is based loosely on the SEI-CM IDEAL model used to guide development of long-range integrated planning for managing software process improvement programs. Although the SEI focus is software process improvement the same overall pattern can applied easily to other subject areas like organization dynamics and strategy planning in general.

The Overall Pattern
At the risk of over-simplifying things, recall the overall pattern most roadmaps should follow is illustrated in the following diagram.

RoadMap_Pattern

This may look overwhelming at first but represents a complete and balanced approach to understanding what the implications are for each action undertaken across the enterprise as part of a larger program.  You can argue (and I would agree) that this may not be needed for simple or relatively straightforward projects.  What is more likely in this case is the project or activity we are discussing represents a piece or component part of a much larger initiative and will most certainly not need  its’ own roadmap at all. This post is focused on the bigger picture of a collection of projects and activities gathered together to organize and guide multiple efforts in a clear directed manner.

Earlier posts in this series (How to Build a Roadmap) summarized the specific steps required to develop a well thought out road map. The method identified specific actions using an overall pattern all roadmaps should follow. The following steps (and related links to other posts) are required to complete this work:

  1. Develop a clear and unambiguous understanding of the current state
  2. Define the desired end state
  3. Conduct a Gap Analysis exercise
  4. Prioritize the findings from the Gap Analysis exercise into a series of gap closure strategies
  5. Discover the optimum sequence of actions (recognizing predecessor – successor relationships)
  6. Develop and Publish the Road Map

This post wraps up all the hard work to date and assembles the road map to begin sharing the results with stakeholders.  Assuming all the prior work is completed, we are now ready to develop and publish the road map. How this is communicated is critical now. We have the facts, we have the path outlined, and we have a defensible position to share with our peers. We have the details readily available to support our position. Now the really difficult exercise rears its ugly head. Somehow, we need to distill and simply our message to what I call the “Duckies and Goats” view of the world. In other words we need to distill all of this work into a simplified yet compelling vision of how we transform an organization, or enabling technology to accomplish what is needed. Do not underestimate the difficulty in this task. After all the hard work put into an exercise like this, the last thing we need to do is to confuse our stakeholders with mind-numbing detail. Yes, we need this for ourselves to exhaust any possibility we have missed something to ensure we haven’t overlooked the obvious because sometimes “when something is obvious, it may be obviously wrong”.  So what I recommend is a graphical one or two page view of the overall program where each project is linked to successive layers of detail. Each of these successive layers of detail can also be decomposed further if needed to the detailed planning products and supporting schedules. For an example of this see the accompanying diagram which illustrates the concept..

RoadMapExplodedDevelop the Road Map
Armed with the DELTA (current vs. desired end state), the prioritization effort (what should be done), and the optimum sequence (in what order) we can begin to assemble a sensible, defensible road map describing what should be done in what order.  Most of the hard work has already been completed so we should be only be concerned at this point with the careful presentation of the results in a way our stakeholders will quickly grasp and understand.

Begin by organizing the high-level tasks and what needs to be accomplished using a relative time scale usually more fine grained for the first set of tasks typically grouped into quarters.  Recall each set of recommended initiatives or projects has already been prioritized and sequenced (each of the recommended actions recognize predecessor – successor relationships for example).  If this gets too out-of-hand use a simple indexing scheme to order the program using groupings of dimension, priority, sequence, and date related values with your favorite tool of choice.  Microsoft Excel pivot tables work just fine for this, and will help organize this work quickly.  I use the MindJet MindManager product to organize the results into maps I can prune and graft at will.  Using this tool has some real advantages we can use later when we are ready to publish the results and create our detailed program plans.

Each project (task(s)) should be is defined by its goals, milestone deliveries, dependencies, and expected duration across relevant dimensions. For example, the dimensions you group by can include People and Organization, Processes, Technology and Tools, and External Dependencies.  The following illustrates a high-level view of an example Master Data Management roadmap organized across a multiple year planning horizon.

PIM_Hub_01I think it is a good idea to assemble the larger picture first and then focus on the near term work proposed in the road map.  For example, taking the first quarter view of what needs to be accomplished from the executive summary above we can see the first calendar quarter (in this case Q4 2009) of the road map is dedicated to completing the business case, aligning the global strategy, preparing the technical infrastructure for a MDM Product project, and gaining a better understanding of product attribution.  The following illustrates the tasks exploded in the summary to the near term map of what is needed in Q4 2009 (the first quarter of this program).

PIM_Hub_02

Publish the Road Map
At this stage everything we need is ready for publication, review by the stakeholders, and the inevitable refinements to the plan. I mentioned earlier using MindJet MindManager tool to organize the program initiatives into maps.  This tool really comes in handy now to accelerate some key deliverables. Especially useful is the ability to link working papers, schedules, documentation, and any URL hyperlinks needed to support the road map elements. Many still prefer traditional documents (which we can produce with this tool easily) but the real power is quickly assembling the work into a web site that is context aware and a quite powerful way to drill from high level concepts to as much supporting detail as needed.  This can easily accessible without the need for source tools (this is a zero footprint solution) by any stakeholder with a browser. The supporting documentation and URL content can be revised and updated easily without breaking the presentation surface when revisions or refinements are made to the original plan.  I also use the same tool and content to generate skeleton program project plans for us with MS Project. The plans generated can be further refined and used to organize the detailed planning products when ready. Your Program Management Office (PMO) will love you for this.

Think you would agree this is an extremely powerful way to organize and maintain a significant amount of related program content to meet the needs of a wide variety of stakeholders.   An example of a Road Map web site is illustrated in the snapshot below (note, the client name has been blocked to protect their privacy).

SiteImage_02

Results
So, we have assembled the roadmap using a basic pattern used across any discipline (business or technology) to ensure an effective planning effort. This work is not an exercise to be taken lightly. We are after all discussing some real world impacts to come up with a set of actionable steps to take along the way that just make sense. Communicating the findings clearly through the road map meets the intent of the program management team and will be used in a variety of different ways.  For example beyond the obvious management uses consider the following ways this product will be used.

First, the road map it is a vehicle for communicating the program’s overall intent to interested stakeholders at each stage of its planned execution.

  • For downstream designers and implementers, the map provides overall policy and design guidance. The map can be used to establish inviolable constraints (plus exploitable freedoms) on downstream development activities to promote flexibility and innovation if needed.
  • For project managers, the road map serves as the basis for a work, product, and organization breakdown structures, planning, allocation of project resources, and tracking of progress by the various teams.
  • For technical managers, the road map provides the basis for forming development teams corresponding to the work streams identified.
  • For designers of other systems with which this program must interoperate, the map defines the set of operations provided, required, and the protocols that allows the interoperation to take place at the right time.
  • For resource managers, testers and integrators, the road map dictates the correct black-box behavior of the pieces that must fit together.

Secondly, the road map can be used as a basis for performing up-front analysis to validate (or uncover deficiencies in) in design decisions and refining or altering those decisions where necessary.

  • For the architect and requirements engineers who represent the customer(s), the road map is a framework where architecture as a forum can be used for negotiating and making trade-offs among competing requirements.
  • For the architect and component designers, the road map can be a vehicle for arbitrating resource contention and establishing performance and other kinds of run-time resource consumption budgets.
  • For those wanting to develop using vendor-provided products from the commercial marketplace, the road map establishes the possibilities for commercial off-the-shelf (COTS) component integration by setting system and component boundaries.
  • For performance engineers, the map can provide the formal model guidance that drives analytical tools such as rate schedulers, simulations, and simulation generators to meet expected demands at the right time.
  • For development product line managers, the map can help determine whether a potential new member of a product family is in or out of scope, and if out, by how much.

Thirdly, the road map is the first artifact used to achieve program and systems understanding.

  • For technical mangers, the map becomes a basis for conformance checking, for assurance that implementations have in fact been faithful to the program and architectural prescriptions.
  • For maintainers, the map becomes a starting point for maintenance activities, revealing the relative date and areas when a prospective change is planned to take place.
  • For new project members, the map is should be the first artifact for becoming familiar with a program and system’s design intent.

This post wraps up all the hard work to date and assembles the road map to begin sharing with stakeholders impacted by the program as planned.  The original intent was to share a time tested method to develop, refine, and deliver a professional roadmap producing consistent and repeatable results.  I want to thank all of you embarking down this adventure with me over the last couple years. My sincere hope is that what has worked for me time after time may work just as well for you.

How to build a Roadmap – Sequence

An earlier post (How to Build a Roadmap)  in this series summarized the specific steps required to develop a well thought out road map. This roadmap identified specific actions using an overall pattern ALL roadmaps should follow. The steps required to complete this work:

  1. Develop a clear and unambiguous understanding of the current state
  2. Define the desired end state
  3. Conduct a Gap Analysis exercise
  4. Prioritize the findings from the Gap Analysis exercise into a series of gap closure strategies
  5. Discover the optimum sequence of actions (recognizing predecessor – successor relationships)
  6. Develop and Publish the Road Map

This post explores the step where we discover the optimum sequence of actions recognizing predecessor – successor relationships. This is undertaken now that we have the initiatives and the prioritization is done. What things do we have to get accomplished first, before others? Do any dependencies we have identified need to be satisfied before moving forward? What about the capacity for the organization to absorb change? Not to be overlooked, this is where a clear understanding of the organizational dynamics is critical (see step number 1, we need to truly understand where we are). FuturesWheelThe goal is to collect and group the set of activities (projects) into a cohesive view of the work ordered in a typical leaf, branch, and trunk pattern so we can begin to assemble the road map with a good understanding what needs to be accomplished in what order.

Finding or discovering the optimal sequence of projects within the larger program demands you be able to think across multiple dimensions recognizing where there may be second and third order consequences for every action undertaken (see the Future Wheel method for more on this). More of craft than a science the heuristic methods I use refer to experience-based techniques for problem solving, learning, and discovery. This means that the solution is not guaranteed to be optimal. In our case, where an exhaustive search is impractical, I think using heuristic methods to speed up the process of finding a satisfactory solution include using rules of thumb, educated assumptions, intuitive judgment, some stereotyping, and lot of common sense. In our case optimization means finding the “best available” WBS_Diagramordered set of actions subject to the set of priorities and constraints already agreed upon.  The goal is to order each finding from the Gap Analysis in a way that no one project within the program is undertaken without the pre-requisite capability in place. Some of you will see the tree diagram image is a Work Breakdown Structure.  And you are right. This is after all our goal and represents the end-game objective. The sequencing is used and re-purposed directly into a set of detailed planning products organized to guide the effort.

What to think about
Developing a balanced well-crafted road-map is based on the “best available” ordered set of actions subject to the set of priorities (and constraints) already agreed upon.  When you think about this across multiple dimensions it does get a little overwhelming so I think the best thing to do is break the larger components into something we can attack in pieces and merge together later. After this set of tasks is accomplished we should always validate and verify or assumptions and assertions made to ensure we have not overlooked a key dependency.

Technology layer overviewI’m not going to focus on the relatively straight forward task in the technology (infrastructure) domains many of us are familiar with. Following a simple meta-model like this one from the Essential Architecture project (simplified here to illustrate the key concepts) means we can grasp quickly the true impact of an adoption of a new capability would have, and what is needed to realize this in the environment as planned. Where I will focus is on the less obvious. This is the  relationships across the other three domains (Business, Information, and Applications) which will be dependent on this enabling infrastructure.  And no less important is the supporting organization ability to continue to deliver this capability when the “consultants” and vendors are long gone.

So if you look at the simple road map fragment illustrated below note the technology focus. Notice also what is missing or not fully developed. Although we have a bubble labeled “Build Organizational Capability” it is not specific and detailed enough to understand how important this weakness is to overcoming an improved product because of the technology focus.RoadMap_Example_01Understanding of this one domain (technology) becomes a little trickier when take into account and couple something like an Enterprise Business Motivation Model for example . Now we can extend our simple sequence optimization with a very thoughtful understanding of all four architecture domains to truly understand what the organization is capable of supporting. So I’m going to reach into organizational dynamics now assuming all of us have a very good understanding as architects of what needs to be considered in the information, application, and supporting technology domains.

Organization Maturity Models
Before diving into the sequence optimization we should step back and review some of our earlier work. As part of understanding current and desired end states we developed a pretty acute awareness of the organization profile or collection of characteristics to describe ourselves. There are many maturity models available (CMMI, Information Management Maturity Model at Mike 2.0, and the NAS CIO EA Maturity Model) that all share the same concepts usually expressed in five or six levels or profiles ranging from initiating (sometimes referred to as chaos) to optimized. Why is this important? Each of these profiles includes any number of attributes or values that can be quantified to gain a deeper understanding of capability. This where the heuristic of common sense and intuitive judgment is needed and overlooked only at the risk of failure.  For example, how many of us have been engaged in a SOA based adoption only to discover the technology is readily achievable but the organization’s ability to support widespread adoption to realize the value is not manageable or produces unintended consequences.  Why does this happen so often? One of the key pieces of insight I can share with you is there seems to be a widespread assumption you can jump or hurdle one maturity profile to another by ensuring all prerequisite needs are adopted and in use.  Here is a real world case to illustrate this thought.

SOA_Architecture_Capability

Note there were thirteen key attributes evaluated as part of the global SOA adoption for this organization. What is so striking about this profile (this is not my subjective evaluation, this is based on the client’s data) is the relative maturity of Portfolio Management and integration mechanisms and the low maturity found in some core competencies related to architecture in general.  Core competency in architecture is key building block to adopting SOA on the size and scale as planned.   The real message here is this organization (at least in the primary architecture function) is still a Profile 1 or Initiating. Any attempt to jump to a Defined or even Managed profile cannot happen until ALL key characteristics are addressed and improved together.  The needle only moves when the pre-requisites are met. Sounds simple enough, can’t count how often something this fundamental is overlooked or ignored.

Organization Profile
This is a little more sophisticated in its approach and complexity. In this case we are going to use a maturity model that attempts to clarify and uncover some of the organizational dynamics referred to earlier in our Gap Analysis. This model is based on the classic triangle with profiles ranging from Operational to Innovating (with a stop at Optimizing along the way).  Each profile has some distinct characteristics or markers we should have recognized when base lining the Current State (if you used the Galbraith Star Model to collect you findings, its results can always be folded or grouped into this model as well).  Within each of the profiles there a couple of dimensions that represent the key instrumentation to measure the characteristics and attributes of the organization’s operating model. This is useful to measure as the organization evolves from one Profile to another and begins to leverage the benefits of each improvement in reach and capability.

Profile Triangle

We use this to guide us to select the proper sequence. The dimensions used within each of the six profiles in this model include:

Infrastructure (Technology)
The computing platforms, software, networking tools, and technologies (naming services for example) used to create, manage, store, distribute and apply information.  This is the Technology architecture domain I referred to earlier.

Processes
Policies, best practices, standards and governance that define: how information is generated, validated and used.  How information is tied to performance metrics and reward systems.  How the company supports its commitment to strategic use of information.

Human capital
The information skills of individuals within the organization and the quantifiable aspects of their: 

    • capability, 
    • recruitment, 
    • training, 
    • assessment and 
    • alignment with enterprise goals.

Culture
Organizational and human influences on information flow — the moral, social and behavioral norms of corporate culture (as shown in the attitudes, beliefs and priorities of its members), as related to the use and value of information as a long-term strategic corporate asset.

What this reveals about our sequencing is vital to understanding key predecessor – successor relationships. For example, a Profile 1 (Initiating) organizations are usually successful due to visionary leaders, ambitious mavericks and plain good old-fashioned luck. Common characteristics include: 

  • Individual leaders or mavericks with authority over information usage 
  • Information infrastructure (technology and governance processes) that is nonexistent, limited, highly variable or subjective 
  • Individual methods of finding and analyzing information 
  • Individual results adopted as “corporate truth” without the necessary validation

Significant markers you can look for are usually found in the natural self-interests of individuals or “information mavericks” who will often leverage information to their own personal benefit. Individuals flourish at the expense of the organization. The silo orientation tends to reward individual or product-level success even as it cannibalizes other products or undermines enterprise profitability. Because success in this environment depends on individual heroics, there is little capability for repeating successful processes unless the key individuals remain the same. This dynamic never scales and in the worst cases the organization is impaired every time one of “these key employees” leaves taking their expertise with them.

By contrast a Profile 3 (Integrated) organization has acknowledged the strategic and competitive value of information. This organization has defined an information management culture (framework) to satisfy business objectives. Initiatives have been undertaken enhance the organization’s ability to create value for customers rather than catering to individuals, departments, and business units. Architecture integrates data from every corner of the enterprise — from operational/transactional systems, multiple databases in different formats and from multiple channels. The key is the ability to have multiple applications share common “metadata” — the information about how data is derived and managed. The result is a collaborative domain that links previously isolated specialists in statistics, finance, marketing and logistics and gives the whole community access to company-standard analytic routines, cleansed data and appropriate presentation interfaces. Common characteristics include:

  • Cross-enterprise information. 
  • Decisions made in the context of enterprise goals. 
  • Enterprise information-governance process. 
  • Enterprise data frameworks. 
  • Information-management concepts applied and accepted. 
  • Institutional awareness of data quality.

So how well do think adopting an enterprise analytic environment is going to work in a Profile I (Initiating) organization?  Do you think we have a higher probability of success if we know the organization is closer to a Profile III (Integrated) or higher?

I thought so.

In this case, the sequencing should reflect a focus on actions that will move and evolve the organization from a 1 (Initiating) to a 2 (Consolidated), eventually arriving at a 3 (Integrated) profile or higher.

Understanding the key characteristics and distinct markers found in the organization you are planning for at this stage will be a key success factor. We want to craft a thoughtful sequence of actions to address organization and process related gaps in a way to that is meaningful.  Remember has much as we wish, any attempt to jump to a higher profile cannot happen until ALL key characteristics are addressed and improved together.  So if we are embarking on a truly valuable program remember this fundamental.  In almost every roadmap I have reviewed or evaluated by others not very many of them seek to understand this critical perspective. Not really sure why this is the case since an essential element of success to any initiative is ensuring specific actionable activities and management initiatives include:

  • Building organizational capability, 
  • Driving organizational commitment, and 
  • The program will not exceed the organization’s capability to deliver.

Results
Now that we have gathered all the prioritized actions, profiled the organization’s ability to deliver, and have the right enabling technology decisions in the right order it’s time to group and arrange the findings. While the details should remain in your working papers there are a variety presentation approaches you can take here. I prefer to group the results into easy to understand categories and then present the high level projects together to the team as a whole.  In this example (simplified and edited for confidentiality) I have gathered the findings into two visual diagrams illustrating the move from a Profile 1 (Operational) labeled Foundation Building, to a Profile II (Consolidated) labeled Build Out.  I have also grouped the major dimensions (Infrastructure, Processes, Human Capital and Culture) into three categories labeled Process and Technology, and combined Human Capital and Culture into Organization to keep things simple.

First the Profile I diagram is illustrated below. Note we are “foundation building” or identifying the optimal sequence to build essential capability first.

Profile 1 Roadmap

The move to a Profile II (Consolidated) is illustrated below. In this stage we are now leveraging the foundation building from earlier actions to begin to leverage what is important and take advantage of the capability to be realized.

Profile 2 Roadmap

Note at this time there is no times scale used; only a relative size denoting an early estimate of the level of effort from the Gap Analysis findings.  There is however a clear representation of the order of what actions should be taken in what order.  With this information we are now ready to move to the next step to develop and publish the completed Road Map.

How to build a Roadmap – Prioritize (Part II)

An earlier post in this series (How to Build a Roadmap)  summarized the specific steps required to develop a well thought out road map. This roadmap identified specific actions using an overall pattern ALL roadmaps should follow. The steps required to complete this work:

  1. Develop a clear and unambiguous understanding of the current state
  2. Define the desired end state
  3. Conduct a Gap Analysis exercise
  4. Prioritize the findings from the Gap Analysis exercise into a series of gap closure strategies
  5. Discover the optimum sequence of actions (recognizing predecessor – successor relationships)
  6. Develop and Publish the Road Map

Rearranging-Your-PrioritiesThis post will continue with the prioritization steps we started with in How to build a Roadmap – Prioritize (Part I). The activity will use the results from steps 1 – 3 to prioritize the actions we have identified to close the gap or difference (delta) from where we are to what we aspire to become.   In short, we want to IDENTIFY what is feasible and what has the highest business value balancing business need with the capability to execute.

Prioritize Gap Analysis Findings
After the gaps analysis is completed further investigation should be performed by a professional with subject matter expertise and knowledge of generally accepted best practices.  It is best to use prepared schedules in the early field work (if possible) to begin gathering and compiling the facts needed during the interview processes to mitigate “churn” and unnecessary rework. This schedule will be populated with our findings, suggested initiatives or projects, expected deliverables, level of effort, and a preliminary priority ranking of the gap identified.  At this point we have a “to-do” list of actionable activities or projects we can draft the working papers with for the prioritization review process using a variety of tools and techniques.

todolistThe prioritization review is an absolutely essential step prior to defining the proposed program plans in the road map. Scheduled review sessions with key management, line managers, and other subject matter experts (SME) serves to focus all the major activities for which the function or subject area is responsible.  We are asking this group of stakeholders to verify and confirm the business goals and objectives as value criteria and to score the relative importance of each gap identified.  The following section describes the available techniques but is by no means an exhaustive survey of what is available.

Prioritization Methods
There are a number of approaches to prioritization that are useful. There are also some special cases where you’ll need other tools if you’re going to be truly effective.  Many of these techniques are often used together to understand the subtle aspects and nuances of organizational dynamics. The tools and methods include:

Paired Comparison Analysis
Paired Comparison Analysis (sometimes referred to as Pairwise Comparison) is most useful when the decision criteria are vague, subjective or inconsistent (like we find most of the time). This is not a precision exercise. The analysis is useful for weighing up the relative importance of different options. It’s particularly helpful where:

  • priorities aren’t clear,
  • the options are completely different,
  • evaluation criteria are subjective, or they’re competing in importance.

Apples_OrangesWe prioritize options by comparing each item on a list with all other items on the list individually. By deciding in each case which of the two is most important, we can consolidate results to get a prioritized list. This can be especially challenging when the choices are quite different from one another, decision criteria are subjective, or there is no objective data to use. This makes it easy to choose the most important problem to solve, or to pick the solution that will be most effective. It also helps set priorities where there are conflicting demands or resource constraints that need to be considered.

Grid Analysis
Grid Analysis helps prioritize a list of tasks where you need to take many different factors into consideration. Grid Analysis is the simplest form of Multiple Criteria Decision Analysis (MCDA), also known as Multiple Criteria Decision Aid or Multiple Criteria Decision Management (MCDM). Sophisticated MCDA can involve highly complex modeling of different potential scenarios, using advanced mathematics. Grid Analysis is particularly powerful where you have a number of good alternatives to choose from, and many different factors to take into account. This is a good technique to use in almost any important decision where there isn’t a clear and obvious preferred option. Grid Analysis works by listing options as rows on a table, and the factors you need consider as columns. Conversely you can populate the rows with business drivers and the rows with technical factors. The intent is to score each factor combination, weight this score by the relative importance of the factor, and add these scores up to give an overall score for each option.

Action Priority Matrix
Closely related to Grid analysis is the Action Priority matrix. This quick and simple diagramming technique plots the value of the candidate tasks against the level of effort or technical feasibility to deliver. By doing this we can quickly spot the “quick wins” giving you the greatest rewards in the shortest possible time, and avoid the “hard rocks” which soak up time for little eventual reward. This is an ingenious approach for making highly efficient prioritization decisions.

I illustrated this approach in detail in the first part of this step in How to build a Roadmap – Prioritize (Part I). This earlier post described how we take relatively “subjective” set of inputs and prepare a simple quadrant graph where technical feasibility and business value represent the X and Y axis respectively. This quantifies the relative priority of each candidate set of processes impacted subjected to technical feasibility constraints.  The result is an easy to see visual representation of what is important (relative business value) and technically feasible. In short the method (as summarized here) has helped us prioritize where to focus our efforts and aligned business need with the technical capability to deliver.

Urgent/Important Matrix
Similar to the Action Priority Matrix this technique considers whether candidate tasks or projects are urgent or ImportantUrgentimportant. Sometimes, seemingly urgent gaps really aren’t that important. The Urgent/Important Matrix helps by using the gap analysis task list and quickly identifying the activities we should focus on. By prioritizing using the Matrix, we can deal with truly urgent issues while working towards important goals.  The distinction is subtle but important. Urgent activities are often the ones we concentrate on; they demand attention because the consequences of not dealing with them are immediate.  “What is important is seldom urgent and what is urgent is seldom important,” sums up the concept of the matrix perfectly. This so-called “Eisenhower Principle” is said to be how Eisenhower organized his tasks. As a result, the matrix is sometimes called the Eisenhower Matrix..

Somewhat related to this technique is the MoSCoW method.  This is named because the acronym is related to grouping the alternative actions into four discrete types.

  • M – MUST: Describes an action (or requirement) that must be satisfied in the final solution for the solution to be considered a success. In other words not a high, non-negotiable priority.
  • S – SHOULD: Represents a high-priority item that should be included in the solution if it is possible. This is often a critical action (requirement) but one which can be satisfied in other ways if necessary.
  • C – COULD: Describes an action (requirement) which is considered desirable but not necessary. This will be included if time, dependencies, and resources permit.
  • W – WOULD: Represents a requirement that stakeholders have agreed will not be implemented in a given release, but may be considered for the future (sometimes the word “Won’t” is substituted for “Would” to give a clearer understanding of this choice). This is usually associated with low priority actions.

The o’s in MoSCoW are added simply to make the word pronounceable, and are often left lower case to indicate that they don’t stand for anything. Most often used with time boxing where a deadline is fixed so that the focus can be on the most important actions or requirements. This is often used as a core aspect of rapid application development (RAD) software development processes, such as Dynamic Systems Development Method (DSDM) and agile software development techniques. In this context we are “borrowing” from application development to quickly arrive at a priority scheme or relative measure of how important each proposed action is to the business.

Ansoff Matrix and the Boston Matrices
These give you quick “rules of thumb” for prioritizing opportunities uncovered in the gap analysis. The Ansoff Matrix helps evaluate and prioritize opportunities by risk. The Boston Matrix does a similar job, helping to prioritize opportunities based on the attractiveness of a market and our ability to take advantage of it. There is much more information about the Ansoff Matrix and the Boston Matrix at these links.

Pareto Analysis
Pareto Analysis helps identify the most important of the gap analysis changes identified to make.  It is a simple technique for prioritizing problem-solving work so that the first piece of work proposed resolves the greatest number of gaps or challenges uncovered. The Pareto Principle (known as the “80/20 Rule”) is the idea that 20% of the gap Parettocandidates may generate 80% of results. In other words we are seeking to find the 20% of closure actions that will generate 80% of the results expected that attacking all of the identified work would deliver. While this approach is great for identifying the most important root cause to deal with, it doesn’t take into account the cost of doing so. Where costs are significant, you’ll need to use techniques such as Cost/Benefit Analysis, or use IRR and NPV methods to determine which priority based changes you should move ahead with. To use Pareto Analysis, identify and list the gap candidates and their causes. Score each problem and group them together by their cause. Then add up the score for each group. Finally, work on finding a solution to the cause of the problems in the group with the highest score. Pareto Analysis not only shows you the most important gap to solve, it also gives you a score showing how severe the gap is relative to the others.

Analytic Hierarchy Process (AHP)
The Analytic Hierarchy Process is useful when there are difficult choices to make in a complex, subjective situation with more than a few options.  The method (AHP) is included in most operations research and management science textbooks, and is taught in numerous universities. While the general consensus is that it is both technically valid and practically useful, the method does have its critics. Most of the criticisms involve a phenomenon called rank reversal which is described here.

To address this problem, Thomas Saaty created the Analytic Hierarchy Process (AHP) combining qualitative and quantitative analysis. This technique is useful because it combines two approaches –mathematics, and the subjectivity and of psychology – to evaluate information and make priority decisions that are easy to defend.

AHP_01

Simply put ranking is a relationship between a set of items such that, for any two items, the first is either ‘ranked higher than’, ‘ranked lower than’ or ‘ranked equal to’ the second. In mathematics, this is known as a weak order or total preorder of objects. It is not necessarily a total order of objects because two different objects can have the same ranking. The rankings themselves are totally ordered.  By reducing detailed measures to a sequence of ordinal numbers, rankings make it possible to evaluate complex information according to certain criteria. This is how an Internet search engine may rank the pages it finds according to an estimation of their relevance making it possible for us to quickly select the pages we are most likely interested in.

Results
I think you can see there are a couple of techniques we can use to prioritize the findings from our gap analysis into a coherent, thoughtful set of ordered actions we can take. Armed with this consensus view we can now proceed to step five (5) to assemble and discover the optimum sequence of actions (recognizing predecessor – successor relationships) as we move to developing the road map. Many of the techniques discussed here can (and should be combined) to capture consensus on priority decisions. My professional favorite is the approach found in the Action Priority Matrix due to its structure and group involvement for making highly efficient prioritization decisions. Now that we have the priorities in hand we have the raw materials needed to move forward and produce the road map.