Sunday, January 12, 2014

REDUCE RISK THROUGH QUANTITATIVE COMPLEXITY SCIENCES

In my previous blog  I speculated about the practical application of complexity sciences to assist in business management. In this blog I demonstrate how we use complexity sciences to create additional quantitative insight into business systems to support our mission at IAM of “To see better. To understand better. To do better”. As an entrepreneur, I have learned over the years that investments go hand-in-hand with risk – understand risk in order to mitigate potential issues and subsequently gain on system performance.

The Anscombe quartet consists of four data sets which share similar statistical properties, but when visually inspected shows different patterns. I provide the visual scatter plot at the end of the blog to support the use of complexity science in discovering better insight into system behavior.

Complexity is hard to define, but in simplicity I can state that complexity is the result of a combination of uncertainty, volatility and relationships in a system. This also underpins the key requirement for any complexity science approach - having the ability to measure in a non-supervised fashion, entropy and chaos in a non-linear manner. In the case of the Anscombe quartet, the observations are 11 – not enough to create meaningful insight into characteristics such as robustness, self-organised criticality and small-world behaviors, but still powerful enough to demonstrate the concept.

The Anscombe quartet

The Anscombe quartet is 4 data sets which shares common statistical attributes. In this example the four pairs can represent 4 business units, departments, products or processes with 11 observations.


Statistical Analysis

Descriptive statistics show similarity between the X and Y pairs, with similar correlations between the variables. On face value these 4 units operate and perform in a similar fashion.




COMPLEXITY SCIENCE INSIGHT

What else can we see or understand from this system ? How will we go about to optimize this system ?  What is the overall risk of this system ? If we optimise this system, where should we start ? What risks do we face if we do this ?

With complexity science we can get quantitative answers on these questions. To summarize the following sections we can state the following about the system as derived from quantitative analysis:

" This system is fairly integrated and shielded from either random failures or structural failures. This is caused by the high interaction of system components between all business units. However, performance optimization of this system do pose a greater challenge. From the quantitative insight the strategy should be to address and correct the high uncertainty of x1,x2 and x3 performance. The cause-effect model of the system should be used to study the impact of changing x1,x2, and x3 to understand what implications will effect the overall system. Improving x1, x2, and x3 will have significant impact on performance indicators by reducing customer lead times, work-in-progress and waiting times. It is important to protect the stability of BU 4 (x4 and y4) as they have significant impact on 60% of all observations".


ANALYSIS: VOLATILITY, UNCERTAINTY & IDEAL CAPACITY
Volatility = measure deviation from the indicator mean.  Between 0-50 % indicates a relative
stable indicator, between 51-100 less stable, and above 100 highly volatile.
Uncertainty = indicates the level of uncertainty in the indicator measurement, with 0%
no uncertainty, and 100% total uncertainty in the measurement.
Ideal Capacity =  ideal capacity required to process indicator values according to the level of  uncertainty in the indicator values at the selected frequency of observations.





ANAlysis: Systemic Risk Score

Complexity Score = Relative value calculated based on the current measured complexity within range of the minimum and maximum complexity of the system.
Systemic Risk Score = Value between 0 and 100 %. Complexity is a result of the volatility in behavior of objects within a system, and the level of uncertainty in the relationships between these objects. For a given system 100% represents the maximum risk due to complexity – this is an absolute value and can be used to compare different systems against each other.




Analysis: Small World Evaluation

Small World networks are typical of scale-free systems where the average path in the network is short (mean geodesic) and with high transitivity. This means that relative few nodes act as hubs with many relationships and weak links between these hubs. In this case the relative path is shorter than a similar random network, but with no transitivity due to the small number of nodes.



Analysis: Additional Network Insight

Due to the number of observations in the Anscombe Quartet the following analysis indicates the potential of complexity sciences to uncover more hidden facts about the quartet. Extending the network model with Bayesian probabilities, potential cause-effect relationships can be identified to show potential directional cause-effect relationships in the data. Understanding this adds greatly to creating a predictive model using dependent and independent variables. 

The scale-free test enables insight into the self-organised criticality state of the system (in this case which it is not).

To test robustness of the system, use is made of “random attacks” and “structured  attacks” on the network nodes(vertices). This means the nodes are removed in a random order, as well as a structured order to observe the mean distance of the network. In this case with the removal of 25% of vertices the mean distance only increased with 1.5% which indicates a fairly robust network. This supports the fact that the network is fairly dense at 85% - meaning many relationships between the different indicators and no significant hubs.



Analysis: Self-organised clustering

The time-based observations (events) are not sufficient for temporal Investigations – hence self- organised clustering is used to investigate potential correlations between events. In this case, six of the 5 observations cluster around significant Business Unit 4 input and outputs.



INSIGHT DERIVED FROM USING COMPLEXITY SCIENCES

To summarize the insight derived from the complexity analysis we can see and understand the Anscombe quarter better by adding the following observations:

a)    The volatility indicators show similar volatility in the X and Y groups. Volatility explains the deviations from a mean, and in this case the outliers were left alone for demonstration purposes.
b)  The uncertainty measures show the different levels of uncertainty on each indicator – it becomes clear that these indicators are not similar as measured by standard statistical measurements.
c)   The ideal capacity calculation uses the combination of volatility and uncertainty to indicate the different levels of capacity required by each business unit.
d)   The systemic risk score is relatively high at 68%. The level of uncertainty in the indicators, as well as the high system density measurement (85%) supports this level of systemic risk which indicates that the 4 business units do not operate in isolation but have cross relationships which increases the complexity between them.
e)   The system’s relationships are not random driven as the benchmark against a similar random network shows significant differences in network characteristics. The strongest relationships are between x1, x2 and x3 at 86%.
f)  The system doesn’t measure as a scale-free system and does not support a self-organised critical system.
g)   The system is fairly robust against random and direct attacks. A simulated 25% removal of vertices only resulted in a 1.5% increase in average network distance.
h)  From a temporal view, 6 out of the 11 events can be clustered around Business unit 4 events – mainly due to the fact that level of uncertainty around Unit 4 is low, and that Unit 4 is not significant impacted by the other units.

This can be summarised in the following risk approach and improvement strategy:

"  This system is fairly integrated and shielded from either random failures or structural failures. This is caused by the high interaction of system components between all business units. However, performance optimization of this system do pose a greater challenge. From the quantitative insight the strategy should be to address and correct the high uncertainty of x1,x2 and x3 performance. The cause-effect model of the system should be used to study the impact of changing x1,x2, and x3 to understand what implications will effect the overall system. Improving x1, x2, and x3 will have significant impact on performance indicators such as customer lead times, reduction in work-in-progress and waiting times. It is important to protect the stability of BU 4 (x4 and y4) as they have significant impact of 60% on all observations made in the system".

Anscombe Quartet Scatterplot


Conclusion
In summary the above analysis shows that the Anscombe Quartet is a fairly complex system of related relationships between most variables with a high degree of uncertainty in the data sets. Each business unit, although appearing very similar in standard statistical measurements are quite different in operation, and cannot be viewed or treated as individual units. 

In conclusion, this approach should still be applied as preached in any data science methodology - use your common sense to analyze, construct and predict !






No comments:

Post a Comment

Popular Posts