The boundaries of agreement estimate the interval between some of the differences between the measures. To compare the Bland Altman measurement systems, the differences between the different measurements of the two different measurement systems are calculated and the average and the standard deviation are calculated. The 95% of “agreement limits” are calculated as the average of the two values minus and plus 1.96 standard deviation. This 95 per cent agreement limit should include the difference between the two measurement systems for 95 per cent of future measurement pairs. The diagram shows a dispersal diagram of the differences represented by the average values of the two measures. The horizontal lines are drawn at the average difference and the limits of the match. Bland-Altman plots are widely used to assess the agreement between two instruments or two measurement techniques. Bland-Altman plots identify systematic differences between measures (i.e. fixed pre-stress) or potential outliers.

The average difference is the estimated distortion, and the SD of the differences measures random fluctuations around this average. If the average value of the difference based on a 1-sample-t test deviates significantly from 0, this means the presence of a solid distortion. If there is a consistent distortion, it can be adjusted by subtracting the average difference from the new method. It is customary to calculate compliance limits of 95% for each comparison (average difference ± 1.96 standard deviation of the difference), which tells us how much the measurements were more likely in two methods for most people. If the differences in the average± 1.96 SD are not clinically important, the two methods can be interchangeable. The 95% agreement limits can be unreliable estimates of population parameters, especially for small sampling sizes, so it is important to calculate confidence intervals for 95% compliance limits when comparing methods or evaluating repeatability. This can be done by the approximate Bland and Altman method [3] or by more precise methods. [6] We see that the limit values do not match the data well. They are too wide at the lower end of glucose and too narrow at the high end of glucose. They are right because they probably have 95% of the differences (here 84/88 – 94.5%). but all the differences outside the borders are at one end and one of them is far away. The boundaries of the agreement (LoA) are defined as the average difference ± 1.96 SD of differences.

If these limits do not exceed the maximum allowable difference between methods (differences in average value ± 1.96 SD are not clinically important), the two methods are considered consistent and can be used interchangeably. The simple 95% limits of the agreement method are based on the assumption that the average value and standard deviation of differences are constant, i.e. they do not depend on the size of the measurement. In our original documents, we described the usual situation where the standard deviation is proportional to size, and described a method using a logarithmic transformation of the data. In our 1999 review paper (Bland and Altman 1999), we described a method to avoid any relationship between the average and the SD of the differences and magnitude of the measurement. (It was Doug Altman`s idea, I can`t take recognition.) Especially for small sample sizes, the sample average and sample SD may not have values close to the actual population average and SD. To account for this possible discrepancy, it is possible to calculate 95% of forecast tapes for the difference between the two assage methods. These 95% prediction bands are wider than the 95% chord limits (especially for small samples) and therefore provide a more accurate prediction of future differences between the two test methods.