Methods Comparison 5th Sept'18 DR Priya

Analytical Evaluation Of
Methods and Method

Comparison
Dr.R.S.Logapriya
Date ??????
Specific learning objectives
At the end of this session, one should be able to
1. Discuss the criterion involved in validating a method

2. Describe the parameters involved in analytical evaluation of a
method
3. Explain how these parameters are estimated
4. Explain the concept of error - systematic (constant and
proportional), random error.
5.Discuss in detail the comparison of method study (regression, bland
altman plot)
PROCESS OF METHOD SELECTION ,EVALUATION &
MONITORING
• Major requirements for testing validation are driven by these
rules
• CLIA -Clinical Laboratory Improvement Amendments
• TJC- The Joint Commission
• CAP- College Of American Pathologists
• They require the same type of experiments to be performed, with

a few additions.
• It is these rules that guide the way tests in the clinical chemistry
laboratory are selected and validated.
Method Evaluation
• Test are categorised into 3 groups
• Waived
• Moderate complexity
• High complexity
• WAIVED TEST :
• Cleared by FDA to be very simple
• Most likely accurate
• Would posses negligible risk of harm to patient if not

performed correctly also
• eg: dipstick test and glucose monitors
• The CLIA (Clinical Laboratory Improvement Amendments) final

rule requires that waived tests simply follow the manufacturers
instruction
• MODERATE COMPLEXITY : most automated methods
• HIGH COMPLEXITY : manual methods and methods

requiring more interpretations.
• Both these tests are validated on whether they are FDA

approved or not. (U.S Food and Drug Administration)
GENERAL CLIA REGULATIONS OF METHOD
VALIDATION
NONWAIVED FDA-APPROVED TESTS

1. Demonstrate test performance comparable to that established
by the manufacturer.
a. Accuracy
b. Precision
c. Reportable range
2. Verify reference (normal) values appropriate for patient

population.
NON WAIVED FDA-APPROVED TESTS MODIFIED OR
DEVELOPED BY LABORATORY
1. Determine
a. Accuracy
b. Precision
c. Analytic sensitivity
d. Analytic specificity
(including interfering substances)
e. Reportable range of test results
f. Reference/normal ranges
g. Other performance characteristic
h. Calibration and control procedures
ACCURACY : Closeness of agreement of a single measurement with “true
value”.
Its quantitative measure comprises of random and systemic errors.
Deviation from accuracy is Error of measurement (Inaccuracy)
TRUENESS:Closeness of agreement of mean value with “true value”.
Its quantitative measure comprises of systemic error.
Deviation from trueness is Bias

PRECISION : Closeness of agreement between independent results
of measurements obtained under stipulated conditions.
ANALYTICAL SPECIFICITY : Is the ability of a method to detect

the specific concentration of the target analyte even in the presence
of potent interfering substance in the sample.
ANALYTICAL SENSITIVITY : Is the ability of an analytical

method to detect small quantities of an analyte.
• REPORTABLE RANGE : Is the measurement range that
extends from the lower limit of quantification to the higher
limit of quantification.
• LOWER LIMIT OF QUANTIFICATION :Ability of the

analyser to detect an analyte at lower concentration which
shows CV<10%.
• HIGHER LIMIT OF QUANTIFICATION :Ability of the

analyser to detect the higher concentration of the analyte
above which the CV (coefficient of variables) exceeds the
allowable error.
LIMIT OF DETECTION: Is defined as the lowest value that
significantly exceeds the measurement of a blank sample. It is
the lowest amount of analyte accurately detected by a method.
REFERENCE RANGE: Is a set of values that includes upper

limit and lower limit of a test based on a group of healthy
individuals.
Proficiency: Use of external validation to implement a new

method and to improve the efficiency of the test.
• CALIBRATION: Calibration function is the relation between
instrument signal and concentration of the analyte . This
relationship is established by measurement of sample with
known amounts of analyte(calibrator)
• CONTROL: Is is the internal quality control .Sample with

known concentration of analyte is used.
ERRORS:
1.Systemic error : Error always occur in one direction
2.Random error : Error varies from sample to sample. Cause in instrument instability,
temperature variable etc
TOTAL ERROR: Systemic error plus random error.
SYSTEMIC ERROR:
Proportional error - Error where the magnitude changes as a percent of analyte

present; error dependent on analyte concentration.
Constant error - Error in the sample direction and magnitude; the magnitude of
change is constant and not dependent on the amount of analyte.
PRECISION : Closeness of agreement between independent results of
measurements obtained under stipulated conditions.
Precision is specified as:
• Repeatability:Closeness of agreement between results of successive

measurement carried out under the same conditions(within run and between run)
• Reproducibility:Closeness of agreement between results of measurements

performed under changed conditions.(between days)
• The scheme of precision is 2*2*10
Quantitatively measures the dispersion of random errors.
Deviation from precision is Imprecision.

Imprecision
Precision plot
Precision Estimate
ACCURACY
Accuracy is estimated using
1.Recovery
2.Interference
3.Patient sample comparison

RECOVERY STUDIES
• It is used to show whether a method is able to accurately
measure an analyte.
• A small aliquot of concentrated analyte is added (spiked) into

a patient sample (matrix) and then measured by the method
being evaluated.
• The purpose is to determine how much of the analyte can be

detected (recovered) in the presence of all other compounds
in the matrix.
INTERFERENCE STUDY
• Are designed to determine if the specific compounds affect the accuracy of laboratory tests.
• Common interference encountered are
Hemolysis : Broken red blood cells and its content
Icterus :High bilirubin
Turbidity :Particulate matter or lipids
Interferents affect test by absorbing or scattering light, react with reagents and affect reaction
rates to measure a given analyte.
• In interference experiments a potential interferent is added to
patient sample.
• If an effect is observed, the concentration of the interferent is

lowered sequentially to determine the concentration at which
test results are not affected.
• Results with unacceptable high levels of interferent are

reported with cautionary comments or not reported.
Comparison Of Method Studies
• A method comparison experiment examines a patient
samples by the test method with a reference method.
• Used to estimate the systemic error in patient sample and

also offers the type of systemic error( proportional or
constant)
• Test method is always compared with a Gold standard

reference method.
PE
RE
SE/CE
STRENGTH OF RELATION BTW TEST
• A plot of the test-method data (y-axis) versus the comparative
method (x-axis) helps to visualize the data generated in a COM test.
• If the two methods correlate perfectly, the data pairs plotted as
concentrations values from the reference method (x) versus the
evaluation method (y) will produce a straight line (y=mx+b), with a
slope of 1.0, a y-intercept of 0, and a correlation coefficient (r) of 1.
• Data should be plotted daily and inspected for outliers so that
original samples can be re analyzed as needed.
• Linearity can be confirmed visually.
Bland-Altman plot
• It is used to check whether the given two methods are
comparable or not.
• To check for concentration dependent error.
• Note: If the plots are randomly distributed it indicates that

the method are non comparable.
Bland-Altman plot
PERFORMANCE STANDARDS FOR COMMON CLINICAL
CHEMISTRY ANALYTES AS DEFINED BY CLIA
Precision
• Precision may be defined as the closeness of agreement between
independent results of measurements obtained under stipulated
conditions.
• ' The degree of precision is usually expressed on the basis of statistical
measures of imprecision, such as the SD or CV, which thus is
inversely related to precision.
• Imprecision of measurements is solely related to the random error of
measurements .Precision is specified as follows:
• Repeatability: closeness of agreement between results of
successive measurements carried out under the same conditions
(i.e., corresponding to within-run precision).
• Reproducibility: closeness of agreement between results of
measurements performed under changed conditions of
measurements (e.g., time, operators, calibrators, and reagent lots).
• Two specifications of reproducibility are often used: total or between-
run precision in the laboratory, often termed intermediate precision,
Analytical Measurement Range
• The analytical measurementrange (measuring interval, reportable
range) is the analyte concentration range over which the
measurements are within the declared tolerances for imprecision and
bias of the method.
• In practice, the upper limit is often set by the linearity limit of the
instrument response and the lower limit corresponds to the lower
limit of quantitation (LoQ-see below).
• Usually, it is presumed that the specifications of the method apply
throughout the analytical measurement range.
• However, there may also be situations in which different
specifications are applied to various segments of the analytical
measurement range.
NON WAIVED FDA-APPROVED TESTS MODIFIED OR
DEVELOPED BY LABORATORY
1. Determine
a. Accuracy
b. Precision
c. Analytic sensitivity
d. Analytic specificity
(including interfering substances)
e. Reportable range of test results
f. Reference/normal ranges
Method Evaluation
• A short, initial evaluation should be carried out before the
complete method evaluation.
• This preliminary evaluation should include the analysis of a
series of standards to verify the linear range and the replicate
analysis (at least eight measurements) of two controls to
obtain estimates of short-term imprecision.
• If any results fall short of the specifications published in the
method’s product information sheet (package insert), the
method’s manufacturer should be consulted.
Determine Imprecision and Inaccuracy
• The first determinations (estimates) to be made in a method
evaluation are the imprecision and inaccuracy.
• They should be compared with the maximum allowable error
based on medical criteria.
• If the imprecision or inaccuracy exceeds the maximum
allowable error, it is unacceptable and must be modified and
re-evaluated or rejected.
• Imprecision is the dispersion of repeated measurements
around a mean (true level).
• Random analytic error is the cause of imprecision in a test.
• Imprecision is estimated from studies in which multiple
aliquots of the same specimen (with a constant
concentration) are analyzed repetitively.
• Inaccuracy, or the difference between a measured value and
its actual value, is due to the presence of a systemic error.
• Systemic error can be due to constant or proportional error
and is estimated from three types of study:
• (1) recovery,
• (2) interference, and
• (3) a COM study.
Measurement of Imprecision
• Method evaluation begins with a precision study.
• This estimates the random error associated with the test
method and detects any problems affecting its
reproducibility.
• It is recommended that this study be performed over a 10- to
20-day period, incorporating one or two analytic runs (runs
with patient samples or QC materials) per day.
• A common precision study is a 2 x 2 x 10 study, where two
controls are run twice a day for 10 days.
• Running multiple samples on the same day does a good job of
estimating precision within a single day but underestimates longterm
changes that occur over time.
• By running multiple samples on different days, a better estimation of
the over time random error is given.
• It is important that more than one concentration be tested in these
studies, with materials ideally spanning the clinically meaningful
range of concentrations.
• For glucose, this might include samples in the hyperglycemic range
(150 mg/dL) and the hypoglycemic range (50 mg/dL).
• After these data are collected, the mean, SD, and CV are calculated.
• The random error or imprecision associated with the test
procedure is indicated by the SD and the CV.
• The within-run imprecision is indicated by the SD of the controls
analyzed within one run.
• The total imprecision may be obtained from the SD of control
data with one or two data points accumulated per day.
• The total imprecision is the most accurate assessment of
performance that would affect the values a clinician might see and
reflects differences in operators, pipettes, and variations in
environmental changes such as temperature.
• In practice ,run imprecision is used more commonly than total
imprecision.
• An inferential statistical technique, ANOVA is then used to
analyze the available precision data to provide estimates of the
within-run, between- run, and total imprecision.
• Acceptable Performance Criteria:
Imprecision Studies
• During a recent evaluation of vitamin B 12 in laboratory. They
ran several concentrations of vitamin B12, twice daily (in
duplicate) for 10 days.
• The amount of variability between runs is represented by
different colors, over 10 days (x-axis).
• The CV was then calculated for within run, between run, and
between days.
• The total SD, estimated at 2.3, is then compared with medical
decision levels (MDLs) or medially required standards based
on the analyte .
• The acceptability of analytic error is based on how the test is
to be used to make clinical interpretations.
•
• The determination of whether long-term precision is adequate
is based on the total imprecision being less than one third of
the total allowable error (total imprecision, in this case, 1.6;
selection ofone-third total allowable error for imprecision is
based on Westgard).
• In the case that the value is greater than the total allowable
error , the test can pass as long as the difference between one-
third total allowable error and the determined allowable error
are not statistically significant.
• In this case, the 1.79 was not statistically different from 1.6
(1⁄3*4.8), and the test passed our imprecision studies .
Measurement of Inaccuracy
• Once method imprecision is estimated and deemed acceptable, the
determination of accuracy can begin.
• Accuracy is estimated using three different types of studies:
• (1) recovery,
• (2) interference, and
• (3) patient- sample comparison.
Recovery Studies
• Recovery studies will show whether a method is able to accurately
measure an analyte.
• In a recovery experiment, a small aliquot of concentrated analyte is
added (spiked) into a patient sample (matrix) and then measured by
the method being evaluated.
• The amount recovered is the difference between the spiked sample
and the patient sample (unmodified).
• The purpose of this type of study is to determine how much of the
analyte can be detected (recovered) in the presence of all the other
compounds in the matrix.
• The original patient samples (matrix) should not be diluted more
than 10% so that the matrix solution is minimally affected.
Interference Studies
• Interference studies will determine if specific compounds affect the
accurate determination of analyte concentrations.
• Common interferences include hemolysis and turbidity, which can
obscure the absorbance of the measured analyte.
• Interferents, either can react with the analytic reagent or may alter
the reaction between the analyte and the analytic reagents. An
interference experiment is performed by adding the potential
interferent (in the maximally elevated range) to the patient sample.
• If an effect is observed, its concentration is lowered sequentially in
order to determine the concentration at which test results are valid.
• Similarly, results canbe ignored if interferent levels are too high,
owing to the fact that the results are inaccurate.
Comparison-of-Methods Studies
• A COM experiment examines patient samples by the method being
evaluated (test method) with a reference method.
• It is used primarily to estimate systemic error in actual patient samples,
and it may offer the type of systematic error (proportional versus
constant).
• Ideally, the test method is compared with a standardised reference method
(gold standard), a method with acceptable accuracy in comparison with
its imprecision.
• Many times reference methods are laborious and time consuming, as is
the case with the ultracentrifugation methods of determining cholesterol.
• These routine tests have their own particular inaccuracies, so it is
important to determine what inaccuracies they might have that are
documented .
• If the new test method is to replace the routine method, differences be-
• To compare a test method with a comparative method, it is
recommended by Westgard et al. and CLIA that 40 to 100
specimens be run by each method on the same day over 8 to 20
days (preferably within 4 hours), with specimens spanning the
clinical range and representing a diversity of pathologic
conditions.
• As an extra measure of QC, specimens should be analyzed in
duplicate. Otherwise, experimental results must be checked by
comparing test and comparative-method results immediately
after analysis.
• Samples with large differences should be repeated to rule out
technical errors as the source of variation.
• Daily analysis of two to five patient specimens should be
followed for at least 8 days if 40 specimens are compared and
for 20 days if 100 specimens are compared in replication
studies.
Statistical Analysis of Comparison-of-Methods
Studies
• The data used to plot the test method versus the comparative method can be further
statistically analyzed using a linear (also known as Deming) regression analysis.
• Linear regression generates statistical calculations of the slope (b), the y-intercept
(a), and the SD of the points about the regression line (Sy/x), and the correlation
coefficient (r).
• An example of these calculations can be found in , where a comparison of beta hCG
concentrations on the IMx system and the Elecsys 2010 is given.
• The reason for calculating statistics is to determine the types and amounts of error
that a method has, in order to decide if the test is still valid to make clinical
decisions.
• Several types of errors can be seen looking at a plot of test method versus
comparative method . When random errors occur, points randomly move about the
mean. Increases in the Sy/x statistic reflect random error. Constant error is seen
visually as a shift in the y-intercept; a t-test analysis can be used to determine if these
differences are significant. Proportional error is reflected in alterations in line
slope and can be also be analyzed with a t-test .
• Interpretation of experimental data is performed by using the
results of the paired t-test and the correlation coefficient.
• The paired t-test is used to compare the magnitude of the
bias (the difference between the means of the test and that of
the comparative method) with that of the random error.
• The t-test indicates only whether a statistically significant
difference exists between the two SDs or means,
respectively.
• It does not provide information on the magnitude of the error
compared in the context of clinically allowable limits of
error.
• tant to confirm that outliers are true outliers and not the
result of technical errors.
• A linear regression is performed to analyze COM studies .
• When two tests perfectly give the same results , the correlation coefficient(r)
2 is1.
• The correlation coefficient used in COM studies should be 0.99 (or greater),
indicating the range of patient samples is adequate for the standard linear
regression analysis.
• If r is less than 0.99, then alternative analyses should be used.
• Linear regression analysis is more useful than the t-test for evaluating COM
studies, as the constant systemic error can be determined by the y-intercept,
and proportional systemic error can be determined by the slope.
• Random error can also be determined by the standard error of the estimate
(Sy/x).
• If a nonlinear relationship occurs between the test and comparative methods,
linear regression analysis can be used only over the values in the linear
range.
• To make accurate conclusions about the relationship between two tests, it is
important to avoid technical errors.
• Tests are performed to answer clinical questions, so in order to
assess how this error might affect clinical judgments, it is
assessed in terms of allowable (analytic) error (Ea).
• This allowable error is determined for each test and is based
on the amount of error that will not negatively affect clinical
judgments.
• If both random and systemic error (total error) is less than the
Ea, then the performance of the test is considered acceptable.
• If the error is larger than the Ea, corrections must be made to
reduce the error or the method should be rejected.
• This process ensures that laboratory tests give accurate,
clinically relevant information to physicians to manage their
patients effectively.
Allowable Analytic Error
• Probably the most important aspect of method evaluation is to
determine if the random and systematic error (total error) is
less than the Ea.
• There have been several methodologies that have estimated
medically Ea, including physiologic variation, multiples of the
reference interval, and pathologist judgment.
• The Ea limits published by CLIA specify the maximum error
allowable by federally mandated proficiency testing .
• These performance standards are now being used to
determine the acceptability of clinical chemistry
analyzer performance.
• The Ea is specifically calculated based on the types of
studies described in the previous section .
• Estimates of random and systematic error are calculated
and then compared with the published allowable error at
critical concentrations of the analyte.
• If the test does not meet the allowable error criteria, it
must be modified to reduce error or rejected.
Reference
• Bishops Clinical chemistry ,7th edition
Instrumental parameters needed in overall evaluation process
• Pipetting precision
• Specimen to specimen carry over
• Detector imprecision
• Time to first reportable results
• Onboard reagent stability
• Overall throughput
• Meantime between instrument failures
• Meantime to repair

Methods Comparison 5th Sept'18 DR Priya

Uploaded by

Copyright:

Available Formats

Methods Comparison 5th Sept'18 DR Priya

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Methods Comparison 5th Sept'18 DR Priya

Uploaded by

Copyright:

Available Formats

Analytical Evaluation Of

Methods and Method

1. Discuss the criterion involved in validating a method

• CLIA -Clinical Laboratory Improvement Amendments

• TJC- The Joint Commission

• CAP- College Of American Pathologists

• They require the same type of experiments to be performed, with

• Cleared by FDA to be very simple

• Most likely accurate

• Would posses negligible risk of harm to patient if not

• eg: dipstick test and glucose monitors

• The CLIA (Clinical Laboratory Improvement Amendments) final

• HIGH COMPLEXITY : manual methods and methods

• Both these tests are validated on whether they are FDA

NONWAIVED FDA-APPROVED TESTS

2. Verify reference (normal) values appropriate for patient

Its quantitative measure comprises of random and systemic errors.

Deviation from accuracy is Error of measurement (Inaccuracy)

TRUENESS:Closeness of agreement of mean value with “true value”.

Its quantitative measure comprises of systemic error.

Deviation from trueness is Bias

ANALYTICAL SPECIFICITY : Is the ability of a method to detect

ANALYTICAL SENSITIVITY : Is the ability of an analytical

• LOWER LIMIT OF QUANTIFICATION :Ability of the

• HIGHER LIMIT OF QUANTIFICATION :Ability of the

REFERENCE RANGE: Is a set of values that includes upper

Proficiency: Use of external validation to implement a new

• CONTROL: Is is the internal quality control .Sample with

1.Systemic error : Error always occur in one direction

TOTAL ERROR: Systemic error plus random error.

Proportional error - Error where the magnitude changes as a percent of analyte

Precision is specified as:

• Repeatability:Closeness of agreement between results of successive

• Reproducibility:Closeness of agreement between results of measurements

• The scheme of precision is 2*2*10

Quantitatively measures the dispersion of random errors.

Deviation from precision is Imprecision.

3.Patient sample comparison

• A small aliquot of concentrated analyte is added (spiked) into

• The purpose is to determine how much of the analyte can be

• Common interference encountered are

Hemolysis : Broken red blood cells and its content

Icterus :High bilirubin

Turbidity :Particulate matter or lipids

• If an effect is observed, the concentration of the interferent is

• Results with unacceptable high levels of interferent are

• Used to estimate the systemic error in patient sample and

• Test method is always compared with a Gold standard

• To check for concentration dependent error.

• Note: If the plots are randomly distributed it indicates that

• Specimen to specimen carry over

• Time to first reportable results

• Onboard reagent stability

• Meantime between instrument failures

You might also like

• The scheme of precision is 2210