Most modern instrumental techniques used in analytical chemistry produce an output or signal that is not absolute; the signal or peak is not a direct quantitative measure of concentration or target analyte quantity. Thus, to perform quantitative analysis, one must convert the raw output from an instrument (information) into a quantity (knowledge). This is done by standardizing or calibrating the raw response from an instrument (Refs. 1-4). Here, we briefly summarize the most common methods applied in analytical chemistry, recognizing that this is a very large field. We note that the common use of the term “standardization” is not to be confused with the application of standard methods as specified by regulatory or consensus standard organizations.
In the following discussion, it is assumed that the sample has been properly drawn from the parent population material and properly prepared. Clearly, the most precise analytical methods and the most painstaking calibration methods are useless if applied to a sample that does not represent reality. Nevertheless, the term “sampling,” which describes the process of obtaining the sample (from the population material), implies the existence of a sampling uncertainty (arising mainly from population material heterogeneity) (Refs. 5 and 6). Thus, the analytical result is an estimate of what would be obtained from the parent population material. The theory, concepts, and nomenclature regarding samples and sampling constitute a complex, statistically based, sub-specialty of analytical chemistry well beyond the scope presented here. We begin with some simplified definitions (Ref. 7):
Sampling uncertainty is that part of the total uncertainty in an analytical procedure or determination that results from using only a fraction of the population material. In this respect, sampling by any method is an extrapolation process. Because the sampling uncertainty is usually ignored for an individual analysis on an individual test portion, the sampling uncertainty is considered as being due entirely to the variability of the test portion. It is therefore assessed, when necessary, by replication of the sampling from the parent population material, and statistically isolating the uncertainty thus introduced by analysis of the variance. Typically, the problems associated with liquid population material are less complex but must not be ignored. Sample stratification, concentration and thermal gradients, poor mixing, and gradients associated with flow are all real effects that must be considered. Sampling uncertainty is often minimized by field and laboratory processing, with procedures that can include mixing, reduction, coning and quartering, riffling, milling, and grinding.
Another aspect that must be considered after sampling is sample preservation and handling. The integrity of the sample must be preserved during the inevitable delay between sampling and analysis. Sample preservation may include the addition of preservatives or buffer solutions, pH adjustment, use of an inert gas “blanket,” and cold storage or freezing.
The external standard method can be applied to nearly all instrumental techniques, within the general limits discussed here, and the specific limitations that may be applicable with individual techniques. This method is conceptually simple; the user constructs a calibration curve of instrumental response with prepared mixtures containing the analyte(s) over a range of concentrations, an example of which is shown in Figure 1a. Thus, the curve represents the raw instrumental response of the analyte as a function of analyte concentration or amount. Each point on this plot must be measured several times so that the repeatability can be assessed. Only random uncertainty should be observed in the replicates; trends of increasing or decreasing response (hysteresis) must be remedied by identifying the source and adjusting the method accordingly. The calibration solutions should be randomized (that is, measured in random order). Although called a calibration “curve,” ideally the signal versus concentration plot is linear, or substantially linear (that is, areas of nonlinearity are unimportant, otherwise they are localized, minor, and properly treated by the measurement technique). In some cases, the response may be linearizable (for example, by calculating the logarithm of the raw response). If a curve shows nonlinearity in an area that is important for the analysis, one must measure more concentrations (data points) in the region of curvature.
In practice, the line that results from the calibration is fit with an appropriate model, and the desired value for the unknown concentration is calculated. The curve can be used graphically if approximation suffices. Mixtures prepared for external standard calibration can contain one or many analytes. Once a calibration curve is prepared, it can often be used for some time, provided such a procedure has been previously validated (that is, the stability of the standards and the instrument over the time of use has been assessed). Otherwise, it is best to measure the unknown and the standards within a short period of time. Moreover, if any major change is made to the instrumentation (changing a detector or detector parameters, changing a chromatographic column, etc.), the standards must be remeasured.
FIGURE 1a. An example calibration curve prepared by use of the external standard method. The instrument response is represented by A, and the concentration resulting in that response is [A]. While curves for two analytes are shown, in principle, one can plot as many analytes as desired. While five points per analyte have been shown, one can measure as many as required. Note that a region of nonlinearity is shown in the latter part of the curve for one of the components. One would require a larger number of points to adequately represent and fit any nonlinear areas.
To successfully use the method, the standard mixtures must be in a concentration range that is comparable to that of the unknown analyte, and ideally should bracket the unknown. Multiple measurements of each standard mixture should be made to establish repeatability of points on the curve. Many instrumental methods have operation ranges (frequency, temperature, etc.) in which the uncertainty is minimized, so components and concentrations for standard mixtures must respect this. The standard mixtures should be in the same matrix as the unknown, and the matrix must not interfere with the unknown or other standard mixture components. Any pretreatment of the unknown must also be reflected in the standard mixtures. As with any calibration method, components in the standard mixtures must be available at a high (or at least known) purity, they must be stable during preparation, and must be soluble in the required matrix. Unless the physical phenomenon of a measurement is well understood, extrapolation beyond the curve is not recommended (and indeed is usually strongly discouraged); nevertheless, extrapolation is occasionally done in practice. In those cases, one must be cautious, report exactly how the extrapolation was done, and assess any increase in uncertainty that may result. Note that the curve might not extrapolate through the origin. This is usually the result of adsorption (of components on container walls), carryover hysteresis, absorption (of components in seals or septa), or component degradation or evaporation.
A major consideration with external standardization is that typically, the sample size (for example, the injection volume in chromatography) must be maintained constant for standard mixtures and the unknowns. If the sample size varies slightly, it is often possible to apply a correction to the raw signal. One should not attempt to generate a calibration curve by varying the sample size (that is, for example, injecting increasing volumes into a gas chromatograph). This caution does not preclude serial dilution methods (see below), in which multiple solutions are generated for separate measurement. Other issues that can hinder successful application of the external standards method include instrumental aspects that might not be readily apparent. In chromatographic methods, for example, one can overload the column or detector. In older instruments, settings of signal attenuation were typically made manually, while in newer instruments, this may occur through software, sometimes without operator interaction or knowledge.
Note, inter alia, that in Figure 1a (and indeed all the examples presented here), the uncertainty is only indicated for the variable on the y-axis. It must be recognized there is uncertainty in the values plotted on the x-axis as well, but usually only the largest uncertainty, or the uncertainty that is most important for our application is treated. Note, also, that it is critical to maintain the integrity of standards; decomposition, degradation, moisture uptake, etc., will adversely affect the validity of the calibration.
In many situations in chemical analysis, a full calibration curve is not prepared because of the complexity, time, or cost. In such situations, abbreviated external standard methods are often used. Under no circumstances can an abbreviated method be used if the raw signal response is nonlinear. Moreover, these methods are not generally appropriate for analyses in regulatory, forensic, or health care environments where the consequences can be far reaching.
This method uses a simple proportion approach to standardize an instrument response. It can be used only when the system has no constant determinate error or bias,* and when the reagents used give a zero-blank response (that is, the instrument response from the matrix and measurement system only, without the analyte). A standard should be prepared such that the concentration is close to that of the unknown. One then calculates the concentration of the unknown, [X], as:
[X] = (Ax/As) [S] (1)
where Ax is the instrument response of the unknown, As is the instrument response of the standard, and [S] is the concentration of the standard.
This method, illustrated schematically in Figure 1b, assumes that the blank reading will be zero. One uses a two-point calibration in which the origin is included as the first point. It is important to ensure, by experiment or experience, that such a method is adequate to the task.
If the analytical method has no determinate error or bias, but does produce a finite blank value, then one must also perform a blank measurement, which is subtracted from the instrument response of the standard and the unknown. Then the same procedure (Eq. 1) is used as for the single standard. If multiple samples are to be measured, it is important to measure the blank between each measurement.
FIGURE 1b. An example of a single-point calibration curve. The instrument response is represented by A, and the concentration resulting in that response is [A]. The origin (0,0) is assigned as part of the curve and is assumed to have no uncertainty.
FIGURE 1c. An example of two standards plus a blank calibration curve. The blank is subtracted from each of the standards. The instrument response is represented by A, and the concentration resulting in that response is [A].
When the analytical method has both a determinate error (or bias), and a finite blank value, at least three calibrations must be made: two standards and one blank. The standard concentrations are typically prepared widely spaced in concentration, and the higher concentration should be chosen to represent the limit of linearity of the instrument or method. If this is not practical, the higher concentration should simply be the highest expected concentration of the analyte (unknown). This method is illustrated schematically in Figure 1c. If multiple samples are to be measured, it is important to measure the blank between each measurement.
As mentioned above, the raw signal from an analytical instrument is typically not an absolute measure of concentration of the analyte(s), because the instrument may respond differently to each component. In some cases, such as with chromatographic methods, it is possible to apply response factors, determined from a standard mixture containing all constituents of the unknown sample, for standardization (Ref. 8). The standard mixture is gravimetrically prepared (with known mass percent for each component), and the instrument response is measured, for example as chromatographic areas. The total mass percent and the total area percent each sum to 100. One calculates the ratio of each mass percentage to each area, choosing one component as the reference, which is assigned a response factor of unity. To obtain the response factors of all the other components, one divides its (mass%-to-area ratio) with that of the reference. This is done for all components, producing a response factor for all components, except of course for the reference, defined as unity. When the unknown sample is measured, the response factor is multiplied by each raw area, and the resulting area percent provides the normalized mass percent of each component in the unknown.
This method corrects for minor variations in sample size (earlier defined as the test portion), although large differences in sample size must be avoided so that one is assured of consistent instrument performance. Although the method corrects for the different responses of samples, large differences must be avoided. This also means that the detector must respond linearly to the concentrations of each component, even if the concentrations are very different. This may require dilution or concentration of the sample in some situations. In chromatographic applications, all components of a mixture must be analyzed and standardized, since normalization must be performed on the entire sample.
Some techniques, such as gas chromatography with flame ionization detection and thermal conductivity detection, have well defined physical phenomena associated with output signals. With these techniques, there are some limited, published response factor data that can be used in an approximate way to standardize the response from these devices.
While it is rare that an analytical method can be calibrated by use of a single solution, some instances of spectrophotometry and electroanalytical methods can qualify. To use this method, one sequentially and incrementally adds known masses of standard analytes to a solution, with an instrument response being measured after each addition. This procedure can only be used if the analytical method itself does not change the analyte concentration (nondestructive) and does not lead to a loss of solution volume. A solid crystalline analyte is an example. One must also minimize changes in solution volume over the course of the standardization.
Samples presented for analysis often are contained in complex matrices with many impurities that may interact with the analyte, potentially enhancing or diminishing a signal from an instrumental technique. In such cases, the preparation of an external standard calibration curve will be impossible, because it might be very difficult to reproduce the matrix. In these cases, the standard addition method may be used. A standard solution containing the target analyte is prepared and added to the sample, thus not altering the unknown impurities and their effects. While the quantity of target analyte in the target sample is unknown, the added quantity is known, and its incremental additive effect on the instrument signal can be measured. Then, the quantity of the unknown analyte is determined by what is effectively an extrapolation. In practice, the volume of the standard solution added is kept small to avoid dilution of the unknown impurities by no more than 1% of the total signal. This method can only be used if there is a verified linear relationship between the signal and quantity of analyte. If a determinate error is present, then the slope of the line must be known. Moreover, the sample cannot contain any components that can respond as the analyte (that is, masquerade).
In the simplest case, one addition of analyte is made after first measuring the response of the analyte in the unknown sample. Thus, two measurements are required:
Axo = m[X0] (2)
Axi = m([X0] + [S]) (3)
where Axo is the instrument response of the analyte in the unknown sample, [X0] is the concentration in the unknown sample, and Axi is the instrument response upon the addition of the standard, [S] (additive in equation because X and S are the same compound). The assumed slope is the proportionality constant, m. The two equations are solved simultaneously for [X0]. This technique is very rapid and economical, but there are serious drawbacks. There is no built-in check for mistakes on the part of the analyst, there is no means to average random uncertainties, and there is no way to detect interference (mentioned above as masquerade).
This standard addition method alleviates some of the problems inherent in single standard addition. Here, the unknown sample is first measured in the instrument. Then that sample is “spiked” with incrementally increasing concentrations of the analyte, generating a curve such as that shown in Figure 1d. The curve should extrapolate to zero signal at zero concentration. The concentration of the analyte in the unknown is read or calculated from the abscissa (x-axis).
An internal standard is a compound added to a sample at a known concentration, the purpose of which is to exhibit a similar signal when measured in an instrument but be distinguishable from the signal of the desired analyte. It provides the highest level of reliability in quantitation by chromatographic methods and is not affected by large differences in sample size (Ref. 8). Unlike the internal normalization method, it is not necessary to elute or measure all the components of the sample, one need focus only on the component(s) of interest. In atomic spectrometry, this method is not affected by changes in gas flow rates, sample aspiration rates, and flame suppression or enhancement. Another situation in which this method is valuable is when the sample matrix is either unknown or very complex, precluding the preparation of external standards.
A set of calibration solutions is prepared by mass, containing the target analyte, X, and a standard that is not present in the unknown sample, A. The instrument response (for example, a chromatographic area) is measured for each calibration solution, and a plot is made to establish linearity as in Figure 1a. The ordinate axis is the ratio of the response of the unknown analyte component, Ax, to the response of the chosen standard, As. The abscissa is the ratio of mass of X to the mass of S for that standard mixture. Once the linearity is confirmed in the concentration range of interest, the unknown is spiked with a known mass of S, the instrument response is measured, and the area ratio Ax/As is calculated. Either the graph or a fit of the data on Figure 1e is then used to determine the corresponding mass fraction, from which the mass of X may be determined. Note that the calculations could be simplified if the same mass per volume of the internal standard is added to both the unknown samples and the calibration standards.
FIGURE 1d. An example of calibration by multiple standard addition. Three additions (spikes) of the analyte X are shown, as is the extrapolation to the unknown concentration, X0.
FIGURE 1e. An example of the multiple internal standard method. The ordinate (y) axis is the ratio of the response of the unknown analyte component, Ax, to the response of the chosen standard, As. The abscissa (x) axis is the ratio of mass of X to the mass of S for that standard mixture.
In practice, once the linearity is established for a given mixture, it is no longer necessary to use multiple standards, although this is the most precise method. After verification of linearity, one standard solution can be used to fix the slope, provided it is close in concentration to that of the target analyte. In this case, the mass of the unknown can be found from:
X/S = (Ax/As)(1/R) (4)
where X is the mass of the unknown analyte in the sample, S is the mass of the added internal standard in the sample, Ax and As are the instrument responses (areas) of the unknown and internal standard, respectively. R is a ratio determined from the standard solution prepared with both X and S: (mass, unknown analyte/mass, internal standard)/(signal, unknown analyte/signal, internal standard) = R.
(5)
Because R is the slope of the calibration curve discussed above, once linearity is established, one solution suffices. There are many conditions that must be fulfilled to use the internal standard method, and it is rare that all of them can actually be met. Indeed, in practice, one tries to meet as many as possible, but those that are mandatory are italicized. The compound chosen must not be present already in the unknown. The compound chosen must be separable from the analyte present in the unknown. An exception occurs when an isotopically labeled standard is used, in conjunction with mass discrimination or radioactive counting detection. In a chromatographic measurement, this is typically at least baseline resolution, although this would be a minimally acceptable degree of separation. On the other hand, the unknown analyte peak and the internal standard peak should be close to each other (temporally) on the chromatogram. The compound chosen must be miscible with the solvent at the temperature of reagent preparation and measurement. The compound chosen must not react chemically with the sample or solvent, or interfere in any way with the analysis. It is critical to maintain the integrity of standards; decomposition, degradation, moisture uptake, etc., will adversely affect the validity of the calibration. In the case of a chromatographic measurement, the same applies to interactions with the stationary phase. The compound chosen must be chemically similar (for example, in functionality, thermophysical properties) to the analyte. If such a compound is not available (for example, in a chromatographic measurement), an appropriate hydrocarbon should be chosen as a surrogate. The standard solution should be prepared at a similar concentration as in the unknown matrix; ratio correction of large differences is no substitute for an appropriate concentration. In a chromatographic measurement, the compound chosen must elute as closely as possible to the analyte and should not be the last peak to elute (the final peak often shows different geometry such as tailing). The compound chosen must be sufficiently nonvolatile to allow for storage as needed. When there is the potential for the unknown analyte to be lost by adsorption, absorption, or some other interaction with the matrix or container, a compound called a carrier is sometimes added in large excess. The carrier is similar, chemically and physically, to the unknown analyte, but easily separated from it. Its purpose is to saturate or season the matrix and prevent analyte loss.
Serial dilution is less a standardization method as it is a method of generating solutions to be used for standardizations. Nevertheless, its importance and utility, as well as the popularity of its application, warrants mention in this section. A serial dilution is the stepwise dilution of a substance, observant of a specified, constant progression, usually geometric (or logarithmic). One first prepares a known volume of stock solution of a known concentration, followed by withdrawing some small fraction of it to another container or vial. This subsequent container is then filled to the same volume as the stock solution with the same solvent or buffer. The process is then repeated for as many standard solutions as are desired. A ten-fold serial dilution could be 1 M, 0.1 M, 0.01 M, 0.001 M, etc. A ten-fold dilution for each step is called a logarithmic dilution or log-dilution, a 3.16-fold (100.5-fold) dilution is called a half-logarithmic dilution or half-log dilution, and a 1.78-fold (100.25-fold) dilution is called a quarter-logarithmic dilution or quarter-log dilution. In practice, the ten-fold dilution is the most common. The serial dilution procedure is not only used in chemical analysis but also in serological preparations in which cellular materials such as bacteria are diluted. A critical aspect of serial dilution is that the initial solution concentration must be prepared and determined with great care, since any mistake here will be propagated into all resulting solutions.
In many analytical procedures and standard protocols, the use of reference materials may be called for. Every industrialized country maintains a national measurement laboratory (NML), and in the United States, the National Bureau of Standards and Technology (NIST) satisfies this requirement by statute. Several different kinds of materials are available from NIST (Refs. 10, 11). Other countries and organizations have similar reference materials systems; the terminology used to describe different types of reference materials may vary from system to system. Here we use the NIST reference materials to describe different classes of references materials and their use in traceability. It should be noted that other organizations within the United States provide similar types of reference materials.
A RM is a material sufficiently homogeneous and stable with respect to one or more specified properties established to be fit for its intended use in a measurement process. The term RM is generic and not fixed by statute. A RM can be used for calibration, identification, procedure assessment, or quality control. A RM generally cannot be used for both validation and calibration in the same instrument or procedure.
A CRM is a reference material characterized by a metrologically valid procedure for one or more specified properties. It is provided with a certificate that lists the value of the specified property, its associated uncertainty (which may be expressed as an expanded uncertainty or a probability statement), and a statement of traceability. The certificate lists the numerical value of the property (e.g., enthalpy, density, etc.) and the traceability.
A SRM is a CRM that meets additional certification criteria that might differ from material to material. These criteria are determined by NIST and may include the requirement of two separate, orthogonal (independent and non-overlapping) measurement methods for the property (and its associated uncertainty), which are listed on the accompanying certificate. A SRM is prepared and used for three main purposes: (1) to help develop accurate methods of analysis; (2) to calibrate measurement systems used to facilitate exchange of goods, institute quality control, determine performance characteristics, or measure a property at the state-of-the-art limit; and (3) to ensure the long-term adequacy and integrity of measurement quality assurance programs. The term "Standard Reference Material" is fixed by statute and is a registered trademark.
The uncertainty provided with a reference material can be used to set a lower limit for the uncertainty in a procedure developed with that material. The use of these materials in no way guarantees that the result obtained using the material will have the same uncertainty as that listed on the certificate. It is still imperative to develop an uncertainty budget for the method and perform an uncertainty analysis.
Analytical measurements and certifications often contain a statement of traceability. Traceability describes the “result or measurement whereby it can be related to appropriate standards, generally international or national standards, through an unbroken chain of comparisons" (Ref. 9). Traceability typically includes the application of a reference material (RM) or a standard reference material (SRM) for instrument calibration before standardization for the analytes of interest. The true value of a measured quantity (τ) cannot typically be determined. The true value is defined as characterizing a quantity that is perfectly defined. It is an ideal value which could be arrived at only if all causes of measurement uncertainty were eliminated, and the entire population was sampled.
As stated in Section 2, the result of a measurement is only an approximation or estimate of the true value of the measurand or quantity subject to measurement. In the determination of the combined standard uncertainty and ultimately the expanded uncertainty, it is critical to include the uncertainty of calibration in the process, as discussed above. The process of arriving at the uncertainty Uy of a quantity y that is based upon measured quantities xi,..,xz is called the propagation of uncertainty. A full discussion of propagation of uncertainty is beyond the scope of this section; a simplified prescription, in the form of general and specific formulae, is provided here. In general, the propagated random uncertainty in y can be determined from:
This approach can be used when the uncertainties are random (not systematic), are relatively small, and are independent or uncorrelated (that is, in the absence of covariance). Relatively large uncertainties (such as those approaching the magnitude of the measurand itself) cannot be treated with this approach, especially if the measurand is a nonlinear function of the measured quantity. Note that by convention, the use of upper case Uy denotes the expanded uncertainty, which is the uncertainty multiplied by a coverage factor k in excess of unity (for the 95% confidence level, the coverage factor is 2). A coverage factor k = 1 represents the 68% confidence level. In scientific and technical reports and publications, the goal is to report measurements and the standard uncertainty (k = 1) or the expanded uncertainty (k > 1).
It is possible to reduce this general formulation to more specific formulae in the cases of common mathematical operations. These are provided in the table.
It is often necessary or desirable to compare an analytical result against some established regulatory standard or limit, for example, to determine if the concentration of a toxic substance falls above or below a legal limit. It is imperative to consider the uncertainty when determining compliance against limits; indeed, most established limits are set with some consideration or allowance for uncertainty. A decision rule is often built into tests of compliance limits. A common decision rule is that a measured result indicates non-compliance if the measurand exceeds the limit by the expanded uncertainty. A similar approach is applied for measurands that fall below an established limit.
A disconnect often occurs when scientific personnel attempt to convey their measured results in legal or regulatory venues. For example, the courts in the United States routinely must deal with scientific testimony based on measurements, and scientists are often asked to characterize their measurements in terms of error rates.
Statistically, the error rate is the frequency of type I and type II errors in null hypothesis significance testing. This has importance in forensic chemistry, e.g., when the blood alcohol content (BAC) of a sample is measured. Here, a null hypothesis might be the BAC of sample X is not below 0.08% (mass/mass). A measurement above that level, and therefore a failure to reject the null hypothesis, can result in a legal finding of intoxication. A type I error occurs when a rejected null hypothesis is correct (false positive); a type II error occurs when the accepted null hypothesis is false (false negative). Independent of the frequency of type I and type II errors (the statistical error rate), each measurement of BAC has an uncertainty. The uncertainty of each measurement is determined by the propagation of the contributions to uncertainty that is represented by the uncertainty budget, multiplied by the appropriate coverage factor. The error rate of a particular laboratory or technique is not so easily determined. In some large state forensic laboratories, error rates can be approximated by inserting known standard samples anonymously into the normal workflow, but even this approach has limitations.
It is important to understand that the concept of error rate is distinct from the frequency at which an analytical instrument “throws an error.” For example, in the headspace gas chromatographic analysis of BAC, the instrument might report a sampling error, and the operator might notice a damaged needle, which is then replaced. The frequency of this type of error is different from the error rate mentioned above.
|
Measurand argument | Arithmetic uncertainty formula |
y (where y is a counted random event over a time interval) | |
y = A × x (where A is a constant with no uncertainty) | |
y = x1 + x2 | |
y = x1/x2 y = x1 × x2 | |
y = (x1 × x2)/x3 | |
y = log(x) | |
y = ln(x) | |
y = ex | Uy = y × Ux |
y = xa | |
y = 10x |