Dengue is an arthropod-borne flavivirus that comprises four distinct serotypes (DEN-1, DEN-2, DEN-3 and DEN-4) that constitute an antigenic complex of the genus flavivirus, family Flaviviridae. Infection by one serotype induces life-long immunity against reinfection by the same serotype, but only transient and partial protection against infection with the other serotypes1,2.

Dengue virus infections can result in a range of clinical manifestations from asymptomatic infection to dengue fever (DF) and the severe disease dengue haemorrhagic fever/dengue shock syndrome (DHF/DSS). Most dengue infections are asymptomatic or cause mild symptoms, which are characterized by undifferentiated fever with or without rash. Typical DF is characterized by high fever, severe headache, myalgia, arthralgia, retro-orbital pain and maculopapular rash. Some patients show petechiae, bruising or thrombocytopenia. The clinical presentation of acute dengue infection is non-specific but 5–10% of patients progress to severe DHF/DSS, which can result in death if it is not managed appropriately. Plasma extravasation is the main pathophysiological finding of DHF/DSS, which differentiates it from DF. DHF/DSS is characterized by high fever, bleeding, thrombocytopenia and haemoconcentration (an increase in the concentration of blood cells as a result of fluid loss). Approximately 3–4 days after the onset of fever, patients can present with petechiae, rash, epistaxis, and gingival and gastrointestinal bleeding. Pleural effusion and ascites are common. Some patients develop circulatory failure (DSS), presenting with a weak and fast pulse, narrowing of pulse pressure or hypotension, cold and moist skin and altered mental state. Although there are no specific antiviral treatments for dengue infection, patients usually recover when the need for fluid management is identified early and electrolytes are administered3. It has been proposed that the classification of dengue disease should be simplified as severe and non-severe dengue. This simplified classification would make patient management and surveillance easier4.

There is a need for specific, inexpensive dengue diagnostic tests that can be used for clinical management, surveillance and outbreak investigations and would permit early intervention to treat patients and prevent or control epidemics. Progress is being made in primary prevention, with several candidate dengue vaccines in late phases of development as well as improved vector control measures. Additionally, new techniques for the early detection of severe disease such as the use of biomarkers have the potential to decrease morbidity and mortality. Recent developments in rapid detection technologies offer the promise of improved diagnostics for case management and the early detection of dengue outbreaks.

The characteristics of an 'ideal' dengue diagnostic test depend on the purpose for which the test will be used. Box 1 shows some proposed product specifications for diagnostic tests that could be used for case management, surveillance and outbreak investigations, and vaccine trials5. The optimal window for diagnosing a dengue infection is roughly from the onset of fever to 10 days post-infection; however, as not all patients are diagnosed within this period, an ideal diagnostic test should be sensitive regardless of the stage of infection. Laboratory confirmation of dengue infection relies on isolation of the virus in cell culture, the identification of viral nucleic acid or antigens, or the detection of virus-specific antibodies6. The relative merits of the different diagnostic tests and their optimal time frame for use are summarized in Figs 1, 2.

Figure 1: Comparative merits of direct and indirect laboratory methods for the diagnosis of dengue infections.
figure 1

Opportunity refers to the fact that antibody testing is usually the most practical diagnostic option available.

Figure 2: Major diagnostic markers for dengue infection.
figure 2

The titre of the IgM and IgG response varies, depending on whether the infection is a primary or secondary infection.

Direct virus detection could potentially be used for early, definitive and serotype-specific identification of dengue infections during the acute phase of the disease. Live virus or virus components (RNA or antigens) can be detected in serum, plasma, whole blood and infected tissues from 0–7 days following the onset of symptoms, which corresponds roughly to the duration of fever. Direct virus detection procedures are not routinely performed by laboratories, and few commercial kits that have been independently validated are available to aid in this area of dengue diagnosis.

Serological tests are more commonly used to diagnose dengue infections because of their ease of use compared to techniques such as cell culture or RNA detection. Different patterns of antibody responses are observed when patients experience a first (primary) or subsequent (secondary) dengue infection. In primary infections, immunoglobulin M (IgM) is detected 5 or more days after the onset of illness in the majority of infected individuals and immunoglobulin G (IgG) is detected from 10–15 days. In secondary infections, IgM appears earlier or in the same time frame but are usually at lower titres than in primary infection. IgG is present from the previous infection and the titre increases rapidly. Haemagglutination inhibition (HAI) antibody titres in primary infections peak at 1:640 whereas titres of 1:1280 or greater are common in secondary infections7.

Several rapid diagnostic tests are commercially available and many in-house assays have been developed but the performance characteristics of many of these tests have not been adequately evaluated. The need for a validated diagnostic test for dengue virus infection for clinical and epidemiological use was recommended by the International Expert Meetings on dengue diagnostics held at the WHO in Geneva in October 2004 and April 2005. The purpose of this guide is to establish best practice guidelines for how to design and conduct evaluations of dengue diagnostic tests for the management of acute infections, surveillance and monitoring of interventions.

I. Current laboratory methods for dengue diagnosis

The tests that are currently used in the laboratory diagnosis of dengue infections, and their advantages and limitations, are shown in Table 1.

Table 1 Advantages and limitations of different dengue diagnostic tests

1. Virus detection

Dengue virus can be isolated by the inoculation of diagnostic samples into mosquitoes, cell culture (using mosquito cell lines, such as C6/36 and AP61 or mammalian cell lines, such as Vero and LLC-MK2 cells) or intra-cerebral inoculation of suckling mice. Whole blood, serum or plasma is collected from patients during the acute phase of the disease or from tissues in fatal cases (dengue virus has also been isolated from liver, spleen, lymph nodes and other tissues). There is evidence that the virus isolation rates from whole blood are considerably higher than the isolation rates from serum or plasma8. For virus serotype identification, immunofluorescent assays using serotype-specific monoclonal antibodies (mAbs) are used.

2. Viral RNA detection

Dengue viral RNA can be detected using a nucleic acid amplification test (NAAT) on tissues, whole blood or sera taken from patients in the acute phase of the disease. Various protocols have been developed that identify dengue viruses using primers directed to serotype-specific regions of the genome9,10,11. Nested PCR techniques improve the sensitivity of detection because the initial amplification product is used as the target for a second round of amplification. However, it is crucial that laboratories performing nested PCT take every precaution to prevent false-positive results that can occur as a result of contamination. In situ PCR can be carried out on tissue slides.

3. Antigen detection

3.1. NS1-based assays. A simplified method of diagnosis during the acute stage of dengue infection compared to viral isolation or nucleic acid detection is the detection of viral antigens in the bloodstream; however, antigen detection in the acute stage of secondary infections can be compromised by pre-existing virus–IgG immunocomplexes.

New developments in enzyme-linked immunosorbent assay (ELISA) and rapid immunochromographic assays that target non-structural protein 1 (NS1) have shown that high concentrations of this antigen can be detected in patients with primary and secondary dengue infections up to 9 days after the onset of illness12. NS1 is synthesized by all flaviviruses and is secreted from infected mammalian cells. The presence of secreted NS1 (sNS1) in the bloodstream stimulates a strong humoral response. Many studies have investigated the utility of sNS1 detection as a diagnostic tool during the acute phase of a dengue infection. A serotype-specific mAb-based NS1 antigen-capture ELISA has recently been developed and shows good serotype specificity. This test can differentiate between primary and secondary dengue virus infections. There is a good correlation between NS1 serotype-specific IgG as determined by ELISA and plaque reduction neutralization test (PRNT) results, but the performance and utility of these NS1-based tests require additional evaluation.

3.2. Immunohistochemistry. Dengue antigens can be visualized in tissue sections using labelled mAbs that are visualized with markers such as fluorescent dyes, enzymes or colloidal gold. These tests can be undertaken on frozen tissue or paraffin-embedded slides from fatal cases.

4. Serological methods

The acquired immune response to dengue virus infection consists of the production of immunoglobulins (IgM, IgG and IgA) that are mainly specific for the virus envelope (E) protein. The intensity of the response varies depending on whether the individual has a primary or secondary dengue infection. During a primary dengue infection, the IgM response is typically higher titre and more specific than during secondary infections. The titre of the IgG response is higher during secondary infection than during primary infection (Fig 2). IgA- and IgE-based assays have also been used but the utility of these immunoglobulins as markers for the serodiagnosis of dengue infections requires further validation.

It is difficult, if not impossible, to use serology to identify dengue serotypes following a recent infection because the antibodies produced following a primary dengue infection often demonstrate some degree of cross-reactivity with other dengue virus serotypes. Antibodies formed following secondary dengue infections are strongly cross-reactive within the dengue group and also usually crossreact with other flaviviruses13.

4.1. IgM-based assays. The detection of dengue-specific IgM is a useful diagnostic and surveillance tool. IgM is initially detectable between 3 to 5 days post onset of fever in ∼50% of hospitalized patients and has a sensitivity and specificity of ∼90% and 98%, respectively, when assays are undertaken five days or more after the onset of fever. Dengue-specific IgM is expressed earlier than dengue-specific IgG. In one study in Puerto Rico, by day 5 of illness, most patients (80%) with dengue infections that were subsequently confirmed by HAI on paired serum samples or by virus isolation had detectable IgM in acute-phase serum. Nearly all patients (93%) developed detectable IgM 6 to 10 days after the onset of fever, and 99% of patients tested between 10 and 20 days after onset had detectable IgM14.

The sensitivity and specificity of IgM-based assays is strongly influenced by the quality of the antigen used and can vary greatly between commercially available products. ELISA-based IgM assays have become an invaluable tool for the surveillance of dengue. Many ELISAs use dengue E protein antigens from all four dengue virus serotypes. This ensures that the assay is capable of identifying any dengue infection regardless of the serotype. However, because IgM circulates for up to three months or longer, its presence might not be diagnostic of a current illness. To diagnose a current dengue infection, the demonstration of a seroconversion (four fold or greater changes in antibody titres) in paired sera is required.

In areas where dengue is not endemic, IgM-based assays can be used in clinical surveillance for viral illness or for random, population-based serological surveys, with the likelihood that any positive results detected indicate recent infections (within the past 2 to 3 months). The M antibody capture ELISA (MAC-ELISA) is based on detecting IgM in serum using human-specific IgM that is bound to the solid phase. MAC ELISAs are frequently run as a non-quantitative, single dilution test and positive results are commonly reported as a 'recent flavivirus infection'.

Rapid IgM-based dengue diagnostic tests have been developed as a quick and easy method for use at point of care or bedside, and exist in different formats including particle agglutination and lateral flow immunochromatographic strips, with or without plastic cassettes. Most of these tests use recombinant antigens from all four dengue virus serotypes and the results are available within 15 to 90 minutes. Four rapid IgM-based kits have been evaluated. Their sensitivities ranged from 21%–99% and their specificities from 77%–98%, compared with gold standard laboratory-based ELISAs. False-positive results were present in patients with malaria or previous dengue infections15,16. The ELISA format shows greater sensitivity in detecting dengue-specific antibodies than the rapid tests, but the rapid tests are field friendly, with the results available in a short timeframe.

4.2. IgG-based assays. Dengue-specific IgG-based assays can be used for the detection of past dengue infections and current infections if paired sera are collected within the correct time frame to allow the demonstration of seroconversion between acute and convalescent serum samples. Assays are usually carried out using multiple dilutions of each serum tested to determine an end-point titre.

IgG avidity assays can be used to determine whether an infection is a primary or a secondary infection, based on the principle that the avidity of IgG is low after primary antigenic challenge but matures slowly within the weeks and months after infection. Thus, these assays can be more useful than the HAI test for this purpose. The IgG-based ELISA exhibits the same broad cross-reactivity with other flaviviruses as the HAI test; therefore, it cannot be used to identify the infecting dengue virus serotype. However, it has a slightly higher sensitivity than the HAI test.

II. The need for evaluation of dengue diagnostic tests

The dramatic increase in the global burden of dengue has spurred increased public and private sector interest in developing improved diagnostics for dengue infections. There is often much discrepancy between the manufacturers' claims of test performance and those reported in peer-reviewed trials. Unfortunately, for diseases that are mainly endemic in developing countries, market-driven incentives rarely exist. Companies with interests in these areas tend to be small and some are under-resourced, with limited or no access to the reagents, strains and specimens necessary for product research and development. Diagnostic tests are not subject to stringent regulations in many developing countries. As a result, commercial assays are often sold and used without evidence of effectiveness.

The Pediatric Dengue Vaccine Initiative (PDVI) and the UNICEF/UNDP/World Bank/WHO Special Programme for Research and Training in Tropical Diseases (TDR) are collaborating to evaluate diagnostic tools to detect acute dengue virus infection. A global network of dengue diagnostic laboratories has been developed to produce specimen panels and evaluate commercially available dengue diagnostic tests. An initial assessment on commercial IgM-based tests has been undertaken by this laboratory network16.

III. General issues in study design

Dengue diagnostic tests comprise both laboratory-based and point-of-care tests. The laboratory-based tests comprise non-commercial assays such as NAATs, in-house and commercially available ELISAs, HAI tests, and neutralization assays. Currently, point-of-care tests for dengue diagnosis include NS1- and IgM-based tests. These tests can be evaluated in the laboratory using archived samples or prospectively collected samples, depending on the purpose for which the test is intended and the samples that are used. If fresh specimens are required then prospective specimens must be collected. Ideally, trials should be conducted prospectively in the population for whom the test is intended but field trials tend to be expensive and time consuming. If well-characterized specimens are available, it may be reasonable to design a laboratory-based trial using archived or convenient specimens to quickly assess the performance of a new test compared to an existing test, or to use a laboratory-based evaluation to triage for tests with acceptable performance to move forward into field trials. In general, it is not necessary to conduct a prospective field trial for laboratory-based assays.

IV. Considerations for laboratory-based evaluations

1. Objectives of the trial

The objectives of diagnostic trials vary, depending on who will be using the test, for example whether it will be used by a control programme or by a laboratory providing dengue diagnostic services. It also depends on the purpose for which the test is intended, such as clinical use or for surveillance, or whether it is for use in the laboratory or at point of care. It is therefore important to clearly state the objective of a diagnostic trial before considering the trial design. For example, for IgM and NS1 evaluations, the objectives can be to assess the performance and operational characteristics of commercially available NS1 antigen and IgM detection tests for the diagnosis of early acute dengue fever or assess the ease of use of the tests under evaluation.

2. Rationale for the trial

A literature review should be conducted to avoid duplicating what is already known and to learn from the findings of earlier trials. The rationale for conducting the trial should be considered. If a trial is conducted for regulatory approval of a test, the product safety issues are as important as the performance compared to a reference standard, reproducibility and lot-to-lot variation. If a trial is conducted to provide evidence for policy, then the needs of a control programme, such as the setting where the test will be used and the sample size of the trial, are important considerations.

3. The organization of a laboratory-based evaluation

As the occurrence of dengue and other disease agents or conditions that might give rise to cross-reactivity vary widely in different regions of the world, it is important for laboratories to conduct evaluations before adopting a new test. To provide evidence for regional or global guidelines, it is important to conduct multi-country trials using a network of laboratories that have proven expertise and proficiency for the type of technology being examined. To maximize the accuracy and quality of the evaluation, it should be managed by a reference centre in each region that will coordinate the activities among the evaluation laboratories.

3.1. Reference centres. Reference centres should coordinate proficiency testing in the participating laboratories to establish the baseline for the reference standard. They will then design and conduct proficiency testing for the evaluation laboratories before commencing the project. A sample terms of reference for a reference centre is shown in Box 2.

3.2. Evaluation laboratories. The evaluation laboratories should be selected based on the experience and expertise of the laboratory. Sample activities of the evaluation network laboratories are listed in Box 3.

4. Selecting the reference standard

The selection of the appropriate reference standard is crucial in obtaining the best estimate of performance for the test under evaluation. No single diagnostic test detects dengue infection with 100% sensitivity and specificity. The reference standard for comparison of test sensitivity and specificity to the test under evaluation can vary depending on the test being examined. For example, new IgM-based tests in either ELISA or rapid test format might be compared against the Armed Forces Research Institute of Medical Sciences (AFRIMS) or CDC MAC-ELISA tests. It is possible that the new test might be better than the reference standard. For new tests for which a reference standard does not exist, such as the NS1 test, it is necessary to evaluate its performance against a composite reference standard such as a combination of virus isolation, RNA detection, seroconversion in IgM or IgG paired sera or a four-fold rise in IgG. A reference standard for NAATs can be a panel with a known copy number of RNA targets from all dengue virus serotypes or genotypes from different regions to determine the lower limits of detection for each type. Additionally, it can comprise positive samples identified by an existing reverse transcriptase (RT)-PCR method9.

5. Test(s) under evaluation

The design for the evaluation can be affected by some of the characteristics of the test including the volume of sample required, the number of steps involved, the equipment necessary for performing the test and the temperature and time required.

6. Evaluation procedures

The design and assembly of the reference panel is the most crucial element in a laboratory-based evaluation. Some examples of evaluation panels for different dengue tests follow.

6.1. Reference panel for evaluating NAATs. It is important to determine the lower detection limit of NAATs using a standardized quantitation method. The evaluation can be carried out using a panel of known RNA targets of varying copy numbers to determine the analytical sensitivity and specificity or potential cross-reactivity in the assay. This panel should comprise different dengue serotypes from different geographical locations to give a confidence interval of 95% or greater around the point estimates of sensitivity and specificity.

6.2. Factors affecting NAAT performance. The sensitivity of NAATs can be affected by various factors including the copy number of the amplification target; primer selection; and the method of amplification, for example, nested PCR is more sensitive than a single round of PCR. The composition of the master mix, that is, the amount of MgCl2 and other reagents, can also affect PCR-based assays. The detection methods used are also important; detection by gel electrophoresis is less sensitive than by ELISA or chemiluminescence. The specificity of NAATs can be affected by the ability to distinguish dengue viruses from other flaviviruses; the uniqueness of the target sequence, especially for dengue genotyping; and false positives resulting from contamination.

6.3. Reference panel for evaluating antigen-detection tests. If there is no reference standard, a combination of culture, RNA detection and/or seroconversion can be used to identify an infected patient. Serum should be taken from an infected patient on day 15 after the onset of fever (acute phase of infection).

6.4 Factors affecting antigen-detection test performance. The sensitivity of an antigen-detection assay can be affected by the avidity of the monoclonal or polyclonal antibodies; the amount of antigen present in samples; and the detection method. The specificity of an antigen-detection assay can be affected by the uniqueness of the antigen used (that is, whether it shows cross-reactivity).

6.5. Reference panels for evaluating serological tests. The sensitivity of serological tests can be affected by the avidity of the antigen used; the amount of antigen present in the test strip or well; and the detection method. The specificity of serological tests can be affected by the potential cross-reactivity with related flaviviruses or other infectious agents causing acute febrile illness, and other conditions. The composition of the panels can vary for the different test types. An IgM-positive panel can comprise ∼150 DENV-specific IgM-positive samples, including samples from primary and secondary infections representing all four dengue serotypes. Samples should be selected based on optical density (OD) and weighted towards low and medium ODs to give the evaluation panel greater discriminatory power for test performance. An early acute panel can comprise ∼250 samples from patients who are confirmed as being dengue infection positive by PCR, culture or seroconversion. Samples are grouped by primary and secondary infection and by serotype. A proportion of the samples will be classified into early (0–5 days following the onset of fever) and late (6–10 days following the onset of fever) phases of infection and divided into the four serotypes. A challenge panel can comprise ∼150 samples negative for current dengue infection classified in five groups: DENV-specific IgM or PCR negative; related flavivirus; other virus; febrile illnesses; and systemic conditions.

Specimens in the panel should be obtained from laboratories that are proficient in performing assays for dengue and other agents. They should subscribe to proficiency programmes and the laboratory should be compliant with Good Laboratory Practice (GLP). Each specimen in the panel should be assigned a unique number and, if available, the following relevant information should accompany each specimen: the date of the onset of symptoms; the date on which the specimen was collected; the type of specimen; and the geographical location where the specimen was collected. Demographic and epidemiological information can also be included. Tracking records for specimen management is also important.

7. Conducting the evaluation

Panel specimens from all of the network sites should be validated, heat-inactivated as needed and aliquoted. Lyophilization of samples by the reference centres is strongly recommended. One aliquot should be retested following lyophilization to check the quality of the specimen. The panels are coded by the reference laboratory and then sent to the evaluating laboratories to test sensitivity and specificity. Specimens should be replicated in the same analytical run to measure the within-run variation; likewise, they should be tested in separate runs to assess between-run variation or precision. When the interpretation of the results is subjective, such as a visual reading for a dipstick test, the percentage agreement between two independent readers should be measured.

8. Data management

The reference centres should develop and disseminate an electronic form to the evaluating laboratories for collecting and submitting their data. All data collection should be standardized to facilitate data comparison and analysis and double data entry should be undertaken to ensure accuracy. A statistician should be in charge of the data analysis to provide the kappa coefficient values, sensitivity, specificity and confidence intervals.

9. Data analysis

Sensitivity (the proportion of true positive samples (as identified by the reference assay) that are correctly identified as reactive by the test under evaluation) and specificity (the proportion of true-negative samples (as identified by the reference assay) that are correctly identified as non-reactive by the test under evaluation) should be calculated. A Kappa coefficient value for each site should be determined to measure the agreement of each site's results against the reference panel. A test of homogeneity should be determined for inter-site agreement for each test.

10. Conclusions and recommendations

Laboratory-based evaluations using archived serum panels can be used to determine the ability of a new diagnostic test to detect dengue infection compared to an existing method and to determine the specificity of this test with respect to other infectious agents that often co-circulate with dengue. Field trials are needed to determine the performance and utility of these tests in a local context. A complete analysis of the data should be provided to each commercial company that supplied the kit along with recommendations for any areas of improvement. In addition, the evaluation should be published in a peer-reviewed journal as well as a WHO bulletin to allow other public health and private laboratories that perform dengue testing to access this information for future testing. Recommendations should include acceptable sensitivity and specificity values for a given test.

V. Considerations for field or clinic-based evaluations

1. Objectives of the trial

The considerations for field or clinic-based trials should be the same as for a laboratory based evaluation (Section IV), in that the objectives of the trial vary according to the purpose for which the test is intended, such as clinical use or for surveillance, and the setting, such as its use in hospitals or clinics.

2. Rationale for the trial

This should be similar to that for laboratory-based evaluations. A literature review should be conducted to avoid duplicating what is known and to learn from the findings of earlier trials. For example, for point-of-care tests, it is important to consider whether field staff who are normally not trained to perform diagnostics tests are trained appropriately and whether the testing is feasible in the given clinical setting.

3. The organization of a prospective clinic-based evaluation

A clinic-based evaluation can be conducted by an individual clinic or as a multi-site evaluation. The selection of the site should be based on criteria such as access to dengue patients, good standard of care and whether the site is supported by a good quality laboratory either on site or nearby. The laboratory associated with the clinic should have the expertise and capacity to perform reference standard tests for dengue and new tests under evaluation. The laboratory should have some form of accreditation or exterior proficiency testing for dengue.

4. Selecting the study population and the reference standard

A clinic-based evaluation should be conducted in populations for whom the test is intended. The performance of an NS1 test can be evaluated in dengue-infected patients, as defined by a positive culture or PCR result. See Section IV for other samples.

5. Test(s) under evaluation

The design of the evaluation can be affected by some of the characteristics of the test such as the volume of sample required, transportation of the specimen from the clinic to the laboratory and capacity for refrigeration at the clinic to store the specimens. For rapid tests, the number of steps and the time required can affect whether the test can be performed in a clinical setting. The temperature of the clinic is an important consideration for the long-term stability of the test kits when stored at room temperature. For example, if a test requires serum and the clinic does not have a centrifuge to separate the blood, this is a limitation of the field site that affects the testing of the sample and the final results.

6. Evaluation procedures

A master protocol and all site-specific protocols should be prepared and submitted to the local Ethics Review Committee for approval before the evaluations can be initiated. In general, the tests should be performed according to manufacturer's directions. However, if evidence must be generated from the programmatic point of view which deviates from manufacturer's instruction, these should be employed. This is a condition for which the manufacturer needs to be informed and to which they need to agree whether the test evaluation can go forward. For example, most rapid tests specify a time of 15–20 minutes after the application of the specimen to read the results. This might not always be possible in a busy clinic. It might be useful to include in the evaluation protocol a reading after 1 hour to determine whether the test results remain the same. This would certainly increase the usefulness of the rapid tests.

7. Conducting the evaluation

7.1. Obtaining informed consent See the discussion of informed consent in the generic guidelines Evaluation of diagnostic tests for infectious diseases: general principles in this supplement and the sample informed consent form in APPENDIX 1. The site personnel should be trained in Good Clinical Practice (GCP).

7.2. Specimen sampling and preparation. Sera, plasma or whole blood can be used for most of the dengue tests. Samples should be collected according to manufacturer's instructions.

7.3. Transport and storage. Ideally, aliquot serum or plasma specimens should be stored at −20°C, and for long-term storage should be frozen to −70°C. Sera should not have been frozen and thawed excessively. When specimens are transferred from the clinic to the laboratory they should be maintained at 4°C and shipped on cold packs to the storage site. To improve the quality of sera storage, particulate, lipaemic and haemolysed specimens and specimens contaminated with bacteria or fungus should be excluded. Sera should be filtered, aliquoted and then lyophilized and stored in a non self-defrosting freezer. Specimen identifiers should be labelled directly on the tube and specimen inventories should be maintained. It is important to store the specimen identifiers in a computer database or bound logbook, which is periodically updated to reflect specimen use or transfer. Maintaining this database and storing specimens will facilitate additional evaluations at a later date.

The shipment of test kits must comply with International Air Transport Authority (IATA) regulations and regulations indicated by each country, and must be approved by the national regulatory authority if necessary. During transport, the manufacturer's directions should be followed, especially regarding temperature. Each shipment should have an electronic monitor to record the maximum and minimum temperature reached. As tests can be highly sensitive to storage and shipping conditions, data will be collated with the test results and may explain differences between laboratories.

It is important to follow the instructions of the manufacturer carefully for storage, such as keeping the tests out of direct sunlight and maintaining a cold chain if needed. The site should keep records of the manufacturer's lot number, expiry date, duration of storage on site, temperature and humidity at storage, state and type of packaging and time to complete use from the time the package is opened.

7.4. Training and choice of technicians, test preparation and interpretation. All clinical and laboratory personnel involved in the evaluation must comply with national workplace biosafety guidelines, including those related to the safety of laboratory personnel and the disposal of infectious waste. All samples are potentially infectious material. Training protocols should be developed to train personnel in the use of technology and in good laboratory practice. An international guideline, Good Clinical Laboratory Practice (GCLP), has been developed for laboratories involved in clinical trials17. All personnel involved should be trained in GCLP and have expertise in routine diagnosis using different methodologies.

For rapid tests, training and experience of technicians can affect the test performance because reading of a rapid test result is not always unequivocal. Sometimes the bands are very faint, but these do indicate a positive test, and it is a common mistake to read these as negative or doubtful. Similarly, if a dent is produced on the strips due to a manufacturing or handling error, it is possible that a coloured line appears but these are then mostly located at the wrong place on the strip or are very thin. In such circumstances, it is prudent to repeat the test. A company-prepared buffer is supplied with the strips, and it is extremely important to use that buffer only. If for some reasons this buffer runs out, it is best to ask for replacement.

7.5. Performing the tests. The general guidelines for the use of test kits should be adopted and implemented (Box 4). Unless otherwise noted in the master evaluation protocol, all tests should be performed according to the manufacturer's instructions. Any deviation from the recommended procedure should be recorded. The reference laboratory should establish clear Standard Operating Procedures (SOPs) for conducting the evaluation and must have all facilities to perform the evaluation.

For rapid tests, the interpretation of results is subjective. Hence, it is recommended that at least two persons read the test results independently. The results of rapid tests performed in the clinic can also be evaluated against rapid tests performed by trained laboratory technicians to assess the feasibility of using these tests in field settings. In this type of agreement study, blinding is necessary to ensure the independence of test results in the evaluation. For ELISA testing, laboratory facilities and well trained technicians are required.

7.6. Quality assurance. The manufacturer's control should always be included in every assay to ensure correct functioning of reagents according to the manufactures specifications. Principal investigators should work with manufacturers and the designated laboratories to ensure that staff are adequately trained and instrumentation conforms to GLP. Both test kits and samples will be tracked from the manufacturer to the reference centres and to/from the evaluation laboratories.

7.7. Data management. The results of the two readings of the diagnostic test under evaluation should be recorded in separate notebooks to ensure independent interpretation of results. Both the results from the test under evaluation and the reference standard should then be entered into a spreadsheet, together with the information on the subject's sex and age and a limited set of variables (such as treatment status and duration of symptoms). Double entry of data is recommended to minimize errors. The collected information as well as the frozen serum samples should be kept until the study has ended and its results have been published. A sample clinic data collection form is shown in APPENDIX 1.

7.8. Data analysis. The sensitivity, specificity and 95% confidence intervals should be calculated for each test under evaluation compared to the results obtained using the reference standard. In Phase III studies with prospective recruitment of patients, positive and negative predictive values of the new test should be given, but not in case-control designs, as the frequency of disease in such studies is artificially determined, and does not reflect a real prevalence allowing a meaningful interpretation of predictive values.

7.9. Conclusions and recommendations. The appropriate forum in which to disseminate the results of each evaluation will be determined by the investigators, and publication of the results in peer-reviewed journals would be encouraged.