Nuts, bolts, and tiny little screws: how Clinical Evidence works
In a nutshell, in Clinical Evidence (CE) we summarise the current state of knowledge - and uncertainty - about interventions used to prevent and treat important clinical conditions. We do it by searching and appraising the literature to create rigorous systematic reviews of evidence on the benefits and harms of clinical interventions.
There. Sounds simple enough, doesn't it? Keep reading...
Complication number 1: summarising necessarily means excluding some detail, and CE users should keep in mind the limitations of what we present. It's not possible to make global statements that are both useful and that apply to every patient in every clinical context.
For example, when we say we found evidence that a drug is beneficial, it means there is evidence that the drug has been shown to deliver more benefits than harms when assessed in at least one group of people, using at least one outcome at a particular point in time.
It doesn't mean that the drug will be effective in all people, or that other outcomes will be improved, or even that the same outcome will be improved at a different time after the treatment. Not so simple any more.
Our categorisation of interventions
Each systematic review (SR) contains a page that lists key clinical questions and interventions and describes whether they have been found to be effective or not.
We developed the following categories of effectiveness from one of the Cochrane Collaboration's first and most popular products, A guide to effective care in pregnancy and childbirth.
Of course, fitting interventions into these categories is not always straightforward. For one thing, the categories represent a mix of several hierarchies: the size of benefit (or harm), the strength of evidence (RCT or observational data), and the degree of certainty around the finding (represented by the confidence interval [CI]).
Additionally, much of the evidence most relevant to clinical decisions relates to comparisons between different interventions rather than with placebo or no intervention. Where appropriate, we indicate the comparisons (and, yes, it can get complicated).
A third consideration is that interventions may have been tested, or found to be effective, in only one group of people - such as those at high risk of an outcome. Again, we indicate this where possible.
But perhaps trickiest of all is trying to maintain consistency across our reviews. There isn't a convenient web tool for that - it just takes a lot of work.
We continually revisit the criteria for categorising interventions. And we're pragmatic about it: interventions that can't be tested in an RCT for ethical or practical reasons are sometimes included in the categorisation table and identified with an asterisk and an explanatory comment. For more on our adventures in categorisation, see Efficacy categorisations, below.
How Clinical Evidence is put together
Our systematic reviews have to tick two crucial boxes, which, together, form the CE mantra: our reviews must be reliable, and they have to be relevant to clinical practice.
As you'd expect, we aim to cover the most common and/or important clinical conditions seen in primary and hospital care. In deciding what to cover, we:
- Review national data on consultation rates, morbidity, and mortality
- Take account of national priorities for health care, such as those outlined in the UK National Service Frameworks and the US Institute of Medicine reports
- Take advice from clinicians and patient groups.
Planning our reviews - selecting the questions
Our clinical questions are the spine of Clinical Evidence. They address the benefits and harms of preventive and therapeutic interventions - with emphasis on outcomes that matter to patients.
Our section advisors and contributors choose questions for their relevance to clinical practice, in collaboration with primary care clinicians and our own Editors.
But constructing CE doesn't (and shouldn't) stop with our own ideas: our readers suggest new clinical questions via the contact us page. Please feel free to get involved!
Searching and appraising the literature
Our search covers Medline, Embase, the Cochrane Library and, where appropriate, other electronic databases. For more, see The systematic search, below; and if you have stomach for the technical details see our search process.
Summarising the evidence, peer review, and editing
The search completed, we begin in earnest the process of writing our reviews, involving contributors and peer reviewers overseen by our valiant in-house Editors (see How we write Clinical Evidence reviews, below).
To help clinicians put the evidence in CE into practice, our reviews link to Drug safety alerts, the full text of relevant major guidelines, and updates via the BMJ Updates service (see Adding value to the core evidence, below).
Feedback, error corrections, and user responses
Feedback is crucial for assessing the validity of the information we publish, and for improving the way our reviews are presented and distributed.
An experienced editor processes reader responses and ideas, and deals with anything urgent straight away. Please feel free to contact us with your comments about CE.
In addition to general feedback, individual reviews have a 'Your Responses' facility similar to the BMJ's successful 'Rapid Responses' service. To submit a comment - whether it's about a question we don't yet cover, or a comment on how our evidence is related to practice, or anything else you think is important - click on the link to the right of the relevant systematic review.
Our review updating process - intelligent updating
Much has been published on issues around the methodology of systematic reviews, including the QUOROM statement. Yet there is no consensus and little research about how frequently systematic reviews should be updated.
We used to try to update every CE review annually, until it became clear that not all of them actually benefited from an annual update - in particular those where there is little high-quality, important new evidence published on a regular basis.
Our experience is supported by studies that found changes to the conclusions in less than 10% of updated reviews, and that only 23% of systematic reviews actually merit an update within 2 years (as determined by substantial changes to data on effectiveness or harms of treatments).
Too much information?
Too-frequent updating of reviews where new research is sparse may even be harmful, as there may be a temptation to change a conclusion based on a few small studies, only to change it back when the more complete research picture becomes clear with the publication of more and larger trials.
Move to continuous updating
Reviewing the options, it became clear that the crucial question was not when to update, but why.
We wanted a more sophisticated approach to our handling of information, in line with other large producers of systematic reviews, such as the Cochrane Library.
To that end, we developed a new updating strategy for CE reviews, based on the content of the review, the number of new RCTs and systematic reviews being published on the topic each year, and the popularity of the review with our readers.
Tailored updating schedules
So each review now has a tailored updating schedule that relates to the review itself, rather than to the blunt instrument of a rigid schedule. These tailored plans are helping us to produce better-quality systematic reviews and make best use of our authors' and reviewers' time.
The updating schedule is supported by a continuous review of the evidence by Evidence Updates.
So, that was the overview. We hope you're sufficiently interested to read more - we're very proud of what we produce, and we believe that what we do is important. We hope the following information answers any questions you might have, and inspires confidence in what you read in Clinical Evidence.
The devils in the detail: how we write Clinical Evidence reviews
We design our reviews to be both relevant to clinical practice and academically robust. CE is not 'research led', meaning that we're not interested in evaluating and reporting all research in any given area. Rather, we are clinically led, in that we look to answer specific clinical questions which that come up in everyday clinical practice.
... and how we don't write them
We generally don't:
- Report on drugs that are still in the experimental stage and not in common use
- Report on treatments that have been withdrawn
- Tell people what to do. Rather, we look to inform their decision making process.
We have a cunning (review) plan
Our review-planning process ensures that we address the clinically important questions and treatment options for each review. This process results in a review protocol (plan) that specifies what we do and don't include.
Early in this process we decide on the quality of evidence to include.
In the Benefits section of our reviews, we usually present evidence only from systematic reviews (SRs) of randomised controlled trials (RCTs) or from individual RCTs. Well-conducted RCTs provide the most robust evidence of all epidemiological studies, particularly compared with observational studies, which may be more subject to confounding or bias.
But (there's always a but)...
RCTs are not always ethical or practicable. In such cases we may include less methodologically robust studies, such as observational studies; for example, in our Sudden Infant Death Syndrome (SIDs) review, where RCTs on some interventions may be considered unethical.
We allow studies less robust than RCTs in other parts of the reviews: for example, in our background information we may cite cohort- or population-based data in the prognosis section. We may also cite less-robust studies in a Comment section as supporting information.
In addition, for harms, we recognise that RCTs may be underpowered to detect some adverse effects, so we may include non-RCT data. However, in most CE reviews, we only report systematic reviews of RCTs or individual RCTs. It is on these high-quality studies that we base our categorisations.
The review plan specifies:
- The population to be included
- The interventions to be assessed
- The comparisons to be examined
- And the outcomes to be reported.
It also specifies the type of studies to be included (SRs, RCTs, etc), and the minimum quality criteria for studies: these may include:
- Study size
- Degree of blinding
- Level of follow-up
- And study duration, as well as other individually specified criteria.
The default criteria for an RCT are that it:
- Included at least 20 people
- Had at least 10 people per arm
- Was blinded
- And had at least an 80% follow-up of the people initially randomised.
However, these criteria vary between reviews depending on the subject area: for example, where little or no evidence is available, we may include RCTs of fewer than 20 people. In others, we may agree that studies attain a specified duration of follow-up, or that RCTs can be unblinded.
Surgery or physical interventions such as exercise may be hard or impossible to blind. In each case, however, we specify the minimum quality criteria before doing the search or writing the review. We clearly specify the minimum quality criteria in the Methods section of each review.
The systematic search
We base our reviews on systematic searches of Medline, Embase, and the Cochrane Library.
First appraisal: cutting down on 'noise' to hone in on relevant and high-quality SRs of RCTs and individual RCTs
Having completed the initial search, our Information Specialists (ISs) make a first pass assessment of the abstracts of the studies they've identified against the criteria in the review plan. This includes the minimum quality inclusion criteria and other specified criteria such as whether the study included the correct population or examined the intervention of interest.
The IS then excludes all studies not attaining the minimum quality criteria, and sends a list of the remaining studies to the contributor(s) of the review for assessment. This does not mean that these studies will be included in the review: after further examination of the full report, other reasons may lead to an individual study being excluded.
Second appraisal: the role of contributors
Our expert contributors (their expertise may be clinical or epidemiological, and they often work in a team of contributors with a range of such skills) are often well published, and may have authored previous systematic reviews or RCTs in the area covered by the review.
Having assessed the list of studies appraised by our IS, the contributors obtain full papers of the studies to be included. Using an inclusion/exclusion document, they give reasons for all exclusions.
They then go ahead and report the selected evidence.
Methods or madness
In the case of an update, we perform our systematic search from the search date of the last published version to the present date. We publish the search dates at the beginning of each review, and in the Methods section, which also reports the databases interrogated and the issue date of the Cochrane database searched.
Our Methods also report the type of studies included in that review and the minimum quality criteria for including studies.
Reporting the benefits and harms of treatments
The principles of reporting and synthesising evidence are the same whether we're updating a review or writing a new one from scratch, so the following explanation relates to the more common scenario of updating an existing review.
We try to use language that's as precise as possible, as well as a common terminology and way of reporting data across all of our reviews. To this end, all submissions undergo a rigorous internal editing process (see below).
So that readers can judge how generalisable the results of any individual trial or meta-analysis may be, we present evidence using a PICOT sysem. This covers, wherever these parameters are available:
- Population included
- Intervention assessed
- Comparison tested
- Outcome involved
- Timeframe measured.
In practice, most of our reviews report SRs of RCTs and RCTs (see Critical appraisal criteria). In reporting data from any study, we follow some basic principles:
- We don't do meta-analysis or recalculate results of trials ourselves but act as a secondary source and report what has been published in the public domain (different statistical packages may give differing results and can even reverse the published conclusions of a trial; we would, however, comment if we thought an analysis was inappropriate or incorrect)
- When reporting an individual result, we cite the specific source where this analysis is presented (in order to maintain 'drillability')
- We don't report on results presented in abstracts (these do not allow a proper scrutiny of methodology of the trial, are often sparsely reported, and many don't go on to full publication)
- We include data from abstracts when they have been incorporated into an analysis performed by a systematic review (in this case, we count the study as being 'published' by the review but may comment in our reporting of the review that the trial has not otherwise been published in full)
- We only report on parameters (e.g., population, interventions, comparisons, and outcomes) prespecified in the review plan.
Include me out?
We would potentially report any SR of sufficient quality that we found, and any subsequent RCTs published after the search date of the review. Occasionally, we might also report RCTs published before the search date of the SR we found that were not included in it. We call these 'additional' RCTs. This sometimes happens if an RCT does not fulfil the inclusion criteria for a reported review, which may be narrower than our own criteria. Alternatively, it may not be clear why the additional RCT was excluded from the reported review.
For updates, we add any new SRs or RCTs to the existing text in the CE review (which may already report SRs or RCTs). With regard to the existing reporting of RCTs, before adding a new SR, we would check whether the already-reported RCTs' results were included in the new review. In such cases we exclude the original reporting of the RCT so that we don't report the same data twice (i.e., both as an individual result and as part of a meta-analysis). This is important, as reporting data twice overestimates the amount of evidence.
We don't necessarily report all the SRs we find, usually because some reviews are outdated, in that later reviews may include data not available to the earlier reviews. When we have to make choices about which reviews to report - such as when several reviews report similar findings - our guiding principle is to try to report the most recent, largest, and most methodologically robust review.
When two reviews report different conclusions even when examining the same data, though, we would report both reviews, as another guiding principle of CE is that we report the span of any evidence we find.
We may also report more than one review where they performed different types of analysis or used different data sources. Where we do this, we often describe the relationship between them: for example, any overlap of included RCTs or differing inclusion criteria.
Lies, damned lies...
When reporting the statistical results of studies, we report the test statistic reported in the study. Given a choice, we would probably choose to report the relative risk (RR). But studies report a variety of test statistics (P value, RR, odds ratio [OR], hazard ratio [HR], weighted mean difference [WMD], standardised mean difference [SMD], etc) depending on the data used and the analysis performed. In practice, the two most commonly reported test statistics in CE are the RR and P value.
Each test statistic has its own strengths and weaknesses. In light of this, for analysis suggesting statistical benefit with a particular treatment, we also try to report absolute data. This may be, say, the absolute numbers of people improved against the denominator in each group in the case of categorical data, or the numbers of RCTs and people included in an analysis of continuous data.
Of course, any reporting of absolute data is limited by the information supplied in the original study, and where absolute data is not reported, we say so, as we consider this a potentially important omission.
We also report the effects of interventions on prespecified outcomes. All our reporting is outcome based, and the outcomes we report in any review are listed in the Outcomes section at the beginning of the review.
Reporting the outcomes that matter
We always preferentially report on clinical outcomes: that is, ones that matter to people, such as mortality, morbidity, number of people improved, etc. We try to avoid laboratory or proxy outcomes.
For example, in our Fracture prevention review we report on fractures prevented, not on changes in bone density measured by scans, which may or may not eventually result in fractures.
Occasionally, we may agree to report laboratory outcomes, particularly when reporting on clinical outcomes is scarce and where the laboratory outcomes are commonly used in management or are considered strongly related to prognosis.
For example, we report effects on glycated haemoglobin in our diabetes reviews and thyroid function tests in our thyroid reviews.
However, we use clinical outcomes alone whenever we can. Hence, in reporting any trial data, we only report results that relate to our prespecified outcomes of interest, rather than all outcomes reported in a study.
Where's the harm? Reporting adverse effects
We report any adverse effects found by the trials we included; but RCTs may be underpowered to detect harms, and so we might include non-RCT data that provide details on relevant adverse effects. We also report relevant warnings from bodies such as the FDA and MDA.
We recognise, though, that adverse effects are often under-reported, and that readers should read other sources of evidence, including observational data, case reports, warnings, prescription guides, etc, to get a comprehensive view of harms associated with an intervention.
Reporting expert comments and clinical guidance
Our Benefits and Harms reporting is strictly governed by the parameters we set in the review plan. In the Comment and Clinical guide sections, however, our contributors have more leeway to report or comment on the evidence.
In the Comment, they may choose to:
- Cite lower-quality evidence or evidence outside our designated group of interest as supporting or background information
- Comment on the evidence overall
- Highlight important omissions in the current evidence.
In the Clinical guide we invite our contributors to put the evidence into some sort of clinical context. This may involve reporting any current clinical consensus, or reasons why a particular intervention is not currently used in routine practice.
Again, we don't tell people what to do: for example, we would never state that people should be put on one particular drug dose rather than another. We simply present the best quality available evidence of effectiveness to inform people's decision-making process.
Applying GRADE analysis
For selected outcomes for which we find evidence, we report a GRADE statement with an assessment of the quality of that evidence.
The GRADE statement and quality assessment are based only on the data presented in the Benefits and Harms sections, the reporting of which is in turn based on the parameters outlined in the review plan for which a systematic search has been performed.
We don't report a GRADE statement for material presented in Comments or Clinical guides.
Reporting the 'take home' messages
For each review, we present a quickly digestible summary of its Key points. Statements in the Key points always relate to material reported in the review, and usually to material in Benefits and Harms, although occasionally they may refer to issues raised in Comments or Clinical guides.
We repeat these Key points in the relevant option, so that readers can clearly see the evidence from which a particular statement has been distilled.
We categorise each intervention or comparison into one of five categories (see Our categorisation of interventions, above). When we add extra evidence at update, we assess whether this is strong enough to alter the previous categorisation, or whether (as is often the case) it strengthens it.
The process of categorisation is, of course, subjective to some extent: for example, the number and quality of studies on which we base a categorisation may vary widely across subject areas. In one review with a lot of evidence a large RCT may be necessary to achieve a Likely to be beneficial categorisation, whereas in another review with very little evidence, and little prospect of more, a much smaller RCT might suffice.
Different people may also bring different value judgements with regard to any presented evidence. Hence, each subject area is, to some extent, an individual case.
In practice, because of the subjective nature of the assessment, the categorisations of interventions are often one of the most discussed parts of the review.
The categorisation table entry will usually just list the intervention (e.g., intervention X) under the selected categorisation (e.g., Likely to be beneficial). Sometimes, though, the table entry gives further reasons for a categorisation. For example: "Intervention X (better than placebo; however, no evidence against other treatments)" or "Intervention X (may be more effective than intervention Y)."
Rarely, when there is little or no evidence, we categorise an intervention based on consensus. We do this in cases of broad consensus, or where doing otherwise goes against common sense. For example, in our organophosphorous poisoning review, we categorise "removing the contaminated clothes and washing the contaminated person" as Likely to be beneficial even though we find no direct evidence.
However, we categorise by consensus rarely, as what can seem obvious at one point in time may be refuted later by new evidence. For most interventions, if there is no good evidence, we say so and categorise as Unknown Effectiveness.
None but the brave: the editorial process
To ensure methodological quality, adherence to the review plan, and precise and comprehensive reporting of the evidence, all CE reviews undergo a rigorous internal edit. We do this without exception whenever any new material is added to any review.
Our in-house editors, trained in evidence-based medicine and with a range of skills (clinical, pharmaceutical, and scientific), oversee the editorial process through to publication.
Their job starts at the initial review-planning stage.
For a straightforward update of a condition, the systematic search goes ahead based on the existing review plan. When we're adding new options, though, the Editor works with the contributor(s) to ensure that the new review plan precisely captures the new intervention and comparisons to be added.
Like a rolling stone?
'Editor' is one of those vague and variable job titles that mean something completely different depending on whether you're working on Clinical Evidence or the Cochrane Library as opposed to the Times or Rolling Stone magazine. So here's an insight into what our Editors actually do.
- First and foremost, we ensure that the new studies fulfil the inclusion criteria in the review plan. This may result in some papers being excluded or moved to the Comments section as background information.
- Rigorously check all new reporting of data for accuracy against the original papers.
- Establish whether the new papers report on any other designated outcomes of interest that should be also be reported in the CE review.
- Check that deletions or amendments of existing reporting in the review are appropriate (e.g., when a contributor wishes to eclipse reporting of an older SR with a newer one).
- Confirm that the impact of the new material reported in Benefits, Harms, and Comments is reflected in the summary statements, key points, and categorisations.
- Style the whole review so that it conforms to Clinical Evidence style - CE uses a very precise form of reporting statistics and of wording to ensure consistency between reviews; and it can be difficult for authors to reproduce this when writing one CE review, particularly as other academic journals may not use the same rigorous restrictions on language and reporting.
- Identify issues for clarification such as:
- Questions relating to categorisations
- Suggestions for the omission or inclusion of studies
- Requests for additional text
- Suggestions for the addition of further data (in practice, most often the addition of absolute data)
- Feedback from either internal or external peer review.
There is no "typical edit": the scope of any update can vary from adding one RCT through to a rewrite of every option, depending on the new evidence identified.
Although we prespecify what is to be included in any review, in practice there are always borderline decisions, which invariably involve value judgements. These decisions may ultimately affect categorisations: for example, which review to favour/base conclusions on when reviews have different results, and whether to include a particular RCT.
At this stage, the Editor will decide what further input is required. For example, for a new review we would ask for external peer review. With routine updates, though, the Editor will ask an editorial colleague (CE Editor) to independently look at the review (we call this a 'pal edit'), and may even ask more than one colleague to look at different aspects of the review.
Again, there is a subjective element to assigning an overall categorisation that may rely on value judgements. One of the most difficult scenarios is where a number of studies show no difference between an intervention and placebo. The question here is whether there is no evidence of effectiveness or evidence of ineffectiveness.
Here, the Editor and contributors together reach a decision that is supported by the evidence cited; but they might add caveats to the entry in the categorisation table or Key points to highlight important issues.
In addition, we keep an internal Editorial Action Sheet (EAS) for every review, which lists:
- Any actions we might wish to consider at the next update
- Feedback from either internal or external sources
- Suggestions for the addition of possible further interventions.
The Editor returns the edited review to the contributors for further revision, and it may go back and forth between them a number of times until both parties are completely happy.
After a thorough copy edit, where the review is re-checked for style, consistency, and technical issues (e.g., drug units, RINN drug names), and another check by the contributor, a senior editor not involved in the editing of the review is responsible for the final sign-off. This is usually the Editor or Deputy Editor of Clinical Evidence.
A key factor at this stage is further independent scrutiny of the categorisations. The senior editor examines the categorisations with regard to two key questions: are the individual categorisations supported by and consistent with the evidence we present; and are the individual categorisations within the review consistent with each other.
Conflicts of interest
As part of our policy of openness and transparency, all CE authors complete a conflict-of-interest form, which is presented at the end of each review. Internal CE Editors also complete a similar form.
In writing the CE review our contributors may have to report their own studies. This is not unusual in that we specifically look for leaders in any particular field when searching for authors of our reviews, and such people are usually well published. Where this occurs, the contributor(s) will declare that they are the author of included studies in the conflict-of-interest statement.
We expect our contributors to review their own work as they would other published studies: that is, without fear or favour, and we believe that our explicit processes ensure that this happens. In practice, we have not found bias towards the contributors' own work to be an issue. In fact, the converse often occurs, with authors understating the importance of their own work, an issue which is picked up and addressed during the internal editing process.
In synthesising and reporting evidence at Clinical Evidence, we think our processes ensure that we are as objective as we can be. However, for anyone involved in the assessment of evidence, it quickly becomes clear that there are always subjective and borderline decisions to be made, and value judgements may be brought to bear in some areas.
In light of this, at CE we have a policy of openness and transparency in order to open up our decision-making process to external scrutiny.
We realise that our rigorous quality-assurance processes, the precision of our wording and reporting, the extensive feedback and requests for alterations and amendments from our internal editorial review, as well as the comments identified from external peer review, throw a considerable amount of work onto the contributors of CE reviews. We don't for one moment underestimate the amount of time it takes to write or update a CE review, and the degree of expertise and clinical knowledge that our authors bring to bear in this process. We are unfailingly grateful to them for both the time they spend writing and amending their reviews, and for the extremely high quality of their work.
And finally: adding value to the core evidence.
Drug safety alerts
If important information on drug safety is issued from regulatory authorities or any other reputable source, we aim to add a drug safety alert to all reviews mentioning the drug within 72 hours. The alert contains a link to the source of the drug safety alert for more information. The information prompting an alert is processed, together with any new evidence we may find, for the next update of the review.
To help clinicians put the evidence in CE into practice, our reviews link to the full text of major guidelines relevant to the review's clinical area.
All linked guidelines have been produced by national or international government sources, professional medical organisations, or medical speciality societies, and have met predetermined quality requirements.
New guidelines are added regularly, and old ones are replaced by their revised versions as these are published.
In addition to our updating cycle, we add details of clinically important studies to the relevant reviews as they are published using the BMJ Updates service. BMJ Updates is produced by collaboration between the BMJ Group and the internationally acclaimed McMaster University's Health Information Research Unit to provide clinicians with access to current best evidence from research.
All citations (from over 110 premier clinical journals) are rated by trained researchers for quality, then for clinical relevance, importance, and interest by at least three members of a worldwide panel of practising physicians. The final content is indexed by health professionals to allow news of studies to be added to all relevant CE reviews.
Declaration of competing interests
As part of our policy of maximum transparency, our authors declare any competing interests each time their review is published or updated. This declaration is published with the systematic review.
1. Enkin M, Keirse M, Renfrew M, et al. A guide to effective care in pregnancy and childbirth. Oxford: Oxford University Press, 1998.
2. Moher D, Cook DJ, Eastwood S et al. Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement. Lancet 1999;354:1896-1900.
3. Shojania KG, Sampson M, Ansari MT, et al. How quickly do systematic reviews go out of date? A survival analysis. Ann Intern Med 2007;147:224-233.
4. French SD, McDonald S, McKenzie JE et al. Investing in updating: how do conclusions change when Cochrane systematic reviews are updated? BMC Med Res Methodol 2005;5:33.
5. Moher D, Tsertsvadze A, Tricco A, et al. When and how to update systematic reviews. Cochrane Database of Systematic Reviews 2008, Issue 1. Art. No.: MR000023. DOI:10.1002/14651858.MR000023.pub3.