Tag Archives: Systematic Review

Introduction to text mining and machine learning in systematic reviews

By Tom Roper, Clinical Librarian, Royal Sussex County Hospital

A group of librarians from NICE, Public Health England, universities and NHS Library and Knowledge services were privileged to attend a workshop on Text Mining and Machine Learning in Systematic Reviews, led by [James Thomas] (http://iris.ucl.ac.uk/iris/browse/profile?upi=JTHOA32), Professor of Social Research and Policy at the EPPI-Centre.  James designed [EPPI-Reviewer[ (https://eppi.ioe.ac.uk/CMS/Default.aspx?alias=eppi.ioe.ac.uk/cms/er4), software to manage all types of literature review, including systematic reviews, meta-analyses, ‘narrative’ reviews and meta-ethnographies, and leads Cochrane’s [Project Transform](https://community.cochrane.org/help/tools-and-software/project-transform).

James outlined the problem: we systematically lose research, and then spend a great deal of effort and money on trying to find it again. We need to use correct methods, and, moreover, need to be seen to be correct. There are quantitative issues as well: Cochrane reviewers screen more than 2 million citations a year.  Can this considerable human effort be made more manageable by the judicious use of text mining and machine learning? While tools are being developed to help this task, their development is uneven, as is their adoption.

James distinguished between three types of machine learning, rules-based (unfashionable in computer science circles, he warned), unsupervised, and supervised, and gave us opportunities to try out tools based on these approaches using our own devices.

Rules-based approaches are accurate, but fragile – they either work, or fail completely. Unsupervised approaches work by leaving a machine to identify patterns in the data, for example by clustering documents, for example [LDAVis ]( http://eppi.ioe.ac.uk/ldavis/index.html#topic=6&lambda=0.63&term=) based, you don’t need me to tell you, on Latent Dirichlet Allocation.

Supervised approaches require a human or humans to give the machine training data; after a while, from a 280,000 row spreadsheet in an example James quoted, a statistical model can be constructed which can then be used with new material to determine whether or not a study is a randomised controlled trail or not. Training data comes from people, including data generated for other purposes, data created for the project itself  and crowd-sourced data, as in the case of [Cochrane Crowd ]( http://crowd.cochrane.org/index.html), which mobilises Cochrane Citizen Scientists to decide whether or not the subject of a database record is an RCT.

In systematic reviews, these approaches may be used to identify studies by citation screening or classification, to map research activity, and to automate data extraction, including performing Risk of Bias assessment and extraction of statistical data. Readers may be familiar with tools that take a known set of citations, and use word frequency counts, or analysis of phrases and adjacent terms to create word or phrases lists or visualisations.  Similarly, term extraction and automatic clustering can be used to do statistical and linguistic analysis on text, for human review, and, if deemed useful, modification of an initial search strategy. [Voyant Tools]( https://voyant-tools.org/) is one example, as are [Bibexcel]( https://homepage.univie.ac.at/juan.gorraiz/bibexcel/), [Termine]( http://www.nactem.ac.uk/software/termine/) and even the use of Endnote’s subject bibliography feature to generate lists of keywords.

Citation networks can be used for supplementary searching – will this change, James asked, if or when all bibliographic data becomes open? Useful tools here, apart from traditional ones such as Web of Science, include [VosViewer]( http://www.vosviewer.com/). We also spent some time playing with [EPPI-Reviewer]( https://eppi.ioe.ac.uk/eppireviewer-web/home), the EPPI-Centre’s own tool for systematic reviewers and with [Carrot2 Search](http://search.carrot2.org/stable/search)

In the future, James suggested that there is a great deal of interest in a “surveillance” approach to finding evidence, which can automatically identify if a review or some guidance needs updating. Cochrane are developing the [Cochrane Evidence Pipeline](https://community.cochrane.org/help/tools-and-software/evidence-pipeline) which aims to triage citations found by machine or crowd-sourced methods can either be triaged by the relevant Cochrane Review Group, or assessed using machine-learning.

While the workshop focussed on systematic reviews, for a jobbing librarian like me in a clinical setting, searches to support systematic review will make up only a small part of the workload. Nevertheless, searches still need to be conducted soundly and rigorously. Can artificial intelligence and machine learning help? Certainly some of the tools James showed are useful when formulating search strategies. A group within London and Kent Surrey and Sussex NHS Libraries is developing a search protocol for the region. We may well find ourselves referencing some of these tools. It is always stimulating to hear a world leader in a field talk, and I’m sure all the workshop participants would join me in thanking both Professor Thomas for giving up his time, and Health Education England for organising the workshop.

The tools James described, and more, may be found on the [EPPI-Centre website] (http://eppi.ioe.ac.uk/cms/Default.aspx?tabid=3677). See also the National Centre for Text Mining’s page of [software tools] (http://www.nactem.ac.uk/software.php)

For a systematic review on the subject see:

O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev. 2015 Jan 14;4:5. doi: 10.1186/2046-4053-4-5.

For a more recent overview, I would recommend Julie Glanville’s chapter on Text Mining for Information Specialists in Paul Levay and Jenny Craven’s new book on systematic searching:

Glanville J. Text mining for information specialists. In: Craven J, Levay P, editors. Systematic searching:  practical ideas for improving results.  London : Facet Publishing 2018. p.147-169.

Clinical librarians really make a difference!

Alison Brettle, Reader in Evidence Based Practice, University of Salford a.brettle@salford.ac.uk
@Brettleali

One of the proudest moments for me professionally this month was the publication of a study which demonstrates the impact of clinical librarians in the North West.  The paper has been a long time coming so it was very exciting to see it finally in print (or should that be online!).  The paper (Brettle, Maden and Payne, 2016) was the result of a number of years work and collaboration between clinical librarians working in the North West and myself.

The project really began in 2009, when I returned from the EBLIP4 conference in North Carolina. As LIHNN had kindly sponsored my conference fees I wrote an article in LIHNNK Up about the conference, expressing some frustration about the lack of evidence within our profession. My practical way forward was to suggest librarians conducted systematic reviews so we would know what evidence there was – and where the gaps were.  I also strongly believed that getting involved in systematic reviews was a good introduction to research. The clinical librarians group in the North West were interested in publishing more about the work they were doing and they got in touch, and to cut a long story short, this was the beginning of a partnership which resulted in the group undertaking a systematic review on evaluating clinical librarian services (Brettle et al, 2011).  The systematic review updated the evidence on effectiveness as well as highlighting what was needed to provide rigorous evidence to demonstrate the impact of clinical librarian services.  The next logical step was to put these findings into practice and conduct an evaluation across the North West.  This was to be the largest clinical librarian study in the UK to date, and all clinical librarians across the region were invited to participate.  Both these studies benefited from small grants from HCLU, which were key to providing a small amount of resources to get the projects off the ground.

Building on the recommendations from our systematic review, our aim was to understand the impacts of CL services within National Health Service (NHS) organisations, by

  • Using a framework that ensured consistent and robust data collection across all participants
  • Testing the Making Alignment a Priority (MAP) Toolkit (https://maptoolkit.wordpress.com) in measuring the CL contribution to organisational objectives
  • Developing research skills amongst the group of librarians involved.

The paper describes the results and the tools used. Using both questionnaires and interviews, we found that the interventions or services provided by CL’s are complex and each contributes to multiple outcomes of importance to their organisation.  So for example each literature search or participation in a journal club or current awareness bulletin could impact on multiple areas and decisions, and will be unique to each encounter.  We found that the questionnaires were useful in providing data about the outcomes to which the librarians contributed, whilst the interviews really brought this data to life, explaining how one piece of information could really contribute in a wide range of ways that are important within the NHS context.

In brief we found that clinical librarians contribute to a wide range of outcomes in the short and longer term and really do make a difference within the NHS. These include direct contributions to choice of intervention (36%) diagnosis (26%) quality of life (25%), increased patient involvement in decision making (26%) and cost savings and risk management including avoiding tests, referrals, readmissions and reducing length of stay (28%).  As well as looking at contributions to patient care, we looked at other outcomes that are important within the NHS (this is where the MAP toolkit came in), so the study is relevant across all types of NHS organisations not just acute patient care.  We were able to show that clinical librarians improve quality and help save money as well as affecting patient care directly – all key outcomes in the current NHS climate.

The third objective of the study was to help improve research skills, and this isn’t really covered in the paper.  The approach we used built on that used in the systematic review project (Brettle et al, 2011) and has since been described as a “hive approach” (Buckley-Woods and Booth, 2013).  Another way of describing it is “doing with” rather than “doing for”. As an experienced researcher I directed and guided the research but it was very much a partnership and mentoring relationship where the clinical librarians really contributed to the research (and it wouldn’t have taken place if they hadn’t done so).  For this project the clinical librarians were invited to take part in the research at a level that worked for them.  For example, some participated in the survey part whereas others took part in the interviews, interview analysis, and writing up the results.  At the questionnaire design stage, a small group drafted the questionnaire, as a group this was discussed and modified to suit everyone’s needs, and then piloted on each service.  Standard documents were developed and provided to all, on how to conduct the pilot and how to obtain ethical and governance approval.  Meetings were used for agreeing procedures and training.  A wiki was used to share and update resources.  At the interview stage, meetings were held to develop the interview schedule and provide training to those taking part in this stage.  Librarians were “buddied” and conducted interviews in each others organisations (to enhance rigour) but with the advantage that the buddies could practice on each other as well as bounce ideas (and fears!).

The tools developed in the project have informed the development of a simpler generic tool for use across all health libraries and have been incorporated in a revised impact toolkit, for those who want to conduct more in-depth, rigorous impact studies.  In terms of further research what we need to do next is find out whether this approach of building research capacity has made a difference in the longer term.  If this is the case we can use this approach more widely to develop the evidence base of health libraries and librarians for the future.

Acknowledgements
This project wouldn’t have been possible without the librarians involved.  Thanks to: Michelle Maden-Jenkins, Clare Payne, Helen Medley, Tracey Pratchett, Michael Reid, Debra Thornton, Rosalind McNally, Pippa Orr, Morag Platt, Denise Thomas, Anne Webb, Riz Zafar

References
Brettle, A., Maden, M., Payne, C. (2016) The impact of clinical librarian services on patients and health care organisations, Health Information and Libraries Journal. Available from: http://dx.doi.org/10.1111/hir.12136

Brettle, A., Maden-Jenkins, M., Anderson, L., McNally, R., Pratchett, T., Tancock, J., Thornton, D. and Webb,A. A 2011, ‘Evaluating clinical librarian services: a systematic review’ , Health Information & Libraries Journal, 28 (1) , pp. 3-22.

Buckley-Woods, H. and Booth, A. (2013) What is the current state of practitioner research: the 2013 LIRG scan.  Library and Information Research, 37(116).  Available from: http://www.lirgjournal.org.uk/lir/ojs/index.php/lir/article/view/598