James outlined the problem: we systematically lose research,
and then spend a great deal of effort and money on trying to find it again. We
need to use correct methods, and, moreover, need to be seen to be correct.
There are quantitative issues as well: Cochrane reviewers screen more than 2
million citations a year. Can this
considerable human effort be made more manageable by the judicious use of text
mining and machine learning? While tools are being developed to help this task,
their development is uneven, as is their adoption.
James distinguished between three types of machine learning,
rules-based (unfashionable in computer science circles, he warned),
unsupervised, and supervised, and gave us opportunities to try out tools based
on these approaches using our own devices.
Rules-based approaches are accurate, but fragile – they
either work, or fail completely. Unsupervised approaches work by leaving a
machine to identify patterns in the data, for example by clustering documents,
for example [LDAVis ]( http://eppi.ioe.ac.uk/ldavis/index.html#topic=6&lambda=0.63&term=)
based, you don’t need me to tell you, on Latent Dirichlet Allocation.
Supervised approaches require a human or humans to give the
machine training data; after a while, from a 280,000 row spreadsheet in an
example James quoted, a statistical model can be constructed which can then be
used with new material to determine whether or not a study is a randomised
controlled trail or not. Training data comes from people, including data
generated for other purposes, data created for the project itself and crowd-sourced data, as in the case of
[Cochrane Crowd ]( http://crowd.cochrane.org/index.html),
which mobilises Cochrane Citizen Scientists to decide whether or not the
subject of a database record is an RCT.
In systematic reviews, these approaches may be used to identify studies by citation screening or classification, to map research activity, and to automate data extraction, including performing Risk of Bias assessment and extraction of statistical data. Readers may be familiar with tools that take a known set of citations, and use word frequency counts, or analysis of phrases and adjacent terms to create word or phrases lists or visualisations. Similarly, term extraction and automatic clustering can be used to do statistical and linguistic analysis on text, for human review, and, if deemed useful, modification of an initial search strategy. [Voyant Tools]( https://voyant-tools.org/) is one example, as are [Bibexcel]( https://homepage.univie.ac.at/juan.gorraiz/bibexcel/), [Termine]( http://www.nactem.ac.uk/software/termine/) and even the use of Endnote’s subject bibliography feature to generate lists of keywords.
In the future, James suggested that there is a great deal of
interest in a “surveillance” approach to finding evidence, which can
automatically identify if a review or some guidance needs updating. Cochrane are
developing the [Cochrane Evidence Pipeline](https://community.cochrane.org/help/tools-and-software/evidence-pipeline)
which aims to triage citations found by machine or crowd-sourced methods can
either be triaged by the relevant Cochrane Review Group, or assessed using
While the workshop focussed on systematic reviews, for a
jobbing librarian like me in a clinical setting, searches to support systematic
review will make up only a small part of the workload. Nevertheless, searches
still need to be conducted soundly and rigorously. Can artificial intelligence
and machine learning help? Certainly some of the tools James showed are useful
when formulating search strategies. A group within London and Kent Surrey and
Sussex NHS Libraries is developing a search protocol for the region. We may
well find ourselves referencing some of these tools. It is always stimulating
to hear a world leader in a field talk, and I’m sure all the workshop
participants would join me in thanking both Professor Thomas for giving up his
time, and Health Education England for organising the workshop.
O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S.
Using text mining for study identification in systematic reviews: a systematic
review of current approaches. Syst Rev. 2015 Jan 14;4:5. doi:
For a more recent overview, I would recommend Julie
Glanville’s chapter on Text Mining for Information Specialists in Paul Levay
and Jenny Craven’s new book on systematic searching:
Glanville J. Text mining for information specialists. In:
Craven J, Levay P, editors. Systematic searching: practical ideas for improving results. London : Facet Publishing 2018. p.147-169.
On the 20th November 2018 I attended the CILIP Employers Forum. One of the talks was by Terry Corby on “Avoiding the Toaster! Meeting the challenge of disruptive innovation”. The toaster in the title was alluding to the idea that if we fail to deal with disruptive innovation, we will become “toast”.
Terry argued that automation is already here:
“60% of occupations could have 30% or more of their activities automated with current technology”
20% of a CEO’s activities could be automated now
The cost benefits are between three and ten times the investment. Only human factors prevent it happening.
AI solutions tend to work best when they have a human element as well.
Many companies foresaw future disruption but failed to capitalise:
Kodak invented digital photography
Xerox invented the Graphical User Interface and the computer mouse.
Among Terry’s suggestions for how to operate in this environment were:
Seek out stakeholders who will insist on innovation.
Find out what your customer really wants and values.
Work on many innovations, expecting that most will fail, but some may greatly succeed.
Create a culture that encourages innovation and learning.
Completely master new skills if you can, or recognise when you can’t.
Be an outsider in new areas, not just an insider in your own.
Established companies are often at a disadvantage because they don’t recognise the threat and fear cannibalising their business
The challenge Terry laid down to librarians was that we had allowed search engines to roll over us, would we do the same for artificial intelligence? He doesn’t know our field and so had no answers, but he did call us to think these issues through for ourselves, and then we will avoid someone “eating our breakfast”.
Now over to you: what do you think? Leave a comment below.