Databases and ontologies
The Taverna Interaction Service: enabling manual interaction
* and Tom Oinn
Computational Biology Unit, Bergen Center for Computational Science, University of Bergen, 5008 Bergen, Norway
EMBL European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
Advance Access publication March 12, 2008
Associate Editor: John Quackenbush
tools and databases for life science research by the construction of
workflows. The Taverna Interaction Service extends the functionality
of Taverna by defining human interaction within a workflow and
acting as a mediation layer between the automated workflow engine
and one or more users.
Availability: Taverna, the Interaction Service plug-in and web
application are available as open source and can be downloaded
analytical applications, databases and other resources available
for life science research. Furthermore, these tools and databases
(hereafter called services) are often heterogeneous in terms of
access, data formats and definitions used, as well as user
interfaces (Stein, 2002). Manual cut-and-paste work, as well as
competence in bioinformatics and programming is often needed
in order to combine results from several services in a
meaningful way. Web service technology provides a solution
to some of these issues, by specifying a standardised program-
matic interface for computational resources. This technology
has gained popularity within bioinformatics recently with an
increasing number of services now providing Web services
access (Neerincx et al., 2005). The application Taverna (Oinn
., 2007) offers an environment to access Web services
of Web services or programming. Most importantly, it allows
for the creation, execution and reusability of workflows, by
combining several services in a coordinated and well-defined
manner. While many other workflow editing environments
exist, Taverna is one of the most popular in life sciences with an
estimated user base of around 1500 installations in February
2006 (Hull et al., 2006). It has also been actively used in
genomics research (Stevens et al., 2004).
In addition to computer resources, there is often a need to
include user intervention in workflows. More often than not,
automatically generated prediction results require some form of
manual quality control. In the standard version of the Taverna
Workbench, a user cannot control the behaviour of a workflow
once it is running. In simple workflows with relatively short
execution time, this can be dealt with by manually inspecting
intermediate or final results and restarting the workflow with
modified parameters, as needed. In a typical genomics project,
however, there is a need for workflows that include large
volumes of data and computationally demanding services.
Total running time of such workflows can be as long as several
hours or even days. Further, there is often a need to include
other people than the primary Taverna user in the review
process. This could also include external collaboration partners
that may not have direct access to the same file server as the
A simplified example of a workflow that includes manual
interaction is illustrated in Figure 1. This conceptual workflow
predicts the boundaries of the protein-coding portion of genes
in a bacterial genome, and subsequently predicts the function of
the encoded protein. This illustrates the requirement to include
user interaction as an integrated part of a workflow, which
raises the issue of how to define human interaction to the
workflow designer and ultimately to the workflow engine
executing the workflow. To this end, we have developed the
Taverna Interaction Service, an extensible mediation layer in
between the automated workflow system and the user. As far as
the workflow design is concerned, there is no obvious reason to
separate this kind of human inspection from computational
analysis, hence the slogan ‘because users are services too’ was
chosen for the application.
IMPLEMENTATION AND FEATURES
application, providing a programmatic interface for commu-
nication with the Taverna workflow engine. Thus, it can be
deployed to a Java Servlet container independently of the
Taverna installation of its user. Besides its programmatic
interface towards Taverna, it presents a status screen to the user
when accessed through a web browser, showing its status and
available interaction patterns. A working demonstration work-
flow can also be downloaded from this web page.
In order for a Taverna Workbench installation to commu-
nicate with the Interaction Service, a plug-in must be installed.
*To whom correspondence should be addressed.
ß The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: email@example.com
more information on how to install and use the Interaction
Service, please see the online manual at http://bioinfo.no/
software/interaction-service. Once installed, Interaction Service
instances can be added to a workflow from the Taverna services
panel just like any other resource, by providing Taverna with its
URL. This exposes all the available interaction patterns
installed on the service. Academic users are free to use the
Interaction Service of the Computational Biology Unit (CBU)
at the University of Bergen, available at http://api.bioinfo.no/
An interaction pattern defines the input and result data types
required for the interaction, as well as the method by which it
takes place. Two default interaction patterns are available. In
addition to these, advanced users can design new interaction
patterns for other purposes. New interaction pattern may be
uploaded and added to the Interaction Service at runtime. All
interaction patterns will, when invoked from a workflow, send
an email to the targeted user or users—the body of which is
specified by the chosen pattern. All such interaction messages
contain a number of hyperlinks that facilitate user interaction.
The simplest of the default patterns presents the reviewer with
the choice to accept or reject a piece of textual data. In this case,
the user is simply presented with one link for accepting the text
specified in the message and another for rejecting it. The
decision is sent back to the workflow engine via the Interaction
Service and appears as the output of the interaction step in the
The second default pattern provided handles genome
annotation data. The input for this process consists of one or
more genome ‘flat files’ in EMBL, Genbank or GFF format,
each with a title. A textual comment may also be included.
When invoked, this pattern will send an email to the targeted
reviewer containing the comment submitted and a hyperlink for
opening and reviewing the results. Following this link will
launch a modified version of the Artemis sequence and
annotation editor (Rutherford et al., 2000). This step utilises
Java Web Start technology. Upon opening, the genome and
annotation data is automatically downloaded from the
Interaction Service and presented to the reviewer, who then
reviews and edits the data in Artemis as usual. A notepad is also
provided for writing down comments about the data and
modifications made. Having reviewed the data, the user can
choose to either accept the results with or without changes, or
to reject them. The edited data is sent back to the Interaction
Service along with review notes and the decision made, and
appears as the output of the interaction processor in the
There are many situations where an interaction pattern may
be useful to allow manual interaction in an analysis workflow.
Thanks to the modular design of the Interaction Service that
such interaction patterns are relatively easy to define, create
and add at runtime, providing basic programming skills and
familiarity with Java. The developer of a new pattern must
first download the full java source tree from the Taverna
website and then implement the ServerInteractionPattern inter-
face, preferably by inheriting the partial implementation
. The new ServerInteractionPattern
can then be compiled and added to a .jar file. This file may
be uploaded to an Interaction Service web server, which can
discover and add it to its repository at runtime. More
information about how to do this can be found at http://
DISCUSSION AND FUTURE PERSPECTIVES
particularly in the field of business process management
(BPM). Limiting the scope to Web service technology, the
Business Process Execution Language (BPEL) (http://docs.
oasis-open.org/wsbpel/2.0). Taverna instead uses a workflow
language called SCUFL (Simple Conceptual Unified Flow
Language), containing a number of features that separates its
functionality quite fundamentally from BPEL. However, what
the two languages have in common is that neither of them define
an interface for human interaction in workflows. To include
user interaction, many developers of BPEL workflow software
have overcome this limitation by implementing special web
applications, presenting a BPEL compliant WSDL (Web
Service Description Language) interface to the workflow
engine, but handling user interaction by mechanisms hidden
from the workflow engine, i.e. external to BPEL. However, no
adopted standard for describing the interaction interface itself
exists, so a workflow designer often needs knowledge of the
inner workings of the web application responsible. The Taverna
Interaction Service uses a similar approach, but does not expose
Fig. 1. Example of a Taverna workflow utilising an Interaction Service
step. The input of this simplified workflow is a prokaryotic genomic
DNA sequence. In the first step, the sequence data is sent to a number
of services that directly or indirectly aid in predicting the gene structure
of the sequence [BLASTX, Genscan (Burge and Karlin, 1997) and
Glimmer (Delcher et al., 1999)]. The results from these upstream
services are merged to a preliminary gene structure prediction, which is
sent for manual review using the Interaction Service. Thus, a manually
reviewed and possibly modified gene prediction is obtained and stored
as an output of the workflow along with the comments of the reviewer.
The workflow goes on to extract-predicted protein sequences for these
hypothetical genes, which are passed on to the downstream services
BLASTP, the protein domain analysis tools Pfam (Bateman et al.,
2002) and ScanProsite (Gattiker et al., 2002). The results of the
downstream services are stored in the workflow output as ‘Protein
The Taverna Interaction Service: enabling manual interaction in workflows
suited for this purpose. Instead, it uses a custom format
exposing some of the interaction related metadata to the
workflow designer. The communication between the external
Interaction Service and the user is initiated by email. It should
be noted that a consortium of major BPM software developers
recently proposed an extension of BPEL called BPEL4People
and WS-HumanTask specifing how user interaction can be
included in workflows, but whether this will contribute to
increased standardisation of the interaction interface and
portability of BPEL workflows with user interaction steps, is
In connection to ongoing genomics projects at CBU, we are
planning to implement a number of new interaction patterns.
These will be added to the Interaction Service distribution. One
such pattern is selecting relevant sequence alignments from a list.
Another is curating a list of automatically generated gene names.
We thank the Taverna development team, Pa˚l Puntervoll and
three anonymous reviewers for helpful comments on this
manuscript and Jan-Christian Bryne for fruitful discussions
and debates about WS technology. The development of the
Taverna Interaction Service has been supported by the UK
e-Science program through the
Grid project and by the
(FUGE) of the Research Council of Norway.
Conflict of Interest
: none declared.
Bateman,A. et al. (2002) The Pfam protein families database. Nucleic Acids Res.,
Burge,C. and Karlin,S. (1997) Prediction of complete gene structures in human
genomic DNA. J. Mol. Biol., 268, 78–94.
Delcher,A. et al. (1999) Improved microbial gene identification with GLIMMER.
Nucleic Acids Res.
, 27, 4636–4641.
Gattiker,A. et al. (2002) ScanProsite: a reference implementation of a PROSITE
scanning tool. Appl. Bioinformatics, 1, 107–108.
Hull,D. et al. (2006) Taverna: a tool for building and running workflows of
services. Nucl. Acids Res., 34 (Web Server issue), 729–732.
Neerincx,P.B.T. and Leunissen,J.A.M. (2005) Evolution of web services in
bioinformatics. Brief Bioinform., 6, 178–188.
Oinn,T. et al. (2007) Taverna/myGrid: Aligning a workflow system with the life
sciences community. In Taylor,I.J.,
Deelman,E., Gannon,D.B. and
Shields,M. (eds.) Workflows for e-Science, Springer-Verlag.
Rutherford,K. et al. (2000) Artemis: sequence visualization and annotation.
, 16, 944–945.
Stein,L. (2002) Creating a bioinformatics nation. Nature, 417, 119–120.
Stevens,R.D. et al. (2004) Exploring Williams-Beuren syndrome using myGrid.
, 20 (Suppl. 1), I303–I310.