Position paper for CHI'99 Workshop
on End-User Programming and Blended-User Programming
Catherine Letondal, letondal@pasteur.fr
Pasteur Institute, Scientific Computing Center
A practical and empirical approach for biologists who almost program.
Abstract:
This paper addresses the programming needs of biologists who have to
adapt and combine existing programs and sometimes develop new ones for
their research. We explain why existing end-user programming tools
do not work well for them and we present a critical perspective on the
conception of programming by computer scientists. Finally we introduce
our approach, based on participatory design and technical exploration.
Keywords:
end-user programming, participatory design, prototype
languages, scripting languages, spreadsheets.
Page updated on: 1999 Feb 27 11:35.
Computers are now widely used in molecular biology, e.g. to run
simulations or sequence analysis and to test hypotheses against genetic
databanks.
Biology researchers not only need to use software that is
currently available to them, but also to develop new programs or adapt
existing ones. Some biologists already write their own program and
even release public software. A larger number are end- or
blended-users who do not want to spend too much of time and effort on
programming and computer issues. These users need a scaffolding
environment to process their data themselves or to try simple ideas
with greater ease.
My purpose in this position paper is not to propose a new end-user
language or programming environment for these users, but to explain
why, according to my experience, this is a difficult problem.
I have observed biologists programming in Scheme during a three months
programming course where I helped them with practical exercises. By
the end of the course, they were able do design algorithms and define
data structures. After the course, however, it was very hard for
them to apply what they had learned to real problems, and to build
actual software. They discovered that writing a program is more than
implementing an algorithm.
I have also observed for two years biologists using Web
interfaces that I have developped to give access to biological sequence
analysis programs on Unix. A popular feature of this system is the
ability to chain programs with the equivalent of a Unix pipeline or
shell scripts. These observations suggest that users would probably
write shell scripts if they were more at hand.
What then, would it take to make programming "at hand" for these users?
I propose that part of the problem is on the computer scientist
side: it is not what professionals
consider as hard in programming, e.g. writing algorithms,
that biologists find difficult, but the many "details" around it,
such as input/output, databank file formats, visualization.
Visual languages (as opposed to visual environments),
can be separated into bi-modal systems (editing /
executing modes) such as LabView [barroth95] and direct "programming in
the interface" (PITUI) tools, such as Self [rbsmith-95] or Forms/3
[burnett95].
Bi-modal models support the graphical manipulation of programming
concepts, such as loops, conditionals, classes, component
interfaces. The idea behind this kind of tools is that the difficulty
for non professional programmers is a synctatic one, and that
replacing text coding by graphical coding will solve the
problem. Nardi [nardi93] showed that the visual versus textual
distinction does not really discriminate between end-user versus
programmer languages. Nardi also gives examples of non programmer
textual languages such as knitting machine instructions and base-ball
game coding.
The PITUI approach on the other hand does not make a difference
between programming and using, and makes both activities available
simultaneously. However, this does not necessarily mean that
everything has to be graphical or visual in the language. For users
such as the biologists, it is important to have a PITUI environment,
although it need not be exclusively graphical.
The idea of the "Programming by demonstration (or by example)"
approach [cypher93] is to use sequence of user's actions to infer a
more general program.The more interactive the interface, the more
useful and extended the approach. There are however some limits:
- there must be an interactive interface,
- this interface should cover all of the user's needs.
Programming by example is designed to reuse previous actions, but
this only works if there has been such actions, and it is sometimes
easier to describe an action than to demonstrate it to the
program. Therefore, even if there is a macro or history facility,
which would be useful for biologists, it should be completed with a
general language.
One can define degrees in software flexibility [nierstrasz95a]:
functional parameterization, software composition where the structure
is a parameter, and general programming languages.
The first level is insufficient for biologists. It seems that the second
level is either a software engineering matter (and too difficult for
end-users) or not general enough (visual composition environments).
We are interested in the component composition level (or component
reuse level, which is better suited here), and also in the component
definition level since some components may have to be slightly adapted
to the user's needs. This does not mean that end-users should be able
to use components as a framework, nor that they should be able to
define new components from scratch, but that they could go to the
component definition level and incrementally modify it or clone it to
try a new feature.
It appears therefore that we need a general programming language that
can be used directly from the user interface and that is domain
oriented. We should now almost be done, because we know what
programming is. Well, do we? What seems strange and confusing is that
the definition of what is or is not programming is not clear
at all. The table below contains several definitions of what
programming is : each of these statements can be easily challenged.
| programming is...
|
but...
|
| Programming is abstraction, generalization: writing a program
involves modelling some objects in a general enough way.
|
Programming also requires specialization, e.g. going into the code to customize part of it. For example, writing a device driver is the opposite
of doing a generalization; taking a general algorithm (dynamic programming)
and tuning its equations for a specific field (DNA sequences comparison) is
considered as programming.
|
| The essence of programming is in the design of algorithms.
|
Very few programs contain a new algorithm. It is indeed well established
that the largest part of a program is not algorithmic in nature,
but rather consists of "glue" code, input/output, user interface.
Should we considder that using, e.g., a loop is programming?
|
| Programming is automation: saving some instructions
into a file to be re-used later, is programming.
|
Does this mean that entering complex commands into an interactive
shell is not programming?
|
| Programming, by saving instructions to be executed at a given time,
is also planning.
|
Does it mean that using the at or cron
command on Unix is programming?
|
| Programming is translating to a meta or
symbolic level, e.g. dealing with the name of an object (this
also refers to the use-mention dichotomy [myers92b-SUC]).
|
Is defining an alias programming?
|
| Programming is writing code.
|
Is editing a configuration file programming?
|
| Programming is using a compiler or an interpreter.
|
Is submitting a file to LateX or even Netscape programming?
|
| Programming is building and implementing software or
software components.
|
How often do programmers build actual software or even software components?
|
The purpose of this this list is not to define what programming is,
but instead to show that we do not really know what it is, even though
when I say to a colleague that "I am programming" he or she knows
perfectly what I mean.
These questions about what is programming are important and have to be
addressed. After all, in order to design an end-user programming
language, we ought to know what is a programming language. A
different perspective is to start from the users. User-centered design
has proved useful in many areas. I believe it is particularily
important in a field where professionals think they know what
the problem is.
From the user perspective, the first and most important question to
ask is: "What and how would biologists program?" To
address this question, I have chosen to let them participate very early in the
design process and to apply participatory design techniques [schuler93].
The following studies have been conducted so far:
This approach has already shown a number of immediate and potential benefits:
- Being designed with the biologists rather than for them, the
system is likely to better suit their needs and fit their
mental models.
- By involving the users in the design of the system, they are more
likely to use it.
- The users are a source of potential technological innovation [mackay92a]: non-programmers should not be
constrained by the "culture" of programmers, and can find new solutions to
their problems.
My goal is not (yet) to propose a toolkit or a GUI, but rather to
evaluate potential approaches and identify requirements to build a
first prototype. This prototype will be used to test such trigger ideas
during brainstorming and prototyping workshops with users.
- A scripting language is a good choice for gluing pieces of
software and do actual reuse, as well as for prototyping
purpose. Textual coding is not an issue, because it is intended more
for intermediate users rather than for beginners. Moreover textual
coding is definitely not the most difficult part of programming.
- Spreadsheets have shown the power of combining direct
manipulation with symbolic access (naming) for dealing with objects.
- The source code should be accessible
from the user interface. Web pages have this feature, which makes
it easy for anyone to be publisher by copying and pasting from any
page.
Another example is the Self environment [rbsmith-95], where
the implementation of every object is accessible through its outliner,
directly from the user interface. This contradicts the encapsulation
principle, which is important in the software engineering field, but
not in a scaffolding and learning context.
- Meta-object protocols [kiczales91] or more generally open
implementation ideas go in a similar direction by giving access
to the underlying decision level. Reflective features in the
chosen scripting language should be helpful for manipulating symbolic
informations.
- I like the idea from prototype languages of not having to declare
types or classes: optimization issues in compilers have made this
common, but more and more scripting languages like perl or tcl are
only string based, which is a convenient feature. Besides,
programming is not necessarily a modeling process.
- A good repository or library of working and
realistic objects and functions with appropriate default behaviour
should be available. It is easier to use an already instantiated
framework, rather than an abstract one. For example, Web interfaces
benefit from an extensive collection of biological software
installation.
These guidelines and existing techniques are being used as a starting
point in participatory design sessions, in order to let the biologists
programmers figure out their own needs and learn from
existing possible solutions.
The main idea is to lower the step a user need to
climb to start programming, by accepting the
idea that programming is neither necessarily software engineering nor
modeling, and that, for a first try, pottering is better than
nothing.
References
[barroth95]
Ed Baroth and Chris Hartsough
"Visual Programming in the Real Wold"
Visual Object-Oriented Programming, Concepts and Environments, 1995, Prentice Hall
[burnett95]
Bay-Wei Chang, David Ungar, and Randall B. Smith
"Getting Close to Objects: Object-Focused Programming Environments"
Visual Object-Oriented Programming, Concepts and Environments, 1995, Prentice Hall
[cypher93]
Allen Cypher
"Watch What I Do. Programming by Demonstration"
, 1993, MIT Press
[kiczales91]
G. Kiczales and J. des Rivieres and D. G. Bobrow
"The Art of the Meta-Object Protocol"
, 1991, MIT Press
[mackay92a]
Mackay, W.E
"Beyond iterative design: User innovation in co-adaptive systems."
Rank Xerox EuroPARC, Cambridge, England, 1992
[myers92b-SUC]
Randall B. Smith, David Ungar, and Bay-Wei Chang
"The Use-Mention Perspective on Programming for the Interface"
Languages for Developping User Interfaces, Jones and Bartlett
[nardi93]
Bonnie A. Nardi
"A small matter of programming: perspectives on end user computing"
, 1993, MIT Press
[nierstrasz95a]
Oscar Nierstrasz and Laurent Dami
"Component-Oriented Software Technology"
Object-Oriented Software Composition, 1995, Prentice Hall, pp 3-28
[rbsmith-95]
Randall B. Smith and David Ungar
" Programming as an Experience: The inspiration for Self "
in Proc. ECOOP '95, 1995
[sagot97]
M.-F. Sagot and A. Viari and H. Soldano
"Multiple sequence comparison --- A peptide matching approach"
Theoretical Computer Science, 1997, pp 115--137
[schuler93]
Douglas Schuler and Aki Namioka
"Participatory Design: Principles and Practices "
, 1993, Hillsdale, NJ: LEA