Issues of Partial Credit in Mathematical
Assessment by Computer
By C E Beevers, D G Wild, G R McGuire, D J Fiddes and M
A Youngson
Department of Mathematics, Heriot-Watt University,
Edinburgh
email: c.e.beevers@hw.ac.uk, d.g.wild@hw.ac.uk,
g.r.mcguire@hw.ac.uk, d.j.fiddes@hw.ac.uk,
m.a.youngson@hw.ac.uk
Fax: 0131 451 3249
The CALM Project for Computer Aided Learning
in Mathematics has operated at Heriot-Watt University in Edinburgh
since 1985. From the beginning CALM has featured assessment in its
programs, Beevers et al (1991), and enabled both students
and teachers to view progress in formative assessment. The computer
can play a role in at least four types of assessment: diagnostic,
self-test, continuous and grading assessment. The TLTP project
Mathwise employs the computer in three of these roles. In 1994 CALM
reported on an educational experiment in which the computer was
used for the first time to grade, in part, the learning of a large
class of Service Mathematics students, Beevers et al (1995),
using the Mathwise assessment template. At that time the main
issues identified were those of partial credit and communication
between the student and the computer. These educational points were
addressed in the next phase of the CALM Project in which the
commercial testing program Interactive PastPapers was
developed. The main aim of this paper is to describe how
Interactive PastPapers has been able to incorporate some
approaches to partial credit which has helped to alleviate student
worries on these issues. Background information on other features
in Interactive PastPapers is also included to place them in
context.
Introduction
The CALM Project for Computer Aided Learning in Mathematics
started at Heriot-Watt University in 1985. In the first phase of
the project CAL materials were created in Calculus with each topic
structured around Theory, Worked Examples, Motivating Examples and
a Test. Students chose most readily to work through the Test
section questions welcoming the chance to assess their own
progress. In addition, the teachers could view class progress. The
weekly tests were designed from banks of questions with randomised
parameters in each question. These questions prompted students for
a mathematical answer and asked them to type in their response on
one line using a style similar to computer languages such as
Pascal. Students of engineering and science took only a short time
to adjust to this approach and much good educational advantages
accrued. The routines developed in those early stages of CALM meant
that testing could be more meaningful and not rely on the more
usual multiple choice format favoured by so many computer projects.
Over the years 1989 - 1992 CALM developed techniques to trap
predictable wrong answers and this form of self-testing proved to
be a powerful learning aid for students. Nevertheless some problems
which we will discuss in greater detail in Sections 3 and 4
remained.
Despite these problems the assessment template which CALM
created was used as a testing mechanism in collaborative projects
such as the TLTP project, called Mathwise, and the SUMSMAN project.
For further details on these the interested reader is directed to
Harding & Quinney (1996), Beevers et al (1996), Beevers
& Scott (to appear) and Beevers et al (to appear).
Meanwhile the CALM project focussed on the production of a
commercial project Interactive PastPapers} aided by the award of
a Higher Education Tercentenary prize from the Bank of Scotland.
This enabled the group to investigate solutions to such problems as
partial credit and some difficulties with student input
incorporating the ideas of Wild (1995) (see also Wild et al
(1997) and McGuire et al (1998) for further details). These
features of IPP (as Interactive PastPapers is affectionately
known) relating to these issues will be described in section 3.
This follows a discussion of some of the educational issues raised
by using computer assessment. This paper concludes with a summary
of the way ahead in this area.
Educational discussion
Students of Mathematics can be assessed in a number of ways. The
computer can play a role in, at least, four types of
assessment:
- diagnostic tests where the emphasis is on helping students
discover their strengths and weaknesses
- self-tests in which the emphasis is on rapid feedback and can
be used to pick out predictable wrong answers
- continuous assessment in which students and teachers alike can
see how mathematical topics are being absorbed; and
- grading assessment in which the computer is used in the setting
and marking of examinations.
Over the years there has been much work in the area of
diagnostic testing with examples like Diagnosis (Appleby et
al (1997). Diagnostic testing using student profiling has been
studied by Bridges & Hibberd (1994). This latter work is
forming the basis of some current work on web delivery of
diagnostic tests.
Self testing is a feature of the work of Harding et al}
(1995) in the projects Renaissance and Nuffield. CALM used this
approach too in its pre-university Mathematics courseware Units
where predictable wrong answers were trapped and fast feedback
helped students consolidate their learning. Self- tests are also a
feature of the many Mathwise modules developed during the TLTP
initiative.
Continuous assessment was the driving force in the first CALM
Project with weekly tests helping the students assess their own
progress through the material. Brunel, Portsmouth and Glasgow
Caledonian Universities have used a similar approach with
QuestionMark software and the CALMAT units (see McCabe (1995) and
Tabor (1993) for separate references).
Grading assessment by computer has been pioneered at Heriot-
Watt University initially using the Mathwise assessment mechanism
but latterly with the Interactive PastPapers assessment engine
delivered over the web. Napier University under the SUMSMAN Project
are employing Mathwise to assess their students and this is
starting in other Scottish Universities (see the article by
Ashworth (1998)).
The commercial program Interactive PastPapers has a
number of features that help with the different types of assessment
described above. For example, diagnostic tests can be set up using
the multiple choice type of question. The marks recording service
offered by IPP is then able to supply information on the strengths
and weaknesses of individual students.
IPP has three modes of delivery:
- Help mode where students can reveal answers if they are stuck
on a particular part
- Practice mode in which the users are given visible feedback on
the correctness of answers as they proceed through a question
and
- Examination mode in which the computer marks the question but
no visible sign is shown on screen.
These three modes simulate self-test, continuous assessment and
grading assessment respectively.
Questions can be chosen from banks of typical examples randomly,
by topic or by particular question thus providing different forms
of testing at different times of the learning cycle . When a test
has been chosen the student can browse the questions as in a
conventionally written examination before moving on to answer a
question. In the following section those aspects of IPP which deal
with problems of partial credit and student input difficulties will
be considered.
Aspects of partial credit
When a student gets the same answer as the examiner to a
mathematical question assigning marks to their answer is simple
both for a human and a computer. However deciding how to assign
marks to other answers may be tackled differently by humans and
computers. This has led to some student anxiety about whether their
results in a computer exam accurately reflect their knowledge.
Particular issues of greatest concern were:
- the computer input of a mathematical answer being interpreted
in a different manner by the computer to that which the student
intended;
- the student answer being correct but in the wrong format;
- the student gives a numerical approximation to the correct
answer;
- the student gets one of a pair of answers correct;
- the student can answer part of the question but not all of
it.
The first of these has been tackled in Interactive
PastPapers by the use of an Input Tool. This device shows the
student how a one-line mathematical entry typed in by a student is
being interpreted by the computer as the typing proceeds. For
example, the student may wish the answer to be
and so types in
and the Input Tool window shows this as

This shows that the computer has interpreted this in a different
way to that which is intended so enabling the correct input to be
given as 1/(2x) which appears in the Input Tool window
as

In addition, the Input Tool provides feedback on the syntax of
expressions such as missing brackets and the misuse of
operands.
Turning to the second of these items answers which are close to
the correct one but do not take the approved form can be awarded
partial credit as in the following example. If the answer to a
problem is 1/2 and the student types 0.5 then the message
- Your answer is correct but in the wrong format, give your
answer as a fraction -- 50% partial credit
can be displayed.
Interactive PastPapers tries to capture
good mathematical practice by looking for answers in a compact
form. So, a maximum length can be specified and students awarded
partial credit for their answer if it is close to the correct
answer but not in the most compact form.
A similar approach works for the third of these topics in which
partial credit is given for answers which contain a numerical
approximation to the correct answer. If desired a warning message
can be issued as in the following example. If the correct answer to
a question is
the student types
in the numerical approximation 1.414 (which is correct to 3 decimal
places ) might receive the message
- Your answer should contain a square root -- 75% partial
credit
>From the last two examples it can be seen that the percentage
of partial credit given can vary at the examiner's discretion.
Concerning the fourth of these topics many questions in
Interactive PastPapers require answers in the form of unordered
and ordered linked pairs (or triples). Examples of such questions
are
- what are the factors of
?
- find the coordinates of the y-intercept of the line y =
3x + 7
The first question has answers
x - 2 and
x - 3 which
can be given in any order whereas the second question has the
answer (0,7) in which the order is important. If one of these
answers is correct then appropriate partial credit can be awarded.
A similar approach can deal with questions which require answers
in scientific notation of the form
with a and b as an ordered linked
pair and algebraic fractions x/y with x and
y again as an ordered pair.
The last item is perhaps the most difficult to deal with. As an
example of this type consider the question
- find the derivative of

To do this question ( whose correct answer is

) the student needs to know the correct rule of
differentiation to apply and the derivative of two simpler
functions. If a student were to answer

in which the correct rule has been identified
and the derivative of one of the two functions has been correctly
obtained, most human examiners would consider this answer worthy of
some credit. CALM used a device in the self-assessment part of the
pre-university CALM units in which predictable wrong answers were
programmed into trapped errors so that such predictable answers
could carry appropriate learning messages. However it could prove
difficult to list all the possible wrong answers which the student
came up with!
Interactive PastPapers has questions which have been set
using key steps. These key steps have been broken down into further
substeps. It is up to the students to decide whether they wish to
answer these substeps. A less confident student can tackle a
question by asking for the substeps which would possibly enable
some marks to be picked up. However in such cases the student will
take longer to answer the question but in Help or Practice mode
this can aid the learning process. In Exam mode a student has to
decide whether the extra time is worthwhile compared with the
partial credit on offer in that question. This method of awarding
partial credit has been steadily evolving over the last few years
since the original experiment in 1994 reported Beevers et al
(1995). Whatever the mode this extra flexibility has proved popular
with students and has been introduced after much student feedback
on earlier versions of the CALM test.
IPP also provides an opportunity to deliver tests over the Web.
So, in the Winter term 1997, two classes with in excess of 350
undergraduates were introduced to tests using IPP. Of the two
classes using the IPP delivery over the Web the syllabus for the
first class of over 200 students covered topics in differentiation
including hyperbolic equations, parametric equations, McLaurin and
Taylor series and in integration topics include standard integrals,
application to differential equations, area under curves, methods
of substitution and integration by parts. Two tests were planned
with each one carrying a mark of 10%. It was decided to give the
students the use of the program in Practice mode so that they could
see if they had errors as they went along.
The second class of about 150 science and engineering
undergraduates took a course which covers topics in the theory and
application of differentiation and aspects of complex numbers. In
this second class only one computer test was set carrying a credit
of 10 % of the module mark. Questions were set with one, two or
three key steps. A student who could answer the question in the
required number of steps could do so and move to the next question.
However, each key step could be broken down into at most a further
three sub-steps and the student who chose to tackle the question by
answering every sub-step could score partial credit as the part
answers are supplied but this had to be balanced in a timed
summative test when the overall time to complete the examination is
fixed. Clearly the design of the key steps and substeps was
critical in ensuring that students who could not tackle the entire
question could get as much credit as they deserved for the parts
they could do.
Towards an On-line Examination for
Mathematics
It is now possible in a mathematical test to resolve the main
difficulties of examining a student using the computer to set and
mark the test. The use of random parameters in each question
ensures that each student receives a test of similar standard but
cheating is impractical since neighbouring screens will carry
different versions of the same question and hence will have
different answers.
The design of the Input Tool has minimised the misinterpretation
between student and computer. Moreover, the Input Tool carries an
excellent syntax checker so that students are guided to form
meaningful expressions. The Input Tool creates a dynamic display
which provides immediate visual feedback on how the computer is
understanding the student's input. As in many other testing
situations if the student has practised with the use of the Input
Tool then the questions asked on the day of the test hold little
fear.
It is in the area of partial credit that most advances have been
made. Through the design of questions and the introduction of key
steps the less confident students can choose to take the question
in smaller steps whereas the confident, careful student can move
through a question more rapidly. Partial answers in the wrong
format can carry some credit and some correct answers from a list
can also be rewarded in part. Finally, the ability to deal with
answers which are not correct can to some extent be done using key
steps and sub-steps. So what remains to stop the creation of an
on-line test remotely on demand? The assurance that the person
sitting the test is the person they say they are! Such security
issues remain the major stumbling block to the setting and marking
of a test remotely by computer. Some progress electronically to the
issue of security is possible. Students can be screened by name,
location and machine and provided there is some validation of an
individual's identity by a human checker then an examination can be
delivered. Answers can be recorded and updated at every input to
prevent loss of results by electronic breakdown and results can be
encrypted if necessary. It may be that the use of a video camera
together with image processing techniques can combine to remove
even this restriction in the future (see the article by Daugman
(1997)).
References
Beevers, C E, Cherry, B G, Foster, M G and McGuire, G R, (1991),
Software Tools for Computer Aided Learning in Mathematics,
Ashgate Publishing Company.
Beevers,C E, McGuire, G R, Stirling, G and Wild, D G, (1995),
Mathematical Ability Assessed by Computer, J Comp Educ , 123
- 132.
Wild, D G, Beevers, C E, Fiddes, D J, McGuire, G R and Youngson,
M A, (1997), Interactive PastPapers for A-Level and Higher
Mathematics, Lander Educational Software, Glasgow.
McGuire, G R, Wild, D G, Beevers, C E, Fiddes, D J and Youngson,
M A, (1998), Computer Aided Learning in Mathematics --- Highers
CALM Units and Interactive PastPapers, Scottish Mathematical
Council Journal, 53 - 57.
Harding R D and Quinney, D, (1996), Mathwise and the UK
Mathematics Courseware Consortium, Active Learning, 53 -
57.
Beevers, C E, Maciocia, A, Prince A R and Scott, T D, (1996),
Pooling Mathematical Resources}, Active Learning , 41 -
42.
Beevers, C E, and Scott, T D, (1997), SUMSMAN ---
Collaboration between Scottish Universities, Proc Int Conf Tech
and Collegiate Math, Chicago.
Beevers, C E, Bishop, P and Quinney, D, (to appear), Mathwise
Diagnostic Testing and Assessment}, CITech.
Wild, D G, (1995), Computer Assessment in Mathematics}, Ph D
thesis, Heriot-Watt University.
Appleby, J, Samuels P and Treasure-Jones, T, (1997),
Diagnosys: A Knowledge Based Diagnostic Test of Basic Mathematical
Skills, J Comp Educ, 113 - 133.
Bridges, S and Hibberd, S, (1994), Construction and
Implementation of a Computer-Based Diagnostic Test, CTI Maths
and Stats Newsletter.
Harding, R D, Lay, S W, Moule, H and Quinney, D A, (1995), A
Mathematical Toolkit for Interactive Hypertext Courseware: Part of
the Mathematics Experience with the Renaissance Project, J Comp
Educ, 127.
McCabe, M, (1995), Designer Software for Mathematics
Assessment, CTI Maths and Stats Newsletter, , 11 - 16.
Tabor, J, (1993), Review of CALMAT, CTI Maths and Stats
Newsletter, (1) , 14 - 18.
Ashworth, M, (1998), Maths Resources on the Scottish
MANs, CTI Maths and Stats Newsletter, 32.
Daugman, J, (1997) Face and Gesture Recognition: Overview
IEEE Transactions on Pattern Analysis and Machine Intelligence,
675-676.
Copyright (c) 1996-2001 Computer Aided Learning in Mathematics (CALM) Group,
Department of Mathematics, Heriot-Watt University.
