CALM : Computer Aided Learning in Mathematics
 
 
 
 
 
 
 

Issues of Partial Credit in Mathematical Assessment by Computer

By C E Beevers, D G Wild, G R McGuire, D J Fiddes and M A Youngson

Department of Mathematics, Heriot-Watt University, Edinburgh
email: c.e.beevers@hw.ac.uk, d.g.wild@hw.ac.uk, g.r.mcguire@hw.ac.uk, d.j.fiddes@hw.ac.uk, m.a.youngson@hw.ac.uk
Fax: 0131 451 3249

The CALM Project for Computer Aided Learning in Mathematics has operated at Heriot-Watt University in Edinburgh since 1985. From the beginning CALM has featured assessment in its programs, Beevers et al (1991), and enabled both students and teachers to view progress in formative assessment. The computer can play a role in at least four types of assessment: diagnostic, self-test, continuous and grading assessment. The TLTP project Mathwise employs the computer in three of these roles. In 1994 CALM reported on an educational experiment in which the computer was used for the first time to grade, in part, the learning of a large class of Service Mathematics students, Beevers et al (1995), using the Mathwise assessment template. At that time the main issues identified were those of partial credit and communication between the student and the computer. These educational points were addressed in the next phase of the CALM Project in which the commercial testing program Interactive PastPapers was developed. The main aim of this paper is to describe how Interactive PastPapers has been able to incorporate some approaches to partial credit which has helped to alleviate student worries on these issues. Background information on other features in Interactive PastPapers is also included to place them in context.
Introduction

The CALM Project for Computer Aided Learning in Mathematics started at Heriot-Watt University in 1985. In the first phase of the project CAL materials were created in Calculus with each topic structured around Theory, Worked Examples, Motivating Examples and a Test. Students chose most readily to work through the Test section questions welcoming the chance to assess their own progress. In addition, the teachers could view class progress. The weekly tests were designed from banks of questions with randomised parameters in each question. These questions prompted students for a mathematical answer and asked them to type in their response on one line using a style similar to computer languages such as Pascal. Students of engineering and science took only a short time to adjust to this approach and much good educational advantages accrued. The routines developed in those early stages of CALM meant that testing could be more meaningful and not rely on the more usual multiple choice format favoured by so many computer projects. Over the years 1989 - 1992 CALM developed techniques to trap predictable wrong answers and this form of self-testing proved to be a powerful learning aid for students. Nevertheless some problems which we will discuss in greater detail in Sections 3 and 4 remained.

Despite these problems the assessment template which CALM created was used as a testing mechanism in collaborative projects such as the TLTP project, called Mathwise, and the SUMSMAN project. For further details on these the interested reader is directed to Harding & Quinney (1996), Beevers et al (1996), Beevers & Scott (to appear) and Beevers et al (to appear).

Meanwhile the CALM project focussed on the production of a commercial project Interactive PastPapers} aided by the award of a Higher Education Tercentenary prize from the Bank of Scotland. This enabled the group to investigate solutions to such problems as partial credit and some difficulties with student input incorporating the ideas of Wild (1995) (see also Wild et al (1997) and McGuire et al (1998) for further details). These features of IPP (as Interactive PastPapers is affectionately known) relating to these issues will be described in section 3. This follows a discussion of some of the educational issues raised by using computer assessment. This paper concludes with a summary of the way ahead in this area.

Educational discussion

Students of Mathematics can be assessed in a number of ways. The computer can play a role in, at least, four types of assessment:

  1. diagnostic tests where the emphasis is on helping students discover their strengths and weaknesses
  2. self-tests in which the emphasis is on rapid feedback and can be used to pick out predictable wrong answers
  3. continuous assessment in which students and teachers alike can see how mathematical topics are being absorbed; and
  4. grading assessment in which the computer is used in the setting and marking of examinations.

Over the years there has been much work in the area of diagnostic testing with examples like Diagnosis (Appleby et al (1997). Diagnostic testing using student profiling has been studied by Bridges & Hibberd (1994). This latter work is forming the basis of some current work on web delivery of diagnostic tests.

Self testing is a feature of the work of Harding et al} (1995) in the projects Renaissance and Nuffield. CALM used this approach too in its pre-university Mathematics courseware Units where predictable wrong answers were trapped and fast feedback helped students consolidate their learning. Self- tests are also a feature of the many Mathwise modules developed during the TLTP initiative.

Continuous assessment was the driving force in the first CALM Project with weekly tests helping the students assess their own progress through the material. Brunel, Portsmouth and Glasgow Caledonian Universities have used a similar approach with QuestionMark software and the CALMAT units (see McCabe (1995) and Tabor (1993) for separate references).

Grading assessment by computer has been pioneered at Heriot- Watt University initially using the Mathwise assessment mechanism but latterly with the Interactive PastPapers assessment engine delivered over the web. Napier University under the SUMSMAN Project are employing Mathwise to assess their students and this is starting in other Scottish Universities (see the article by Ashworth (1998)).

The commercial program Interactive PastPapers has a number of features that help with the different types of assessment described above. For example, diagnostic tests can be set up using the multiple choice type of question. The marks recording service offered by IPP is then able to supply information on the strengths and weaknesses of individual students.

IPP has three modes of delivery:

  1. Help mode where students can reveal answers if they are stuck on a particular part
  2. Practice mode in which the users are given visible feedback on the correctness of answers as they proceed through a question and
  3. Examination mode in which the computer marks the question but no visible sign is shown on screen.

These three modes simulate self-test, continuous assessment and grading assessment respectively.

Questions can be chosen from banks of typical examples randomly, by topic or by particular question thus providing different forms of testing at different times of the learning cycle . When a test has been chosen the student can browse the questions as in a conventionally written examination before moving on to answer a question. In the following section those aspects of IPP which deal with problems of partial credit and student input difficulties will be considered.

Aspects of partial credit

When a student gets the same answer as the examiner to a mathematical question assigning marks to their answer is simple both for a human and a computer. However deciding how to assign marks to other answers may be tackled differently by humans and computers. This has led to some student anxiety about whether their results in a computer exam accurately reflect their knowledge. Particular issues of greatest concern were:

  1. the computer input of a mathematical answer being interpreted in a different manner by the computer to that which the student intended;
  2. the student answer being correct but in the wrong format;
  3. the student gives a numerical approximation to the correct answer;
  4. the student gets one of a pair of answers correct;
  5. the student can answer part of the question but not all of it.

The first of these has been tackled in Interactive PastPapers by the use of an Input Tool. This device shows the student how a one-line mathematical entry typed in by a student is being interpreted by the computer as the typing proceeds. For example, the student may wish the answer to be tex2html_wrap_inline92 and so types in tex2html_wrap_inline94 and the Input Tool window shows this as

displaymath96

This shows that the computer has interpreted this in a different way to that which is intended so enabling the correct input to be given as 1/(2x) which appears in the Input Tool window as

displaymath100

In addition, the Input Tool provides feedback on the syntax of expressions such as missing brackets and the misuse of operands.

Turning to the second of these items answers which are close to the correct one but do not take the approved form can be awarded partial credit as in the following example. If the answer to a problem is 1/2 and the student types 0.5 then the message

  • Your answer is correct but in the wrong format, give your answer as a fraction -- 50% partial credit
can be displayed. Interactive PastPapers tries to capture good mathematical practice by looking for answers in a compact form. So, a maximum length can be specified and students awarded partial credit for their answer if it is close to the correct answer but not in the most compact form.

A similar approach works for the third of these topics in which partial credit is given for answers which contain a numerical approximation to the correct answer. If desired a warning message can be issued as in the following example. If the correct answer to a question is tex2html_wrap_inline106 the student types in the numerical approximation 1.414 (which is correct to 3 decimal places ) might receive the message

  • Your answer should contain a square root -- 75% partial credit
>From the last two examples it can be seen that the percentage of partial credit given can vary at the examiner's discretion.

Concerning the fourth of these topics many questions in Interactive PastPapers require answers in the form of unordered and ordered linked pairs (or triples). Examples of such questions are

  1. what are the factors of tex2html_wrap_inline112 ?
  2. find the coordinates of the y-intercept of the line y = 3x + 7
The first question has answers x - 2 and x - 3 which can be given in any order whereas the second question has the answer (0,7) in which the order is important. If one of these answers is correct then appropriate partial credit can be awarded.

A similar approach can deal with questions which require answers in scientific notation of the form tex2html_wrap_inline122 with a and b as an ordered linked pair and algebraic fractions x/y with x and y again as an ordered pair.

The last item is perhaps the most difficult to deal with. As an example of this type consider the question

  • find the derivative of tex2html_wrap_inline134
To do this question ( whose correct answer is tex2html_wrap_inline136 ) the student needs to know the correct rule of differentiation to apply and the derivative of two simpler functions. If a student were to answer tex2html_wrap_inline138 in which the correct rule has been identified and the derivative of one of the two functions has been correctly obtained, most human examiners would consider this answer worthy of some credit. CALM used a device in the self-assessment part of the pre-university CALM units in which predictable wrong answers were programmed into trapped errors so that such predictable answers could carry appropriate learning messages. However it could prove difficult to list all the possible wrong answers which the student came up with!

Interactive PastPapers has questions which have been set using key steps. These key steps have been broken down into further substeps. It is up to the students to decide whether they wish to answer these substeps. A less confident student can tackle a question by asking for the substeps which would possibly enable some marks to be picked up. However in such cases the student will take longer to answer the question but in Help or Practice mode this can aid the learning process. In Exam mode a student has to decide whether the extra time is worthwhile compared with the partial credit on offer in that question. This method of awarding partial credit has been steadily evolving over the last few years since the original experiment in 1994 reported Beevers et al (1995). Whatever the mode this extra flexibility has proved popular with students and has been introduced after much student feedback on earlier versions of the CALM test.

IPP also provides an opportunity to deliver tests over the Web. So, in the Winter term 1997, two classes with in excess of 350 undergraduates were introduced to tests using IPP. Of the two classes using the IPP delivery over the Web the syllabus for the first class of over 200 students covered topics in differentiation including hyperbolic equations, parametric equations, McLaurin and Taylor series and in integration topics include standard integrals, application to differential equations, area under curves, methods of substitution and integration by parts. Two tests were planned with each one carrying a mark of 10%. It was decided to give the students the use of the program in Practice mode so that they could see if they had errors as they went along.

The second class of about 150 science and engineering undergraduates took a course which covers topics in the theory and application of differentiation and aspects of complex numbers. In this second class only one computer test was set carrying a credit of 10 % of the module mark. Questions were set with one, two or three key steps. A student who could answer the question in the required number of steps could do so and move to the next question. However, each key step could be broken down into at most a further three sub-steps and the student who chose to tackle the question by answering every sub-step could score partial credit as the part answers are supplied but this had to be balanced in a timed summative test when the overall time to complete the examination is fixed. Clearly the design of the key steps and substeps was critical in ensuring that students who could not tackle the entire question could get as much credit as they deserved for the parts they could do.

Towards an On-line Examination for Mathematics

It is now possible in a mathematical test to resolve the main difficulties of examining a student using the computer to set and mark the test. The use of random parameters in each question ensures that each student receives a test of similar standard but cheating is impractical since neighbouring screens will carry different versions of the same question and hence will have different answers.

The design of the Input Tool has minimised the misinterpretation between student and computer. Moreover, the Input Tool carries an excellent syntax checker so that students are guided to form meaningful expressions. The Input Tool creates a dynamic display which provides immediate visual feedback on how the computer is understanding the student's input. As in many other testing situations if the student has practised with the use of the Input Tool then the questions asked on the day of the test hold little fear.

It is in the area of partial credit that most advances have been made. Through the design of questions and the introduction of key steps the less confident students can choose to take the question in smaller steps whereas the confident, careful student can move through a question more rapidly. Partial answers in the wrong format can carry some credit and some correct answers from a list can also be rewarded in part. Finally, the ability to deal with answers which are not correct can to some extent be done using key steps and sub-steps. So what remains to stop the creation of an on-line test remotely on demand? The assurance that the person sitting the test is the person they say they are! Such security issues remain the major stumbling block to the setting and marking of a test remotely by computer. Some progress electronically to the issue of security is possible. Students can be screened by name, location and machine and provided there is some validation of an individual's identity by a human checker then an examination can be delivered. Answers can be recorded and updated at every input to prevent loss of results by electronic breakdown and results can be encrypted if necessary. It may be that the use of a video camera together with image processing techniques can combine to remove even this restriction in the future (see the article by Daugman (1997)).

References

Beevers, C E, Cherry, B G, Foster, M G and McGuire, G R, (1991), Software Tools for Computer Aided Learning in Mathematics, Ashgate Publishing Company.

Beevers,C E, McGuire, G R, Stirling, G and Wild, D G, (1995), Mathematical Ability Assessed by Computer, J Comp Educ , 123 - 132.

Wild, D G, Beevers, C E, Fiddes, D J, McGuire, G R and Youngson, M A, (1997), Interactive PastPapers for A-Level and Higher Mathematics, Lander Educational Software, Glasgow.

McGuire, G R, Wild, D G, Beevers, C E, Fiddes, D J and Youngson, M A, (1998), Computer Aided Learning in Mathematics --- Highers CALM Units and Interactive PastPapers, Scottish Mathematical Council Journal, 53 - 57.

Harding R D and Quinney, D, (1996), Mathwise and the UK Mathematics Courseware Consortium, Active Learning, 53 - 57.

Beevers, C E, Maciocia, A, Prince A R and Scott, T D, (1996), Pooling Mathematical Resources}, Active Learning , 41 - 42.

Beevers, C E, and Scott, T D, (1997), SUMSMAN --- Collaboration between Scottish Universities, Proc Int Conf Tech and Collegiate Math, Chicago.

Beevers, C E, Bishop, P and Quinney, D, (to appear), Mathwise Diagnostic Testing and Assessment}, CITech.

Wild, D G, (1995), Computer Assessment in Mathematics}, Ph D thesis, Heriot-Watt University.

Appleby, J, Samuels P and Treasure-Jones, T, (1997), Diagnosys: A Knowledge Based Diagnostic Test of Basic Mathematical Skills, J Comp Educ, 113 - 133.

Bridges, S and Hibberd, S, (1994), Construction and Implementation of a Computer-Based Diagnostic Test, CTI Maths and Stats Newsletter.

Harding, R D, Lay, S W, Moule, H and Quinney, D A, (1995), A Mathematical Toolkit for Interactive Hypertext Courseware: Part of the Mathematics Experience with the Renaissance Project, J Comp Educ, 127.

McCabe, M, (1995), Designer Software for Mathematics Assessment, CTI Maths and Stats Newsletter, , 11 - 16.

Tabor, J, (1993), Review of CALMAT, CTI Maths and Stats Newsletter, (1) , 14 - 18.

Ashworth, M, (1998), Maths Resources on the Scottish MANs, CTI Maths and Stats Newsletter, 32.

Daugman, J, (1997) Face and Gesture Recognition: Overview IEEE Transactions on Pattern Analysis and Machine Intelligence, 675-676.

Copyright (c) 1996-2001 Computer Aided Learning in Mathematics (CALM) Group, Department of Mathematics, Heriot-Watt University.

Valid XHTML 1.0!