Skip to content.
Sections
Personal tools
You are here: Home » Features » MCQFM: Towards a Web 2.0 Approach to Objective Testing

MCQFM: Towards a Web 2.0 Approach to Objective Testing

Steve Bennett
Last modified 12 Dec, 2007
Published 11 Dec, 2007
Steve Bennett describes the MCQFM (Multiple Choice Questions Five Methods) toolkit project which has produced a simple notation for multiple choice questions (QTEXT) to facilitate conversion into a QTI format. The project has also produced a web service with five functions which can be integrated into any application. Steve begins by discussing the difficulties of translating questions between different objective test formats.

You would have thought that objective test questions are one of the most instantly shareable educational commodities around. They are extremely granular - probably in most cases less than 100 words. They don't really need any metadata because everything you wanted to know about the question is in the question itself. Actually that's not true. You might, for example, wish to know the author and the course (year and subject) for which it was written. But beyond that...what else? By their very nature, multiple choice questions tend to be written extremely clearly because in test conditions you can't have students putting up their hands to query regular moments of ambiguity. And yet question sharing doesn't seem to go on very much. Or at least no-one has told me about it.

For instance, when learning Flash, I continually search for tutorials on the web, and I find many. No matter how specific or arcane the particular bit of action script I want to learn about is, I almost always find something illuminating, educational and immediately applicable to what I am doing. However, if I try to find quizzes about Flash, I am lucky if I can find one. I am much more likely to find a tutorial teaching me how to write a flash application which will do quizzes!

There is one good and justifiable reason for this. If a student has exposure to an objective test question - the ability of that question subsequently to test the student's knowledge is dramatically reduced. For instance if I see the question:

Question: Can you handle mouse events in a frame script in Flash?

Distractor: Yes - just write on (press)

Distractor: Yes - just write on (Release)

Answer@: Yes - if you reference the instance name of the button in the script: e.g myButton.onRelease= function() {}

Distractor: No it is completely impossible

Now then, to answer this does require some sophistication in Flash. However, the student who has no such sophistication, but nonetheless takes this question, gets it wrong but is pointed to the right answer (or gets the right answer after 4 attempts) will subsequently be able to answer this correctly. Because they will remember the form! Therefore there is justification in this setting for not sharing questions which are to be used in high stakes summative testing. However, there is no such justification for hoarding up formative test questions. So why is it so hard to find good questions? And how can MCQFM help?

First Obstacle: Incompatible Formats and Occult Meanings!

The three major formats for objective test questions are:

  • QTI Version 2.1
  • QTI Version 1.2
  • and Questionmark Perception's QML.

All three are XML constructs - and are therefore (you would have thought) amenable to translation via XSLT or other means. For instance a single multiple choice single answer question ought to be translatable into another XML variant. And in fact it is. But, think of the multiple answer question. Suppose it is something like:

Which of the following types of text fields can have variables associated with them in Flash?

Static text fields

@Dynamic text Fields

@Input text fields

(@ denotes the correct answer)

Now this seems clear enough. But in both QTI and QML they could be represented in diverse ways. We could say the right answer is a piece of boolean logic: B AND C AND NOT A. But you could also represent it as +1 for B +1 for C and 0 for A (and we could also score it negatively with a penalty). At this point, we are not just doing an act of straightforward 1->1 translation: rather we must seek to establish a meaning to the question - and reproduce that meaning in the other format.

In this way, the act of translation becomes remarkable similar to that of language translation. If I want to say "the straw that broke the camel's back" in Italian, I do not say "il pezzo di paglia che ha rotto la schiena del camello" (the literal translation). I say "the drop of water that made the vase topple over" (il goccio che ha fatto traboccare il vaso d'acqua"). Similarly here we need to make some sort of guess at the essence of meaning of the question, and reproduce it in the other XML dialect.

But now lets make it even more complicated. Imagine the following question that might appear in a psychology test:

Which of these are valid responses to belonging to unappealing social groups?

a) Leave the group

b) Seek to change the group

c) Remain in the group but complain about it intensely

d) Do passive aggressive things like wearing slovenly clothes whenever the group meets

e) Try to change your attitude to see the good in each member of the group

In this case (as in many in life) there might not be a right answer. If you are keen on trying to change the group and your attitude to it (b) and (e) are a logical choice. A more negative but still coherent choice would be to do (c) and (d) - moan about the group and indulge in passive aggressive behaviours in it. The only thing we do know is that (b OR e) AND (c OR d) represent completely incoherent behaviour (e.g moaning about everyone in the group and trying to see the best in them at the same time). We might wish to score anything except that last condition as being +2 and ever instance of that incoherent condition -2. But you can imagine how difficult it might be for XSLT to do a translation here. Are we doing an accumulation question - is it a boolean logic question - or could it be a multiplicity of boolean conditions with accumulative marking between them? As you can see, this is not ideal territory for the tree-descending symbol-traversing logic of the XSLT compiler.

Finally, there is the question of scoring. The reason for sharing multiple choice questions would ultimately be to put questions from different sources in to the same assessment. However, if for example question bank Topic A contains all questions with a maximum of 10 for a right answer, and all the questions in Topic B, score only 1 for a right answer, we are going to have to do a lot of question tweaking to make the questions work together in the same test. We might wonder if the scoring is properly a property of the question - or rather of the test? Something having meaning during its instantiation among other questions - yes! But as an abstract concept (e.g. this question inherently deserves 10 marks for a correct response) absurd.

The Second Obstacle: The Browsability and Auditability of Questions

For most of us teaching in Higher Education a tedious but necessary part of our work is moderation, by which I mean getting our students' coursework, assignments and tests looked over by our peers both internal and external to our institutions. One of the things a moderator might wish to check is the comprehensibility of our questions (avoiding double negatives in question stems for instance). They might also wish to see the kinds of alternatives on offer. But crucially they ought to be able to see which is the right answer. And they want to do so in printed form. Can you do that in most question authoring environments? I don't think that is possible in questionmark perception 3 (am not sure about 4 as I have not used it). In fact my own Director of Studies made a note on some coursework I submitted in a previous moderation exercise where I just printed the Perception test as it appeared in the browser. "Show which is the right answer" he wrote on the paper subsequently!

And this is not only necessary for the moderator. The fellow academic seeking to reuse your question will also need to know what you regard as being the right answer. This is because the kinds of questions we ask in HE can often be very nuanced and subtle, where an argument can easily be advanced in favour of a competing proposition (I am thinking of Oscar Wilde's thought that "A truth in art is something whose contradiction is also true"). Not all questions in a properly academic setting will have completely unfuzzy boundaries between truth and falsity. Therefore I can't just look at your question and decide to include it in my test without seeing the right answer.

So summing up, the barriers to the sharing of question content are

(1) the legitimate desire to limit the exposure of questions in high stakes summative assessment.

But for all the other questions, the obstacles are;

(2) difficulty of translating between formats and

(3) the difficulty of browsing representations of these questions.

(At this point I am dealing with what I consider to be immutable facts relating to the very phenomena of objective test items - however, equally there also pertain a lot of local and historical issues such as different renderings of questions by VLE engines - Dick Bacon spoke very interestingly on this at the last CETIS assessment SIG [1], comparing QTI 1.2 items as dealt with by Blackboard, WebCt and Moodle.)

The MCQFM Solution

Our solution is to radically simplify both the range of possible questions you can share and also distill them into a simple textual essence which renders them transparent to anyone who wants to use them. In order to be transparent they must be immediately meaningful to the reader. Here are some examples:

MCQ

What is the capital of China?|Beijing is "northern" (bei) "capital" jing

Shanghai

@Beijing

Kunming

MCQ

Which of these territories belong to the People's Republic of China|Only Taiwan is outside the PRC

Taiwan

@Mainland China

@Hong Kong

@Macau

CLOZE

Put in the correct answer here|Ni hao

The Mandarin for hello is @ni@ @hao@

ORDER

Put these chinese cities in order by number of inhabitants|Shanghai is not the capital is the commercial hub and has 20 million inhabitants

Shanghai

Beijing

Xi'an

ORDER

Where do they speak what?|Its mandarin in the two main cities

Beijing@Mandarin

Hong Kong@Cantonese

Shanghai@Mandarin

Now, in the ORDER question, it presupposes some kind of randomisation of the pairs or initial order of the items, but the true answer is immediately visible and comprehensible. The fill-in the blanks has the true answers put inside the blanks as they will appear on the screen. In all cases there is the assumption that the question is a boolean entity: it is either right or wrong. And how you wish to score that is your business - by default it is 1 and 0.

Here are the same questions in the MCQFM interface

Question Editing

Figure 1:MCQFM Question Editing

Test Interface

Figure 2:MCQFM Test interface

Now the purists might baulk at this. How appalling, they might think, that you can't specify alternate spellings or capitalisations in the cloze question. What a shame, they might moan, that you can't have specific feedback depending on the choice you made rather than one piece of feedback for all possible answers. How depressing, they might lament, that the author cannot control the initial associations and orderings to be found in an order question. How grim, they might contend, that you can't have differential nay negative marking in multiple response questions! And in a way they are right. What we are doing here is very similar to lossy compression. We are removing (forever) data that we consider is too miniscule for the hard-working academic to deal with. We have a more brutal palette of question types - we do not have the same levels of sophistication.

But, on the plus side what we have is: instantly auditable and moderateable questions, humanly readable questions, formats that we can share between colleagues via email and text editors, not via cumbersome dedicated question editors. Moreover, these questions yield up to us their meaning at a single glance, whereas a question editor is likely to involve us in all manner of modal dialogs where all the vital matter in a question is secreted away (the wizard for a questionmark single answer multiple choice question involves us doing the stem on screen a, the alternatives on screen b, the scoring of the alternatives on screen c, and the feedback on screen d). The question is: is all the customisation abilities of the traditional specs in any way purposeful for the vast majority of objective testing going on? Might it not be a vast over-egging, a gilding the lily gone ape, a case of all trims and no chassis - or even like one of those restaurants in Gordon Ramsey's Kitchen Nightmares where the precious chef keeps attempting all manner of silly recherche dishes when all the punter wants is a good old sunday roast? Might in fact the abundance of possibilities offered by QML, QTI in its early and late versions but just simply over-the-top? De trop?

MCQFM Site

We think so, and we hope you will too. At the MCQFM site [2] we have an authoring tool for writing questions in the above format, which we can convert into QML and QTI 2.0 so that you can include those questions either in your Question Mark Perception installation, or you can attempt to upload to ADSEL and some of the other QTI tools now available. You might indeed wish to knock out the majority of your QTI 2.0 questions at mcqfm and save the more sophisticated ones for more sophisticated tools like Acquarate. However, you can do much more than that. You can also convert your QML questions back into the textual format above (which for now we are calling QTEXT), or you can covert them out into QTI 2.0 also. Now we cannot guarantee mcqfm will always work for one legitmate reason and some not so legitimate reasons. One legitimate alarming reason that the conversion might not work is that QML quite bizarrely allows the use of angle brackets inside some of its sections. Now adding a < or a > inside that is about as illegal as XML can get. And the XSLT parsers inside MCQFM quite rightly throw up their hands in despair when confronted by such content. However, there are also reasons why some of the conversions don't work. As we have seen, the multiple response question can be a very nuanced and complex thing. It might be sometimes very difficult to resolve highly complex boolean logic into a simple right/wrong duality. Finally of course, there were the limits imposed by time and resources. For instance we have implemented the conversion of the Question Mark FIB (fill in the blanks) question into QTEXT, but we did not have time to do the conversions of the, in some respects very similar, TM (text match) questions.

In addition to all this, we have created a SOAP web service which can be leveraged by other applications to use the functions of MCQFM. (The initial anagram does in fact mean Multiple Choice Questions Five Methods - though in the end we have produced 6: (qtext2html, qtext2qml,qtext2qti,qml2qtext,qml2qti,qml2html). If you use the ordinary web interface (as opposed to the service), the qti sent back will be in the form of a zip file (IMS-JORUM-kosher or one specifically generated for use with R2Q2). If however, you use the service, it will be all enclosed within an XML leaving the onus upon the client to do all the content packaging. On the mcqfm page there are examples of clients written for the mcqfm webservice in both java and vb.net.

MCQFM Future Developments

To a great extent future developments are finance dependent. But things we would really like to do are include more question types in both the conversion from QML and in the authoring in qtext itself. Moreover, we would like to enhance the QTEXT format such that it allows more sophistication. For instance the CLOZE (or fill-in-the-blanks) we could consider altering the format such that it would allow alternate spelling.

CLOZE

The capital of china is:

@Beijing/beijing/Peking/peking@

And of course a variation on this logic might allow for numeric questions to be asked (1.5/1.50) or even (1.5±.2). Or even random factor questions (or in QTI 2.0 terminology template questions - where mathematical values are varied according to different students).

Another possibility would be to allow dropdowns to appear which allows the user to select between alternatives.

CLOZE

The capital of china is:

@Beijing|Shanghai|Hong Kong@

(with the implication that the order of items gets randomized in presentation - the first one being the correct one in the question as authored.)

Beyond this, we might consider a simple addition to the XSLT to allow QML TM (text match) questions to be converted. Finally, more ambitiously, it would be good to include the ability to generate sequences of questions semi-automatically either from:

(a) dictionaries - allowing definitional questions to be created, particularly valuable in langauge learning for instance, but also maybe from

(b) ontologies - that is to say, description of domains which describe the relationships between various concepts in a domain in such a way that questions may be asked to see if the student can correctly understand the field (e.g in java programming you might ask is an object is (a) an instantiation (b) a copy or (c) and extension of a class - and in flash programming you might ask if a symbol is the (a) class (b) copy or (c) the extension of a movie clip instance.)

In both cases a question like this ought to be capable of being generated from a higher level ontology which represents the relationships between those concepts.

Anyway, please feel free to experiment with MCQFM. If you find any bugs we will try to correct them, subject of course to resources and availability. And with special regard to the human resources on this project, I'd like to pay particular tribute to my colleagues Caroline Bettison and Theodorus Parmakis who did most of the work on MCQFM. Caroline for the painstaking way she went about constructing highly complex XSLT transformations which do most of the conversions between formats, and Theodorus for writing the web service and the various clients that consume it.

By the end, the project took much longer than we initially expected - but ultimately I feel it was one of the most worthwhile I have been involved in. As a visual interface, MCQFM is not a shining example of visual richness, but what it does offer is a kind of paradigm of a more simple, minimalist, less baroque way to go about generating objective tests - and hopefully a more web 2.0 way of creating objective tests. A system which can be mashed up with other applications - which does privilege the solid over the florid, which deals with the kinds of questions that you and your colleagues can author, not which ones you and some expert can author. Therefore, please experiment with MCQFM and tell us what you think.

Steve Bennett

References

[1] JISC CETIS Assessment Special Interest Group Meeting

[2] MCQFM project web site

 

Supported by JISC Supported by CETIS
Powered by Plone