In any e-learning system, there exists data regarding all the assessments and the students responses/submission of these assessments.
Such historic data are of tremendous use in providing value added services to end users.

An assessment can be defined as a set of questions.
If testid represents a test and qid represents a question, then

a test is defined by the set testid = {qid1, qid2, qid3 …}

A student attempting such a test can either
1. answer the question correct
2. answer it wrong
3. not attempt it

If students are identified as stu, then a student’s performance in a test can be defined by the set

Performance(stuid) = {testid, stuid, {qids of questions answered correct}, {qids answered wrong}, {qids unattempted}

For example,

Given a test

test1 = {q1,q2,q3,q4,q5,q6}

We can have a possible performance snapshot as

{test1, stu1, {q1}, {q3,q4}, {q2,q5,q6}}
{test1, stu2, {q1,q2}, {q5,q6}, {q3,q4}}
{test1, stu3, {q1,q5}, {q4,q5}, {q2,q3,}}
{test1, stu4, {q1,q6,q2,q4,a5}, {q3},{}}

{test1, stu5, {},{}, {q1,q2,q3,q4,q5,q6}

What can be inferred from this data ?

We can find out
1. for each question the triplet (how many answered correct n(c), how many answered wrong n(w), how many did not attempt n(u) )

Quite simple isnt’t ? Based on this one could deduce 2 useful information

For a given Question,

Difficulty index (DI) = 1 – ( n(c)/ (n(c) + n(w) + n(u)) —— (1)

% answered (PA) = ( n(c)/ (n(c) + n(w) + n(u)) * 100 —— (2)

Even though (1) and (2) arent much different from each other, it does make sense to keep them.
Let us see why.

If a pool of questions that are used to generate assessments can be subject to such an analysis as mentioned above,
Difficulty Index ( this can further be classified as easy, medium & difficult) can form a part of the question metadata.
This information helps
1. teachers to set assessments of varying difficulty levels
2. create tools that generate assessments of varying difficulty levels.

From a student’s point of view, PA is much useful, as it gives the student an instantaneous benchmarking.
In the above example, q4 hasnt been answered correctly by anyone. A student who answers q4 correctly knows that
only 0% of the students have been able to solve it. This instant benchmarking is of immense value.
However, it is worth of note that such benchmarking of question is an ongoing process and the DI index varies over time.

Parameters like age group, expertise level, country etc makes such universal classification of Questions highly debatable.
For instance, a 12 grader could find a 6th grade problem of level “Difficult” to be quite easy.
But there isnt a simple solution to this problem except that the question metadata could indicate the age group in which the
classification (as ‘Diffcult’ or ‘Easy’) hold good.

I shall post my experiences after I implement and roll it out.