What it is understandable to fear, of course, is that the TS software creates TEXAMS of unequal difficulty, such that some students will get easier TEXAMS, and others harder ones. To help clarify this issue, let us imagine eight classrooms, each containing 20 students, and let us imagine a different method of preparing an examination for each classroom. For two of the classrooms, each examination contains only two questions. In one of these classrooms, each question can have a difficulty ranging randomly from 0 (maximally easy) to 10 (maximally hard). We would be right to expect that this is a very unfair examination, as some students would sometimes be confronted with two difficult questions while other students might find themselves greeted with two easy questions.
A computer simulation of what can be expected in such a classroom is represented below by unfilled dots under the heading TWO QUESTIONS/TEXAM, and where we observe pretty much what we feared. To consider only the most extreme cases, one student writes a TEXAM whose average difficulty is 0.0 (Student 18 in the graph), while at the other extreme, another student writes a TEXAM whose average difficulty is 9.5 (Student 10). The students in this classroom have indeed been tested on TEXAMS differing greatly in the average difficulty of their questions, but steps can be taken to make the TEXAMS fairer.
The first thing that can be done is to restrict the permissible heterogeneity of the questions, as for example by allowing difficulty to range randomly only from 4 to 6, whose result is represented by the filled dots under TWO QUESTIONS/TEXAM, and where we now see that the average difficulty of the 20 TEXAMS ranges only from 4.0 (Students 10 and 15) to 6.0 (Students 1, 8, and 18).
The second thing that can be done to reduce unfairness is to increase the number of QUESTIONS/TEXAM, whose results we see in the successive panels beginning with the TWO QUESTIONS/TEXAM panel on the left that we have already been looking at, then proceeding rightward to the successive panels showing TEN, FIFTY, and TWO HUNDRED FIFTY QUESTIONS/TEXAM, comparing which tells us that even the higher-variability TEXAMS (unfilled dots), whose question difficulty is allowed to range randomly from 0 to 10, approach closer and closer to having average TEXAM difficulty equalling 5.0 as the number of questions per TEXAM increases. The lower-variability TEXAMS (filled dots), whose question difficulty is restricted to ranging randomly from 4 to 6, hug the average-difficulty-equals-five line more closely, and by TWO HUNDRED FIFTY QUESTIONS/TEXAM may be said to be guaranteeing TEXAM papers that are just about as equal in difficulty as anybody could ever want or need.
What is meant here by the claim that TEXAMS are guaranteed equal is not that their average difficulty (assuming it can be measured on a scale of 0 to 10) is mathematically equal even to several decimal places, but rather something more like that their average difficulty rounded to the nearest integer will equal 5. Any differences that exist can be considered inconsequential in the same way that we consider a myriad of differences in testing conditions inconsequential — some examinees have better lighting in the testing hall, others have worse; some sitting near radiators are comfortably warm, others sitting near drafty doorways feel chilled; some find themselves surrounded by distracting fellow-examinees, others are able to concentrate amid a sea of silent and motionless ones; some sit at the front where invigilators converge and converse among themselves, others sit far from invigilator bustle; some have worn shoes that pinch, underwear that binds, and sweaters that itch, and others happen to have worn undistracting clothing. A myriad of such inequalities are routinely present and are routinely discounted, and the differences in average difficulty that exist between TEXAMS fall into the same category — as being inappreciable and inconsequential.
A computer simulation of what happens to the average difficulty of the questions on a TEXAM as the number of QUESTIONS/TEXAM rises by factors of five from 2 to 10 to 50 to 250 (left panel to right panel), and as the difficulty of each question is allowed to range randomly either from 0 to 10 (unfilled dots), or else only from 4 to 6 (filled dots).
The simulation envisions 8 classrooms with 20 students per classroom, each student tested on his own unique TEXAM, giving a grand total of 160 TEXAMS, with each of the 160 dots below plotting the mean difficulty of each TEXAM's questions. Data from the two classrooms each of whose TEXAMS contained only two questions appear in the left-hand panel below; the classroom whose question difficulty was allowed to range randomly from 0 to 10 is represented by the 20 unfilled dots in the left-hand panel, and the classroom whose question difficulty was allowed to range randomly from only 4 to 6 is represented by the 20 filled dots in the left-hand panel; and so on.
A rule of thumb (which begins to fail at especially low values of QUESTIONS/TEXAM) is that when the number of QUESTIONS/TEXAM is multiplied by five, measures of dispersion are cut in half.
|TWO HUNDRED FIFTY
FOR TWENTY EXAMINEES
FOR TWENTY EXAMINEES
FOR TWENTY EXAMINEES
FOR TWENTY EXAMINEES
97% confidence interval
for filled dots:
3.37 to 6.63
97% confidence interval
for filled dots:
4.27 to 5.73
97% confidence interval
for filled dots:
4.67 to 5.33
97% confidence interval
for filled dots:
4.85 to 5.15
The Transformational Syllabus examinations that are discussed and recommended in this article, then, are ones in which considerable attention is paid to selecting examination questions that differ from one student to another in appearance and in the correct answer, but that are equivalent in difficulty (one might imagine being allowed to vary randomly only from 4.5 to 5.5) and also where each examination contains a substantial number of questions. The average difficulty of the TEXAMS being recommended, then, would tend to look like the filled dots under TWO HUNDRED FIFTY QUESTIONS/TEXAM that can be seen in the graph above.
It can be readily calculated that the discrete random variable "difficulty" which with equal probability assumes values of 4, 5, and 6, will create TEXAMS whose average difficulty (black dots) falls 97% of the time within the two-standard-errors-of-estimate ranges shown underneath each panel.
Thus, it can be said that TEXAMS come with the equal-difficulty guarantee which is evident in the above graphs. The guarantee is backed by a body of theory and equations known to, and trusted by, all mathematicians and statisticians and researchers. And at the same time, no other method of creating alternative versions of an examination comes with any comparable guarantee. A teacher can try to create a new examination equal in difficulty to an existing examination by formulating new questions which he hopes are equivalent to the questions on the original examination, but where is the guarantee that his hopes are realized? Where are the graphs showing degree of equality or inequality? Where are the equations allowing the computation of confidence intervals? Simply trying to create a unique examination of equal difficulty comes with no such encouragements as graphs or confidence intervals or guarantees. Once any examination has been created, there is no sure way to create an alternative version which is unique and yet equal. The only sure way to create two examinations that are unique and yet guaranteed equal is to create two TEXAMS randomly, and if two unique but equal TEXAMS can be so created, then so can any number.
After today's exam creator has labored to create an alternative form of an examination, all that he finds before him are two versions of the examination, and with no guarantee of equality, and with experience often demonstrating serious inequality. In contrast, when someone writes a TS computer program, at the end of his labor he is able to generate a million versions of the same examination, which is to say a million TEXAMS, all of them guaranteed to be equal, and he or others can continue doing so into the indefinite future. It is the power to create an infinite number of examinations which are both unique and equal which enables the resolution of a host of educational problems.
A COURSE SYLLABUS, as used here, is a specification of the type of questions that are examinable, which is to say that will appear on tests and examinations within a course. It is also possible for a PARTIAL SYLLABUS to cover only a subset of the course, as perhaps only a single imminent test, which PARTIAL SYLLABUS would then identify the question types that would be examinable on that single imminent test. In the case of a final examination covering an entire course, the COURSE SYLLABUS would be identical to the PARTIAL SYLLABUS.
A SYLLABUS can be either a
TRANSFORMATIONAL SYLLABUS (TS), in which every question actually appearing on an exam incorporates differences produced by random transformations, such that every question on any student's test is likely to differ from the corresponding question on every other student's test, and so which makes it helpful to call the unique set of questions presented to a particular student not merely an EXAM, but a TRANSFORMATIONAL EXAM, or TEXAM for short,
or it can be a
STATIC SYLLABUS, in which questions are the same from one examinee's test to another's, and sometimes even from year to year.
A SYLLABUS can also be either an
EXPLICIT SYLLABUS, meaning that the syllabus is disclosed or specified clearly and in detail, leaving no need for the student to infer or guess its contents,
or it can be an
IMPLICIT SYLLABUS, meaning that the syllabus is hinted at or suggested but not plainly expressed, requiring the student to infer or guess its contents.
Of primary interest here is the EXPLICIT TRANSFORMATIONAL SYLLABUS, or XTS for short (pronounced EKS-tuss), which provides an unprecedented degree of syllabus disclosure by means of the TS software being made universally accessible, as for example on all school or college or university computers, and also on the Internet, and also by purchase on optic disc, such that on the one hand a teacher is readily able to feed in the parameters for an examination he intends to give his class, and able to print out for himself as many EVALUATION TEXAMS as there are students in the class; and on the other hand, every student has been able for months or years prior to the examination to also feed in the same parameters, and to similarly print out for himself as many PRACTICE TEXAMS as he might want to practice on for that particular imminent test, or as many as he might need to wallpaper his dormitory room if that is his inclination. Anyone who has access to the TS software, which is everyone, is able to set the individual parameters singly, or can click on something like "Final Examination On Grade 11 Area" to call up the entire set of parameters that had been adopted for that examination. Furthermore, as the student-requested PRACTICE TEXAMS are created by the same computer program, and employing the same parameter settings, as the teacher-requested EVALUATION TEXAMS, the two sets of TEXAMS are indistinguishable. Indistinguishable, that is, with respect to the examination questions, but necessarily distinguishable by perhaps the paper they are printed on, and distinguishable on every EVALUATION TEXAM by the printing of a serial number on every page which identifies when and where and for which student each EVALUATION TEXAM had been printed out.
Strictly speaking, XTS refers to a TRANSFORMATIONAL SYLLABUS which has been made EXPLICIT, but it is possible to test students on TEXAMS in a totally IMPLICIT environment, where, say, the students have never had access to PRACTICE TEXAMS, and may have only approximate information concerning what sort of exam they are about to take, or even no information. Under normal circumstances, such a use of the TS would be considered non-optimal, and yet under certain unusual circumstances it might nevertheless be useful or unavoidable. Accordingly, one can speak of an XTS TRANSFORMATIONAL SYLLABUS, or an XTS course of study, but the computer program which generates the TEXAMS is only a TS computer program, as it knows nothing of the explicitness or implicitness of the testing environment in which its TEXAMS will be employed.
But to continue with the detailing of just what XTS methodology allows, the TS software can also be allowed to freely disclose the parameters used to generate any TEXAM, as for example that in the Alice-to-Irma TEXAMS the width and height of the pedestal rectangles in Levels 05 to 10 were permitted to range randomly from 2 to 5, and that the height of each triangle was permitted to range randomly from 1 to 5 (and even that the permissible colors were selected randomly from red, green, blue, yellow, purple, cyan, and gray). From this parameter disclosure anyone would be able to compute, and TS software could be allowed also to freely disclose, that Level 05 correct answers could range from 4 to 50 square units, and that Level 06 to 10 correct answers could range from 8 to 75 square units, although most students might be expected to avoid burdening their memories with facts of such limited application, and of such little help within the one situation to which they do apply, when the alternative of simply solving the problems offers by far the easiest and surest way of scoring high on the test.
And the TS software can also be commanded to print out not just the correct answers to any TEXAM, but complete solutions as well.
If imaginary student Jack was about to take the same 27-question exam that Alice-to-Irma took above, and if he printed nine PRACTICE TEXAMS for himself to practice on, they would be indistinguishable from the nine Alice-to-Irma EVALUATION TEXAMS displayed above, and which would qualify Jack as being in possession of a very high degree of syllabus definition.
But might that syllabus definition be higher than is good for Jack's mathematical development?
That is, could it be argued that the high explicitness of Jack's curriculum definition will enable him to so narrow his scope of study as to deprive his area-solving skills of breadth, and in consequence leave him flummoxed upon encountering unprecedented Area problems in the future? To such a challenge there can be made at least the following two replies.
The first reply is that the above 27-Level Alice-to-Irma examination was created to assist in explaining XTS methodology, but does not represent a plausible test, and certainly not one covering all K-12 Area problems. A test covering all K-12 Area problems would cover not just the 27 question types shown, but would cover hundreds of question types, which breadth of study dispels somewhat the impression of XTS students mastering only a narrow selection of skills.
More particularly, though, if a course consists of learning to solve say 500 problem types, and the final TEXAM consists of 25 questions, it follows that for every 20 question types studied, only one is examined, and so the TEXAM usually will resemble the EXAM in that the student doesn't know exactly which questions he will be asked to solve, but can nevertheless be said to have high syllabus definition in that he knows the pool from which the TEXAM questions will be selected, and knows also the rules governing the manner of that selection, and knows also the rules governing the transformations that the questions will be subjected to, and of course can access any number of PRACTICE TEXAMS.
The second reply emerges from imagining a gedanken experiment, which, as used here, means an experiment which does not need to be performed because its results are obvious. The proposed gedanken experiment is to compare the proficiency of two groups of same-age students in computing Areas — one group having studied Area under XTS methodology, and the other group having studied Area under the sort of conventional teaching methodology which predominates in the United States and Canada. And the gedanken experiment consists of imagining each group taking two tests — a conventional test and an XTS test.
Is it not obvious what the results would be? — The XTS group would beat the conventional group on both tests. On the one hand, the conventional students would have never been exposed to anything like the complexity of the Alice-to-Irma exam, and they would flunk it. And on the other hand, the XTS students would recognize the conventional test as consisting of the baby problems that they themselves had worked on in the first few days of their Area-computation studies, and they would ace it.
Furthermore, as the claim has been made that XTS seven-year-olds, nominally in Grade 2, would be able to do the Alice-to-Irma examination above, then a more startling gedanken experiment can be imagined, which is comparing XTS seven-year-olds, nominally in Grade 2, to conventionally-schooled twelve-year-olds, really in Grade 7. Is it not also obvious here that a conventional Grade 7 class will find the TS Alice-to-Irma exam above largely incomprehensible, whereas the XTS Grade 2 class will find the conventional Grade 7 examination elementary?
But let us imagine a third gedanken experiment, one which goes out in search of the unfamiliar and exotic Area problems that it is feared the XTS student will be unable to handle. Let us take our two groups of students first to locations where students score high on international math assessments, like Shanghai, Singapore, Hong Kong, Finland, Japan, on the expectation that where achievement is high, questions will be difficult, and the more difficult, the less likely to be familiar. Or let us take our two groups of students to remote and exotic locales, without regard to test scores in those locales, on the chance that the math questions studied there will surprise because isolation has led to that remote math having evolved in a different direction — say to locales like Borneo or Madagascar or Mozambique or Tibet.
But what's the use? No matter how difficult the Area test, and no matter how exotic its origin, what our gedanken experiment keeps telling us is that students who are able to fluently solve the Level 01 to 27 problems on the Alice-to-Irma exam, will always beat conventional students because they are on average only able to haltingly solve the Level 01 to 10 problems. There does not exist any Area exam anywhere on which conventional students can be expected to triumph. Of course it will always be possible to construct Area problems so unfamiliar that they will stump XTS students, but such problems will stump conventional students even more.
XTS students do find TS exams easy in the sense that they are able to work through them briskly and often mentally, all of which must be regarded with approval, as the net result is students who best their conventionally-trained peers in every conceivable competition. The hypothesis that XTS students will be floored by unexpected questions can only mean that they will be more floored than XTS-deprived students, and yet it seems difficult to imagine any data that would confirm such a hypothesis.
Having seen the solution, we now turn to an examination of the educational problems which that solution remedies.
The assigned mathematics or physics or chemistry or biology textbook in a typical university course may be some 800 to 1,200 pages long, and may contain anywhere from 2,000 to 10,000 problems. The student first flipping through that textbook may understandably doubt his ability to master such a large volume of such difficult material, but not to worry — if he sticks with the course, he will come to realize that he only needs to read something like one-quarter that number of pages and he only needs to solve something like one-twentieth that number of problems. Yes, what is reasonable to expect is that something like 19 out of 20 problems in his assigned textbook are beyond the scope of his course, or one might say are not examinable. Before he can write any exam, then, and even before he can begin studying for it, he needs to know what selection from the available pages he really does need to read, and what selection from the available problems he really does need to learn to solve. The alternative of tackling the entire textbook is far beyond what is expected of him, far beyond what any other student is doing, and far beyond his capacity.
If the desired syllabus information is vague or incomplete, which is to say if the syllabus is decidedly implicit, the student may find himself reading advice such as that offered by Joel Hass, Professor of Mathematics and Chairman of the Mathematics Department at the University of California, Davis, from which we are able to infer not only the faculty lack of commitment to anything like an explicit syllabus, but their overt hostility to it. In response to requests for syllabus definition, Professor Hass points not to any single definitive source, but to several different sources, confesses instructors' carelessness in supplying curriculum definition, and in effect acknowledges that his instructors are unconscious of any obligation to supply their students with an explicit syllabus.
More specifically, in blue font below can be found the explicit acknowledgement that professors recycle old exams, sometimes modifying some of the questions slightly — a practice which unfairly rewards students who happen to be aware of the practice and early acquire copies of the old exams (for whom the curriculum may be said to be explicit), and unfairly punishes those who don't (for whom the curriculum can be said to remain implicit).
What will be on the Exam
The main source of information about what will be covered on an exam is the instructor. Probably there is a syllabus stating exactly what will be covered. Possibly the exam topics were announced in class.
But the key period for culling information is the week prior to the exam. This is when the instructor is making up the exam. Before that, the instructor probably has no idea what will be on it. (There are exceptions to this. In particular, there are the instructors who have two exams, and use them alternately every other year.)
In fact, instructors hate to give exams. It's a lot of work making them up and pure agony grading them. [...]
Joel Hass Professor of Mathematics and Chair University of California, Davis 1999-05-26
How to study
1. Go over the assigned homework problems. [...] It is almost always true that if you have mastered the assigned homework problems you will do well on the exam.
2. Dig up old exams and do them. Lots of Practice Calculus Exams can be found at ExamsWithSolutions.com.
It is far easier for an instructor to reuse an old exam then to cook up a new one. Exam questions are really hard for a professor to make up. Questions must be not too easy but not impossible (although this latter restriction is sometimes inadvertently overlooked). All the numbers have to work out nicely. So the instructor will very likely leaf through the previous couple of years' exams and modify some problems slightly. Not more than two years because the older exams are hopelessly lost under piles of papers.
An opportunity for the enterprising student! [...]
Joel Hass Professor of Mathematics and Chair University of California, Davis 1999-05-26
How not to study for the exam
Don't spend a lot of time bothering your professor with questions about what will be on the exam, unless they have completely ignored this issue. They hate this kind of question, and it almost always fails to provide useful information. What will be on the exam is completely obvious — exactly what has been on last year, and the year before. Time spent wheedling is better used studying.
Joel Hass Professor of Mathematics and Chair University of California, Davis 1999-05-26
Consider, as a single example of the unhelpfulness of Professor Hass's advice, the following recommendation: "Dig up old exams and do them. Lots of Practice Calculus Exams can be found at ExamsWithSolutions.com. [...] What will be on the exam is completely obvious — exactly what has been on last year, and the year before."
However, a visit to ExamsWithSolutions.com is more likely to frustrate the student than to help him. In the first place, some of its links are dead. And despite the promise embedded in its domain name, some of its old exams do come without solutions. And if there are solutions, finding them may be tricky.
And if solutions are found, their legibility may leave something to be desired, and in view of the sloppiness with which some solutions are written, their correctness may be distrusted, and searching for a statement from the webmaster that before putting solutions on his website he goes to the trouble of verifying them, and so that he can vouch for their accuracy, turns up no such assurance.
And sapping the student's morale all the while is the recognition that these exams, some composed as long ago as 1994 and some composed as far away as Canada, and of course by lecturers other than his own, and following textbooks other than his own — they all come without promise, let along guarantee, that they will resemble to his own imminent exam. If these old exams are accompanied by any statement on the question of resemblance to future exams, it will be a warning to avoid assuming any such resemblance, even resemblance to future exams at the same university, as can be seen in the Michigan State warning opposite, which requires the student to click on I AGREE before he will be given access to some old Michigan State exams. Perhaps the I AGREE button was placed there on the recommendation of Michigan State University lawyers in response to student complaints that studying old exams ill prepared them for taking new ones.
In short, the UC Davis student following Professor Hass's advice to go to ExamsWithSolutions.com for practice materials may understandably experience frustration and demoralization and anger. The student wants to work, but not on exams that come without solutions, and not on exams that come with illegible or mistaken solutions, and not on exams he can be sure will differ significantly from his own imminent exam.
Evidence of students being frustrated by poor syllabus definition can be found in the news report below of complaints that Professor Platt's syllabus had been implicit — his failing to adequately inform students what he was going to ask them on their final examination, and his distributing contradictory information through his teaching fellows, and his exam striking the students as surprising and as corresponding poorly to the material that he had been presenting in class.
Harvard Students in Cheating Scandal Say Collaboration Was Accepted
By RICHARD PÉREZ-PEÑA
New York Times
Published: August 31, 2012
[...] Students said they were tripped up by a course whose tests were confusing, whose grading was inconsistent, and for which the professor and teaching assistants gave contradictory signals about what was expected. They face the possibility of a one-year suspension from Harvard or revocation of their diplomas if they have already graduated, and some said that they will sue the university if any serious punishment is meted out. [...]
Many students, who posted anonymously, described Dr. Platt as a great lecturer, but the guide included far more comments like "I felt that many of the exam questions were designed to trick you rather than test your understanding of the material," "the exams are absolutely absurd and don't match the material covered in the lecture at all," "went from being easy last year to just being plain old confusing," and "this was perhaps the worst class I have ever taken."
Students complained that teaching fellows varied widely in how tough they were in grading, how helpful they were, and which terms and references to sources they expected to see in answers. [...]
One student recalled going to a teaching fellow while working on the final exam and finding a crowd of others there, asking about a test question that hinged on an unfamiliar term. The student said the fellow defined the term for them.
An accused sophomore said that in working on exams, "everybody went to the T.F.'s and begged for help. Some of the T.F.'s really laid it out for you, as explicit as you need, so of course the answers were the same." [...]
Although usually it is the student who is frustrated at the lack of practice questions that are closely similar to the evaluation questions that will be on an imminent exam, and it is the teacher who has this information, sometimes the teacher is frustrated along with the student:
What's the big deal about SAT cheating?
By Jason Lim 2013-05-31 17:40
[...] During the summer of 1997, I taught a group of twenty plus Korean students the verbal section of the SAT. They were mostly Korean students who went to international schools in Korea and wanted to go to college in the U.S.
The text I used was Barron's, since that was widely available in Seoul's bookstores even then. It had sample questions and tests that mimicked the actual SAT tests. Accordingly, the organization of the mock tests, question types, and difficulty levels were made to be as similar as possible to the real thing.
But it didn't feel authentic. It just didn't feel real. And that's because it wasn't.
Let's face it. Nothing is real except the real thing. And solving real problems is the best way to prepare for the test. This is true for all tests, whether they are SAT, GMAT, GRE, LSAT, or any other standardized tests that are administered by ETS.
Ultimately, this means that I would have used real test questions to prepare my students if I had been able to gain access to them. And I know that I'm not alone in this. I am sure any teacher worth the name would want to get their hands on the best possible material for the students. This is not just a matter of standing out from the crowded SAT marketplace and getting a leg up on the competition. It's also a matter of wanting to do your best for your students.
In the late 1990s, I also taught GMATs. For this class, I actually used real GMAT questions that had been retired because they were made available. I did this because this was the best preparatory material available for my students. And it made total sense to study from questions that had actually been on past test questions. It just makes total sense to solve actual SAT questions to prepare for the SAT's.
The reason this is not allowed is that SAT recycles its questions instead of retiring them after a sitting. Supposedly this is done to even out the levels of difficulty for different versions of the test. But that's a choice made by the test makers to make their jobs easier. And it goes against the natural urge for students to study and the teacher to teach from the best possible preparatory material available.
So, the question we should be asking is, "Why shouldn't SAT makers change their processes in order to eliminate this dilemma that leads to 'cheating'?" It's their process that's creating a distortion between supply and demand. Even worse, it's their process that's making cheaters where there need not be any. [...]
But the more recent SAT "cheating" scandal points more to a systemic flaw with SAT makers themselves, not some imaginary moral flaw in Korean society. Koreans shouldn't be so quick to buy into their own guilt. [...]
To the predicament of the IMPLICIT SYLLABUS, then, XTS offers its unrivaled solution. The student wants to know what's on the exam? — XTS will tell him and show him. The student wants highly-relevant problems to practice on? — XTS will provide him with as many PRACTICE TEXAMS as he cares to request, or covering any part of the course he specifies, in the extreme case covering only the very first lecture, if that's all he wants to work on, and these PRACTICE TEXAMS will be indistinguishable from the EVALUATION TEXAMS that will be handed out to the class on examination day. And the instructor has reason to be pleased as well. He hates creating exams? — Fine, XTS will create the exams for him. The instructor hates marking exams? — Fine, XTS will mark them too. The instructor recycles his old exams? — No need to, XTS will supply fresh ones. The instructor hates being bothered by questions about the upcoming exam? — With XTS, the students will spare him that bother, as XTS will tell them everything there is to know. And XTS will benefit the instructor also in freeing him from constantly having to judge what, in good conscience, he can allow himself to disclose to any student concerning the upcoming TEXAM — his role under XTS is to tell the student everything he knows, absolutely everything that could be of any help. In other words, XTS transforms the instructor from the student's adversary to his ally.
Lecture halls are often crowded.
Which results in a serious loss of privacy.
On high-stakes examinations, teachers most usually address this particular variety of cheating — copying or passing information — by increasing the distance between examinees, combined with heightened invigilation by proctors, as is shown below being done in China, and not by half measure.
When achievable separation seems inadequate, sensory shielding may be deployed:
ABOVE: In the West, the separation achievable in a gymnasium or other large room combined with heightened invigilation is considered both necessary and sufficient for high-stakes examinations such as final examinations or tests like the LSAT or MCAT or GRE.
RIGHT: Tiered lecture halls may also be employed for high-stakes examinations, but only when at least one empty seat is available to separate examinees. The photo shows students who have not yet begun their examination, and so who are often looking straight ahead, or still talking to each other.
However, security in a tiered lecture hall is lower than in a gymnasium. For example, toward the bottom of the tiered-lecture-hall photo can be seen that students are using the continuous table surface between them to store papers and writing tools, which precedent facilitates exchanging written information once the exam begins.
Also, invigilators in a gymnasium find it easy to approach any particular student to inspect more closely what he might be up to, or to tell him quietly to keep his eyes on his own work, whereas in a tiered auditorium, it is impossible to approach most students to check on them or to speak to them without disturbing others. To avoid raising a ruckus in the middle of an exam, the invigilator may avoid confronting cheaters.
The advent of computer-console testing presents the ancient problem of during-exam cheating in a setting where the traditional precaution of separating examinees has been rendered impracticable:
American students in a computer fundamentals class taking a computer-based test. Photo and caption are from en.wikipedia.org/~
As copying from nearby examinees is the most obvious and available method of cheating, reliance on it can be expected to be frequent, and lackadaisical attempts to block it can be expected to be circumvented:
FAKING THE GRADE: EFFORTS TO STOP CHEATING OFTEN FALL SHORT
MORE EMPHASIS HAS BEEN PLACED ON TAKS [Texas Assessment of Knowledge and Skills], NOT ON CATCHING COPIERS
By Joshua Benton and Holly K. Hacker The Dallas Morning News, June 5, 2007
[...] Texas sets no standards for the physical arrangement of students on test day. It has no rules for how far apart students should sit from one another. It also has no rules on seating arrangements, so that in many schools, best friends are allowed to sit within easy view of each other's answer sheets. (Some school districts voluntarily set tighter rules.)
During tests with higher levels of security, such as the SAT [Scholastic Aptitude Test], there are usually firm standards on the distances between students, and students are generally not allowed to choose where they sit. Would-be cheaters are sentenced, at least, to eyestrain.
Texas also does not require schools to keep charts recording where students sit. If two students have identical and highly unlikely answer sheets — but there is no way to know whether they sat near each other — can be difficult for an investigation to proceed. Texas doesn't even require schools to record what classroom a student was in when he or she took the TAKS.
"Knowing where they sit can take the evidence from a statistic and make it stronger," said Dr. Frary, a retired professor of educational measurement.
Some local students said their teachers let them sit pretty much wherever they wanted during TAKS testing. Others said their teachers were more aggressive about assigning seating. But even when teachers take preventive steps, students find ways around them.
Priscilla Ramirez, a student at DISD's Adamson High School, said that on one test day in her class, students were assigned numbers as they walked in. Each number corresponded to a desk, which was intended to randomize seating. But, she said, some students just traded their numbers to sit next to their friends.
Cheating "is sort of like an epidemic," she said. "It's not going to stop unless you really, really, really try." [...]
When it comes to security on state tests, Mississippi is one of the gold standards. It requires at least two adults in the testing room at all times — and in some cases, even more. Violators of state test-security rules can, in the most extreme cases, be fined or face jail time.
State auditors make at least one unannounced visit to every school district each year on testing days to inspect the security procedures in place. If students are seated too close together to meet state guidelines, for instance, schools are told to spread them out. Seating students at cafeteria tables so they face each other — which is allowed in Texas — is prohibited in Mississippi. [...]
"The basic ground rule is, whatever security measures go in, somebody is going to find out how to evade them," said David Berliner, a professor of education at Arizona State University and a critic of high-stakes testing. "Banks are still broken into." [...]
Perhaps it is only following the advent of high-tech communication technology that during-exam cheating has been able to take place on a grand scale even during high-stakes examinations:
At Top School, Cheating Voids 70 Pupils' Tests
By AL BAKER New York Times Published: July 9, 2012
Seventy students were involved in a pattern of smartphone-enabled cheating last month at Stuyvesant High School, New York City officials said Monday, describing an episode that has blemished one of the country's most prestigious public schools.
The cheating involved several state exams and was uncovered after a cellphone was confiscated from a 16-year-old junior during a citywide language exam on June 18, according to a city Department of Education investigation.
Cellphones are not permitted in city schools, and when officials looked into the student's phone, they found a trail of text messages, including photos of test pages, that suggested pupils had been sharing information about state Regents exams while they were taking them.
Sixty-nine students had received the messages and responded to them, the department said. [...]
Despite cell phones giving fresh life to during-exam cheating in high-stakes examinations, it is in the more numerous and lower-stakes term tests that the greatest volume of such cheating occurs because these are typically administered in the same crowded lecture halls in which lectures are delivered, like the one below.
And if during-exam cheating is an omnipresent threat to exam integrity even when reasonable preventative measures are taken, it mushrooms when all preventative measures are removed, as seems to have been done in the United States Air Force Academy in 2012:
Air Force Academy: Up to 78 Cadets Busted for Cheating on Test
Jun 07, 2012 Associated Press by Dan Elliott
DENVER — Up to 78 Air Force Academy cadets cheated on an online calculus test by getting help during the exam from a website, the academy said Wednesday. [...]
Cadets took the test on their own, outside the classroom and without supervision. The test will no longer be given online, he said.
The academy's math department began looking into the possibility of cheating when a number of cadets who had done well on previous [weakly-invigilated term] tests failed the final [strongly-invigilated] exam. [...]
When the 2012 cheating scandal above is added to the historical record below, the conclusion that is suggested is that cheating is endemic in the United States Air Force Academy, and never stops being re-discovered because no feasible solution to it is ever found:
The first Honor scandal broke in 1965, when a resigning cadet reported knowing of more than 100 cadets who had been involved in a cheating ring. One hundred and nine cadets were ultimately expelled. Cheating scandals plagued the Academy again in 1967, 1972, 1984, 2004 and 2007.
And so during-exam cheating is a conventional-examination defect which the TEXAM is able to usually entirely block, and at the very least to render prohibitively laborious. That is, as every TEXAM differs from every other, then even during crowded testing, nothing is gained by catching glimpses of a neighbor's work, and the better-prepared student finds it colossally more difficult to help his under-prepared friend.
Even cell-phone communication poses a reduced threat to examination integrity under TS testing. That is, in conventional examination all EXAMS are identical, and so a single examinee can inform seventy other examinees of correct answers. Under a TS regime, however, the better-prepared student would have to receive from his worse-prepared friend a photo of the question that was stumping the latter, and take time from his own work to solve that friend's question, and then would return the answer to him alone, as no other student would have that same question on his own exam.
To wrap up — for the Chinese teachers whose outdoor solution is illustrated above, the TS renders unnecessary the very wide separation made possible by open-air testing, as the smaller separations available indoors would suffice, and so that equitable testing would become available in all seasons and in any weather. All the many precautions to frustrate during-exam cheating that we have seen being taken above, sometimes with only small effect, would become unnecessary. The Transformational Syllabus TEXAM by itself would practically wipe out during-exam cheating.
But during-exam cheating is only one kind of cheating, and not the most important. Obtaining a copy of the exam beforehand is the variety that contributes most heavily toward an inequitable allocation of grades, because it is practiced by a larger number of students, and because it provides each cheater with a greater improvement in his score than peeking at his neighbor's work is likely to.
Exam previewing is sometimes initiated by a single student obtaining, and often sharing, a copy of the exam beforehand:
Alberta Education uncovers cheat ring for Math 30 test
Pete Curtis and Lisa Grant 660 NEWS ALL NEWS RADIO May 01, 2010
Alberta Education officials believe an adult student from Calgary emailed details of the Math 30 test to students in Edmonton.
The Pure Math 30 test is kept under wraps, like all diploma exams, until moments before the tests. Even the teachers don't know the questions.
However, exceptions are made for the 30 to 50 students who are out of the province for family emergencies or sports competitions.
The Calgary Herald reports the woman, overseas at the time, gave false information as to where she was going to take the test. After receiving it a couple of days before the exam, the document was apparently scanned and e-mailed to friends back in Alberta.
The whole thing might have gone unnoticed if it weren't for one student who asked his Edmonton teacher how to complete a specific question. While flipping through the exam the next day, the teacher noticed the exact same question.
Geography sometimes makes it impossible to administer examinations simultaneously:
SAT's Legacy of Cheating
Sep 28, 2011 5:40 PM EDT
[...] Last year a lecturer at a Korean "cram school" was arrested after taking the SAT in Seoul and then emailing the questions to two Korean students studying in Connecticut. Because of the time difference, the two students were able to research the answers. They scored 2,250 and 2,210 points each, out of a perfect 2,400 — much higher than their previous attempts — which triggered an audit. [...]
The illicit preview is sometimes made possible not by a lone individual, but by an entire group.
Schools told not to stagger exams amid allegations of cheating
Schools have been ordered to hold 11-plus exams on same day after staggered tests (to avoid Jewish students working on the Sabbath) saw the pass rate double amid allegations of cheating.
The Telegraph Published: 05 Nov 2010
Parents claimed two-day tests allowed pupils from the first exam day to leak questions to candidates taking papers the next day.
Police in Ilford, Essex were called in earlier this year after cheating allegations were made.
Now, after an investigation, education officials from Redbridge Council in East London have arranged for all pupils to take the test at the same time.
Last year more than a thousand pupils had a choice of taking the exam on either Saturday or Sunday.
The two-day option was introduced to help children with religious requirements. The area has a large Jewish population.
But parents posting messages on internet forums claimed earlier this year that some children had been encouraged to pretend they were ill or use religion as an excuse — so that they could take the test on Sunday.
Eleven-plus tutors would call Saturday candidates and get them to reveal exam questions. They could then pass on answers to their pupils before they entered the exam room, it was alleged.
Twenty-nine per cent of the 365 boys and girls who took the Sunday paper got through, while only 14 per cent of the 1,065 candidates who took the Saturday exam passed.
It commonly happens that a large class is chopped into sections, sometimes taught by the same instructor, sometimes by different instructors, a practice that isn't going to go away because there are good reasons for it. For example, as the different sections meet at different times, students are able to choose the meeting times that do not clash with their other courses. Or maybe a large auditorium able to accommodate all students is unavailable, but smaller classrooms are. Or maybe students are expected to participate more in smaller classes, or have greater access to their instructors. In any case, if examinations are given during regular class times, some students will be tested earlier than others, and if the examinations are identical, then a leakage of information from earlier to later examinees is bound to occur. Obvious though this may be, the instructor will often feel that the volume of cheating will be too small to justify the labor of creating different versions of the examination, and may be further deterred from doing so by concern that the alternative versions might differ in difficulty, or at least might be complained of being of unequal difficulty even when they are not.
The following account is useful in detailing how such cheating might take place, even though a reading of the full article reveals that the particular student suspected was ultimately deemed to have been innocent:
There were three physics classes, taught in periods 4, 5 and 6. Charles Pryor was in sixth period. Happy Jack gave the same tests to all three periods. Therefore, during period 5 Charles Pryor could meet with a student from period 4 who had just taken the test, find out what the questions were and look up the answers before period 6. In addition, Happy Jack Fielder would sit in the back of the classroom while the tests were being taken and grade the tests handed in by the previous students. Charles Pryor could go to the back of the room to ask the teacher a question, lean over Happy Jack's shoulder and see what answers the other students had given and whether Happy Jack had marked them right or wrong. We became convinced that this was what Charles Pryor was doing.
Sam Sloan, A Story About High School Cheating www.ishipress.com/~
Under the TS method, however, TEXAMS taken earlier convey exactly the same information as the PRACTICE TEXAMS which everybody has already been running off and studying, and has no shortage of. Earlier exam takers would be welcome to pass on their exams to later exam takers, as doing so would confer no advantage. TS examination methodology completely resolves the problem of preview cheating.
In the four examples immediately above, preview cheating was made possible by the same exam being given on two or more occasions that were separated by hours or at most days. A recycled exam is simply the same exam administered between longer time intervals, most usually the one year between successive deliveries of a course, but in one instance farther below, after an interval of five years.
The American Board of Radiology (ABR) scandal furnishes an outstanding example of exam recycling, and which serves to illustrate the principle that the longer a recycled examination continues in service, the more organized and successful become efforts to steal its contents and to disclose them to future examinees, often along with correct answers:
Exclusive: Doctors cheated on exams
By Scott Zamost, Drew Griffin and Azadeh Ansari, CNN
updated 1:20 PM EST, Fri January 13, 2012
(CNN) — For years, doctors around the country taking an exam to become board certified in radiology have cheated by memorizing test questions, creating sophisticated banks of what are known as "recalls," a CNN investigation has found.
The recall exams are meticulously compiled by radiology residents, who write down the questions after taking the test, in radiology programs around the country, including some of the most prestigious programs in the U.S. [...]
Asked if this were considered cheating, Becker told CNN, "We would call it cheating, and our exam security policy would call it cheating, yes."
Radiology residents must sign a document agreeing not to share test material, but a CNN investigation shows the document is widely ignored. Dozens of radiology residents interviewed by CNN said that they promised before taking the written test to memorize certain questions and write them down immediately after the test along with fellow residents. [...]
The practice of sharing exam answers is so widespread and considered so serious in the medical community that the ABR has put out a strongly worded video warning residents that the use of recalls must stop.
"Questions and answers have been memorized, sometimes verbatim, and contributed to extensive archives of old ABR test material that become the prize possessions of many residency programs," Becker said in the video, which appears on the board's website.
He said "accumulating and studying from lists of questions on prior examinations constitutes unauthorized access, is inappropriate, unnecessary, intolerable and illegal." [...]
Webb, 31, said he failed the first radiology written exam, which focuses on physics, in the fall of 2008. He said the program director at the time, Dr. Liem Mansfield, told him to use the recalls in order to pass.
"He told me that if you want to pass the ABR physics exam, you absolutely have to use the recalls," Webb said. "And I told him, 'Sir I believe that is cheating. I don't believe in that. I can do it on my own.' He then went on to tell me, 'you have to use the recalls,' almost as if it was a direct order from a superior officer in the military." [...]
Cheating on the part of FBI agents serves to further illustrate how pervasive cheating is, but with so many cheating methods mentioned in this one report as to make it impossible to place it within a single cheating category, although the availability of "answer sheets" and "study guides" suggests that an oft-recycled exam is being reconstructed perhaps using methodology similar to the one being used by radiologists.
FBI employees reportedly cheated on security test
By James Vicini
Mon Sep 27, 2010 11:24am EDT
(Reuters) — FBI agents and several supervisors cheated on an exam about new rules for terrorism and criminal investigations and for collecting foreign intelligence, according to a U.S. Justice Department report released on Monday.
The report by inspector general Glenn Fine found that some FBI employees improperly consulted with others while taking the exam, and others used or distributed answer sheets or study guides that essentially provided the answers to the test.
A few FBI employees, including several supervisors and a legal adviser, exploited a programing flaw to reveal the answers on their computers, according to the investigation into four FBI offices around the country and several individuals.
The report found significant abuses and cheating involving at least 22 employees. [...]
University testing, of closer interest here, is similarly compromised by exam recycling:
When I attended Stanford, cheating in the exam rooms was low, but students cheated by covertly obtaining the teacher's earlier exams and answers.
George M. Sicular, Engr. '72 Palm Desert, California Letters to the Editor, Stanford Magazine November/December 2003
An examination recycled even after five years is still a security lapse that can give some students an immense advantage:
Past paper exam 'cheats' off the hook
Times Live, Johannesburg
Oct 31, 2010 11:35 PM By PREGA GOVENDER
Twenty students who were banned from writing an exam until November next year because of alleged cheating are off the hook — after the examiner admitted that he had set a question paper identical to one used five years ago.
The students, based at the Ekurhuleni West Public FET College's Kempton Park campus, were banned from writing the N6 fluid mechanics paper, which is scheduled to be written next month, after nine of them scored 99% in an earlier paper written in July.
They were recently informed in writing by the Department of Higher Education and Training's exam's irregularities committee that they had been found guilty of contravening the exam rules. [...]
But the students said they had done well because they had thoroughly gone through previous question papers.
Lucas de Lange, examiner of the fluid mechanics paper, this week apologised to departmental officials for setting the same paper he had set in 2005.
"I told them they could fire me. They warned me not to do it again." [...]
"I don't know how they got to it but they knew which paper was coming out. That's the part that bothers me. I do know that they work through previous years' question papers."
But De Lange admitted that the students had not deliberately cheated. He said he had recommended that the ban be lifted and that the students be allowed to rewrite the paper this year.
One of the students said: "Our only crime was working through past year question papers."
In view of what we have just been reading about recycled examinations, one of the statements made by UC Davis Mathematics Departmental Chairman higher above may now be evaluated in a less indulgent light: "there are the instructors who have two exams, and use them alternately every other year." Well, we are now emboldened to ask, won't students who catch on to what these instructors are doing have a rather easy time of it? And as Professor Hass refers not to a single instructor, but to instructors, one may wonder just how many such biennializing instructors there may be. And Professor Hass publicly broadcasting that such biennial recycling does take place within the Mathematics Department over which he presides — won't that place some UC Davis math students on high alert for these particular recycling patterns so as to be sure that they themselves don't end up at the bottom of the class for neglecting to capitalize on the previewing that recycled exams allow?
K-12 testing is, of course, also contaminated with recycled examinations, and which in the circumstances described in the Parents irked article below comes with three security breaches, each identified within the article like so: [Security Breach 1]. Where canny educators know that recycled examinations are likely to be intercepted and previewed even when the recycling is kept secret, and when no security breaches are evident, the case below shows us educators broadcasting the fact that examinations are recycled, and openly permitting and broadcasting the existence of security breaches, with no recognition that their negligence throws the doors wide open to recycled-exam cheating on a massive scale.
Also noteworthy in the Parents irked article below is mention of a reason for recycling exams on top of the reason of reducing teacher labor, which additional reason is to permit "benchmark assessments", by which is meant historical or geographical comparisons, also marked below with something like [Benchmark Historical]. Benchmark assessments will be discussed below in the section titled BENCHMARK COMPARISONS.
And also noteworthy in the Parents irked article below is the photographic reminder of inadequate precaution to prevent during-exam cheating.
And finally noteworthy in the Parents irked article below is the documentation that recycled examinations arouse the frustration of those left out of the preview loop, those in other words who suffer the disadvantage that an IMPLICIT SYLLABUS inflicts, which is not knowing what's going to be on upcoming exams, and therefore not having optimal practice materials. Parents en masse complain that the IMPLICIT SYLLABUS deprives them of the information they need to teach their children whatever it is that they are being failed for not knowing.
Parents irked over no-peek test tradition: Schools want to reduce cheating|
BLIND TESTING Students at Field Elementary School work on a quiz in class. Some parents are frustrated that graded tests are not routinely sent home with their children for them to review.
By Jonathon Braden COLUMBIA DAILY TRIBUNE February 3, 2009
Harry Williams just wants a chance to see the tests his kids have taken at school.
"How can I help my children if I can't see the test they had?" Williams asked at a recent Black Parents Association of Columbia Public Schools meeting.
Williams is among a number of parents, including Board of Education member Ines Segert, who are frustrated that graded tests are not routinely sent home with Columbia Public Schools students. While the school district leaves that decision up to teachers in most cases, many avoid sending tests home with students so they can reuse the same test the following year. In those cases, parents who want to see their students' work have to arrange a time to review the tests at school. [Security Breach 1]
"It's fear of cheating. Let's be honest," said Michael Muenks, coordinator of curriculum and assessment for the Missouri Department of Secondary and Elementary Education.
Local school district administrators acknowledge that test security is a primary driver for the policy, but they have a nuanced view of why that's important. For example, some teachers need to use the same test every year so the school district can gather reliable data for comparing student performance from year to year. [Benchmark Historical]
"We're in the process of trying to make sure our tests are consistent across the district," [Benchmark Geographical] said Chip Sharp, the district's math coordinator for grades 6-12. [...]
In some cases, the district does not allow teachers to send graded tests home with students. Examples are Connected Math, taught in grades 6-8, and Integrated Math 1 and 2, taught to junior high or early high school students, in which data are collected for benchmark assessments. [Benchmark Assessment]
Segert, a University of Missouri psychology professor, doesn't buy the argument that teachers and school officials can't come up with new tests every year and still accomplish the goal of year-to-year comparison. [Benchmark Historical] "There's no reason they can't pick an equally good question," she said. [...]
But, as Williams and others have asked, what are parents to do if they can't see the problems their children are missing on tests? [...]
Muenks, the state curriculum and assessment coordinator, said it's a common practice among local school districts to keep tests at school, but he knows that's not always popular with parents. "By" schools "trying to be thrifty and keep these things secure, they are inconveniencing parents," he said.
As an alternative to requiring parents to come to school to see tests, Muenks said, schools could have a test folder that students carry home at a parent's request and take back to school. [Security Breach 2] Schools could also post a digital PDF image of the test on the schools' Web site for a couple of days, Muenks said, and give parents access [Security Breach 3] with a username and password. [...]
She points out that district officials say they want parents to partner with the district for students' education. "How can they be partners," she asked, "if they're not allowed to see what" their kids "did wrong?"
Professor Richard Quinn, as described below, recycled an examination that had been created not by himself but by the publisher of the textbook that he was using, and with the same harmful effect of permitting student previewing. Mention in the second of the two articles below that his students took his midterm "over the course of four days" suggests that there may have been four different sections of the course, and that the classic previewing of the sort described in the Happy Jack example above may have played a role in this incident as well.
Criticism Surrounds Test Used In Cheating Scandal
wftv.com Posted: 5:50 pm EST November 13, 2010
ORANGE COUNTY, Fla. — A student in Professor Richard Quinn's business class posted a new video on YouTube. The video is from the first week of class, when Professor Quinn told students he writes his own mid-term and final exams.
"There's an opportunity that I may very well write a question that I couldn't even answer. I try not to do that, but it happens from time to time," said Professor Quinn.
But it seems Professor Quinn never wrote the mid-term exam his students cheated on. It was written by the publisher of the textbook for his business class. One student found a copy on the internet, and passed it on to others.
"I hate to say this, but because of the professor's laziness, it basically gave the students an opportunity to cheat," said David Tate, a UCF business student.
But UCF spokesperson Grant Heston told WFTV "it's not uncommon for higher education professors to use these pre-made exams produced by the publisher."
WFTV talked to students who said they know exactly how to find them.
"You can buy the answer manuals to most publishing textbooks on like Half.com or Amazon.com and I've known friends who have bought the answer manuals," said UCF student Michael Atteo.
Professor Quinn has been with UCF ten years. Eyewitness News asked if he would be punished for using a test that's so easily accessible online.
"It's irrelevant. The focus shouldn't be on the professor, but on the students who used the test inappropriately," said Heston.
Whereas one might view students who legally purchased study materials as having done no wrong, and in fact as deserving commendation for following a study strategy that earned them high marks, the administration chose to view them as having cheated, and the press chose to view professor Quinn not as a blunderer but as a "folk hero" for his aggressive portrayal of himself as a victim of student perfidy.
UCF business instructor becomes folk hero after taking hard line on cheating
By Richard Danielson Tampa Bay Times Staff Writer November 12, 2010
ORLANDO — It was a lecture Richard Quinn hoped he would never have to deliver.
Faced with evidence of cheating by up to a third of his class — 150 to 200 students — the University of Central Florida business instructor confronted them in a weekly lecture.
"To say I'm disappointed is beyond comprehension," Quinn said, his voice quavering with indignation. "Physically ill. Absolutely disgusted. Completely disillusioned." [...]
In a case even more extreme than that of the American Board of Radiology examinations above, the US Navy recycling its nuclear-qualification exam allows time for an answer key to be prepared and to be handed out to examinees along with the exam paper:
Nuclear Sub Cheating Scandal
by Christopher Brownfield
Sep 22, 2010 6:46 PM EDT www.thedailybeast.com/~
A large number of nuclear submarine drivers never legitimately passed their qualification exams.
In the U.S. Navy, thousands of competent sailors and officers drive nuclear submarines 24/7 with discipline, dedication, and skill. The problem is that a large percentage of these personnel never legitimately passed their nuclear qualification exams. And it's not entirely the fault of the crews—the system is seriously broken from the very top. [...]
My fellow officers were surprised by my failure, and wondered aloud why I hadn't used the "study guide." When my second exam arrived, so did the so-called study guide, which happened to be the answer key for the nuclear qualification exam I was taking. I was furious. Defiantly, I handed back the answer key to the proctor and proceeded to take the exam on my own. I failed again. [...]
The most competent junior officer on our ship ran to my rescue, confiding that none of the other officers had passed the exam legitimately; the exam was just an administrative check-off. "Swallow your pride," he told me, and just get it done. [...]
And the US Army scandal may be still worse, seeming to involve a large number of different exams and a vast number of examinees:
Army knew of cheating on tests for eight years
Hundreds of thousands of exam copies used, Globe probe finds
By Bryan Bender and Kevin Baron
Globe Correspondent December 16, 2007
Army program analyst Al Kahn and training technician Sherry Beardslee help those taking tests in the Army Correspondence Course Program from their office at Fort Eustis, Va. The program is under fire from many who say rampant cheating is taking place by soldiers who are getting test answers from Internet websites. (Dina Rudick/Globe Staff)|
FORT EUSTIS, Va. — For eight years, the Army has known that its largest online testing program — which verifies that soldiers have learned certain military skills and helps them amass promotion points — has been the subject of widespread cheating. [...]
Conklin said that cheating was "almost universal" in her unit, and that she was told it was none of her business when she tried to report it.
"The data to catch the cheaters is right there," she said in an e-mailed response from Germany. "The Army has this data in their hands."
During the eight years of Army inaction, the cheating problem grew steadily. The Globe investigation found that the cheating epidemic has involved tens of thousands of soldiers. Computer records from one site, called ShamSchool, created by the soldier who was the subject of news reports in July, show more than 200,000 downloads of packages containing the answers to multiple exams in just the 11 months from September 2006 to this past August. They included:
42,839 downloads of a package of engineering tests, covering subjects including explosives and demolitions, detecting mines, building trenches, and other forms of combat engineering;
19,570 downloads of a package of what the Army calls "interschool" exams, covering attack helicopter formations, chemical detection and contamination, and infantry field hygiene;
18,891 downloads of air defense artillery examinations; and
13,282 downloads of the course package for the Quartermaster Corps.
In August, commanders at the Army's 101st Airborne Division in Kentucky ordered the soldier operating ShamSchool to take down the test materials. He did, but made the same information available on another site. Then, the day after he was allowed to leave the Army on a general discharge in October, he reposted the pirated exams on ShamSchool, which remains active.
But there are many other avenues for obtaining test results, including links on Yahoo and Google message boards, among other heavily trafficked websites. In August, a package of more than 830 Army examinations was offered for $24.99 on eBay.
In October a new service popped up: For a relatively modest fee, someone will take the exams for any soldier who agrees to pay. [...]
Despite the ubiquitous evidence of the impropriety of using recycled examinations, even established testing services find themselves caught in the same old trap:
Computer Admissions Test Found to Be Ripe for Abuse
By WILLIAM CELIS 3d
Published: December 16, 1994
The Educational Testing Service, which administers the nation's leading standardized admissions test for graduate schools, boasted when introducing a new computerized version a year ago that the test was nearly infallible. But a company that coaches students got suspicious last summer that the new version was easy to cheat on.
The company, Kaplan Educational Centers, started hearing the same questions over and over from students they were coaching who had taken the test at various times. The implication was clear: The testers were recycling some questions. If this was so, an enterprising student could memorize the questions and share them with friends taking the test later, or improve his results by taking the test again, or put the questions on the Internet, or sell them.
So earlier this fall Kaplan sent three employees undercover to take the test, called the Graduate Record Examination. Sure enough, the three were able to memorize so much of the exam that Kaplan could construct a replica. Kaplan went straight to the Educational Testing Service, and said, essentially, here is our version of your test. Kaplan said that 70 to 80 percent of their questions matched the real thing. [...]
However, under the Transformational Syllabus, examination recycling, and its attendant evils of exam interception and previewing, would become relics of primitive times in the history of education.
Homework Occasions The Most Cheating
Homework is essentially an examination bearing the worst characteristics that have been noted above — the student does the work wherever and whenever and for however long he wants, and in the absence of proctoring is able to rely on any amount of help, and so which turns homework into the most cheated of all scholastic exercises. Today's Internet expands the sources from which help can be obtained beyond copying from classmates, and beyond getting help from parents — the contemporary student can now add to his list of aides perfect strangers willing to work cheap. Note that although most frequently mentioned in the article below is to the foreign preparation of essays, "maths papers" can also be provided, which possibly means solutions for the student's math homework.
Australian students cheat by outsourcing homework on foreign websites for 2 U.S. dollars
14 November 2010
CANBERRA, Nov. 14 (Xinhua) — Australian high school and university students have been found outsourcing their homework to websites in India, Pakistan and Egypt which provide English essays and maths papers for as little as two U.S. dollars, Australia's media reported on Sunday. [...]
The Sunday Telegraph tracked down one worker offering his services, graduate Mohammed Ali Khan, 23, of Islamabad, Pakistan.
"It's my part-time job," Khan told The Sunday Telegraph on Sunday. "I get work from all over the world including Australia, the U.S. and the United Kingdom."
Schools are powerless to stop cheaters using the outsourcing services, because custom-made work cannot usually be detected by plagiarism software. And Academics expressed concerns about the new customized cheating factories on the net.
According to University of Western Sydney associate dean Craig Ellis, in the past five years there has been an explosion in sites where you can download pre-written assignments. [...]
Homework Returns The Slowest Feedback
Efficient learning requires immediate feedback, but consider what happens when a student translates ten homework sentences into French at 07:00 pm. For each sentence translated, he lacks confidence in his answer, and sometimes he knows that his answer is a wild guess and almost certain to be wrong. At other times, he is confident that his answer is correct, but it is not. What he is immersed in as he churns out his homework is bad French — misspelled and ungrammatical and gap-ridden, and in places totally botched. What is his attitude supposed to be to this French in which he is immersed? Ideally, he should feel that all the French he encounters needs to be internalized because it is correct French. The homeworking student, in contrast, would be operating optimally by assuming an attitude of rejection, of refusing to internalize the French that he is immersed in because he knows it to be largely defective. And so when does he ultimately get the right translations? Next day at the soonest, but as his next French class may be several days off, even later than that. Immersing a student for days at a time in bad French of his own creation, and in memories of that bad French, is hardly good educational practice.
Homework Provides The Lowest Quality of Feedback
And then what is the nature of the feedback that the student does get? If the teacher collects the homework to grade and correct it, the feedback is further delayed, and sometimes the teacher may correct the homework only cursorily, and sometimes not at all. When researcher Lucius Cervantes asked high school dropouts "Was there anything in particular that you disliked about going to school?", one answer he got was:
One thing I didn't like at Rindge was the way they marked your homework assignments. You might work two or three hours on an assignment and when they turn it in they just check your name. And if you don't have it you don't get a check. They don't look to see what you have done.
Lucius F. Cervantes, The Dropout: Causes and Cures, University of Michigan Press, Ann Arbor, 1965, p. 84.
And if the teacher does not collect the homework to correct it, he might solicit the correct answers in class, but here again are encountered inefficiencies. Some of the student answers will be wrong, so the class is immersed in more bad French. If an answer given orally is adjudged to be correct, other students may make mistakes writing it down. And given that a sentence might easily have several correct translations, is every student who has an alternative answer going to get a chance to have his answer evaluated?
The XTS method rids education of homework cheating simply by giving no credit for any unproctored work. The student works through PRACTICE TEXAMS on whatever topics and levels of difficulty he chooses, and whenever and wherever and for however long he wants, either at a computer console or on hard copy, and when he is ready to get credit for his scholastic achievement, he takes an EVALUATION TEXAM in a proctored setting, again either before a computer console or on hard copy, either in class or in a special examination facility.
The student's feedback (correct translations in French, say, or correct solutions in Mathematics) is available as soon as the student asks for it, and that feedback is detailed and correct without burdening the teacher with the vast labor of providing it to individual students by hand.
TEACHERS CHEATING ON BEHALF OF STUDENTS
A vast number of teacher-cheating scandals are on public record, among which the following figures prominently:
America's Most Outrageous Teacher Cheating Scandals
by Lois Beckett
Sept. 19, 2011
[...] Teachers in Atlanta were so used to changing students' answers on standardized tests that they gathered for "erasure" parties and prepared answer keys on plastic transparencies to make the cheating easier. One teacher told investigators that she feared retaliation if she didn't participate, saying the district was "run like the mob." At least 178 teachers and principals have been implicated in the scandal, which was first brought to light by the Atlanta Journal-Constitution. [...]
Tubing, explained below, can be an effective way for school officials to cheat when a test booklet's right edge is taped shut, but the top and bottom edges of the booklet aren't:
Under Pressure, Teachers Tamper With Test Scores
By TRIP GABRIEL New York Times
Published: June 10, 2010
The staff of Normandy Crossing Elementary School outside Houston eagerly awaited the results of state achievement tests this spring. For the principal and assistant principal, high scores could buoy their careers at a time when success is increasingly measured by such tests. For fifth-grade math and science teachers, the rewards were more tangible: a bonus of $2,850.
But when the results came back, some seemed too good to be true. Indeed, after an investigation by the Galena Park Independent School District, the principal, assistant principal and three teachers resigned May 24 in a scandal over test tampering.
The district said the educators had distributed a detailed study guide after stealing a look at the state science test by "tubing" it — squeezing a test booklet, without breaking its paper seal, to form an open tube so that questions inside could be seen and used in the guide. The district invalidated students' scores.
The following report on cheating at the Jesse Jackson charter school presents observations whose interpretation is unclear, but which seem to originate from a peculiarly-inept instance of teacher cheating:
FAKING THE GRADE: AT CHARTERS, CHEATING'S OFF THE CHARTS
LOOSELY REGULATED SCHOOLS AMONG STATE'S WORST OFFENDERS ON TAKS
[Texas Assessment of Knowledge and Skills]
By Joshua Benton and Holly K. Hacker The Dallas Morning News, June 4, 2007
Take last year's 11th-grade science test, for example. The News' analysis flagged 46 of Jackson's 51 juniors for cheating. Their answer sheets are all identical or remarkably similar to the others, as if all 46 students got their answers from the same source — albeit a bad one. But only two of those students actually passed the exam because the shared answers were mostly wrong. [...]
The News' analysis can't determine how, exactly, cheating took place. But experts say the data do suggest a number of possibilities.
It is possible that students are, en masse, copying answers from one of their less-bright peers. That would likely require a near collapse of test-security procedures. A school official — usually a teacher — is supposed to supervise every moment of test administration. It is difficult to imagine how 46 students could copy answers off a single source without an honest teacher noticing.
Another possibility: Teachers or other school officials are actively helping, perhaps by preparing answer keys ahead of time or by doctoring answer sheets after the fact. Both of those phenomena have been reported on the TAKS before, such as in the now-defunct Wilmer-Hutchins school district.
But one might also expect a cheating teacher to get more TAKS answers right than Jackson's students did. [...]
TEACHERS CHEATING ON THEIR OWN BEHALF
The must-read New York Times article linked below presents extensive evidence that even giant-of-exam-business Educational Testing Service relies on the fallacious and discredited practice of recycling examinations, whose flimsy security teachers have not been slow to utilize for their own gain:
Educational Testing Service New Jersey headquarters
Giant of Exam Business Keeps Quiet on Cheating
By DOUGLAS FRANTZ and JON NORDHEIMER September 28, 1997
[...] The man on the telephone said he was a Louisiana teacher and had a stolen copy of the standardized test that Mr. Weston's company, Educational Testing Service, administers to teachers who want to be school principals. [...] Three days later, Mr. Weston and two other senior managers of the testing service were in Louisiana confronting a situation that was even worse than they had thought. Copies of the test's 145 multiple-choice questions, along with correct answers, had circulated among teachers throughout southern Louisiana, probably for years. In a state mired at or near the bottom of almost every educational ranking, teachers had cheated their way into running public elementary, middle and high schools. [...]
The above example of teachers cheating on their own behalf is just reliance on a recycled exam, now being practiced by teachers just as we have already seen it being practiced by students.
EXAM CREATORS CHEATING ON BEHALF OF THEIR OWN POCKETBOOKS
Having worked our way up from the student at the bottom, through the teacher in the middle, we now reach the exam creator at the top — only to discover that none of the three tiers can be trusted to be unsullied by cheating:
Exam boards: Michael Gove orders inquiry over cheating revelations
By Holly Watt, Claire Newell, Robert Winnett and Graeme Paton
10:00AM GMT 08 Dec 2011
Michael Gove, the Education Secretary, has called for a fundamental reform of the exams system after an investigation disclosed that exam boards gave teachers secret advice on how to improve their GCSE and A-level results. [...]
Mr Gove has ordered an official inquiry into the exam system after an undercover investigation by The Daily Telegraph exposed the questionable practice.
It found teachers are paying up to £230 a day to attend seminars with chief examiners during which they are advised on exam questions and the exact wording that pupils should use to obtain higher marks. [...]
One chief examiner has been secretly recorded by this newspaper telling teachers which questions their pupils could expect in the next round of exams.
"We're cheating," he says. "We're telling you the cycle [of the compulsory question]. [...]
A series of secretive exam seminars, which are thought to have rapidly grown in popularity in recent years, are suspected of being at the centre of concerns over the system. [...]
Undercover reporters from this newspaper went to 13 meetings organised by boards used by English schools and found that teachers were routinely given information about future questions, areas of the syllabus that would be assessed and specific words or facts students must use to answer in questions to win marks.
The seminars were usually held in hotels and cost between £120 to £230. Each one is typically attended by at least 20 teachers, but sometimes as many as 100.
At a WJEC course in London for GCSE history last month, teachers were told by Paul Evans, one of the chief examiners of the course, that the compulsory question for section A of the exam "goes through a cycle".
"This coming summer, and there's a slide on this later on, it's going to be the middle bit: 'Life in Germany 1933-39' or for America, it will be 'Rise and Fall of the American Economy' … So if you know what the compulsory section is you know you've got to teach that." [A] teacher pointed out that they had been told to teach the entire syllabus [...]. [...]
When one of his colleagues said this information was not in the course specification, Mr Evans said: "No, because we're not allowed to tell you."
WJEC literature on the website also appears to advise teachers that they need not teach the full syllabus and points out which sections will be examined each year.
When one of Mr Evans's colleagues, Paul Barnes, was asked by a teacher if he had understood correctly that Mr Barnes was saying they would not be asked a question on Iraq or Iran next year, he replied: "Off the record, yes." [...]
In November, an undercover reporter attended the AQA GCSE English seminar in Brighton. Teachers were told by Liz Hey, the subject manager for the English qualifications, that students could study only three out of 15 poems, even though she said the governing body [Qualification and Curriculum Authority] state it should be 15.
The moral to be drawn is that whenever unequal knowledge concerning an upcoming examination exists, some leakage will inevitably occur. When such an illicit transfer of knowledge takes place on a large scale, as described above, it may come to public attention and cause scandal; of the innumerable lesser instances that have been taking place from time immemorial, and that — without testing reform — will continue to take place into the indefinite future, the public never hears. As all conventional testing involves an unequal knowledge of upcoming examinations, all conventional testing must be regarded as corruptible. The only uncorruptible testing method is one which equates knowledge of upcoming exams, and that is the XTS method — the teacher knows as much as the creator, the student knows as much as the teacher, no one knows any more than can be discerned by an examination of plentifully-available TEXAMS.
DON'T ALTERNATIVE VERSIONS OF TESTS ACCOMPLISH THE SAME THING AS TEXAMS?
Creating alternative forms of a test may appear to be a readily-available alternative which accomplishes the same goals as TEXAMS, especially when there are considerably more than two alternative forms. Deserving attention for its outstandingly-large number of alternative forms is Kentucky which employs 6 alternative forms for some of its tests, and 12 alternative forms for others:
For reading, math, science, and social studies tests, students are assessed based on 24 multiple-choice questions and 6 open-response questions on six forms of each test. For arts and humanities and practical living and vocational skills, students are assessed based on 8 multiple-choice and 2 open-response questions on 12 forms of each test.
The three block quotes in this section are all from the document cited below, the one above from p. 8:
An Analysis of the Commonwealth Accountability Testing System, Project Staff Office of Education Accountability
Research Report No. 328 Legislative Research Commission, Frankfort, Kentucky lrc.ky.gov
Accepted July 6, 2005 by the Education Assessment and Accountability Review Subcommittee
It is apparent, however, that all these exams are recycled, which our consideration of the evidence above led us to conclude was improper and unacceptable because exam contents can be readily stolen or reconstructed. With several alternative forms, there is simply a bit more to steal and to reconstruct; however, even with its numerous forms, the Kentucky exams have so few multiple-choice questions per form as to pose little problem for thieves and reconstructors. Please recollect that the Educational Testing Service recycled school-administrator's test was either stolen or reconstructed when it consisted of 145 multiple-choice questions, and which gives us small assurance that Kentucky exams containing 24 multiple-choice questions per 6 forms equals 144 different questions will be hard to steal or reconstruct, or containing 8 multiple-choice questions per 12 forms equals 96 different questions will pose much challenge to steal or reconstruct.
Therefore, the list in the block quote below cannot be construed as one of security gaps which Kentucky has anticipated and plugged, but only as a confession of security gaps that authorities wish they could plug by means as feeble as their stern warning, but know that they can't:
Test Security. District assessment coordinators, administrators, and teachers shall ensure the security of the assessment materials before, during, and after test administration. It is appropriate for teachers to know and teach the concepts measured by the statewide assessment, but secure test materials shall not be reproduced in any way nor shall notes be taken regarding any secure test item. Tests shall be distributed in the order in which they are received in shrink-wrapped packages. No one may have test booklets without authorization. No one may show items in the test booklets to anyone not administering the test. No one may reveal the content of any secure test item or use that knowledge to prepare students for the assessment. Test administrators must destroy any notes, drafts, or scratch paper produced by students and must ensure that any testing materials reused from previous years are free of any marks. [pp. 149-150]
And from the above we also learn that where even minimal precaution in the fatally-flawed system of exam recycling would require that fresh copies of exam questions be printed out for each new batch of examinees, Kentucky is so lax that it recycles the hard copies of exam questions used previously, which opens up the possibility that some marks left by earlier examinees either escape the notice of the erasure squad, or are erased incompletely.
And turning from the question of cheating to the question of equal difficulty, it is axiomatic in test preparation that alternative examinations created in any way other than XTS randomization come with no guarantee of equal difficulty, and in the case of the Kentucky tests, such unequal difficulty is acknowledged:
KCCT [Kentucky Core Content Test] is administered using different test forms to provide greater coverage of the core content. As a result, students completing different forms receive different questions, which can result in differences in difficulty. Test forms are designed to minimize these differences. Some differences can persist, so scores are adjusted. For example, two students who receive the same raw score on two different test forms could ultimately receive somewhat different scores for use in calculating their school's academic index. These adjusted scores are referred to as scale scores. Scale scores adjust for small differences in difficulty between test forms. Two students of the same ability who get the same scale score could get different raw scores on different KCCT forms. [p. 12]
However, the Kentucky attempt to correct for unequal exam difficulty by means of some sort of "adjustment" cannot be condoned. To give a single illustration of why no post hoc adjustment can make up for unequal exam difficulty, consider the following imaginary examination procedure, admittedly extreme so as to bring out its underlying principle clearly.4 + 8 = ?
Form A of the exam, we imagine, asks Classroom A of first-graders the single question
and Form B of the exam, we also imagine, asks Classroom B of first graders the more difficult single question4½ + 8⅓ = ?
The question before us, then, is whether any adjustment will enable testers to transform the results of the Form B exam so that they will give the same information about Classroom B students that Form A gives about Classroom A students. The answer is No, there is no such conceivable adjustment. What is to be expected is that none of Classroom B students will be able to answer their question, which is to say that the difficulty of their question compresses the variance of their test scores down to zero, so that no matter what adjustment is made on these zeros, Classroom B students will all get the same adjusted score, which presumably would not have been the case if they had been tested on Form A. The example is extreme, but in the smaller differences in exam difficulty that exist between different forms in real life, similar vitiating effects are produced, which are nevertheless large enough to invalidate attempts to create equal difficulty by post hoc adjustment.
The issue of adjustments is complicated, and its further discussion would be out of place here, and this brief comment is intended to be only suggestive of that complexity, and to give some credibility to the blanket statement that administering examinations of unequal difficulty followed by adjustments is an unacceptable alternative to administering TEXAMS of equal difficulty.
THE XTS DEATH BLOW TO CHEATING
Examination cheating permeates our schools, our testing corporations, and our society. Thankfully, of the seven cheating varieties discussed above, XTS pretty much eradicates six — During-Exam Cheating, Preview Cheating, Recycled-Exam Cheating, Homework Cheating, Teacher Cheating on Behalf of Teachers, and Exam Creator Cheating — and XTS markedly ratchets up the labor cost of Teacher Cheating on Behalf of Students.
It is impossible to improve, or even to maintain, an efficient and fair educational system without being able to make geographical and historical comparisons, which are sometimes referred to as benchmark comparisons.
Under the rubric of geographical comparisons can be grouped all more-or-less simultaneous comparisons between groups of students. Useful information is gleaned from comparing the academic performance of nations, of states or provinces within a nation, of cities and towns and rural areas within states or provinces, or of individual schools. Comparing the performance of one classroom to another can be considered a geographical comparison where all the groups lie within a few hundred feet of each other.
Under the rubric of historical comparisons fall the questions of whether students in a single area are doing better or worse than they were a month ago, or a year ago, or ten years ago. A graph showing annual scores over many consecutive years might reveal the effect of innovations in teaching methodology that had been introduced over that interval.
And yet what happens when such benchmark comparisons are attempted? Researchers attempting to study geographical differences may find data indicating that geographical differences don't exist:
STANDARDIZED TEST SCORES: VOODOO STATISTICS?
By EDWARD B. FISKE
Published: February 17, 1988
Garrison Keillor, the folk humorist, has provoked many a smile by his description of the mythical town of Lake Wobegon as a place where the women are strong, the men good looking and "all the children are above average."
But looking closely at the results of the estimated 50 million standardized achievement tests taken by American schoolchildren every year, it seems that such fantasies are no longer a laughing matter.
For several years virtually every state education department, and even the most urbanized local school districts, have released standardized test scores showing that their children are reading, writing and calculating above the national average. Since this by definition is impossible, test makers and educators have been accused of playing statistical or educational shell games. [...]
And researchers looking for historical differences may find that their data has been manipulated:
New York schools have failed the test
Kevin Donnelly Sydney Morning Herald November 12, 2010
[...] The suspicion that students achieved better results because the local tests had been made easier to pass was confirmed by a recent independent report, commissioned by the New York Board of Regents and carried out by Professor Daniel Koretz of Harvard University. [...]
As Marc Epstein concluded in an analysis published in New York's City Journal, "The feel-good story of rising student test scores over the last several years is largely an illusion produced by dumbed-down tests. [...]"
What is going wrong? Looking back at the Parents irked over no-peek test tradition article above which alludes several times to the desirability of benchmark comparisons, we see the problem immediately — exam recycling, and of the worst kind, in which the fact of recycling is publicly announced, and in which several security breaches are disclosed, seemingly without awareness that the disclosures inform parents how they can preview the exams for the purpose of boosting their children's scores.
In the New York example above, as the test is progressively being "dumbed down", which is to say being simplified, it is not a case of the same exam being recycled, but rather of alternative exams being substituted, which is in violation of the principle that alternative versions of an exam can be assumed equal in difficulty only if they are created by means of TS randomization; alternative versions created in any other way should not have been used because they come with no guarantee of equal difficulty.
It would be bad enough if test creators were trying with all their might to create examinations of equal difficulty, but inadvertently failing, but the reality may be worse. The reality may be that test creators know that teachers and politicians want the public to trust that education is succeeding, and therefore deliberately create dumbed-down exams, and openly boast of the easiness of their exams:
Exam boards investigation: boasting about easy tests 'undermines exam system'
By Claire Newell, Holly Watt, Robert Winnett and Graeme Paton The Telegraph
11:00AM GMT 09 Dec 2011
Revelation that exam boards are actively boasting about how easy their exams are risks undermining pupils' hard work and parents' confidence, Stephen Twigg has claimed. [...]
A GCSE test set by Edexcel, one of Britain's biggest exam boards, has so little content that its chief examiner cannot believe it was approved by the Government's official regulators, a Daily Telegraph investigation has found.
Steph Warren, a senior official at Edexcel, told an undercover reporter posing as a teacher who was considering using the firm's tests that "you don't have to teach a lot" and that there is a "lot less" for pupils to learn than with rival courses.
Miss Warren, who sets geography exams for tens of thousands of teenagers, said she did not know "how we [Edexcel] got it through" the official regulation system that is supposed to ensure high standards in GCSEs and A-Levels. [...]
The disclosures — the latest from a Telegraph investigation into exam standards — will add to concerns that exam boards are driving down standards by aggressively competing with one another to persuade schools to take their tests.
Another Edexcel official also boasted to an undercover reporter about the ease of the exam board's coursework. "So weak kids, you can get them through on anything really. Frankly," Jennifer Smith, Edexcel's A-Level English moderator said. [...]
Today's disclosures of the covertly recorded remarks made by the Edexcel examiners are the first indications that exam boards are actively boasting about the ease of their courses in an apparent attempt to try to secure valuable business.
When the reporters asked the officials leading the course why they should pick Edexcel, they were told that the assessments were less demanding.
During an Edexcel geography GCSE training course in Birmingham, the reporter asked Miss Warren why she should pick the exam board.
Miss Warren replied, "It's very, very traditional, Edexcel, and also, as these two will tell you [indicating to two teachers sitting nearby] you don't have to teach a lot, do you?"
"No, there's certainly a lot less content," one teacher then said.
Miss Warren added: "Yes, in fact there's so little we don't know how we got it through [the exam regulators]. And I'm deadly serious about that. When I looked at it I thought, 'how is this ever going to get through?'"
As the teachers around Miss Warren agreed with her assessment of the Edexcel exam, she concluded: "It's a lot less, it's a lot smaller, and that's why a lot of people came to us." [...]
Although any shift in the difficulty of examinations should be disallowed because it obscures the educator's view of how well students are doing compared to students at other times or in other places, it is only the shift toward greater difficulty that is likely to elicit protest:
Schools lose appeal in GCSE case
By Helen Warrell Public Policy Correspondent
February 13, 2013 11:49 am
An alliance of pupils, schools and local councils has lost its legal appeal against GCSE exam boards whom it accuses of unfairly pushing up grade boundaries in English modules last summer.
The group was seeking a judicial review of the decision to move grade boundaries by large margins a few weeks before results were issued in August. As a result, pupils who sat GCSE modules in June received lower grades than if they had taken them in January.
The group suggested this amounted to "illegitimate grade manipulation" and "a statistical fix" by AQA, the largest issuer of GCSEs, Edexcel and Ofqual, the regulator. Edexcel is owned by Pearson, the company that also owns the FT. Lawyers for the group have argued that approximately 10,000 pupils who sat exams in June last year missed out on a C grade in English as a result of the exam boards' decisions.
However, the High Court dismissed the challenge. [...]
But Christine Blower, general secretary of the National Union of Teachers, said a "great injustice" had been done. Ms Blower pointed out that the Welsh government had ordered a regrading of the pupils affected. "It is profoundly unfair that Michael Gove and the High Court have taken a different view," she said. "Parents, pupils and teachers will feel very let down."
The two essential characteristics of valid benchmark comparison are those that come built into XTS testing — that the examinations be of equal difficulty, and that they be uncontaminated by cheating or other unfair advantage or handicap, and that examinees have full access to relevant practice materials. Both examination recycling and creating alternative versions by any method other than TS randomization produce data that cannot be relied upon because it lacks one or more of these characteristics.
WHAT IS TAUGHT DIFFERS FROM PLACE TO PLACE AND FROM TIME TO TIME
One side issue relating to benchmark comparisons is that they are made problematic by course content varying from place to place, which interferes with geographical comparisons, and also by course content changing over time, which interferes with historical comparisons. For many core K-12 subjects, however, a standard syllabus can be adopted not only throughout an entire country, but over the entire world, simply because there exists broad agreement concerning what needs to be learned. And it might be noted also that for many core K-12 subjects what needs to be learned changes almost not at all over even half a century. For example, the area problems K-12 students solve today do not differ from the ones they solved fifty years ago, and possibly differ little even from the ones they solved a century ago. And even rapidly-evolving subjects such as genetics and computer science may need to have their K-12 syllabus tweaked only a little over any five-year interval.
The long and the short of it is that San Francisco, for example, does not need a mathematics or physics or chemistry K-12 curriculum different from New York's or Ottawa's or London's or Berlin's or Tokyo's or Beijing's. If there are differences in what is taught in these places, then they are accidental and superficial differences, and not differences that anyone will defend as valuable to preserve. In any case, as one of the characteristics of XTS methodology is that students learn a great deal more than in conventional schooling, setting a world XTS standard would incline to incorporate rare content, and therefore to teach it everywhere.
HOSTILITY TO STANDARDIZED TESTING
Although benchmark comparisons are assumed above to be indispensable, teachers in fact sometimes greet their threatened introduction with heated opposition:
Teachers vote on whether to boycott standardized tests|
BY JANET STEFFENHAGEN Vancouver Sun 11 Dec 2008
EDUCATION | Teachers are making a crucial decision this week about whether to back their union's plan for a boycott of standardized tests in reading, writing and math early next year.
The B.C. Teachers' Federation is seeking member support in a province-wide vote on a union proposal that would see teachers refuse to prepare for, administer or mark the Foundation Skills Assessment (FSA) unless the Education Minister agrees to test only a random sample of students. [...]
One puzzling incongruity that appears in the above quote needs to be disposed of at the outset — which is teachers objecting to the testing of all students while recommending the testing of only a random sample of students. However, a truly random sample will have the same mean achievement as the population from which it is drawn, and so the conclusions and implications and ramifications will be the same no matter whether it is a random sample that is tested or the entire population, and so the teachers would seem to have no reason to welcome random-sample testing while opposing universal testing. The only explanation of this incongruity that suggests itself is not one that is flattering to the BC Teachers' Federation — the explanation that teachers intend to produce for testing a group of superior students, and which they hope to be able to pass off as a random sample of students.
But it is the teachers' opposition to standardized testing that is relevant to our discussion, and here it is necessary to point out and to emphasize that teachers cannot possibly be against standardized testing. What is standardized testing, after all, but testing that is fair, that gives no student an illicit advantage and that burdens him with no gratuitous handicap. Thus, standardized testing requires that all students be given tests of equal difficulty, and surely teachers do not advocate that students be given tests of unequal difficulty. And standardized testing requires all students to be given the same amount of time to complete the test, and surely teachers do not advocate that students be given different amounts of time. And standardized testing requires that all students be provided with the same undistracting environment while taking the test, and surely teachers do not advocate some students taking the test in a commotion-free room, while others take the test in a bustling foyer. And standardized testing requires that all students suffer an equal absence of assistance in answering exam questions, and surely teachers do not favor a few students having private tutors sitting beside them during the exam.
From considerations such as the above, it must be concluded that teachers do not really oppose standardized testing, they oppose something else, and mistakenly call that something else "standardized testing". And what might that something else be? It might be multiple-choice testing that they don't like, and if that is the case, they should say so, and not confuse the issue by misnaming the cause of their antipathy. And yes, there is something to be said against multiple choice testing, but there is also something to be said for it — which is that it permits exams to be marked with high reliability and at high speed and at low cost. But this is an issue that is unrelated to standardized testing, as multiple choice testing can be unstandardized (any exam becomes unstandardized by some examinees cheating more than others), and as standardized testing does not have to rely on multiple choice (XTS examinations avoid multiple choice).
But to get to the bottom of the mystery without further ado, what the teachers really seem to object to is not standardized testing, it is the misinterpretation of standardized-test results, it is the drawing of simple-minded conclusions which the test results do not support:
What the teachers hate most about the tests is that results are used by the Frazer Institute to rank schools. They insist the FSA says little or nothing about a school's performance [...].
And so, more particularly, what teachers seem to fear is that they are going to get blamed for poor performance that is caused not by their teaching but by the students' socioeconomic environment, and over which the teachers have no control, and on this issue the teachers are entirely in the right. Politicians are likely to misconstrue standardized test results, and are likely to scatter praise and censure, and reward and punishment, where they are undeserved, but for our purposes this is another separate issue, and the answer to which is not banning standardized testing. The standardized testing must go ahead, but at the same time everybody must be educated as to the proper interpretation of its results — everybody including the politicians and the public and the teachers themselves who are, after all, laymen in the field of data interpretation.
SPECIALLY-CONSTRUCTED BENCHMARK EXAMINATIONS SHOULD NOT BE NECESSARY
The "no-peek" article above can be seen to distinguish between two types of examinations — those that permit benchmark comparisons and those that don't:
In some cases, the district does not allow teachers to send graded tests home with students. Examples are Connected Math, taught in grades 6-8, and Integrated Math 1 and 2, taught to junior high or early high school students, in which data are collected for benchmark assessments.
However, under XTS methodology there is no need to go to the trouble and cost of devising special benchmark-friendly examinations, as all TEXAMS are amenable to comparisons over any geographical distance and over any span of time.
Testing in the conventional school is simultaneous, which for many students proves to be unfair. A student may fall ill just before his exam. Or a family emergency may demand his attention, and subtract from his study time, and leave him upset and unable to concentrate. Or he may have been handed an exam schedule which piles up difficult exams within a narrow time frame, leaving him insufficient time to prepare. Perhaps he has needed to earn money while in school, not just for his own support, but for the support of his family, and which has caused him to fall behind in his studies, but it would take him no more than a few days to catch up. Or perhaps the night before the exam, there has been a loud party next door which went on into the wee hours, resulting in loss of sleep and a keying up of tension.
If such students were allowed to defer their exam by even a few days, they would be taking it under the favorable circumstances enjoyed by most of their classmates — but by and large they can't take the exam later, not the same exam because the teachers would rightly fear the possibility of recycled-exam preview, and not an alternative exam because of the labor of creating one, along with the probability that the alternative would be of unequal difficulty, or might be complained of for having been of unequal difficulty.
However, we have already seen that TS does make delayed testing feasible, simply because fresh TEXAMS can be effortlessly run off, and because their uniqueness makes preview cheating impossible, and because they come with a guarantee of equal difficulty. XTS teaching methodology, then, renders simultaneous testing unnecessary, and in so doing takes a substantial step toward higher examination standardization. Yes, allowing students to be tested at different times can be considered a step toward heightened standardization. That is, whereas conventional schooling views simultaneous testing as heightening standardization, we have seen that for many students it does the opposite, it burdens them with a handicap. By allowing a student to adjust his examination timetable, greater equality may be attained — for example, the equality of good health, or the equality of adequate spacing between exams, or the equality of having enjoyed a reasonable night's sleep.
The following exemplifies the commonplace observation of student heterogeneity:
When my first-grade teacher discovered that I could understand fifth-grade math, Umma bought workbooks from a store in Koreatown so that I could practice my decimals. On weekends, she took me to Flushing to attend hagwan, a Korean academic cram camp. [...] As I grew older, Umma complained whenever I didn't finish my hagwan exercises or neglected piano practice.
Victor Zapana, Shaken: A mother's conviction. A son's doubts, The New Yorker, 26 November 2012, p. 32-39, p. 33.
A first-grader doing grade 5 math is a four-year discrepancy, but when that is followed by a workbook on decimals, and also by weekends at an "academic cram camp", and also by hagwan homework, and also by piano lessons — the gap between the student and his peers could only have widened.
Achievement discrepancies such as the above are not merely occasional observations, they have been identified by researchers as one of the most pervasive problems bedevilling education:
By the time children complete the fourth grade, the range in readiness to learn (as suggested by the M.A. [Mental Age]) and in most areas of achievement is approximately the same as the number designating the grade level. In other words, pupils in the fourth grade differ by as much as four years in mental age and achievement; in the fifth, by five years; in the sixth, by six years. In reality, then, the grade level designation means little. A fourth-grade teacher who troubles to look back of the grade-level label realizes that, to be honest with himself and his pupils, he really must teach grades one through six or higher.
Goodlad, John I., & Anderson, Robert H. The non-graded elementary school (revised edition), Teachers College Press, New York, 1987, pp. 13-14. Italics are in the original.
Whether any teacher has it within his means to teach "grades one through six or higher" in his nominally fourth-grade class can be doubted. What he is more likely to do is to aim his teaching at the median, thereby wasting the potential of those who are substantially above the median, while at the same time leaving behind those who are substantially below.
But what can be the solution? The call has gone out for some sort of individually-paced learning, otherwise known as multi-tracking:
Classes must be divided according to ability. The gifted and energetic student must not be condemned to a scholastic prison of tedium because of the background deficiencies of the disadvantaged and less talented. Nor should the disadvantaged and disturbed be condemned to a scholastic prison of unspeakable tortures of constant discouragement, frustration, and final alienation from school and society.
Lucius F. Cervantes, The Dropout: Causes and Cures, University of Michigan Press, Ann Arbor, 1965, p. 208.
Cervantes has certainly identified the problem, and commendably recognized the need for individual pacing, but if his reference to "classes" has in mind classrooms composed of students of equal ability — that will not work. Goodlad and Anderson report that after fifth-graders in one school had their high-IQ students removed to a "gifted" class, and their low-IQ students removed to a "retarded" class, the remaining, one-would-imagine-homogeneous, fifth graders still remained heterogeneous enough to exhibit, for example, more than an eight-grade difference in "paragraph meaning and language" (p. 18). Goodlad and Anderson's review of the literature leads them to conclude that sorting students by performance on any particular measure does not produce groups homogeneous enough to be taught all subjects as if they were at the same level:
The troublesome and yet very significant point, however, is that pupils advanced or retarded in one learning area are not necessarily similarly advanced or retarded in other areas. Pupils at twelfth-grade standards for reading might well be at tenth-, ninth-, and seventh-grade levels for other subjects.
Goodlad, John I., & Anderson, Robert H. The non-graded elementary school (revised edition), Teachers College Press, New York, 1987, p. 25. Italics are in the original.
The answer would seem to be to keep students with their peers, but nevertheless enable each student to practice on materials keyed to his own level of achievement, and to write his own exams which correspond to that level of achievement, something that is utterly impossible in the conventional classroom, but made feasible by XTS. We have seen under the heading SIMULTANEOUS TESTING just above that XTS makes it possible for some students to choose for themselves whether they will take any TEXAM a few days earlier or later, and we recognize now that XTS also makes it possible for all students to choose for themselves whether they will take any TEXAM months or years earlier or later, or in other words, choose to take TEXAMS whenever it happens to suit their individual needs.
And so we see now that the XTS regimen makes it possible to implement full-scale individual pacing in a broad range of subjects and at all levels of K-12 and even in university. Of course individual pacing is likely to strike some readers as acknowledging the inevitability of, and legitimizing, vast individual differences. The reality, however, may be quite different from this impression. The reality may be that conventional schooling is already plagued by student heterogeneity, and widens gaps by neglecting laggards, in effect seating them at the back of the classroom where they will be easier to ignore, and thereby ratifying their withdrawal from participation, and so leaving them permanently behind. XTS, in contrast, is able to keep feeding every student practice materials and exams at whatever level the student is able to comfortably work, both the student way behind and the student way ahead, so that none are abandoned to passivity and stagnation.
The section immediately above may be said to have addressed the question of the age at which students take various tests, the answer being that the level of the test should be pegged not to age but to achievement, or in other words that the students take various exams whenever they are ready to. The instant section concerns itself with how often students take tests, and it proposes that it is not often enough.
EXAMINATIONS EVERY TWO MONTHS, OR EVERY DAY?
Consider two extremes. At the infrequent-testing extreme, a one-semester (say September to December) course may require a mid-term and a final exam, which one could say was one test every two months. And at the other extreme ... what? At the other extreme might be daily testing! Why so often? First, it's free, as TS software happily spits out PRACTICE and EVALUATION TEXAMS without stint, and grades them as well, and explains what the student does not understand. And as PRACTICE and EVALUATION TEXAMS are indistinguishable, then a part of every practice session can readily be converted into evaluation. Although PRACTICE and EVALUATION TEXAMS are the same, the way the student responds to them may be different. During PRACTICE, he can pause to ponder, he can weigh alternative solution strategies, he can enter into discussions with his teacher or his classmates. And in the course of such practice there will arrive the recognition that he fully understands the material that he is working on, and that he has gotten enough practice on it to now switch to EVALUATION mode, and for doing which he will get academic credit.
The new idea that is being introduced here is that efficient education requires not only more frequent evaluation, but daily evaluation, and more specifically, it requires the daily earning of Personal Bests. That is, in most subjects of study, the XTS student is expected to earn a daily Personal Best, which can be done in either of two ways: (1) take an EVALUATION TEXAM at the same level of difficulty as last time, but complete the test faster, or (2) complete an EVALUATION TEXAM at the next-higher level of difficulty, or several steps higher, no matter how slowly.
One benefit of the Personal Best method is that in the early stages of mastering a particular level of difficulty, the student finds it easiest to earn Personal Bests simply by beating his previous time, which is to say by performing the same task as before but more fluently, and which is to say by securing a more solid foundation for future work at a higher level — and which approach contrasts with conventional schooling in which barely passing a test, or even failing it outright, and thereby acquiring no fluency and building no foundation, is nevertheless followed by new material which therefore proves to be confusing and intimidating. But could anything be more counterproductive, more likely to occasion insecurity and foreboding of failure, more likely to inculcate an aversion to the subject being taught? The best thing for a student who performs haltingly and blunderingly at a given level is to keep plugging away at that level until he can do it smoothly and faultlessly, and this is what he will choose to do under the obligation to produce personal bests, and what will most securely promote his mastery of the subject.
And progress remaining almost imperceptibly gradual is not a requirement of the Personal Best method, but only an option. Sometimes, a student will beat his old Personal Best speed by a wide margin, and sometimes he will at one sitting advance the level he is working on not merely by the minimal step that is available, but by two or three steps, or by ten.
Every day, then, the student is expected to show demonstrable advancement on every one of his subjects of study, and however small that daily advancement is, it continues relentlessly, and even after only a single week reveals that a highly unusual amount of learning is taking place, and which after a few years produces effects described below in the section titled CONVENTIONAL LEARNING IS MEAGER.
HOW LONG BEFORE SHE CAN STUDY LAW?
Imagine a young lady who on 10 Nov 2012 is seized by enthusiasm to go to Harvard Law School and become a lawyer, and realizes she first needs to take the LSAT, but discovers that the deadline for registering for the upcoming LSAT was 09 Nov 2012 — yesterday! — so she's going to have to wait until February to take the LSAT, which means she'll have missed Harvard's 01 Feb 2012 application deadline for beginning her law studies in September 2013. If all goes well, the soonest she is going to actually begin studying law at Harvard is September 2014, which is 22 months away. Will she be able to maintain her enthusiasm for almost two years, or will temptations and distractions come along and deflect her from her goal? Probably the latter.
The monster of infrequent testing has reared its ugly head again. The LSAT should be available not just four times a year, as is the case presently, but every working day of the year in examination centers, and of course with PRACTICE versions indistinguishable from EVALUATION versions made available in unlimited supply as well. And the same for examinations in every subject. It is insufferable that a society whose survival and growth depends on a highly-educated work force, and that claims to value learning, and yet that ranks low on international comparisons of student achievement, is at the same time a society that can tell an energized youngster to cool her heels for 22 months before she can even begin the study whose prospect excites her.
SHOULD UNIVERSITIES BE ALLOWED TO BECOME TEMPORARY GHOST TOWNS?
A detail concerning what has been said just above about hiatuses in personal development:
IN LATE AUGUST, as the first leaves changed from green to red and gold, university ghost towns were coming back to life. Residences were dusted out. Classrooms were readied. Textbooks were purchased — and new outfits, new computers, new posters to decorate dorm room walls. [...]
Mental Health. The Broken Generation: Why so many of our best and brightest students report feeling hopeless, depressed, even suicidal. Kate Lunau reports on the crisis on campus. Maclean's, 2012-Sep-10, pp. 54-58, p. 55.
Generally speaking, denying students the opportunity to continue their educational advancement three months out of every year slows intellectual development, and thereby lowers the ceiling on what will ultimately be achieved. And constructing and maintaining buildings which go largely unused three months a year is wasteful. This is not to say that students should be required to study through the summer, but only that they should be given the choice.
In conventional schooling, continuously-available education might be impracticable; XTS methodology makes continuously-available education available. If PRACTICE and EVALUATION TEXAMS are always available, learning can progress not only at any speed, but in all seasons.
Although scholarly activities are sometimes competitive, it may be wondered why that competition has not been formalized into a sport.
Competition is natural and spontaneous. Once there were boats, people would race them. Given fists, people will box. Kids contrive all sorts of ad-hoc "sports" based on the shape of their garage roof or the length of the hallway in their apartment or the number of manhole covers on the street outside.
We call these ad-hoc contests games, and one of the ways games get turned into sports is standardization. A sport is a game that, no matter where it takes place, is always played by the same rules. This is not a prerequisite for competition. When someone says, "I'll race you to that tree," the person who reaches the tree first is the winner. The loser does not complain that the race was not run at regulation distance. Regulation and standardization insure that, when competitors meet, everyone will have trained for the same event, and the results can be compared across competitions. Standardization is what makes it possible to have world records.
Louis Menand, Glory days: What we watch when we watch the Olympics, New Yorker, 06 Aug 2012, pp. 64-72, p. 70.
We notice, do we not, that there exists no world record for solving area problems mentally, and there are no gold medals awarded for solving area problems mentally, and the chief reason may be that there has been no standardization of any area-solving challenge, and yet the TS does allow such a standardization, one which eventually could be accepted by a committee of mathematicians as an international standard, in the same way that the 100-meter dash, say, has become accepted as an international standard. The TS makes it easy to present a problem-solving challenge at any time and at any place, while guaranteeing that the challenge is unique and yet of constant difficulty, in the same way that every 100 meter dash, say, that is run is unique and yet of constant difficulty.
The Afternoon of the Big Game by Gluyas Williams
But why bother encouraging academics to be practiced as sports? Because it would bring to academics the same payoff as it has brought to athletics. Today's Olympics, for example, hold up ideals of physical fitness, and motivate the world to strive toward physical fitness ideals, and tomorrow's Academic Olympics could hold up ideals of mental fitness, and would motivate the world to strive toward mental fitness ideals. All that is needed to make this happen is the realization that the thinking of Pierre de Coubertin, described below, can be applied to Mathematics and Physics and Chemistry and many other academic activities just as they have been applied to running and jumping and swimming and many other athletic activities:
It was from Victorian Britain that the founder of the modern Olympics, Pierre de Coubertin, took his inspiration. Coubertin was a French aristocrat with a passionate interest in education. He wanted the French to be more manly, meaning more disciplined and self-reliant (he was reacting partly to the country's defeat, in 1870, in the Franco-Prussian War), and he believed that introducing sports into education could be the basis for this transformation in the national character. In other words, he wanted the French to be more like the British. [...]
Around 1890, he heard about an annual event in the town of Much Wenlock, in Shropshire, called the Wenlock Olympian Games. These had been established, in 1850, by a physician named William Penny Brookes, as a means of fortifying British manhood. [...]
When Coubertin returned to France, he published an article about the Wenlock Games. If the Olympic idea still survives, he wrote, "it is due not to a Hellene but to Dr. W. P. Brookes." Britain's culture of sports, he explained, is the reason for its empire, and he quoted from a speech that Brookes had delivered: "If the time should ever come when the youth of this country once again abandons the fortifying exercises of the gymnasium, the manly games, the outdoor sports that give health and life, in favor of effeminate and pacific amusements, know that that will mean the end of freedom, influence, strength, and prosperity for the whole empire."
Louis Menand, Glory days: What we watch when we watch the Olympics, New Yorker, 06 Aug 2012, pp. 64-72, p. 71-72.
And so one thing that is lacking in academic education is the application of the motivational devices employed in athletics, with the result that Algebra is considered by many more boring than it needs to be and Hockey is considered by many more exciting than it otherwise would be. But imagine Hockey being played with nobody keeping score, with Brockville High School never whipping Scottsdale, with Harvard never beating Yale, with the US never triumphing over Russia — would Hockey acquire a larger or a smaller audience as a result? If Hockey scores were never kept, would interest in Hockey plunge to the level of today's interest in Algebra? And is it conceivable that if the magical effect of keeping score in competitions were transferred to Algebra, that some excitement and public interest would shift to Algebra as well?
Perhaps the day will arrive when intellectual excellence is encouraged using the same powerful device already employed to encourage physical excellence, and an Academic Olympic Games will be staged in parallel with the Physical Olympic Games, and at which time the standardization which will make such Academic Olympic Games possible will be delivered by means of the Transformational Syllabus.
The US in particular might be disappointed in its ranking among 74 countries participating in the 2009 Programme for International Student Assessment (PISA), from which a sampling of countries is shown below. A few countries other than the US might also find small cause for pride:
And it has frequently been complained of that students are not learning nearly as much as our society needs and expects (the international rankings cited below are apparently from a different study than the above, though the results are equally disappointing):
Bill Gates has said that the public-school system is "obsolete," and no longer produces enough technically qualified workers to allow America to compete internationally. As has been widely reported, in 2009 Americans scored seventeenth in science and twenty-fifth in math among students from thirty-four advanced industrial countries. A recent task force on education assembled by the Council on Foreign Relations, and headed by Condoleezza Rice and the former New York City schools chancellor Joel Klein, concluded that weak American performance amounted to a "grave national-security threat."
David Denby, Annals of Education, Public Defender: Diane Ravitch takes on a movement, The New Yorker, 19 November 2012, pp. 66-75, p. 68.
In the US, possibly very few seven-year-olds would be able to calculate the Level 1 areas shown in the Alice-to-Irma set of TEXAMS above. Perhaps if almost all U.S. seven-year-olds were able to compute these Level 1 areas, and able to show comparable advancement in subjects other than mathematics, they might come out on top in the international rankings. The claim made on behalf of the Transformational Syllabus, however, goes far beyond this. The claim is that with the help of the Transformational Syllabus, almost all American and Canadian seven-year-olds can learn to calculate all Level 01 through 27 area problems shown in the Alice-to-Irma TEXAM set. And they can learn to do this mentally, which is to say without the assistance of a calculator or even pencil and paper. And they can learn to do it at about the rate of 20 seconds per problem for problems of Level 27 difficulty.
See for yourself how at the age of seven, Marko begins to acquire fluency in identifying
prime factors and
calculating angles, and by age 10 passes a university exam on differential calculus for Arts students,
UBC Math 140 74%,
and by age 11 has done well on integral calculus for Arts students,
UBC Math 141 85%,
and by age 12 has polished off differential calculus for Science students,
UBC Math 100 85% and has also aced integral calculus for Science students,
UBC Math 101 95%,
and done not a few other things as well, as is summarized toward the
bottom of the TwelveByTwelve pilot study report.
In contrast, the average ten-year-old in the US is not polishing off his first university calculus course, he is more likely struggling with multiplication and division, as is documented on the DECLINE OF MATH EDUCATION IN AMERICA page. The educators who are responsible for this meagerness of achievement argue that their students are being taught something like creative problem solving, but it can be doubted that the educators will ever be able to propose any test on which their students outperform.
The difference between conventional schooling and TwelveByTwelve is immense. The methodology illustrated in the pilot project did not have the benefit of XTS technology, and therefore may be expected to produce even more spectacular results now that XTS technology is available to lend a hand. Any country wishing to place nearer the head of the PISA list above might consider whether XTS would not be among the measures which could provide the needed boost.
CREATING COMPLEX EXAMS MAY TAKE MORE TIME THAN TEACHERS CAN SPARE
Computerizing exam creation makes it possible to create exams relying on such intricate graphics, like the Alice-to-Irma exam above, as to make it inconceivable that any teacher would find the time to create them by hand. Even creating a single column of that exam, say Alice's column, would be unfeasible. Another benefit of TS, then, is that it enables the teacher to create both PRACTICE and EVALUATION TEXAMS incorporating unusually-complex graphics, the benefit being that certain problem types, like the ones shown in Alice-to-Irma, for the first time can be embodied in unlimited numbers of PRACTICE and EVALUATION TEXAMS, and so for the first time become efficiently teachable. Another way of putting is that prior to TS technology, exam quality could not exceed a low ceiling because of the inefficiency of hand production.
ANY CLOSED SYSTEM IS A BREEDING GROUND FOR MISTAKES
Today those who take and use many tests have less consumer protection than
those who buy a toy, a toaster, or a plane ticket. Rarely is an important test or
its use subject to formal, systematic, independent professional scrutiny or audit.
Civil servants who contract to have a test built, or who purchase commercial tests
in education, have only the testing companies' assurances that their product is
technically sound and appropriate for its stated purpose. Further, those who have
no choice but to take a particular test — often having to pay to take it — have
inadequate protection against either a faulty instrument or the misuse of a
well-constructed one. Although the American Psychological Association, the
American Educational Research Association, and the National Council for
Measurement in Education have formulated professional standards for test
development and use in education and employment, they lack any effective
National Commission on Testing and Public Policy (1991, p. 21), quoted by Kathleen Rhoades & George Madaus, Errors in Standardized Tests: A Systematic Problem, National Board on Educational Testing and Public Policy, Lynch School of Education, Boston College, May 2003, p. 8
AND SO EXAMINATION CREATORS DO MAKE MISTAKES
The [Government's examination] watchdog [the Qualifications and Curriculum Authority or QCA] also said in the report that many students were unable to answer questions because of typing and factual errors in exam scripts. All three of England's privately run boards made mistakes, although errors were down from 36 in 2006 to 26 last year.
By Graeme Paton, Education Editor, 4,000 pupils caught cheating in exams, The Telegraph (UK), 12-Feb-2008.
Appealing a Test Score
By ALAN FINDER April 20, 2008 www.nytimes.com/~
[...] From 1,000 to 2,000 students appeal their SAT scores each year, says Laurence Bunin, senior vice president for operations at the College Board. Most involve scoring sheet mishaps, but on average, 370 students a year dispute a single question. Over at ACT Inc., Sherri Miller, assistant vice president for development, says 40 to 180 students a year insist that a question is questionable. [...]
And the College Board has thrown out six questions in the last three years. Three had misprints — a wrong word underlined, missing math sign, smudge on a word — and three times students convinced the board that questions were flawed. One, in October 2005, asked test takers to identify a mistake in this sentence:
Every spring in rural Vermont, the (A) sound of sap (B) dripping into galvanized metal buckets (C) signal the beginning of the traditional season (D) for gathering maple syrup. (E) No error.
The board's answer, (C), suggests changing the verb ("signal") to agree with the subject ("sound"). But it was not the only answer, one student argued. You could also change the subject to "sounds," making (A) the correct answer. [...]
Below is an example of a single mistake made by the exam manufacturer, Edexcel, on a math exam, along with a description of the disruption that mistake created, and not only for the students writing that math exam, but also for geography students writing their exam in the same hall. Also of interest is the mention of the eight-hour lag between Hong Kong and London examinees writing exactly the same exam, which gives the London examinees time to obtain exam preview information from Hong Kong acquaintances, a topic discussed above under the heading PREVIEW CHEATING, a type of cheating rendered obsolete by TS.
Sixth formers say they were distressed by the error
Examiners knew about maths error
By BBC News Online's Gary Eason 21 January, 2002 news.bbc.co.uk/~
Exam board Edexcel knew that one of its maths AS-Level papers had a mistake in it but decided not to tell test centres.
BBC News Online has learnt that the error was noticed first at a school in Hong Kong, where candidates sat last Friday's Decision Mathematics (D1) paper eight hours ahead of those in the UK.
It is understood that the school alerted Edexcel, which received an e-mail sometime after 0800 in London.
The exam was due to be taken by about 2,500 students in 300 centres in England, Wales, Scotland and Northern Ireland at 1330 that afternoon.
The decision was taken not to try to tell exam centres, however.
Amit Bharath said his calculations kept going wrong
Edexcel has promised candidates that they will be given special consideration and will not be disadvantaged because of any disruption to the exam.
The Education Secretary, Estelle Morris, is said to be "furious" about the error. [...]
"As I'm sure you are aware, Hong Kong is eight hours ahead of the UK and so we found the problem eight hours before the majority of people were sitting the exam."
He was taking a geography paper, not maths, and feels his problem is being overlooked.
The mistakes: At the top, the diagram used in the question paper. Below it, the one used on the answer sheet — with two different figures.
"As the maths candidates were asking the invigilators what they should do and the invigilators were conferring, the geography candidates were indirectly affected by this as it caused us a great deal of disruption," he said.
This was made worse when they had to listen to mathematical information being given out to the maths candidates. [...]
Head teacher David Dunn demanded a re-sit
The headmaster of Yarm School, near Stockton-on-Tees, said he contacted Edexcel during the exam after his pupils also spotted "the grave error in the paper, which rendered the exam a farce".
Mr Dunn said: "They gave us new instructions to announce during the exam, which we did."
But he added: "They also turned out to be wrong.
"Imagine the effect on the candidates when we had to stop the exam twice to make announcements about the board's errors, which still made no sense," he said. [...]
Relevant to the phenomenon of a single defective question causing uncorrectable damage is the experience of Svetlana Khorkina (whom Wikipedia credits with being "one of the most successful female gymnasts of all time") at the Sydney Olympics, where she fell victim to a Standardized-Testing error somewhat different from the ones we have been considering on this web page — the error of having a vault set 5 cm too low, which caused her and several other competitors to fall. She said of her experience, "It was cruel to all the participants, to vault on a nonstandard height. It's quite possible to get killed. The five centimeters could decide the future of a sports person."
The relevance of Khorkina's experience to standardized testing in the academic realm is explained in the Rhoades and Madaus excerpt below:
Another type of error is the faulty test question, found by test-takers or proctors during
a test administration, and later removed before scoring. For example, during the 2001
administration of the MCAS, two tenth-graders found one math multiple choice item where
all of the answers provided were correct (Lindsay, 2001). One of the students reported that
he worked more than five minutes on an item that should have taken one or two minutes at
the most. Test officials often claim that removal corrects the problem since the removed item
does not affect the scores. This does not correct the disruption experienced by knowledgeable
test-takers who try to find a solution to a faulty question. Indeed, Michael Russell, a professor
at Boston College, suggests an analogy that demonstrates the residual impact of faulty items,
even if they are removed from consideration during scoring. During the 2000 summer
Olympics, Svetlana Khorkina, international favorite to win a gold medal in the women's
gymnastics competition, failed to win the coveted prize. Khorkina ranked first after scoring
a 9.812 on the floor exercises. She then moved to the vault where she did a most uncharacteristic
thing — she landed on her hands and knees (Harasta, 2000). After a string of similar
mishaps, event officials rechecked the vault's height. It was set at 120 instead of 125 centimeters
— a difference of a little less than two inches. The vault was quickly set correctly and the
affected gymnasts were allowed to repeat their vaults, but the damage was done — Khorkina
was unable to regain her momentum or her confidence, and declined another attempt on the
apparatus. She left the event in 10th place, far behind her initial standing, and ended the
competition in 11th place after a fall on the uneven parallel bars (Measuring mix-up, 2000).
Russell suggests that test takers, confronted with a question that doesn't make sense, may
suffer a similar loss of confidence. And some students may spend an inordinate amount of
time puzzling over the flawed, and eventually discarded, item and so have less time for the
remaining questions (M. Russell, personal communication, July 11, 2001).
Kathleen Rhoades & George Madaus, Errors in Standardized Tests: A Systematic Problem, National Board on Educational Testing and Public Policy, Lynch School of Education, Boston College, May 2003, p. 17
An outstanding example of the severe and costly disruption that can result from the errors which inevitably plague a secretive examination methodology is the case of the National Computer Systems exam administered in 2000:
National Computer Systems
In May of 2000, the daughter of a Minnesota lawyer learned that she had failed the math
portion of Minnesota's Basic Standards Tests (BSTs), a test published by National Computer
Systems (NCS). Her father contacted the Department of Children, Families and Learning
(CFL), asking to see the exam. For two months CFL staffers rejected his request and "told him
to have his daughter study harder for next year's exam" (Welsh, 2000, p. 1). Only when the
parent threatened a lawsuit did CFL permit him to examine the test (Grow, 2000).
The father, along with employees from CFL, found a series of scoring errors on Form B of
the math test administered in February 2000. The errors were later traced to an NCS employee
who had incorrectly programmed the answer key (Carlson, 2000). As a result, math scores for
45,739 Minnesota students in grades 8-12 were wrong. Of these, 7,935 students originally told
they failed the test actually passed (Children, Families, & Learning, 2000a). Another error
involving a question with a design flaw was found on the April administration of the BSTs.
NCS invalidated this item, but not before 59 students were erroneously told they had failed
(Children, Families, & Learning, 2000a).
Since passing the BSTs was a requirement for graduation, more than 50 Minnesota
students were wrongly denied a diploma in 2000. Of this number, six or seven were not
allowed to attend their high school graduation ceremonies (Draper, 2000). [...]
Before the case went to trial [...] NCS settled with the plaintiffs for $7 million dollars,
paying all of the students who missed graduation $16 thousand each (Scoring settlement, 2002).
Kathleen Rhoades & George Madaus, Errors in Standardized Tests: A Systematic Problem, National Board on Educational Testing and Public Policy, Lynch School of Education, Boston College, May 2003, pp. 13-14
Below is a kind of mistake TS cannot prevent — mailing exams to the wrong address. The serious consequence of having to destroy 50,000 exam papers, however, would have been avoided had the misdirected exams been TS, because anyone seeing one batch of TEXAMS is never able to use what he has seen to help any group that is about to write some other batch of TEXAMS:
Thousands of A-Level exam papers pulped after security error means they were mistakenly sent to schools abroad
By LAURA CLARK
PUBLISHED: 19:16 GMT, 13 June 2012 | UPDATED: 00:55 GMT, 14 June 2012
Students sitting their A level examinations will now sit a revised paper after copies of the exam were mistakenly shipped to Egypt
The security of public exams was thrown into doubt last night after it emerged three A-level papers due to be taken next week were mistakenly sent to schools abroad.
Fifty thousand A-level maths papers will now need to be pulped amid concerns the content may have leaked out.
Copies of the paper were mixed in with batches of past papers sent out to schools in Egypt which requested them for pupils to use in revision sessions.
Two further A-level papers — in chemistry and biology — will be taken as normal next week even though they were also sent to the Egyptian schools in error.
Edexcel, the exam board at the centre of the blunder, insisted the security risk to the two science papers had been contained.
An Edexcel A-level mathematics paper from a previous year, similar to the one that was accidentally distributed to schools in Egypt
But it admitted it could not be sure the maths paper 'remained secure' and will instead set a replacement paper, which it had ready as a contingency.
Teachers will be told to destroy the affected paper — an A2 exam for pupils in the second year of A-levels — but Edexcel admitted there was a risk some pupils could still be given the original paper.
The error brings fresh embarrassment to Britain's exam system, which was plagued with blunders during last summer's GCSEs and A-levels.
Some 100,000 students were affected by mistakes found in 12 separate papers.
They included printing errors, wrong answers in multiple choice papers and questions that were impossible to answer.
The teacher makes errors because he has limited time to give to the preparation and verification of an exam that he will give to his single class of students a few days hence, whereas TS computer programmers can allocate one hundred times the person-hours to verify and debug the segment of their program that generates a similar exam, and because that program will generate TEXAMS for millions of students over many years, the per-student cost of their profound TS verification will nevertheless be vastly lower. For this reason, the TS coders of tomorrow can be expected to deliver virtually error-free TEXAMS where individual teachers today are delivering error-ridden exams, and where even the giant testing enterprises are delivering exams that are far from perfect.
AN EXPLICIT CURRICULUM BRINGS EXAM DEFECTS TO PUBLIC NOTICE
Author James Bamford describes below two questions from the Educational Testing Service (ETS) exam called the Professional Qualification Test (PQT) that the National Security Agency (NSA) gives job applicants. During the reading of the Bamford description, it might be hard to avoid recollecting the ETS failure to renounce exam recycling in another of its tests, which led to the debacle already described above: "Copies of the test's 145 multiple-choice questions, along with correct answers, had circulated among teachers throughout southern Louisiana, probably for years. In a state mired at or near the bottom of almost every educational ranking, teachers had cheated their way into running public elementary, middle and high schools."
Might it be possible, any reader might wonder, that ETS is also recycling its PQT, and that copies of it, along with correct answers, have been circulating among NSA job applicants for years, and with the result that an agency that has been plagued with mole infiltration is at the same time an agency that, in effect, requires its job applicants to cheat on the PQT as a precondition of employment?
And is not the excerpt below a demonstration that author Bamford has been able to study the PQT, and that he has been willing to make public disclosure of some of its contents? Questions of interest to Bamford readers would be whether he still has a copy of the PQT, and on how many other occasions he has disclosed its contents to others, and to how many others might he have distributed copies?
The Puzzle Palace: A Report on America's Most Secret Agency
James Bamford Penguin Books, New York, 1983 (first published 1982), pp. 142-143.
The next step for the future and would-be spooks is the standardized Professional Qualification Test (PQT) administered under contract by the Educational Testing Service [ETS]. The exam is designed not just to test the candidates' academic knowledge but to spot the "cipher brains." One question, for instance, asked the applicant to imagine that he or she was an anthropologist on a high cliff overlooking a series of islands. From the perch the anthropologist could see messengers in canoes zigzagging between the islands. In addition he or she could see smoke signals sent from island to island. After reading about a half-page of information like "Canoe A goes to island 3 then to island 7 then to island 5 and so on while Canoe B goes to island 12 then island 1 ... In the meantime smoke signals are sent from island 6 to island 3..." the applicant must answer questions like "Which island is the chief of the group?" and "Which island controls communications?" and "Which island is the least important?"
Another question may deal with a company scattered throughout a large office building that communicates between offices by means of an unreliable intercom system. The applicant is again given information: "Because of faulty wiring, in order for a person in office A to communicate to someone in office E, he must go through office C, but those in office C can only communicate with persons in office E by first going through office J..." This account goes on for about half a page; then the applicant is asked "How would one get a call from office J to office B?" and "What if no one was in office A, how then would a person call from office Y to office H?" and about ten more such questions.
Those who leave the examination room without having suffered a severe breakdown probably assume NSA installs intercoms on South sea Islands. But this portion of the test is designed to ferret out those few with the rare ability to become masters of traffic analysis, to search through reams of messages and come up with patterns.
However, the reason that the PQT exam is discussed right now under the heading EXAMINATION QUALITY is not that it is likely recycled and therefore cheated on, it is because being an IMPLICIT SYLLABUS exam, any defects that it contains will more readily remain hidden.
And reading Bamford's above description of the two scenarios does suggest that they may be ambiguous enough to allow for multiple interpretations, each interpretation legitimate, but each interpretation pointing to different answers to the questions asked.
For example, in the unreliable-intercoms scenario, we read at one point "communicate to someone" and farther on in the same sentence "communicate with persons", and we notice the switch in prepositions from "to" to "with". One interpretation is to consider the preposition switch significant, with "communicate to" signifying the existence of one-way communication, and "communication with" signifying the existence of two-way communication; however, another interpretation is that the change of prepositions carries no meaning, and that whenever communication exists then it is always two-way. What answers to PQT questions an examinee comes up with may depend very much on which of these two interpretations he favors, but which interpretation the ETS favors may be impossible to tell. The examinee who has studied an illicit copy of the PQT-exam-with-correct-answers beforehand will know what answer is expected; the examinee who hasn't will often guess wrong.
Or in the zigzagging-canoes scenario, the examinee reads that "smoke signals are sent from island 6 to island 12", but immediately recognizes that a smoke signal is not easily targetted to one among several possible recipients — it is readable from all directions, and so the examinee begins to wonder how to interpret this incongruous statement, and he comes up with three leading possibilities: (1) the spying anthropologist has assumed that the target of the signal is island 12, but as the assumption is unsupported by evidence, and as it is implausible, the examinee would be correct to ignore it as a misconception; (2) the smoke signals used among these islands are so sophisticated that they begin with an identification of what island they aim at, and which identification the anthropologist has learned to read; or (3) — which will give the same results as the second interpretation — the examinee is obligated to take as given all statements made in the problem, however implausible they may be in real life. Each of these interpretations is reasonable, but some of them may lead the examinee to answers that the ETS has arbitrarily deemed to be incorrect.
Or, also in the zigzagging-canoes scenario, in real life a canoe travelling between islands may be a messenger canoe or otherwise. Otherwise encompasses a canoe trip for purposes of trade, or a commute from a bedroom community to a place of employment, or to pay a visit to a love interest, and so on. But even assuming that all canoe trips are messenger trips, then in real life each leg of the trip could involve carrying any number of messages, including zero. For example, the anthropologist observes a canoe travelling the following circuit: Island A to B to C and finally back to A for a good night's rest after a hard day's paddling. But what about the message pattern on this ABCA circuit? It is possible that the messenger picked up and delivered exactly three messages: AtoC, BtoC, and BtoA. A very simple scenario, and wholly within the realm of the plausible — but notice how many constituent possibilities this three-message voyage recognizes — that a messenger may arrive at an island carrying zero or one or two messages for that island; and that a messenger may leave an island carrying zero or one or two messages from that island; and that a messenger may visit an island carrying zero or one messages not intended for that island.
And simply observing that ABCA circuit is also compatible with the possibility that the messenger completed his circuit only to discover nobody sending or receiving any messages that day.
In the real world, therefore, the astute anthropologist, observing nothing more than the canoe circuit ABCA, recognizes that his observation is compatible with not only the three-message pattern imagined above, but also with the zero-message pattern, and also with a vast number of other message patterns as well, and thereby recognizes that from the ABCA circuit alone he is able to conclude absolutely nothing about the relative importance or role or status of each island. And even if the anthropologist had an assistant travelling inside every canoe and recording who sent a message to whom, he still wouldn't know which messages conveyed a command, and which a request for a command, and which a complaint from a subordinate pointing out that an order had been impossible to carry out, and which a commendation from a superior for an order well carried out, and which contained only jokes or gossip — which knowledge of contents would seem to be highly pertinent to the determination of island status.
At best, then, the zigzagging-canoes scenario is a highly-abstract problem unrelated to the real world, requiring many implausible assumptions to be made before questions concerning it can be answered, implausible assumptions such as that each canoe leg involves the carrying of exactly one message between the islands on each side of the leg, and that the message is always in the nature of a command. Such assumptions would make the scenario abstract and implausible, but not defective. What would make zigzagging-canoes a defective problem set is its failing to clearly specify what implausible assumptions the examinee needs to make in order to arrive at the answer deemed correct by ETS.
The conclusion relevant here is that if the PQT were embedded in an EXPLICIT CURRICULUM, its defects would be noticed and objected to and corrected. As the PQT is embedded in an IMPLICIT CURRICULUM, any defective questions that it contains might be serving to help cheaters find employment at the NSA.
POOR MENTAL HEALTH IS IDENTIFIED AS THE LEADING CAUSE OF STUDENT DEMORALIZATION
Two signs of student disaffection: the high drop-out rate, plus four youths who consider school worse than prison:
[O]ne of the most acute embarrassments of the social order and the educational system is the "drop-outs," young people who leave school as soon as they can. Thirty-five per cent of the pupils in American high schools abandon them before graduation.
In these circumstances, raising the school-leaving age is like imposing an additional jail sentence on at least 35 per cent of the high school population. In 1962, indeed, The New York Times carried the following report from Gaffney, South Carolina: "Four youths appeared in General Sessions Court in connection with a series of break-ins. Judge Frank Epps, learning that they had quit school, gave them the choice of returning to school or going on the chain gang. Without hesitation, all four chose the chain gang."
Robert M. Hutchins, The Learning Society, Mentor, New York and Toronto, 1968, pp. 31-32, is quoting from L. F. Cervantes, The Dropout, Ann Arbor, University of Michigan Press, 1965, p. 196.
Students preferring prison reflects badly enough on conventional schooling, but not as badly as students preferring death. Fifteen students jumped to their deaths from Cornell University bridges between 1990 and 2010, and just a few months ago Cornell enclosed its seven bridges in suicide nets:
The Broken Generation: Why so many of our best and brightest students report feeling hopeless, depressed, even suicidal
Kate Lunau reports on the crisis on campus. Maclean's, 2012-Sep-10, pp. 54-58, p. 55.
[...] [C]onstruction workers at Cornell University began installing steel mesh nets under seven bridges around campus. They overlook the scenic gorges for which Ithaca, N.Y., is known; in early 2010, they were the sites of three Cornell student suicides of a total of six that year. Students cross the bridges daily on their way to class.
Cornell's bridge nets are the latest and most visible sign that the best and brightest are struggling. In an editorial in the Cornell Daily Sun following the 2010 suicides, president David J. Skorton acknowledged these deaths are just "the tip of the iceberg, indicative of a much larger spectrum of mental health challenges faced by many on our campus and on campuses everywhere." [...]
As is exemplified in the quote above, the favored explanation of student demoralization happens to be one that absolves the school of responsibility — it is the explanation that some students are mentally ill, and that the suicides among them simply happen to be the most mentally ill. And Maclean's magazine, rather than objecting to the Cornell President's denigration of the suicide victims, echoes it: Maclean's categorizes its article as one dealing with "Mental Health", and inserts into its title a reference to students being "broken", and suggests that the cause of the suicides is students "feeling hopeless, depressed, even suicidal." And it naturally follows from such language that if students are ill they should be cured, if broken they should be fixed, and if beset by negative emotions they should have them replaced by positive emotions. What does not follow from such language is that schools need to change anything. Everywhere one turns, one finds blame being heaped on the students and not on the schools:
Common Council member Ellen McCollister '78 (D-3rd Ward) who voted against the nets in December, said at the time that bridge barriers fail to address the mental health factors that are the root cause of the suicides.
Danielle Sochaczevski, Cornell Begins Construction of First Bridge Net, Cornell Daily Sun, August 21, 2012
Cornell would not be of interest here were its student-morale problem unique. In fact many universities, perhaps most, face similar student demoralization, as for example Queen's University described below, and with the main cause of the demoralization continuing to be identified as student mental health, though mention is occasionally made of possible causes external to the student, as for example in the reference to "academic pressure" in the title of the excerpt below:
Jack Windeler was 18 years old and in his first year of university when he died.
How academic pressure may have contributed to the spate of suicides at Queen's University
by Jan Wong September 1, 2011
[...] Jack Windeler's was the first of a string of deaths at Queen's. In the ensuing 14 months, five more students would die, three by suicide, two by what the cops call misadventure (likely alcohol related). [...]
Though we'll never know precisely why Jack decided to take his own life, we do know that incidents of mental illness are on the rise among kids in his age group. [...]
Daniel Woolf, the school's principal, won't discuss the particulars of any of the deaths. "But I can tell you the issue of mental health on campus is indeed getting worse," he says. "Our counselling services are just not able to keep up with the volume of demand."
The same is true at other universities. Robert Franck, director of McGill's mental health services, reports that his office has seen "a dramatic increase" in requests for counselling, with more than 18,000 visits in the 2010–11 school year. At Western, counselling appointments have spiked 20 per cent in the last year and a half. [...]
AN ALTERNATIVE VIEW
The alternative view proposed here, however, is that "mental health" explains nothing, because it makes reference only to the negative emotions and morbid thoughts which undoubtedly prevail prior to every suicide, but fails to trace back to the cause of those negative emotions and morbid thoughts. To say, as is said in the block quote immediately below, that students commit suicide because they are depressed is as enlightening as saying that chickens cross the road to get to the other side. What would be enlightening is to view the student's depression-plus-suicide as the unitary phenomenon that needs to be explained, and to locate its cause in the environment, as perhaps by observing that "the student had been confronted with evidence that he would be unable to avoid failing his year", just as it would be enlightening to view the chicken's crossing-urge-plus-crossing as the unitary phenomenon that needs to be explained, and to locate its cause in the environment, as perhaps by observing "that's where the farmer was scattering grain".
College Student Suicide
Suicide Prevention, Awareness, and Support
Suicide is the second leading cause of death for college students.
And the number one cause of suicide for college student suicides (and all suicides) is untreated depression.
The alternative view proposed here also rejects "academic pressure" as the cause of the unitary phenomenon demoralization-depression-suicide on the ground that there is nothing that an 18-year-old university freshman is asked to do that he couldn't have done, stress-free, when he was 12, as is demonstrated in the TwelveByTwelve Pilot Study.
Where, then, is the cause of student demoralization and suicide to be found?
Well, what the TwelveByTwelve Pilot Study page documents is achievement that is possible when obstacles are removed, but what Cornell and Queen's, and every other university, expects is achievement without obstacle removal, a situation of extreme handicap where the easy becomes difficult, and for some impossible. A leading cause of student demoralization, then, may be expecting students to do easy things that have been rendered difficult by the introduction of obstacles. Stripped of reference to obstacles and of nuance, the principle reduces to: Students are demoralized by being expected to do the impossible.
And what could be more devastating to a student than setting a career goal which is the center of his life, and then being stopped from achieving it? And as long as the stymied student views the obstructions in his path as natural and irremovable and inevitable, he remains incapable of blaming those who have the power to remove those obstructions but choose not to, and so he is left blaming himself. He thinks it's his fault that he's failing, he thinks he's not as smart as he once believed he was, that his friends and family who up to now have been counting on him to continue forwarding announcements of success are about to receive notification of failure. What blow to self-esteem can be more crushing than one day glorying in the status of smart, and the next day being demoted to the status of stupid? And what emotion can accompany this seemingly-deserved demotion, but shame? The failing student hides the bad news that he has really turned out to be stupid from his friends and family because he knows how much they expect from him, and how little he is going to be able to deliver.
That hidden depth within the mind of the dropout seems to exhale feelings of inadequacy, worthlessness, frustration, and failure. A dropout, by that very fact, is more clearly cast in the role of outcast and pariah. When one's deflated self-image is scarcely registering zero, suicide and desperate violence seem a logical enough response with which to solve one's own and one's troubled reference group's problems. [...] Because of his failure at youth's prime job — his school work, because of his dead-end laborer's job which even his own immediate reference group is coming to devaluate, the dropout in an affluent upwardly mobile society has come to feel like a second-class citizen.
Lucius F. Cervantes, The Dropout: Causes and Cures, University of Michigan Press, Ann Arbor, 1965, p. 193.
And what Cervantes does not say above, but which is proposed here, is that while student demoralization, along with its most extreme manifestation, suicide, has many causes, the one that may play the most frequent and influential role, the one that should be considered before all others in any particular case, is that the student is ashamed for having failed a task which he does not realize has been rigged against him.
And what are these obstacles that have been placed in the student's path, and which he does not realize are removable? Why everything that we have been considering above — all of it invites depression, all of it frustrates learning. From the section titled The destructive power of THE IMPLICIT SYLLABUS, these words of discouragement and alienation from Harvard students:
"I felt that many of the exam questions were designed to trick you rather than test your understanding of the material," "the exams are absolutely absurd and don't match the material covered in the lecture at all," "went from being easy last year to just being plain old confusing," and "this was perhaps the worst class I have ever taken."
And from the section titled The pervasiveness of CHEATING we learn that it is possible for someone to experience the crushing blow of failure, without realizing that the only way to pass is by cheating, without realizing that many of those who appear to be outthinking him are in reality only outcheating him. In the three examples below, the failures are fortunate in understanding that cheating is prevalent, and so don't blame themselves for failing, and may even feel proud rather than ashamed of their failures because they attribute those failures to their own refusal to cheat, and thus to their own integrity. The university freshman, however, may fail without ever understanding the real reason why he fails, and so will accept the judgement of the university that he has been found wanting, and so will feel shame. In the first two examples below, also, the examinees seem to have available several opportunities to try an examination, so that failing at one time means only that the exam will be retaken shortly after, whereas in an academic setting, failure more often implies a permanent abandonment of a career goal. The first example below is from the American Board of Radiology case discussed above:
Webb, 31, said he failed the first radiology written exam, which focuses on physics, in the fall of 2008. He said the program director at the time, Dr. Liem Mansfield, told him to use the recalls in order to pass.
"He told me that if you want to pass the ABR physics exam, you absolutely have to use the recalls," Webb said. "And I told him, 'Sir I believe that is cheating. I don't believe in that. I can do it on my own.' He then went on to tell me, 'you have to use the recalls,' almost as if it was a direct order from a superior officer in the military."
The second quote is from the US Navy Nuclear Submarine example above:
My fellow officers were surprised by my failure, and wondered aloud why I hadn't used the "study guide."
When my second exam arrived, so did the so-called study guide, which happened to be the answer key for the nuclear qualification exam I was taking. I was furious. Defiantly, I handed back the answer key to the proctor and proceeded to take the exam on my own. I failed again.
And here, from that must-read NYT article already cited above, is another example of someone failing, but again a happy example insofar as the student is able to infer that others were able to pass only because they cheated. Imagine what would have been the blow to his self-esteem if he had failed the exam without by sheer chance having come across that evidence of cheating:
Educational Testing Service New Jersey headquarters
Giant of Exam Business Keeps Quiet on Cheating
By DOUGLAS FRANTZ and JON NORDHEIMER September 28, 1997
[...] On the morning of July 13, 1996, two teachers from Lafayette, in southern Louisiana, were on the way to take the E.T.S. [Educational Testing Service] exam to become school administrators when they stopped at a Burger King.
As the two men sipped coffee, one pulled out 20 typewritten pages that he described to his companion as a "study guide." It contained 145 questions. Correct answers had been typed and underlined in black.
The second teacher recalled that he leafed through the papers but thought little of them until he sat down at the test center later that morning. He immediately recognized questions on the first two or three pages, and he began to suspect that all 145 questions were identical to the ones in his friend's guide.
"I felt like everyone in the room had the test to study except me," the teacher said recently, speaking on the condition of anonymity. "There were 145 questions to complete in 120 minutes, but almost half the room left in just one hour."
The teacher became angry a few weeks later when he learned that he had narrowly missed a passing score, dashing his hope of becoming a principal. [...]
And perhaps the most demoralizing thing about conventional schooling is the scarcity of redemptive mechanisms — once behind, it is hard to catch up and easy to become alienated, and so on down the slippery road to irretrievable failure and profound shame. Thus, the most reliable antecedents of student suicide are absence from class, failure to turn in homework, non-appearance on exams. Because suicide has many causes, even the A+ student in good standing may commit suicide, but he would then constitute an extreme rarity. He has his academic success to buoy him up. Any setback that he may encounter will be endurable because it will concern only his secondary goals, the goal of good health, of course, being the leading exception.
All conventional-schooling defects can be found to lower student morale. Take for example the meagerness of achievement. The way to get students excited about their work is to teach them skills which amaze adults. Being able to recall eighty words that have been read out is an example of just such a jaw-dropping for the parent, and morale-boosting for the student, achievement, and one which is within easy reach of a ten-year-old. In contrast, the ten-year-old, conventionally-schooled, fifth-grader can see that his bafflement at multiplication and division is also jaw-dropping, but from disappointment rather than admiration. He does not come home from school proud to be able to do the impressive, he comes home from school ashamed to still be unable to do what his parents tell him is easy. He does not wake up in the morning eager to acquire additional skills that will leave everyone agog, he wakes up dreading another day in which his lack of progress will continue to disappoint.
As has already been explained above, the Explicit Transformational Syllabus resolves the most serious of such morale-sapping problems. The student always knows that he is practicing on the most relevant materials, he is never taken by ambush on any exam, unexpected failure is an impossibility for him, as he knows he will perform on EVALUATION TEXAMS very much as he has already seen himself performing on the corresponding PRACTICE TEXAMS, and if he's failing his own PRACTICE TEXAMS, he can delay his EVALUATION TEXAM for the few days, or however long, that it will take him to come up to speed. The near-total eradication of most varieties of cheating gives recognition to achievement and not to guile. The ready availability of PRACTICE TEXAMS at every level of difficulty opens up to the lagging student the path of redemption. By removing obstacles, XTS allows students to proudly complete Grade 12 by the age of 12, and without needing to be enclosed in suicide nets.
SEPARATE THE ROLES OF TEACHER AND RESEARCHER
Talk of low motivation at university typically focuses on the low motivation of the student toward learning, and only rarely notices the low motivation of the instructor toward teaching:
A survey of over 3,000 faculty members taken in 1963 showed that in American colleges, as well as universities, small and large, in all fields, faculty members of all ranks, regardless of how little time they devoted to undergraduate teaching, wished to reduce that time still further. All groups wished to increase the time devoted to graduate instruction, and especially to research.
A report of the Carnegie Foundation for the Advancement of Teaching, published in 1964, referred to a "crisis in values" in higher education. It mentioned, as the cause of this crisis, a "limitless supply of research funds, consulting opportunities, easy promotions, dazzling offers." It said the "heavily-bid-for" young man was likely to have "no sense of institutional loyalty whatever." In his view, students were "just impediments in the headlong search for more and better grants, fatter fees, higher salaries, higher rank." This is one way to kill the knowledge industry.
Robert M. Hutchins, The Learning Society, Mentor, New York and Toronto, 1968, p. 50.
The chief reason the faculty doesn't care about teaching is that, as explained above, professional success depends on publication. But there is another reason why the faculty doesn't care about teaching, and that is because nobody measures its quality. The only measures that are taken are student ratings, which are not much better than popularity contests, and which the faculty rightly tends to view with disfavor. The faculty is aware that high student ratings go to the teacher who hands out high grades for small effort, to the teacher who tells good stories and keeps the class rolling in the aisles with good jokes, to the teacher who happens to be movie-star handsome or beautiful, to the teacher whose life style the students envy. But no one is measuring how much the teacher gets his students to learn. If there were such a measure, and if a teacher's high performance on it received recognition and reward, if it was appropriate to include it among the other accomplishments on his curriculum vitae, then faculty would for the first time be awakened to how badly they are teaching, and to their responsibility to teach better, and would begin to teach well with enthusiasm, and would be rewarded also by seeing student morale soar.
But this missing measure of effective teaching is exactly what XTS is able to supply — specifically, the student can be evaluated going into a course and coming out, and the difference — the "gain score" — can be the measure of the amount learned. Teachers for whom a student's success was credited as the teacher's success, and for whom a student's failure was debited as the teacher's failure, would become teachers who taught well and whose students enjoyed high morale.
SEPARATE THE ROLES OF TEACHER AND EVALUATOR
XTS uplifts morale also by separating the roles of teacher and evaluator. The role of teacher is to prepare the student for evaluations which are set and administered by an evaluator. This may be a new way of doing it in schools, but it is old hat in other areas of endeavor. Take a British Columbia student going through Toronto Conservatory piano, for example — well, obviously the standards are set not by the student's teacher, but by music scholars in Toronto, and — not so obviously — the exams are administered and adjudicated by piano teachers brought in from another province — usually Alberta.
Or, gymnasts at the Olympics are not judged by their own coaches, nor by judges from only their own country — they are judged by professionals from around the world. The need for this separation of roles is obvious in music and in gymnastics, and it should be just as obvious in all school subjects. Well, from what Lucius Cervantes says below, it is already obvious in British schools, and whose practice has only to be imported to Canada and the US. The effect of separating teaching and evaluation will be to boost student morale by ridding evaluation of favoritism, and by transforming the teacher into an ally supporting the student's battle to extract marks from the evaluator.
The American system of education which makes the teacher both instructor and judge in effect pits the students against the teacher. In the English system, the teacher teaches the subject matter but someone else judges whether the student has passed the examination. When teacher and student join forces to beat the examinations of common enemies in the central office a quite different esprit de corps develops than when the judge is in the midst of the classroom — in fact at the head of it. There is no avoiding him — except by doing your best to ignore him. This the American student seems to do quite effectively.
Lucius F. Cervantes, The Dropout: Causes and Cures, University of Michigan Press, Ann Arbor, 1965, p. 114.
Many educational problems will be solved by merely recognizing that a university faculty member is today being asked to play three roles — teacher, evaluator, and researcher — and that his time being finite, when he dedicates enough of it to succeed in one role, he has little time left to succeed in any other, and that in any case, the role of teacher and evaluator are incompatible. Dividing these three roles into separate career paths can be expected to number among its benefits the uplifting of student morale. The division relevant here, the one between teacher and evaluator, is the one that can be made with the smallest disruption by simply adopting XTS.
Conventional schooling imposes costs of which XTS is largely free, as for example the costs of preventing cheating, of detecting it, of investigating allegations of it, of prosecuting it. Students who acquire credentials by cheating end up occupying positions in which they prove inept, which incurs both social and economic costs. Students who are prevented from fulfilling their academic potential deprive the economy of innovations and inventions and discoveries. Invalid geographical and historical data leads to bad decisions, which cost money. Student demoralization, dropping out, and suicide all come with a price tag attached.
Two of the costs that XTS lowers are considered below for different reasons, the first because it is often on students' minds — textbook costs — and the second because of the large savings that can be realized — exam-preparation costs.
We have already seen above that the textbooks that students need to buy are bulked up with material that is not examinable — something like 3/4 of the textbook may not need to be read, and something like 19/20th of the problems in the textbook may not need to be solved. The student is buying mostly pages that he will never use. But, as is indicated in the following excerpt from the Executive Summary in Merriah Fairchild's RIPOFF 101, there is more wrong with textbooks than their heavy padding:
RIPOFF 101: How the Current Practices of the Textbook Industry Drive Up the Cost of College Textbooks
Merriah Fairchild CALPIRG January 2004 cram.whitematter.ca/~
Textbooks are Expensive and Getting Even More Expensive
Textbook Publishers Add Bells and Whistles that Drive Up the Price of Textbooks; Most Faculty Do Not Use These Materials
Students will spend an average of $898 per year on textbooks in 2003-04, based on surveys of University of California (UC) students in the fall of 2003. This represents almost 20 percent of the average tuition and fees for in-state students at public four-year colleges nationwide. In contrast, a 1997 UC survey found that students spent an average of $642 on textbooks in 1996-97.
Textbook Publishers Put New Editions on the Market Frequently — Often With Very Few Content Changes — Making the Less Expensive, Used Textbooks Obsolete and Unavailable
Half of all textbooks now come "bundled," or shrink-wrapped with additional instructional materials such as CD-ROMs and workbooks. Students rarely have the option of buying the textbook "a la carte" or without additional materials.
In the one instance that a textbook was available both bundled and unbundled (only the textbook), the bundled version was more than twice as expensive as the unbundled version of the same textbook.
Sixty-five (65) percent of faculty "rarely" or "never" use the bundled materials in their courses.
Seventy-six (76) percent of faculty report that the new editions they use are justified "never" to "half the time." Forty (40) percent of faculty report that the new editions are "rarely" to "never" justified.
A new textbook costs $102.44 on average, 58 percent more expensive than the price of an average used textbook, $64.80.
Fifty-nine (59) percent of students who searched for a used book for the fall 2003 quarter/semester were unable to find even one used book for their classes.
An observation highly relevant to the topic of textbook costs is introduced in the excerpt below — that sometimes the material being tested is so well established that it has not changed over the last fifty or so years — and will be alluded to farther below when the labor cost of implementing and maintaining XTS is discussed:
How Your Textbook Dollars Are Divvied Up:
The price of textbooks is ramping up far faster than inflation, and with good reason, say publishers
By DANIELLE KURTZLEBEN August 28, 2012
Students aren't happy about those costs. Around 75 percent of students agree that the cost of textbooks is "excessive," according to Weil. "There's no excuse for a calculus textbook to cost $250. That's just insanity," says Nicole Allen, affordable textbooks advocate at Student PIRGs, an association of student advocacy groups. Differential equations, after all, were the same in 1960 as they are now; charging ever-increasing costs for the same information, says Allen, is unjustifiable.
The solution to bulky and expensive textbooks is occasionally to dispense with them, which would be particularly easy under an XTS regimen, and which dispensing has proven to work well even with a computer-unassisted EXPLICIT SYLLABUS, as for example in Professor Swanson's section of UBC Math 101, which came with no assigned or recommended textbook, and in which no reference to any textbook was ever made. Instead, Professor Swanson handed out on the first day of class a list of 485 problems, with the solutions to some attached, and to the rest delivered in lectures over the duration of the course. With a few additional problems being introduced in class, the total number of problems rose to around 500, permitting the course to be typified as a 500-question course. Professor Swanson came about as close to having an EXPLICIT SYLLABUS as it is possible to have without the computerization relied upon by XTS. In this EXPLICIT-SYLLABUS course students were secure in knowing that what they would be examined on is variations of these 500 problems. Naturally, Professor Swanson had students lined up outside his office door seeking to transfer into his section out of the sections they had been assigned to, but needing to be turned away because his class was already overflowing. And Professor Swanson's section demonstrated the high student satisfaction, and high performance, which are the dependable results of an EXPLICIT SYLLABUS. And of course a computer-based EXPLICIT TRANSFORMATIONAL SYLLABUS is able to make a merely EXPLICIT SYLLABUS better still.
Professor Swanson's Problem 485 in his UBC Math 101
Here is that 485th problem with Professor Swanson's solution, relevant here in illustrating exactly what it is that a math teacher can provide and which might render a textbook unnecessary, but serving also to reinforce the point made in the Conventional learning is MEAGER section above. We understand that conventional learning is meager indeed when we realize that Problem 485 is the level of difficulty that American twelve-year-olds would be happy working at, were obstacles removed from their path.
Professor Swanson's Solution 485 in his UBC Math 101
On the other hand, the conventional textbook is not without its uses. Its photographs and historical background and anecdotes, though not examinable, might contribute toward kindling and sustaining interest. It can also serve as a reference in case the student wishes to ask a question that is not covered in the course. And it is conceivable that the occasional keen student may wish to study some topic in greater depth than is examinable. The trouble is that the instructor typically fails to clearly distinguish the small fraction of the textbook which is examinable from the large fraction which is not, to the confusion and demoralization of the student, and the trouble also is that textbook publishers keep issuing either needlessly-revised editions, or needless textbooks by new authors, which puts the instructor who has taught the course in previous years to the trouble of re-defining his syllabus, and selecting a fresh batch of practice problems. The optimal solution may be for professors to define a national or international Transformational Syllabus, and for publishers to offer textbooks which clearly distinguish the content that is XTS-examinable from that which is supplementary.
Explicit Transformational Syllabus (XTS)
Fundamentals of TwelveByTwelve
And even if it is agreed that XTS is valuable, it may nevertheless be wondered whether it is affordable. Might it not be a futuristic or utopian scheme that is too expensive for anyone to implement?
Starting from the Alice-to-Irma area TEXAMS above, let us hypothesize that in the 26,000 or so secondary schools in the United States a single area examination is given each year, which means that some 26,000 teachers spend let us say one hour preparing 26,000 area examinations, which takes 26,000 person-hours of exam-creating time.
On the other hand, writing a Transformational Syllabus computer program which covers area problems K-12 might take 10 programers four weeks or 160 hours, which for ten programers is 1,600 person-hours.
However, the following year, the 26,000 teachers will have to repeat their labor, whereas the Transformational Syllabus programmers will not have to repeat anything. K-12 area problems are not a rapidly-advancing field of mathematics. The area problems that K-12 students are solving today are the same as the ones they solved 50 years ago, and maybe even 100 years ago. If enhancements or updates need to be made occasionally, they will be in the nature of tweaks, which might occupy one programer ten hours each year, and assuming tweaking is called for even during the first year, that gives 10 hours of tweaking over 100 years, which is 1,000 person-hours.
Over the course of a century, therefore, conventional exam preparation will take
(100 years) (26,000 person-hours / year) = 2,600,000 person-hours.
Over the same century, in contrast, the Transformational Syllabus alternative will take
(1,600 person-hours) + (100 years) (10 person-hours / year) = 2,600 person-hours.
The ratio 2,600,000 / 2,600 equals 1,000.
What this ratio signifies, then, is that the present method of preparing examinations is in the long run one thousand times costlier than the Transformational Syllabus method. What the Transformational Syllabus offers, to put it another way, is a thousandfold reduction in the labor cost of preparing examinations.
And this saving will be enjoyed not only in the field of area computation, which we focus on here only as an example, but in all fields in which Transformational Syllabus examinations can be given, which surely number in the thousands, coming from all the many sub-areas of Mathematics, Physics, Chemistry, Biology, Computer Science, Engineering, Medicine, and so on.
Adjustments can be made to the assumptions offered above, but they will not alter the cost ratio enough to call for different conclusions. And any adjustments should take into account that to prepare an Area exam might more often take a teacher three hours than one hour, and to create an exam with questions of the complexity of those shown in the Alice-to-Irma TEXAMS might take the teacher closer to one day than to one hour. Also, the number of secondary schools in the U.S. is presently greater than 26,000, and the Transformational Syllabus program can be used in elementary schools as well. Furthermore, if the number of secondary schools doubles or quintuples over the century, so will the Conventional Syllabus labor cost, whereas TS labor cost will remain the same — the TS computer program costs the same to build and maintain whether it services a single student on one occasion or every student on earth for a century. For reasons such as these, the test preparation cost falling by a ratio of 1,000 to 1 is likely to be an underestimate of the cost saving.
At the same time, XTS is not a project which requires heavy initial outlay and which pays off only after a century. Even in the first year, under the above assumptions, the ratio of conventional cost to XTS cost would be 26,000/1,610 which equals approximately 16/1. Even in the first year, then, XTS would cut exam-construction person-hours to 1/16, a ratio which grows annually until after a century the cumulative ratio becomes 1/1,000.
And not to be forgotten is the big picture — that XTS methodology brings so many important advantages that it would be preferable even if it cost more than conventional schooling; that XTS exam-preparation cost promises to end up being in the ball park of one-thousandth of conventional exam-preparation cost argues all the more forcefully for its adoption.