Learning Support Centers In Higher Education, Serving Students, Faculty, Staff, Administration, and Surrounding Communities
HomeDisclaimerAcknowledgementsContacts

Welcome
Why and How to Use LSCHE
What's New
About Learning Support Centers
Calendar
Resources
About LSCHE "nearby history"
Search


image
Kerstiens, Gene. "Postsecondary Student Assessment and Placement: History, Status, Direction," in Mioduski, Sylvia and Gwyn Enright (editors), PROCEEDINGS OF THE 13th and 14th ANNUAL INSTITUTES FOR LEARNING ASSISTANCE PROFESSIONALS: 1992 AND 1993. Tucson, AZ: University Learning Center, University of Arizona, 1994. Pp. 56-62.

  

Postsecondary Student Assessment and Placement:

History, Status, Direction

Dr. Gene Kerstiens, Andragogy Associates

 

This paper is not an intentionally objective, dutifully documented appraisal of student assessment as it obtains in postsecondary institutions. Rather, it is an observation based upon a review of the professional literature together with experience derived from forty years while serving on and visiting a variety of campuses. Many of these institutions were occasion for the author's being tested, evaluated, and advised under the same conditions endured by students who seldom find them edifying. Without apology, this exposure is given to a mindset that considers assessment chiefly from the student's point of view. For those interested in an explication of postsecondary student assessment and placement from the perspective of consumer advocacy, read on.

Briefly, assessment and placement as it is most commonly practiced on college campuses today consists principally of basic skills testing - reading, writing, and math skills as measured on a standardized, time-critical, objective, paper-and-pencil test battery. Scores on these tests are used to qualify students for "college level" courses or to direct underprepared students to some intervention calculated to prepare them for the rigors of the post-secondary curricular experience. To be sure, at some institutions other measures and means are included in the assessment package: writing samples, high school grades or class standing, learning skills surveys, experiential credit, and learning styles and self-esteem surveys. But a review of research (Gabriel, 1989), as well as the latest national survey (Boylan, Bliss, and Bonham, 1992), reveals that 96% of colleges use scores earned on basic skills tests as the principal, if not the sole, determinant for student placement.

How we arrived at this condition and predicament has interesting historical roots in Nineteenth Century intelligence testing and some of the conceptual and measurement errors related to that movement (Gould, 1981). But how we have persisted in this tradition in spite of the preponderance of research criticizing it and our own less than exemplary experience while carrying on the practice (Weber 1985; Kerstiens 1993) is not a pretty story. Perhaps understandably it is avoided as a treatable topic in certain graduate schools of education that regard such ironies as sensible and inevitable and logic as a systematic method of arriving at the wrong conclusion with confidence.

By the turn of the century, Binet, Terman, Spearman, and others had established the construct of general intelligence. They came to believe that ability to learn (IQ), especially in the academic arena, could be measured on paper-and-pencil intelligence tests and that IQ scores predicted success in college at least reasonably well. This opinion survives in some quarters today, sometimes transmuted into curious and convenient persuasions (Conner, 1989). But by 1921, Gates was successfully selling the notion that a more specific ability, the ability to read and comprehend texts, best accounted for proficiencies related to success in college. This rationale inspired a flood of reading comprehension tests that often became exclusionary to other instruments of assessment. Finally, by the late '50's, writing and math skills also were construed as skills necessary to college learning. When tests measuring these

[56]


skills were incorporated with the well entrenched reading skills requirement, the practice of basic skills testing emerged and blossomed. During the '70's, basic skills batteries became regarded as the common and acceptable method for measuring college aptitude. With slight modifications, many academics sustain unshakable confidence in this system, relinquishing their hold on speeded, objective, basic skills tests only when we pry them from their cold, dead fingers.

Of course, most sobering is the fact that these practices have flourished through the process of reification, that is, our coming to regard a theory or construct as having real or concrete existence. Which is to say that basic skills objective testing is commonly equated in the public mind and in many academics' consciousness with assessment itself, so that testing and assessment are now blurred or even indistinguishable constructs. And because standardized basic skills tests are inexpensive, their administration conveniently conforms to institutional time frames, and faculty authored textbooks teaching test-taking strategies have become imbedded in developmental course curriculum, we have found it expedient to remain serenely indifferent to critical research that should discourage their employment.

Accordingly, through the years, standardized, paper-and-pencil basic skills testing has collected popular support as a ship collects barnacles. As the tests enjoyed wider use, they were naturally cited more often in studies and reports. Deft references to these tests abound in the professional literature, especially during the last 30 years, this frequency and duration implying respectability. Consequently, the political correctness of this assessment practice has become well established through its popularity rather than any proven validity, accuracy of measurement, or track record of efficient placement. Even normally scrutinous and skeptical professionals have substituted their faith in group preference for their own independent judgment based upon observation, research, or gut feeling. It's amazing how colleagues' sentiments are influenced when they take notice of the adoption trends generated by their fellows.

As indicated, the movement toward this assessment system grew, nourished by glittering advertisement claims and practitioner endorsements, but in spite of a steady stream of disfavoring professional literature. During the past seven decades extensive and comprehensive research reviews have repeatedly pointed out not only the limitations of the method but also its debilitating effects on our student constituency. (Gates 1921; Flanagan 1939; Preston & Botel 1951; Rankin 1962; Tillman 1977; Stetson 1982) These writers indict most standardized, basic skills instruments and, in almost unanimous agreement, make nine charges: they sacrifice accuracy of response for speed of response, encourage chance-success responses (guessmanship), discourage analytical reasoning, unnecessarily elevate anxiety, delay feedback of test results, rely on norm-referenced measurement, facilitate or demand mass testing format, provide a scarcity of alternative test forms, and promote inconvenient scheduling of test administration. Tillman (1977) succinctly identified the inconsistency between research findings and our assessment practices: "Ironically, the increasing popularity of certain tests seems to be inversely related to the negative comments of critics" (p. 253).

[57]


Meanwhile, back in the political and popular opinion arenas, vigorous unrest concerning assessment/placement practices can be evidenced in abundance. During the past twelve years, no fewer than eight bureaus, commissions, and councils have been appointed by the President or his designee to study the problem, to make recommendations, and to serve as federal assessment regulatory agencies. The results have been unrewarding. The latest in this succession of failures was the National Commission on Education Standards and Testing (Public Law 102-62, 1992) whose recommendations were evaluated in Congressional Testimony as follows:

We believe that the proposed NESAC would not be capable of evaluating the new standards and examinations meaningfully. We see the need for an independent, non-partisan body with sufficient expertise and credibility to evaluate the technical qualities of alternative assessments, examine the evidence about their feasibility and costs, monitor the consequences of their use, and judge the comparability of results. (Institute for Education and Training, 1992, p. 1)

Echoing these concerns are articles in the Chronicle of Higher Education, like George Madaus's (1990) "Standardized Testing Needs a Consumer Protection Agency," and the following charges leveled by Rand researchers:

Our testing policies have failed to achieve many of their intended positive effects, while creating some clearly negative consequences. Initially created to facilitate tracking and sorting of students, these instruments were not intended to support or enhance instruction. Because of the way in which the tests are constructed, they place test takers in a passive, reactive role, rather than a role that engages their capacities to structure tasks, produce ideas, and solve problems. The tests thus exclude many kinds of knowledge and types of performance that we expect of students. They are inappropriate tools for many of the purposes that they are expected to serve. (Darling-Hammond and Lieberman, 1992, B-l)

Of course, we have been invited to believe that one reason why these tests fail to fairly evaluate matriculating students is that they are based on norms developed years ago when, presumably, the norming population possessed better skills. Reinforcing this belief are countless alarming reports in both the media and professional press about students' declining scores. Much as investors pore over stock indexes with frightful eagerness, we have been preoccupied with periodic reports of disappointing fluctuations in student scores, some of us becoming operatic about declining standards and evangelical about reestablishing them.

But such data contribute to a distorted view of the students we serve. Indeed, if we compare today's average student ACT and SAT scores with those of twenty years ago, the results are surprising.

[58]


 ACT Mean Composite Score Comparison*
(Maximum Score = 36)

 

 SAT Mean Composite Score Comparison*
(Maximum Score = 800)

 YEAR

 SCORE

 YEAR

 SCORE

 1970

 19.9

 1970

 474

 1990

 20.6

 1990

 450


* Source: WORLD ALMANAC, Scripps Howard Company, NY, 1991, 218-219.

Since ACT and SAT tests are patently heavy hitters in the postsecondary testing industry, these data should provide a credible comparison of the entrance scores of today's students with scores of students 20 years their junior. We can notice that on the ACT, there is a .7 of a point (5%) increase in scores. On the SAT, there is a 24 point (3%) decrease in scores. The plus-and-minus variance between these average scores would appear to cancel each other to represent overall score levels that are essentially unchanged. Which is to say that students may be different from what they were 20 years ago, but, as they are measured on standardized entrance examinations that enjoy high usage, they are no worse - or, let us say, they are just as bad. Therefore, the declining scores scenario does not account for the tests' mismeasurement of today's student population.

Whatever the historical circumstances that have occasioned or sustained the sad condition of postsecondary assessment and placement, there is wide agreement that changes need to be made. Students have voiced their protests and have even initiated organizations designed either to facilitate reform or abolish the entire process. At the federal level, regulatory agencies with a firm grasp of the obvious have registered displeasure and frustration. Few academics are satisfied with their institution's assessment process. Face-to-face inquiries elicit guardedly discreet responses from professors, counselors, advisors, and administrators, most of whom confess that their institution's assessment/placement strategies are ineffective and probably unfair.

What can be done to improve the typical assessment process? There are three modest measures, based upon the best available research and experience, not really too threatening, and certainly cost-effective, that most institutions can apply on a given Monday morning to take significant steps toward a solution.

First of all, those institutions employing mandatory placement need to reconsider this policy, especially because research supports discretionary placement. According to the latest national survey (Boylan, Bliss, and Bonham, 1992), 57% of postsecondary institutions stated

[original document page 59]


that their placement was mandatory as a result of assessment. However, on six success variables including persistence, success in critical classes, and cumulative GPA, students enrolling in colleges with mandatory placement policies were significantly less successful than students attending institutions allowing options. Additionally, Utterback's (1989) exhaustive review of research, together with his own well designed study, lend credence to the position that insisting on student participation in interventions based on questionable assessment practices is not only unwarranted but untenable.

Next, most schools need to consider augmenting and enriching their assessment packages. While a majority of campuses will probably continue to employ paper-and-pencil objective basic skills testing, they might choose to include promising alternative means and measures for assessing and placing students. One example supporting such consolidation stands out in the research. In their national survey, Boylan, Bliss, and Bonham, (1992) learned that 26% of institutions incorporated learning skills inventories as a component of their assessment system. On seven success variables including mean first-semester GPA, persistence and success in critical classes, and graduation rates, students enrolled in schools utilizing learning skills inventories as part of their assessment system were significantly more successful than students in schools that did not. Additionally, Bliss and Mueller (1987) learned that results on one learning skills inventory predicted first-semester GPAs at an unprecedented .79, a correlation high enough to engage our actuarial and statistical attention and encourage implementation.

Finally, consider adopting a computer-adaptive test to replace the paper-and-pencil basic skills instrument most probably presently in place on most campuses. Why? First of all, computer-adaptive basic skills testing addresses the nine most common researcher objections mentioned in the seventh paragraph of this article. It manages to avoid most if not all of the negative features deliberated by objective basic skills testing critics. Next, because its format and presentation are based on item response theory, the instrument presents a student with items of optimum challenge rather than displaying an entire spectrum of item difficulty that either encourage guessmanship or occasion boredom. Finally, since test items measure research based proficiencies typically required of students engaging in the college experience (College Board, 1983), test results need not be revealed in terms of points or percentiles but can be reported in criterion terms as levels of proficiency and performance. Although only one computer-adaptive basic skills instrument is presently available (College Board), another is being prepared for marketing in the near future (American College Testing).

Of course, a problem that has been decades in the making is not likely to be remedied with dispatch. Nor are mandated regulations liable to inspire an epiphany of collective insight in an infrastructure colonized by petty bureaucrats more interested in turf concerns than what is right for students, the course of study they face, and the faculty delivering instruction. Oxymoronically speaking, it is difficult to provoke revisionist thinking among those seeking innovation without change.

[60]


 

References

Bliss, L., and Mueller, R. (1987). Assessing study behaviors of college students: Findings of a new instrument. Journal of Developmental Education, 11(2), 14-18.

Boylan, H., Bliss, L. and Bonham, B. (1992). National study of developmental education, National Center for Developmental Education, Appalachian State University.

College Board. (1983). Academic preparation for college: What students need to know and be able to do. New York: The College Board.

Conner, J. (1989). Renee's St. George vs Binet's dragon. Journal of Developmental Education, 13(2), 28-29.

Darling-Hammond, L. & Lieberman, A. (1992). The shortcomings of standardized tests. Chronicle of Higher Education, January 29, 38, Bl-B3.

Flanagan, J. (1939). A study of the effect of comprehension of varying speeds of reading. In Research in the foundations of american education (pp.47-50). Washington, DC: American Educational Research Association.

Gabriel, D. (1989). Assessing assessment. Review of Research in Developmental Education, 6(5), 1-6.

Gates, I. A. (1921). An experimental and statistical study of reading and reading tests, Journal of Educational Psychology, 12, September, October, November 1921, 303-314, 378-391, 445-464.

Gould, S. (1981). The mismeasure of man. New York: W.W. Norton.

Institute for Education and Training (1992). National educational standards and testing: A response to the recommendations of the national council on education standards and testing. Santa Monica, CA: The Rand Corporation, 90407-2138.

Kerstiens, G. (1993). A quarter-century of student assessment in CRLA publications. Journal of College Reading and Learning, 25(2), in press.

Madaus, G. (1990). Standardized testing needs a consumer- protection agency. Chronicle of Higher Education, September 5, A-52.

Preston, R. & Botel, M. (1951). Reading comprehension under timed and untimed conditions. School & Society, 74, 71.

Rankin, E. (1962). The Relationship between reading rate and comprehension. In E. Bliesmer & R. Staiger (Eds.), Eleventh yearbook of the national reading conference (pp. 1-5). Boone, NC: The National Reading Conference.

[61]


Stetson, E. (1982). Reading tests don't cheat, do they? Journal of Reading, 25, 634-639.

Tillman, C. (1977). Readabillty and other factors in college reading tests. In D. Pearson & J. Hank (Eds.), Twenty-sixth yearbook of the national reading conference (pp. 253-258). Rochchester, NY: The National Reading Conference.

Utteback, J. (1989). Closing the door: A critical review of forced placement. Journal of College Reading and Learning, 22(1), 14-22.

Weber, J. (1985). Assessement and placement: A review of the research. Community College Review, 13(3), 21-33.

[62]


 
1992-93 Proceedings | 1994-95 Proceedings | 1996-97 Proceedings | Proceedings Home |   

[ Home | Disclaimer | Acknowledgements | Contacts | Welcome ]
[ What's New | About Learning Support Centers | Calendar ]
[ Resources | About LSCHE | SEARCH ]


"1992-93 Proceedings - Kerstiens "
© 1998 -
This page last modified: 2008-05-29
Questions and comments to: Dr. Rick A. Sheets at
rick.sheets@pvmail.maricopa.edu
http://www.pvc.maricopa.edu/~lsche/