Study of test equating on the common item non-equivalent group design<p>Ortak maddeli denk olmayan gruplar desenine ilişkin test eşitleme çalışması

Süleyman Demir; Neşe Güler

Authors

Süleyman Demir Sakarya University
Neşe Güler Sakarya University

Keywords:

Non-Equivalent Groups with Anchor Test, Tucker Linear equalization, Levine linear equalization, frequency prediction, Braun-Holland linear equalization, PISA 2009, Ortak maddeli denk olmayan gruplar deseni, Tucker doğrusal eşitleme, Levine doğrusal eşitle

Abstract

This research aims at testing the statistical equivalence of different forms of a test which are administered at the same time. For our purposes, an equating design with shared items was used for non-equivalent groups. Non-equivalent groups design with common items is used for problems that might arise in relation to the reliability and implementation of tests in which different forms are applied. The data set of the research was obtained from responses given by students participating in the PISA 2009 application within Turkey’s sample. The data collected from the 761 students of 15 age group who had answered the 3rd and 10th booklets of the science studies literacy test were analyzed through Tucker Linear equating, Levine linear equating, frequency prediction and Braun-Holland linear equating methods. The weighted mean error squares averages indices that were obtained through equating procedures were 0.046 for the Tucker- linear equating, 0.072 for the Levine- linear equating, 0.049 for frequency prediction, and 0.034 for the Braun-Holland linear equating. It was observed based on the WMSE coefficient that the Braun-Holland linear equating method was the most appropriate for the equating of booklets 3 and 10 in the PISA 2009 Science Studies sub-test

Özet

Bu araştırmanın amacı aynı anda uygulanan bir teste ait farklı formların istatistiksel eşitliğini sınamaktır. Bu amaç için denk olmayan gruplar için ortak maddeli eşitleme deseni kullanılmıştır. Ortak maddeli denk olmayan gruplarda ortak test deseni; farklı formların uygulandığı testlerin güvenliği ve uygulamasıyla ilgili meydana gelebilecek problemlerden dolayı kullanılmaktadır. Araştırmanın veri setini, PISA 2009 uygulamasına Türkiye örnekleminde katılmış olan öğrencilerin vermiş oldukları cevaplar oluşturmaktadırlar. Fen Bilimleri okuryazarlık testinin 3. ve 10. kitapçıklarını cevaplayan 15-yaş grubu 761 öğrenciden elde edilen veriler Tucker doğrusal eşitleme, Levine doğrusal eşitleme, frekans tahmin ve Braun-Holland doğrusal eşitleme yöntemlerine göre analiz edilmiştir. Eşitleme işlemleri sonucunda elde edilen ağırlıklandırılmış hata kareleri ortalaması indeksleri ise Tucker-Doğrusal Eşitleme için 0,046; Levine-Doğrusal Eşitleme için 0,072; Frekans Tahmin Eşit Yüzdelikli eşitleme için 0,049 ve Braun-Holland Doğrusal Eşitleme için ise 0,034 olarak bulunmuştur. Ağırlıklandırılmış hata kareleri ortalaması katsayılarına göre Braun-Holland Doğrusal Eşitleme yönteminin PISA 2009 Fen Bilimleri alttestindeki 3 ve 10 numaralı kitapçıkların eşitlenmesi için en uygun yöntem olduğu görülmektedir.

Downloads

Metrics

PDF views

607

| |

Author Biographies

Süleyman Demir, Sakarya University

Research Assistant, Sakarya Univesity, Faculty of Education, Department of Educational Measurement and Evaluation.

Neşe Güler, Sakarya University

Associate Prof. Dr., Sakarya Univesity, Faculty of Education, Department of Educational Measurement and Evaluation

References

Akhun, İ. (1984). İki Korelasyon Katsayısı Arasındaki Farkın Manidarlığının Test Edilmesi. Ankara.

Akkuş, N. (2008). Yaşam Boyu Öğrenme Becerilerinin Göstergesi olarak 2006 PISA Sonuçlarının Türkiye Açısından Değerlendirilmesi. Yayımlanmamış Yüksek Lisans Tezi, Hacettepe Üniversitesi, Sosyal Bilimler Enstitüsü, Ankara.

Angoff, W. H. (1971). Scale, norms and equivalent scores. In R. L. Thorndike (Eds.) Educational Measurement (2nd. Ed.) Washington D.C; American Council of Education.

Angoff, W. H. (1982). Summary and derivation of equating methods used at ETS. In P.W. Holland ve D. B. Rubin (Ed). Test Equating. New York: Academic Press.

Angoff, W. H. (1984). Scales, norms and equivalent scores. New Jersey: Educational Testing Service.

Baykul, Y. (1996). İstatistik: Metodlar ve uygulamalar (3.baskı). Ankara: Anı Yayıncılık

Bozdağ, S. ve Kan, A. (2010). Şans Başarısının Test Eşitlemeye etkisi. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 39, 91-108.

Dorans, J. N. ve Holland, P. W. (2000). Population in variance and the equitability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37, 281-306.

Davier, A. A., Holland, P.W. ve Thayer, D. T. (2002). Population in variance and chain versus post-stratification equating methods. In N. J. Dorans (Ed.), Population in variance of score linking: Theory and applications to Advanced Placement Program® examinations(ETS RR-03-27, pp. 19-36). Princeton, NJ: Educational Testing Service.

Hambleton, R. K. ve Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, Kluwer Academic Publishers Group.

Kan A. (2010). Test Eşitleme: Aynı Davranışları Ölçen, Farklı Madde Formlarına Sahip Testlerin İstatistiksel Eşitliğinin Sınanması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi1(1) 16-21.

Kan A. (2011). Test Eşitleme: OKS Testlerinin İstatistiksel Eşitliğinin Sınanması. Eğitim ve Bilim 36 (160) 38-51.

Kelecioğlu, H. (1994). Öğrenci seçme sınavı puanlarının eşitlenmesi üzerine bir çalışma. Yayınlanmamış doktora tezi, Hacettepe Üniversitesi, Sosyal Bilimler Enstitüsü, Ankara.

Kolen, M. J. ve Whitney, D. R. (1982). Comprison of four procedures for equating the tests general educational development. Journal of Educational Measurement, 19(4), 279–293.

Kolen, M. J. ve Brennan, R. L. (1995). Test equating: Methods and practices. New York: Springer.

Kolen, M. J. (2003). CIPE: Common item program for equating (CIPE) (version 2.0). University of Iowa: Center for Advanced Studies in Measurement and Assessment (CASMA).

Kolen, M. J. (2007). Data Collection Designs and Linking Procedures. In Dorans, N. J. Pommerich, M. Holland, P. W. (Eds.), Linking and Aligning Scores and Scales. USA: Springer.

Kolen, M. J. ve Whitney, D. R. (1982). Comprison of four procedures for equating the tests general educational development. Journal of Educational Measurement, 19(4), 279–293.

Kilmen, S. (2010). Madde Tepki Kuramına Dayalı Test Eşitleme Yöntemlerinden Kestirilen Eşitleme Hatalarının Örneklem Büyüklüğü ve Yetenek Dağılımına göre Karşılaştırılması. Yayınlanmamış doktora tezi, Ankara Üniversitesi, Eğitim Bilimleri Enstitüsü, Ankara.

Livingstone, S. A. (2004). Equating test scores(Without IRT). Educational Testing Service.

OECD, (2009) Organization for Economic Cooperation and Development 2009. PISA 2009 Assessment Framework: Key Competencies in Reading, Mathematics and Science. Paris: OECD.

Skagg, G. ve Lissitz R. W. (1986). An Exploration of the Robustness of Four Test Equating Models. Applied Psychological Measurement. 10, 303-317.

Şahhüseyinoğlu, D. (2005). İngilizce Yeterlik Sınavı Puanlarının Üç Farklı Eşitleme yöntemine Göre Karşılaştırılması. Yayımlanmamış Doktora Tezi, Hacettepe Üniversitesi SBE.

Tanguma, J. (2000). Equating test scores using the linear method: A primer. Paper presented at the annual meeting of the Southwest Educational Research Association. Dallas, TX.

Zhu, W. (1998). Test equating: What, why, how? Research Quarterly for Exercise andSport, 69(1), 11-23.

Study of test equating on the common item non-equivalent group design<p>Ortak maddeli denk olmayan gruplar desenine ilişkin test eşitleme çalışması

Authors

Keywords:

Abstract

Downloads

Metrics

Author Biographies

Süleyman Demir, Sakarya University

Neşe Güler, Sakarya University

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Download Article Template & Journal Writing Rules

Original Article

Indexes & Databases: