NAPLAN Review
Final Report
August 2020
Barry McGaw
William Louden
Claire Wyatt-Smith
Copyright
© State of New South Wales (Department of Education), State of Queensland
(Department of Education), State of Victoria (Department of Education and Training),
and Australian Capital Territory, 2020.
Subject to the exceptions listed below, the material available in this publication is owned
by the State of NSW, State of Queensland, State of Victoria and Australian Capital Territory
and is protected by Crown Copyright in each state/territory. It is licensed under the
Creative Commons Attribution 4.0 International Licence. The legal code for the licence
is available here.
Attribution
NAPLAN Review Final Report
Emeritus Professor Barry McGaw AO, Emeritus Professor William Louden AM
and Professor Claire Wyatt-Smith.
© State of New South Wales (Department of Education), State of Queensland
(Department of Education), State of Victoria (Department of Education and Training),
and Australian Capital Territory, 2020.
Exceptions
The Creative Commons licence does not apply to:
1.
The logos of any of the copyright owners or their departments
2.
The Coat of Arms of Australia or a State or Territory of Australia
3.
Material owned by third parties that has been reproduced with permission.
Permission will need to be obtained from third parties to re-use their material.
If you have questions about the copyright in the content of this website, please contact
the NSW Department of Education on 1300 679 332 or email DoEinfo@det.nsw.edu.au.
ISBN
978-0-6480638-1-0
Acknowledgment of Country
We acknowledge the homelands of all Aboriginal people and pay our respect to Country.
NAPLAN Review Final Report
2
The Hon. Sarah Mitchell
Minister for Education &
Early Childhood Learning
(NSW)
The Hon. James Merlino
Minister for Education
(Vic)
The Hon. Grace Grace
Minister for Education
(Qld)
Ms Yvette Berry
Minister for Education
& Early Childhood
Development (ACT)
Dear Ministers
In September 2019, you commissioned us to review the National Assessment Program:
Literacy and Numeracy (NAPLAN). It was an honour and a pleasure to undertake this
important work.
NAPLAN has been in place since 2008 and in this time, as you well know, it has received
mixed reactions from stakeholders. You asked us to take account of the changing local
and international education landscapes and to consider the extent to which NAPLAN
remains fit-for-purpose.
As you directed in our terms of reference, we have identified what a standardised
assessment regime in Australian schools should deliver, determined how well NAPLAN
achieves this, and identified short- and long-term improvements that can be made.
We consulted widely, despite interruptions from the COVID-19 pandemic, meeting with
175 stakeholders in 91 meetings and receiving 301 responses to an online survey and an
additional 31 written submissions. We formed a Practitioners’ Reference Group made
up of teachers, principals and a union representative for more extended discussion
on a number of important areas. We also collaborated with numerous international
colleagues to investigate practices in several high-performing countries. We are grateful
for the time each of these contributors so generously gave and we have acknowledged
them in our report.
We have concluded that standardised assessment is important in Australian education
and that it serves a variety of purposes. We have recommended the retention of some
important features of NAPLAN but recommended, as well, some important changes to
content, psychometric properties, timing of the assessments within the year and over
the Years of schooling.
We commend our final report to you and trust that it will provide a useful platform
for a revitalised Australian National Standardised Assessment.
Yours sincerely
Emeritus Professor
Barry McGaw AO
Chair
Emeritus Professor
William Louden AM
Professor
Claire Wyatt-Smith
14 August 2020
NAPLAN Review Final Report
3
Contents
Copyright .......................................................................................................................................................................................................2
Attribution .....................................................................................................................................................................................................2
Exceptions .....................................................................................................................................................................................................2
ISBN ...................................................................................................................................................................................................................2
Acknowledgment of Country ..........................................................................................................................................................2
Executive Summary ................................................................................................................................ 8
Introduction .............................................................................................................................................................................. 8
Nature of standardised assessments....................................................................................................................... 8
Standardised assessment in Australia .................................................................................................................... 8
Concerns about publication of school results ....................................................................................................9
No common assessment practices in high-performing countries......................................................9
Linking standardised assessments to the curriculum ...............................................................................10
Moving NAPLAN Online with branching tests .................................................................................................10
Problems with NAPLAN writing test........................................................................................................................ 11
Participation rates in NAPLAN ..................................................................................................................................... 11
Census or sample assessment ..................................................................................................................................... 11
Recommendations for change ................................................................................................................................... 12
Preface...........................................................................................................................................................14
Context .........................................................................................................................................................................................14
Task and timeframe of the review ............................................................................................................................14
Summary of key dates ..................................................................................................................................................... 15
Process ......................................................................................................................................................................................... 15
Acknowledgements............................................................................................................................................................16
Chapter 1: Purposes of standardised assessment ....................................................................17
Standardised assessment ...............................................................................................................................................18
Large-scale standardised assessment in Australian schools...................................................................................18
Purposes of national standardised testing ..........................................................................................................................19
Summary ................................................................................................................................................................................... 23
Current purposes of the national standardised testing program .......................................................................23
Stakeholder perspectives on the purposes of national standardised assessment .................................24
Chapter 2: National and international measures of achievement .................................28
NAPLAN achievement and improvement, 2008 to 2019 .........................................................................29
Reading ........................................................................................................................................................................................................29
Writing ......................................................................................................................................................................................................... 30
Spelling ........................................................................................................................................................................................................32
Grammar and punctuation............................................................................................................................................................33
Numeracy .................................................................................................................................................................................................. 34
Patterns of change across the NAPLAN test domains................................................................................................35
NAPLAN and international surveys of student achievement ...............................................................36
PIRLS and NAPLAN reading ..........................................................................................................................................................37
TIMSS and NAPLAN numeracy ....................................................................................................................................................38
PISA reading literacy and NAPLAN reading .......................................................................................................................39
NAPLAN Review Final Report
4
PISA mathematical literacy and NAPLAN numeracy ..................................................................................................39
The national and international standardised testing programs compared ............................................... 40
Stakeholder views on national and international standardised testing .......................................42
Chapter 3: Other national educational assessment practices ..........................................44
Country assessment policies and practices ......................................................................................................45
Singapore ...................................................................................................................................................................................................45
Japan ............................................................................................................................................................................................................ 46
Canada – Ontario ...................................................................................................................................................................................47
England....................................................................................................................................................................................................... 48
Scotland ...................................................................................................................................................................................................... 49
New Zealand..............................................................................................................................................................................................51
Finland ..........................................................................................................................................................................................................52
Potential relevance for Australia ............................................................................................................................... 53
Chapter 4: Quality of NAPLAN digital tests ................................................................................ 57
Content of tests .....................................................................................................................................................................58
Paper tests .................................................................................................................................................................................................58
Online tests .............................................................................................................................................................................................. 60
Item selection ..........................................................................................................................................................................................63
Links to the Australian Curriculum ...........................................................................................................................................65
Psychometric properties of the tests .....................................................................................................................66
Effect of branching within online tests..................................................................................................................................66
Scaling of results over year levels and time ....................................................................................................................... 68
Confidence in measurement....................................................................................................................................................... 70
Establishing benchmarks................................................................................................................................................................73
Inclusiveness of the tests ............................................................................................................................................... 75
Students with disability ....................................................................................................................................................................77
Aboriginal and Torres Strait Islander students .................................................................................................................78
Cultural and language diversity ..................................................................................................................................................79
Similarity to other tests used in schools ..............................................................................................................79
Summary ....................................................................................................................................................................................81
Chapter 5: Quality of the NAPLAN writing test ........................................................................83
NAPLAN writing test ....................................................................................................................................................... 84
Critique of the NAPLAN writing test ......................................................................................................................86
Formulaic teaching of writing and teaching writing as formulaic .................................................................... 87
Factors internal and external to the test .............................................................................................................................. 88
The writing test and alignment to the Australian Curriculum ............................................................................. 89
Sex .................................................................................................................................................................................................................. 90
Geographic location ............................................................................................................................................................................93
High performance in writing at Year 9 ...................................................................................................................................95
Critique of the writing test .............................................................................................................................................................96
NAPLAN writing test mode......................................................................................................................................................... 100
Summary ...................................................................................................................................................................................................101
NAPLAN Review Final Report
5
Chapter 6. Uses of NAPLAN ..............................................................................................................103
National uses ........................................................................................................................................................................104
School systems/sectors ..................................................................................................................................................107
Schools ......................................................................................................................................................................................109
Individual teachers ...........................................................................................................................................................109
Family and community ..................................................................................................................................................110
Summary .................................................................................................................................................................................. 112
Chapter 7: Recommendations......................................................................................................... 113
National standardised assessment ........................................................................................................................ 113
Purposes of national standardised assessment ............................................................................................................. 113
Features of an assessment system ..........................................................................................................................................114
Role of NAPLAN in meeting national purposes..............................................................................................................118
Changes to the NAPLAN tests ..................................................................................................................................120
Curriculum coverage ........................................................................................................................................................................120
Frequency and timing of tests................................................................................................................................................... 123
Rebranding the program .............................................................................................................................................................. 125
Redeveloping the online branching tests.......................................................................................................................... 126
Redeveloping the writing test ...................................................................................................................................................128
Starting a new time series ............................................................................................................................................................ 132
Reporting ................................................................................................................................................................................ 132
Monitoring trends ............................................................................................................................................................................... 132
Reporting to schools, parents/carers and the community .................................................................................... 133
Ongoing evaluation .........................................................................................................................................................134
Links to terms of reference and proposed timeline ...................................................................................135
Appendix 1: Summary of recommendations ..................................................................................................142
Appendix 2: Review of NAPLAN terms of reference ..................................................................................146
Background ...........................................................................................................................................................................................146
Terms of reference .............................................................................................................................................................................146
Other relevant work that the review will need to consider ..................................................................................147
Review process ....................................................................................................................................................................................147
Review outputs ....................................................................................................................................................................................147
Appendix 3: List of stakeholders consulted .....................................................................................................148
Stakeholder consultations ........................................................................................................................................................... 148
Appendix 4: International practice in standardised writing assessment ................................... 153
International testing of writing ................................................................................................................................................. 153
Summary ..................................................................................................................................................................................................162
References ................................................................................................................................................168
NAPLAN Review Final Report
6
List of tables
Table 1: Census and sample assessment and the purposes of national standardised assessment.......................................12
Table 2: Differences in achievements of students in reading, 2008 to 2019 ......................................................................................... 29
Table 3: Differences in achievements of students in writing, 2011 to 2019 ................................................................................................31
Table 4: Differences in achievements of students in spelling, 2008 to 2019 .........................................................................................32
Table 5: Differences in achievements of students in grammar and punctuation, 2008 to 2019 .............................................33
Table 6: Differences in achievements of students in numeracy, 2008 to 2019.................................................................................... 34
Table 7: Differences in achievement, base year to 2019, by test domain, Australia ...........................................................................35
Table 8: Differences in achievement, base year to 2019, by test domain, Western Australia .................................................... 36
Table 9: Differences in achievement, NAPLAN, PIRLS, TIMSS and PISA ................................................................................................... 41
Table 10: Position of comparison countries in relation to Australia in PISA 2018 .............................................................................. 54
Table 11: Nature of assessments in other countries.................................................................................................................................................55
Table 12: Structure of paper NAPLAN paper tests, 2019 ...................................................................................................................................... 59
Table 13: Structure of NAPLAN Online tests, 2019....................................................................................................................................................64
Table 14: 2019 scales for which distributions were adjusted to match those for 2017 .................................................................... 70
Table 15: Confidence ranges (95%) for the 2019 NAPLAN Year 5 numeracy scores ........................................................................... 71
Table 16: Percentages of Australian students below minimum standards benchmarks .............................................................75
Table 17: Percentages of non-participating students in NAPLAN 2017 tests ....................................................................................... 76
Table 18: Participation rate (%) in NAPLAN in 2017 ................................................................................................................................................. 76
Table 19: Percentage of male and female students below National Minimum Standard in writing ................................... 91
Table 20: Percentage of students by location below National Minimum Standard in writing ............................................... 92
Table 21: Descriptions of performance bands on the writing scale ............................................................................................................94
Table 22: Percentages of Year 9 students in top two bands in NAPLAN writing ............................................................................... 96
Table 23: Census and sample assessment and the purposes of national standardised assessment ................................ 114
Table 24: Terms of Reference, recommendations and timeline ..................................................................................................................135
Table 25: Model for assessment of writing in multiple languages .............................................................................................................162
Table 26: International large-scale assessment of writing ...............................................................................................................................163
List of figures
Figure 1: NAPLAN Review summary of key dates ......................................................................................................................................................15
Figure 2: Branching structure of NAPLAN Online literacy and numeracy tests ................................................................................60
Figure 3: Branching structure of NAPLAN Online grammar and punctuation test ......................................................................... 61
Figure 4: Proportions of students taking each path – Year 3 numeracy, 2019 ..................................................................................... 66
Figure 5: Distributions of student achievement by pathway – Year 3 numeracy, 2019.................................................................. 67
Figure 6: Extent of uncertainty in student NAPLAN results with print and branching digital tests, 2018 .......................72
Figure 7: NAPLAN assessment scale ................................................................................................................................................................................ 74
Figure 8: My School comparison with students with same starting score and similar background ................................ 105
Figure 9: My School: Selected school compared with students with and similar background............................................. 105
Figure 10: Relationship between schools’ socio-educational advantage and NAPLAN results ............................................ 107
Figure 11: Trends in mean performances on the NAPLAN Writing test...................................................................................................129
Figure 12: Categories of validity evidence in large-scale assessment of writing .............................................................................. 154
NAPLAN Review Final Report
7
Executive Summary
Introduction
The NAPLAN Review has been
commissioned to determine what the
objectives of national standardised
assessment should be, to advise on how well
the National Assessment Program: Literacy
and Numeracy (NAPLAN) meets these
objectives and how NAPLAN compares
with national assessment programs in
other countries, and to identify short- and
longer-term improvements in the national
standardised assessment of literacy and
numeracy. NAPLAN was built on almost
two decades of similar testing at a state and
territory level and was introduced in 2008.
NAPLAN has evolved during the period
since its introduction but it is timely to
take stock and identify whether and how
it might now be changed.
This report begins with a discussion of
the history and purposes of standardised
testing in Australia. It then considers trends
in achievement revealed by NAPLAN and
international studies in which Australia
participates (Chapter 2) and other national
assessment policies and practices
(Chapter 3). The strengths and weaknesses
of the current NAPLAN assessments are
examined in reading, language conventions
and numeracy (Chapter 4) and writing
(Chapter 5). Chapter 6 discusses the uses
made of NAPLAN by systems/sectors,
schools, teachers and parents. Chapter 7
proposes a series of short- and longerterm improvements to Australia’s national
standardised assessment program.
Nature of standardised
assessments
Standardised assessments provide
common test-taking conditions, questions,
NAPLAN Review Final Report
time to respond, scoring procedures and
interpretations of the results. They may
test knowledge, skills, attributes or values.
They may use multiple choice, short answer
or extended written responses. They may
be administered to whole populations, to
samples from a population or to individuals.
They may be marked electronically or
by human assessors. Results may be
reported in terms of standards achieved
or in comparison with the achievements
of a wider population. Results may be
used for summative, formative, diagnostic
or predictive purposes. There is a long
history of standardised testing in Australia,
beginning with junior and senior secondary
school examinations conducted by
universities and state-wide examinations
at the end of primary school. There is
also a range of commercially available
standardised tests that schools and teachers
use to measure students’ achievements
and monitor their progress.
Standardised assessment
in Australia
The abolition of external examinations
before the end of secondary education
meant that parents had no standardised
assessments and reports on children’s
progress from a broader perspective than
that of their children’s own school. Schools
could still use standardised tests and their
published norms but that did not provide
parents with the kind of comparisons
across the age group that the external
examinations provided. In that vacuum, all
states and territories over a period from the
late 1980s to the mid-1990s introduced new
tests that were administered to all children
in several year levels in their movement
through school. These census tests were
limited to literacy and numeracy though
8
some states and territories also conducted
sample surveys of students’ achievement
and progress in other curriculum areas.
Once all states and territories were
assessing all students in literacy and
numeracy, the ministerial council sought
a national perspective from the different
jurisdictions’ results. In 2007, they resolved
to use common tests, selected the name
National Assessment Program: Literacy and
Numeracy (NAPLAN) and introduced the
new tests in 2008.
Since then, five purposes for national
standardised testing have been endorsed
by the ministerial council:
• Monitoring national, state and territory
programs and policies.
• System accountability and performance.
• School improvement.
• Individual student learning
achievement and growth.
• Information for parents on school
and student performance.
NAPLAN has been a useful barometer
with which to examine trends in students’
achievements over time. In the period
2008 to 2019, national NAPLAN results
have revealed improvement in reading
and numeracy in primary schools but not
in secondary schools, static performance
in writing in Years 3 and 5 and a decline
in Years 7 and 9. They have also revealed
different patterns among the states and
territories. Queensland and Western
Australia have improved more than the
others, but they started behind the ACT,
NSW and Victoria and have not surpassed
them. At the same time as testing all
students through NAPLAN, Australia has
participated in a number of international
sample surveys of students’ achievements
in literacy and numeracy all of which,
while showing improvement in some year
levels in some domains, have generally
NAPLAN Review Final Report
produced declining results and shown a
larger proportion of Australian students
than in NAPLAN to be below the levels that
each assessment program sets as defining
minimum competence. The international
surveys have also reported a declining
proportion of high performing students.
Concerns about publication
of school results
There was some resistance among teacher
organisations to the initial introduction
of state and territory census assessments
of students in literacy and numeracy but
that had largely dissipated until, in 2010,
the My School website was introduced
and provided public reporting on schools’
NAPLAN results. While the website provided
only comparisons among schools with
students from similar levels of socioeducational advantage, many newspapers
retrieved data to create raw league tables
that ranked schools without consideration
of differences in context. It also meant
that schools were being compared and
publicly ranked on only the narrow criteria of
students’ literacy and numeracy. That has led
many in public debate and in submissions
to this and earlier reviews to campaign for
NAPLAN to assess only samples of students,
sufficient to monitor national, state and
territory and other trends in students’
achievement levels but not to enable interschool comparisons to be made.
No common assessment
practices in high-performing
countries
A review of practices in high-performing
countries revealed no common patterns in
assessment policies and practices. In Finland
and New Zealand there are only sample
assessments to monitor the education
system, with teachers’ professional
judgements being the basis of reporting
to parents and students. In Scotland,
9
new census assessments were introduced
in 2017 to 2018, though schools have the
right to opt out. The student participation
rates were 95% in 2017 to 2018 and 93.4%
in 2018 to 2019. These participation rates
match those achieved with NAPLAN census
tests in Australia. Results are reported to
schools, with student reports similar to
those provided to Australian parents from
NAPLAN, but in Scotland they are intended
to be used only by schools and teachers
as one piece of evidence contributing to
reports to parents/carers, students and
local education authorities.
Ontario, the most populous Canadian
province, conducts census assessments
in literacy and numeracy with reports to
parents and public reporting of schools’
results. In England, there are census
assessments of literacy and numeracy in
primary and lower secondary education.
Reporting is focused, as it is on My School in
Australia since 2019, on growth not current
achievement levels. In mid-secondary
school (Years 10 to 11, aged 16), England
has retained subject-based external
examinations through its General Certificate
of Education Ordinary-Level. Japan has
census assessments in Grade 6 and Grade
9 in Japanese, mathematics and science,
with English added in 2019. At the end
of lower secondary education (Grade 9)
there are extremely competitive entrance
examinations for senior high schools.
In Singapore, there are no census
assessments in general domains like
literacy and numeracy. There are subjectbased examinations: the Primary School
Leaving Examination at the end of primary
school, which also serves as a selection test
for secondary school, and the SingaporeCambridge General Certificate of Education
examinations in Year 11 prior to end-ofsecondary examinations in Year 13.
NAPLAN Review Final Report
Linking standardised
assessments to the
curriculum
When NAPLAN tests were first developed,
there was no common curriculum in
Australia, so they were based on national
Statements of Learning for English and
national Statements of Learning for
Mathematics. From 2017, the NAPLAN
tests have been based on the Australian
Curriculum, but on the literacy and
numeracy continua since both are expected
to be developed across the subject
curricula and not exclusively in English and
mathematics. That makes less difference in
numeracy/mathematics where there is more
overlap than in literacy/English but, in both
cases, it is not clear where ownership of both
literacy and numeracy lies in a secondary
school where teaching and learning are
subject-based.
Moving NAPLAN Online with
branching tests
Since 2018, schools have increasingly taken
the NAPLAN tests online. These online
tests in numeracy and literacy are adaptive.
Students are marked by the computer as
they answer questions and branched one
third and two thirds of the way through
the tests to more or less complicated
questions. This branching is determined
based on their achievement on the test
to the point of branching. That presents
students with questions better targeted
to their achievement levels. It has resulted
in better assessment over the full range of
achievement levels among students and less
uncertainty in the measurement throughout
the range. The level of uncertainty, as with
all educational measurement, depends
on how much data lies behind a result:
national means have the least uncertainty
and individual student’s results the most.
The means for large schools have less
uncertainty than those for small schools.
10
Problems with NAPLAN
writing test
Participation rates in
NAPLAN
The NAPLAN writing test is the most
problematic. The restriction of the writing
genres to narrative and persuasive, with the
specific genre being announced in advance
in the early years of NAPLAN, has led to very
formulaic writing in students’ responses to
the prompt and, as a further unintended
consequence, to very formulaic teaching
of writing in some schools as they seek to
prepare students for the NAPLAN writing
test. The marking criteria also need to be
reviewed. The language conventions test
in spelling has a reliability similar to those
achieved in the reading and numeracy
tests but the grammar and punctuation
test has a markedly lower reliability. A more
serious consideration is whether it is better
to assess these language conventions with
decontextualised test questions or to assess
them through students’ use of them in their
writing. To achieve all of these changes,
it is recommended that the writing test
be withdrawn from census testing and
conducted as a sample survey during a
period of experimental redevelopment.
There are explicit provisions for some
students not to participate in NAPLAN
testing. Students can be excluded if
their language background is other than
English and they have been in Australia
for less than a year or they have significant
disabilities. In their submissions to the
review, parents of children with learning
difficulties complained that some schools
encouraged their children to be withdrawn.
Their preference was to obtain the external
measure on achievement with which
NAPLAN would enable them to see where
their children stood in their age-group.
Parents can apply to have their children
withdrawn on the basis of religious beliefs or
philosophical objections to testing. Beyond
those reasons, there is generally a greater,
though still small, percentage of the age
group that is simply absent on the day of
testing. It would be good for jurisdictions to
investigate students’ reasons for absence
and seek to reduce the current levels.
As the NAPLAN tests are moved online,
particularly the writing test with its extended
response, it will be essential that students
develop fluency in the use of keyboards
and word processors, at least from Year 5,
to enable them to concentrate on the
substance of their writing. For Year 3,
students’ handwritten responses would
be more appropriate.
NAPLAN Review Final Report
Census or sample
assessment
Many submissions to the review declared
that NAPLAN had been introduced for the
specific, narrow purpose of monitoring
the overall education system but all five of
the purposes of standardised assessment
nominated above can be found in
declarations of the ministerial council
about the purposes and uses of NAPLAN
results. These purposes are also reflected in
current practices of government education
departments and some, though not all,
schools. Table 1 shows what can be achieved
with census and sample testing.
11
Table 1: Census and sample assessment and the purposes of national standardised assessment
Census
Purpose of national standardised assessment
Sample
Monitoring progress towards national goals
• National, jurisdictional and system estimates of achievement
• Relative performance by gender, geographic location of schools,
socioeconomic background and Aboriginal and Torres Strait Islander
background
School system accountability and performance
• Accountability for system performance
• Accountability for school performance
School improvement
• School-level information on achievement and growth
by assessment domain
• School-level targets informed by system comparative data
Individual student learning achievement and growth
• Student level achievement estimates for comparative purposes
(cohort, test domain, gain, equity groups)
• Student level achievement estimates for diagnostic purposes
Information for parents on school and student performance
• Individual student achievement
• Relative school performance
This table makes clear which current practices could not be sustained if sample rather than
census assessment were used.
Recommendations for
change
The development of the full set of
recommendations is described in Chapter
7 and they are listed in Appendix 1. Only
their main features are included in this
Executive Summary.
It is recommended that NAPLAN remain as
a census test of students’ achievement but
that they be taken by students in Years 3,
5, 7 and 10 not Years 3, 5, 7 and 9. Students’
achievement levels and the absenteeism
rate at Year 9 reveal a relatively low level
of student engagement with NAPLAN
NAPLAN Review Final Report
compared with Years 3, 5 and 7. In Year
10 students are more mature and, more
importantly, reaching the stage at which
important choices are to be made about
their studies in the upper secondary years.
Having a NAPLAN assessment in Year
10 would provide some data to inform
discussions between the students and
their teachers.
It is also recommended that the tests be
administered as early as feasible in the
school year, and not in May as at present,
to give NAPLAN a more clearly formative role
as a measure of students’ starting points
for the year.
12
It is recommended that the writing test be
redeveloped with richer prompts, removal of
restrictions on the genre in which students
write, assessment of language conventions
of spelling, grammar and punctuation in
the students’ writing not in stand-alone
tests, and inclusion of teacher judgements
as a component in the marking. To achieve
this substantial change, it is recommended
that the writing test be withdrawn from
the census assessment while experimental
redevelopment is undertaken and replace
it with a national sample of schools and
students. Once the new form has been
established, it is recommended that census
assessment of writing be resumed.
It is recommended that the scope of the
national standardised assessment be
broadened beyond its current limitation to
literacy and numeracy. As noted in Chapter
3, a number of countries include science in
their census assessments and, as noted in
Chapter 2, Australia’s performance in science
in the international surveys has declined in
recent years. In Australia, there is a strong
focus in current discussions of curriculum
on STEM (science, technology, engineering
and mathematics). In the Australian
Curriculum, this is best represented by
the subjects: science, digital technologies
and mathematics. There is also attention
being given to the general capabilities
in the Australian Curriculum. These are
not generic capabilities devoid of subject
content. Indeed, they take different forms
in different domains. Critical and creative
thinking in science, for example, is not the
same as critical and creative thinking in
history. It is recommended, therefore, that
a new test of critical and creative thinking
in STEM be introduced at Years 5, 7 and
10. It is recommended that the current
triennial sample survey of science literacy
in Years 6 and 10 be withdrawn and that
consideration be given to its replacement
NAPLAN Review Final Report
in the triennial cycle by another that covers
both a subject and a general capability from
the Australian Curriculum, such as history
and intercultural understanding.
Once NAPLAN Online is fully implemented,
with the digital tests freed from the
restriction of mirroring print forms
and capitalising on the flexibility and
creativity available in the digital form, it is
recommended that a new time series be
commenced without reference back to the
2008 NAPLAN scales. For all but the writing
test, the full move to online delivery should
expedite the return of results to school
within days of the testing. Results of the
writing test would take longer because of
the need for manual marking.
It is recommended that Ministers emphasise
the significance of changes that they
introduced to My School in 2019 to remove
inter-school comparisons of the levels of
students’ achievements and to focus on
students’ growth in comparison with other
students at the same starting point.
At present the National Assessment
Program (NAP) is an umbrella title for both
NAPLAN’s census tests of students’ literacy
and numeracy and sample surveys in some
other domains. To distinguish them more
clearly and to recognise that the census tests
are proposed to move beyond literacy and
numeracy, it is recommended that programs
be rebranded, with new names adopted
for each program: the Australian National
Standardised Assessments (ANSA) instead of
NAPLAN and National Sample Assessment
Program (NSAP) instead of NAP.
13
Preface
Context
The National Assessment Program – Literacy
and Numeracy (NAPLAN) is an annual,
point-in-time assessment undertaken by
Australian students in Years 3, 5, 7 and 9. The
first NAPLAN tests were conducted in 2008.
The Australian Curriculum, Assessment
and Reporting Authority (ACARA) is
responsible for the development and
central management of the tests.
Test administration authorities in each
state and territory are responsible for
the administration of the tests in their
jurisdiction. All states and territories
administer the tests in compliance with
nationally agreed protocols and are
also responsible for marking the tests
in accordance with strict guidelines
and processes.
NAPLAN tests four areas (‘domains’) —
reading, writing, language conventions
(spelling, grammar and punctuation) and
numeracy. The tests are scheduled in the
second full week of May. In the past, the
tests have been conducted in pen and paper
format. Schools are currently transitioning
to online testing, with more than 50% of
schools across Australia participating in
NAPLAN Online in 2019 (ACARA, 2020k).
Full transition of schools to NAPLAN Online
is expected by 2022.
NAPLAN Review Final Report
Results are provided to schools between
August and September. Every student
who participates in the tests receives an
Individual Student Report of their results.
Since 2010, NAPLAN performance of schools
in Australia is reported on the My School
website. This includes comparative
performance to similar students and
national results. In addition, the website
displays schools’ historical performance
and also provides some demographic and
financial information for each school.
Task and timeframe of
the review
On 12 September 2019, the NSW Minister
for Education and Early Childhood Learning
announced that the NSW, Victorian,
Queensland and ACT governments would
sponsor a review of NAPLAN. Emeritus
Professor Barry McGaw, Emeritus Professor
William Louden and Professor Claire WyattSmith were appointed to the NAPLAN
Review’s panel as independent curriculum
and assessment experts.
The panel was asked to identify what a
standardised testing regime in Australian
schools should deliver, assess how well
NAPLAN achieves this, and identify shortand longer-term improvements.
The full terms of reference are set out
in Appendix 2.
14
Summary of key dates
Figure 1: NAPLAN Review summary of key dates
Process
The review was conducted in two stages.
As part of the stage one consultation
process, panel members met with
56 individuals over 32 meetings. Meetings
were held from Tuesday 22 October to Friday
25 October 2019 in Brisbane, Melbourne,
Sydney and Canberra. The panel also drew
from other recently completed work,
including the 2018 Queensland NAPLAN
Review and the 2019 NAPLAN Reporting
Review. An interim report setting out
some of the major concerns expressed
by stakeholders and some preliminary
discussion on strategies to deal with
these issues was publicly released on
6 December 2019.
In the second stage of the review, the
panel further analysed the issues discussed
in the interim report and broadened its
consultation to hear directly from a greater
proportion of stakeholders and experts.
This final report discusses the themes
raised in consultations and identifies the
challenges NAPLAN has faced and potential
opportunities moving forward. It also
offers a strategic blueprint for the future of
standardised assessment in Australia.
NAPLAN Review Final Report
Stage two stakeholder consultations were
held from 24 March to 27 May 2020. Face-toface meetings were scheduled in each state
but due to the COVID-19 pandemic, these
were shifted to a web conference platform.
The panel members met with 160 individuals
across 53 meetings.
Stakeholders consulted in stages one
and two of the review included teachers,
principals, parents and carers, students,
school systems/sectors, unions, accreditation
authorities, teacher subject associations,
teacher professional associations, principals’
associations, Aboriginal and Torres Strait
Islander representative groups, disability and
inclusion representative groups, ministers,
key assessment bodies, educational
organisations, academics, and national
and international educational experts.
A full list of stakeholders is available at
Appendix 3. Stakeholders consulted as part
of both stages one and two of the review
have only been listed once.
A call for public submissions for stage two of
the review was made in late February 2020.
Stakeholders were invited to complete an
online survey via targeted invitation as well
as via newspaper, web and social media
15
channels, to ensure all interested parties
had an opportunity to contribute. The
survey consisted of seven topical questions,
as well as some demographic questions.
There were 301 responses to the survey from
all jurisdictions except Tasmania and the
Northern Territory. In addition, the panel
received 31 written submissions made by
individuals or organisations via email.
A Practitioners’ Reference Group was
established with 17 teacher/principal
representatives from all systems/sectors
in each of the four participating states
and territories. The panel conducted six
meetings with the reference group (two
whole group meetings and one meeting
with members in each jurisdiction). The
group provided the panel with an in-depth
practitioner perspective on a range of
issues, including test administration, timing
and student engagement, data use and
classroom practice.
Contributions from the Practitioners’
Reference Group, responses to the online
survey, written submissions and web
conference meetings assisted the panel
significantly in their consideration of the
challenges and opportunities presented
by NAPLAN. Quotes and exploration of the
review’s consultation have been referenced
as part of the panel’s analysis of issues
throughout this final report.
Dr Sue Thomson from the Australian Council
for Educational Research was commissioned
to prepare a comparative report on the key
differences between what is assessed in
international assessments – Programme for
International Student Assessment (PISA),
Trends in International Mathematics and
Science Study (TIMSS) and Progress in
International Reading Literacy Study (PIRLS)
and any apparent consequential differences
in outcomes. The panel have used this
information in their final report.
NAPLAN Review Final Report
Acknowledgements
We wish to thank all those who completed
our online survey, prepared submissions
for our consideration, and those who met
with us face-to-face before the COVID-19
restrictions and in web-based meetings
after them.
We are particularly grateful to those who
provided statistical advice and, in some
cases, undertook statistical analyses for us:
Dr Raymond Adams, Dr Eveline Gebhardt,
Dr Goran Lazendić and Dr Lucy Lu.
We thank colleagues in the Queensland
Department of Education who prepared
background information on the assessment
systems in other countries and the following
colleagues who provided advice and
reviewed our draft materials – Charles Darr,
Associate Professor Christopher DeLuca,
Dr Jenny Donovan, Professor Nikolaj Elf,
Professor Karen R. Harris, Professor Louise
Hayward, Christine Jackson, Professor Kim
Koh, Shumpei Komura, Associate Professor
Ricky Lam, Associate Professor Yuen Yi Lo,
Dr Tim Oates, Professor Judy M. Parr, Dr
Poon Chew Leng, Professor Pasi Sahlberg,
Professor Gustaf B. Skar, Dr Sue Thomson
and Peter Titmanis. Any deficiencies in
our descriptions are ours but their help
diminished the risk.
We are enormously grateful to Claire Todd,
Susanna Osborne and Carolyn Burns in
the NSW Department of Education who
provided the highly professional Secretariat
that supported and, in many skilled and
subtle ways, guided us throughout and
did everything on time and to the highest
quality. We extend that gratitude to
the Secretariat members from Victoria,
Queensland and the ACT.
16
Chapter 1: Purposes of
standardised assessment
This chapter begins with a summary of standardised testing in Australia and traces the
purposes of the NAPLAN national standardised testing program through the deliberations
of successive national education ministerial councils and their Hobart, Adelaide, Melbourne
and Alice Springs (Mparntwe) declarations. It concludes with a discussion of these purposes
in the context of feedback from stakeholders consulted in this review.
Key points
• There is a long history of standardised testing in Australia, beginning with junior and
senior secondary school examinations conducted by universities.
• State- and territory-based standardised literacy and numeracy tests were developed
in the 1980s and 1990s. Subsequently, efforts were made to establish comparability of
these tests using benchmarks and statistical equating.
• In 2003, national sample-based testing of science literacy, civics and citizenship, and
information and communications technology commenced for Years 6 and 10, with
one area tested each year on a triennial cycle.
• National, whole-cohort assessment of literacy and numeracy in Years 3, 5, 7 and 9
began in 2008.
• Schools use a wide variety of other opt-in or compulsory standardised assessments,
including early years assessments in every state and territory.
• Through successive ministerial council decisions and declarations, NAPLAN testing
has developed five endorsed purposes: monitoring progress towards national goals,
school system accountability and performance, school improvement, individual
student learning achievement and growth, and information for parents/carers on
school and student performance.
• Notwithstanding these officially endorsed purposes, there remain substantial
concerns among stakeholders about NAPLAN’s capacity to meet all five purposes
equally well and without conflict among the purposes.
NAPLAN Review Final Report
17
Standardised assessment
The essential characteristic of standardised
testing is consistency. Standardised tests
provide common test-taking conditions,
questions, time to respond, scoring
procedures and interpretations of the
results. Contemporary standardised tests are
developed with careful attention to fairness,
reliability and validity, and may vary test
conditions in order to eliminate bias against
individuals or groups of candidates.
Standardised tests take many forms. They
may test knowledge, skills, attributes or
values. They may use multiple choice, short
answer or extended written responses.
They may be administered to whole
populations, to samples from a population
or to individuals. They may be marked
electronically or by human assessors.
Results may be reported in terms of
standards achieved or in comparison with
the achievements of a wider population.
Such results may be used for summative,
formative, diagnostic or predictive purposes.
Australia’s national standardised literacy
and numeracy tests – NAPLAN – are just
one of the possibilities in this universe of
standardised tests. They are whole-cohort
tests, they focus on cognitive skills in literacy
and numeracy, are marked by a combination
of digital and expert assessors, report against
national standards and are used for a variety
of summative and formative purposes.
Large-scale standardised assessment
in Australian schools
The first standardised tests to be used in
Australian school education were statebased public examinations, which have been
held for more than 150 years. The University
of Melbourne established a matriculation
examination in 1855, and the University
of Sydney set and marked its first junior
and senior secondary public examinations
in 1867. Responsibility for examinations
NAPLAN Review Final Report
such as these subsequently moved from
the universities to public examinations
boards. Public examinations at the junior
secondary level were discontinued but
most jurisdictions have continued to use
large-scale standardised examinations as
a component of their senior secondary
certification.
In the absence of public examinations
until the end of secondary schooling,
during the 1980s a number of jurisdictions
introduced large-scale standardised testing
in the primary or junior secondary years.
The Victorian Achievement Studies were
conducted in 1988 and 1990. In 1989, NSW
introduced the Basic Skills Tests of literacy
and numeracy in Years 3 and 6. In 1990,
Queensland conducted the Assessment of
Student Performance in Years 5, 7 and 9;
and Western Australia began the samplebased Monitoring Standards in Education
Program in Years 3, 7 and 10. Although
these tests assessed similar skills in literacy
and numeracy, there were no common
standards across the assessments. In 1997,
national benchmarks were established using
state-based assessments and a procedure
was developed to equate state and
territory assessments and seek comparable
reporting of results. National assessment
and reporting using equated state and
territory standardised tests in Years 3 and 5
literacy began in 1998, followed by numeracy
testing in 1999.
National sample assessments began
with science literacy in 2003, civics and
citizenship in 2004 and information and
communications technology literacy in
2005. Trials of the first national literacy and
numeracy assessments using a common
test were announced in 2005. Following
evaluation of this trial, whole-cohort national
literacy and numeracy using common tests
in Years 3, 5, 7 and 9 began in 2008. Under
the banner of NAPLAN, these census tests
have continued since then.
18
Beyond NAPLAN, standardised tests are
used widely in Australian schools. Some
standardised tests are mandated by school
systems/sectors, and others are selected
and used on an opt-in basis by schools for
individuals or groups of students. Most
public school jurisdictions, for example,
use some form of common monitoring
instrument or standardised assessment
in the early years of schooling. Students
in all NSW public schools are assessed in
the Foundation year using the Best Start
Kindergarten Assessment. Victorian public
schools assess all Foundation students
using English Online and Mathematics
Online interviews. Queensland public
schools have access to the Early Start
assessment of literacy and numeracy skills
in Years Foundation to 2. ACT public schools
use the University of Western Australia’s
BASE (formerly PIPS) assessments at the
beginning and end of the Foundation
year. Other schools and jurisdictions
use a variety of assessments on an optin basis in the junior primary years,
including most commonly the Australian
Council for Educational Research (ACER)
Progressive Achievement Tests in reading
and mathematics, BASE and the South
Australian Spelling Test.
Purposes of national standardised
testing
NAPLAN tests are technically similar to other
widely used standardised tests such as the
PAT-Reading and PAT-Mathematics tests,
but their principal difference is that they are
whole-cohort tests taken at the same time
across the nation. The stated purposes of
Australia’s national standardised assessment
program have developed over time and
have included monitoring progress towards
national goals, school system accountability
and performance, school improvement,
individual student learning achievement
and growth, and information for parents/
carers on school and student performance.
NAPLAN Review Final Report
This section of the report follows the
development of these multiple purposes
through the Hobart, Adelaide, Melbourne
and Alice Springs (Mparntwe) education
declarations and national ministerial
council decisions.
Monitoring national, state and territory
programs and policies
In the Hobart Declaration (Australian
Education Council, 1989) Ministers agreed
to production of an annual National Report
on Schooling that would ‘monitor schools’
achievements and their progress towards
meeting the agreed national goals.’ Annual
National Reports on Schooling would ‘report
on the school curriculum, participation
and retention rates, student achievements
and the application of financial resources
in schools’. The subsequent Adelaide
Declaration (Ministerial Council on
Education, Employment, Training and
Youth Affairs (MCEETYA), 1999) committed
Ministers to report on progress towards
national goals, including ‘explicit and
defensible standards … through which
the effectiveness, efficiency and equity of
schooling can be measured and evaluated’.
These links between student achievement
and national policy were sharpened in 2005
with the ministerial council’s agreement to
common national standardised tests as ‘a
means of improving comparability of results
among states and territories’ (MCEETYA,
2005, p. 2). The links were broadened in the
Melbourne Declaration (MCEETYA, 2008a)
in which governments committed to the
availability of:
Good quality data [that] enables
governments to analyse how well
schools are performing, identify schools
with particular needs, determine
where resources are most needed to
lift attainment, identify best practice
and innovation, conduct national and
international comparisons of approaches
19
and performance and develop a
substantive evidence base on what
works (p. 17).
The Alice Springs Declaration affirmed
that ‘assessment results that are publicly
available at the school, sector and
jurisdiction level’ provide information that
enables ‘policy makers and governments
to make informed decisions based on the
evidence’ (Education Council, 2019a, p. 11).
The declaration went on to list a range of
policy domains in which governments rely
on ‘good quality data’. Although neither the
Melbourne nor Alice Springs declarations
specify that the source of such ‘data’ need
include whole-cohort national standardised
testing, since 2008 annual National Reports
on Schooling in Australia have updated
progress towards the Melbourne Declaration
goals using NAPLAN results.
In the decade between the Melbourne and
Adelaide Declarations, ministerial councils
have approved a range of developments and
uses of whole-cohort test results. In 2009 the
ministerial council announced publication of
‘relevant, nationally comparable information
on all schools’ using whole-cohort NAPLAN
data, on what came to be the My School
website (MCEETYA, 2009, p. 1). Subsequent
ministerial councils have supported
NAPLAN and My School’s role in monitoring
government programs and policies. In
2015 the ministerial council reaffirmed its
commitment to My School and nationally
consistent school level information ‘for the
use of parents/carers, school communities
and governments’ (Education Council, 2015
p. 2). In 2019, when Ministers agreed to some
changes in the representation of NAPLAN
data on My School, they noted that ‘gain
measures will tell us if students are making
the progress they should – and tell us if
Australia’s education system is on track.’
(Education Council, 2019c, p. 2).
NAPLAN Review Final Report
System accountability and performance
The accountability and performance goals in
the initial Hobart Declaration were modest,
announcing a commitment to a National
Report on Schooling that would ‘increase
public awareness of the performance of
our schools as well as make schools more
accountable to the Australian people.’
The Adelaide Declaration focused on
increasing public confidence rather than
accountability but committed governments
to ‘increasing public confidence in school
education through explicit and defensible
standards that guide improvement in
students’ levels of educational achievement
and through which the effectiveness,
efficiency and equity of schooling can be
measured and evaluated.’ The Melbourne
Declaration further committed Australian
governments to strengthen accountability
and transparency, providing parents/
carers and the community with ‘access
to information about the performance
of their school compared to schools with
similar characteristics’ without resorting
to ‘simplistic league tables or rankings’ (p.
17). Similarly, the Alice Springs Declaration
committed governments to ‘strengthening
accountability and transparency with strong,
meaningful measures’. This includes:
assessment results that are publicly
available at the school, sector and
jurisdiction level to ensure accountability
and provide sufficient information to
parents, carers, families, the broader
community, researchers, policy makers
and governments to make informed
decisions based on evidence (p. 18).
Between the Melbourne and Alice Springs
Declarations, ministerial councils have
balanced the goals of accountability and
transparency against the risk of misuse
of school-level achievement data. The
2009 ministerial council communique
that announced the decision to establish
My School characterised it as ‘a major step
20
forward for the shared national transparency
agenda’ but added the caution that
‘Ministers agreed that these reforms were
not about simplistic league tables which
rank schools according to raw test scores’
(MCEETYA, 2009a, p. 2).
Later that year, the ministerial council
endorsed a more detailed set of principles
for reporting on schooling, affirming that
reporting should be in the public interest,
use valid, reliable and contextualised data,
be sufficiently comprehensive to enable
proper interpretation and should balance
the community’s right to know with the
need to avoid misuse of the information
(MCEETYA, 2009b, p. 4). The principles
also affirmed a variety of purposes for
achievement data. Schools need data on the
performance of their students because they
have the primary accountability for student
outcomes. Parents/carers’ need information
to make informed judgements, make
choices and engage with their children’s
education and the school community. More
broadly, the community needs schools to
be accountable for the results they achieve
with the public funding they receive, and
governments are accountable for the
decisions they take. Finally, the principles
for reporting affirm the need of school
systems/sectors and governments for
sound information on school performance
to support ongoing improvement for
students and schools.
Notwithstanding these agreed principles,
reporting of national standardised test
results at the school level continued to
be a matter of concern for Ministers.
In a communique following a 2011
meeting, Ministers considered a range of
improvements to the My School website
but ‘reiterated strong opposition to the
publication of league tables arising from
My School data.’ (MCEETYA, 2011, p. 2). In
2018 Ministers ‘reiterated their commitment
to standard and fair assessment supported
NAPLAN Review Final Report
by transparent reporting’ (Education
Council, 2018, p. 2) and commissioned
a review of the reporting on NAPLAN
results. In 2019 Ministers noted a number of
recommendations designed to reduce the
risk of misusing NAPLAN data (Ministerial
Council, 2019b). The 2020 version of My
School had fewer NAPLAN displays and a
clearer focus on school-level estimates of
gains in achievement.
School improvement
Neither the Hobart nor the Adelaide
declarations linked achievement data
and school improvement. The Melbourne
Declaration was the first to focus on this
theme, referring to the way in which good
quality data supports each school in ‘the
design of high quality learning programs’
and ‘informs schools’ approaches to
provision of programs, school policies,
pursuit and allocation of resources’ (p. 17).
Similarly, the Alice Springs Declaration,
notes that good quality data:
allows teachers to evaluate the
effectiveness of their classroom
practice and supports educators to
effectively identify learners’ progress
and growth, and design individualised
and adaptive learning programs. It also
informs programs, policies, allocation
of resources, relationships with parents
and partnerships and connections with
community and business (p. 18).
In alluding to ‘good quality data’ neither
of these Declarations specify NAPLAN or
any other whole-cohort standardised test
data. The links between population testing,
school-level data and school improvement,
however, were addressed more specifically
in the Hon. Julia Gillard MP’s second
reading speech to Parliament supporting
the Australian Curriculum, Assessment
and Reporting Authority Bill (Australia,
Parliament, 2008).
21
Accurate information on how students
and schools are performing tells teachers,
principals, parents and governments what
needs to be done. This means publishing
the performance of individual schools,
along with information that puts that
data in its proper context. That context
includes information about the range of
student backgrounds served by a school
and its performance when compared
against other ‘like schools’ serving similar
student populations.
Australian governments’ linkage of
NAPLAN scores to school improvement
have developed substantially in the last
decade. Since 2010, My School has made
it possible to compare NAPLAN growth
and achievement among schools serving
similar students. In recent years school
systems/sectors have also made substantial
investments in business intelligence systems
that (among other things) help schools to
explore links between standardised tests
results and school improvement.
Individual student learning achievement
and growth
The state and territory assessments that
preceded NAPLAN provided schools with
information about individual student
achievement, but these individual results
were brought to a common scale with the
introduction of NAPLAN. When announcing
the first round of NAPLAN tests, Minister
Gillard characterised these new tests as
having ‘a strong diagnostic approach’
(MCEETYA, 2008b). In similar terms, the
National Report on Schooling in Australia
2008 noted that ‘The data from NAPLAN
test results gives schools and systems a
diagnostic capacity to identify individual
student needs’ (MCEETYA, 2008c, p. 2) and
that ‘NAPLAN can be used by teachers for
diagnostic purposes’ (p. 17). In subsequent
years, NAPLAN’s move from pen and paper
to online testing was expected to provide
‘more accurate assessment of each child’s
NAPLAN Review Final Report
strengths and weaknesses’ and ‘even
greater effectiveness as a diagnostic tool in
classrooms’ (Education Council 2014, p. 1).
More modest claims about the diagnostic
value of NAPLAN at an individual level
have been made elsewhere. ACARA’s
submission to the 2013 Senate inquiry
noted that ‘NAPLAN tests do not conform
to the meaning of ‘diagnostic’ assessment
in the way that this term is commonly
understood in a classroom context, as for
an individual student there are insufficient
items at each difficulty level to provide the
detailed information that a diagnostic test
is designed to do’ (ACARA, 2013b, p. 8). As
ACARA’s submission went on to say, NAPLAN
seeks to complement the assessment tools
classroom teachers use by showing how
students are performing against national
standards. Consistent with this view, the
current description of NAPLAN on the
National Assessment Program website
notes that ‘The results can assist teachers
by providing additional information to
support their professional judgement about
students’ levels of literacy and numeracy
attainment and progress’ and ‘do not
replace the extensive, ongoing assessments
made by teachers about each student’s
performance’ (ACARA, 2020).
Information for parents/carers on school
and student performance
NAPLAN’s national standardised cohort
testing has provided two streams of
information for parents/carers: individual
student reports on NAPLAN growth and
achievement and school-level summaries
of NAPLAN performance.
The individual reports to parents/carers were
anticipated at the launch of NAPLAN in
2008 when Minister Gillard said that the new
test program would allow ‘parents/carers
to understand the level of achievement
of students … [including] information
on students who have not achieved the
22
minimum literacy and numeracy standard
and need further support.’ For the last
decade ACARA has provided a standard
student report that schools and school
systems/sectors then pass on for parents
and carers of each student who has taken
a NAPLAN test. The reports show each
student’s performance in achievement
bands and compared with the range
of achievement for the middle 60% of
students taking the same test.
The second stream of data for parents/carers
is in the form of school-level summaries
of NAPLAN achievement and progress.
Launching My School in 2010, Minister
Gillard noted that it would allow parents/
carers and school communities ‘to compare
their school’s results with neighbouring
schools and up to 60 statistically similar
schools.’ A subsequent ministerial council
communique described these as ‘fair
comparisons of schools in Australia, letting
parents/carers, educators and members
of the general public see how a school
is performing, compared to schools with
similar students’ (Education Council, 2016,
p. 1). The statistical basis of the comparisons
has changed over time. Initially comparisons
were available either with statistically similar
schools or students with the same starting
point in achievement. In 2020, My School
introduced a composite comparison of
the amount of improvement achieved by
students with the same starting score and
a similar socio-educational background.
However, following a 2019 Education Council
agreement, the website clarified that ‘The
inclusion of data about how schools perform
in NAPLAN provides information on only
one aspect of school performance and
does not measure overall school quality’
(ACARA, 2020d).
My School provides NAPLAN achievement
and progress scores for every Australian
school, but school-level results are available
from other sources in many Australian
NAPLAN Review Final Report
jurisdictions. The Queensland Curriculum
and Assessment Authority website provides
a downloadable table of results for public,
Catholic and independent schools. Schoollevel results for Western Australian public
schools are also available on a searchable
website. More commonly, annual reports
containing NAPLAN data are available
on school websites. In South Australia,
for example, annual school reports on
government schools’ websites use a
standardised format including proportions
of students achieving NAPLAN proficiency
levels, the proportion of students in three
NAPLAN progress bands and the proportion
achieving in the upper two bands. In
the ACT, annual school board reports on
government school websites include a table
of NAPLAN mean scores and an annual
action plan detailing progress against
NAPLAN targets. Victorian public schools
provide an annual report to their school
community, including NAPLAN results
and comparisons with schools with similar
characteristics. In NSW, annual reports
appear on public school websites but do
not publish NAPLAN results in a common
format. NSW independent schools are
required to publish annual reports that
disclose comparative performance over time,
comparisons with state-wide performance
and comparisons with similar schools
where appropriate.
Summary
Current purposes of the national
standardised testing program
Standardised testing is widespread in
Australian schools and has a long history.
There are many forms of standardised
testing, of which NAPLAN is just one – they
are whole-cohort tests of cognitive skills
in literacy and numeracy, marked digitally
and by experts, reporting against national
standards and benchmarks. In the years
since the first national declaration in Hobart,
23
Australia has moved from separate tests in
each state and territory to common wholecohort literacy and numeracy testing.
During this time, five purposes for national
standardised testing have been endorsed by
Australian governments.
Monitoring. Removing examinations at
the end of primary and junior secondary
schooling in most jurisdictions led to the
development of state-based literacy and
numeracy assessments. Some of these
tests were sample tests and others were
census tests. To improve the comparability
of these tests across jurisdictions, national
benchmarks were developed and backed
by statistical moderation. These tests were
replaced by common, census NAPLAN tests
in 2008. Since the first cycle of NAPLAN tests,
Australian governments have consistently
endorsed the use of NAPLAN’s whole-cohort
standardised test results for monitoring
progress towards national goals.
Accountability. Early national declarations
committed governments to increasing
public confidence in school education.
However, since the Melbourne Declaration,
Australian governments have reiterated
their commitment to transparent schoollevel reporting of standardised national
assessments. Ministers’ commitment
to accountability has been balanced by
their concerns about the possibility of
misuse of the data in school rankings
and comparisons.
School improvement. National, state
and territory improvement targets are
currently expressed in terms of NAPLAN
achievement. School systems/sectors have
made substantial investments in business
intelligence systems that enable schools and
system/sector officials to explore trends in
achievements at the individual school level.
They have also supported intervention
programs designed to help individual
schools lift their NAPLAN achievement
standards.
NAPLAN Review Final Report
Individual achievement. Standardised
testing can provide teachers with
useful information about individual
achievement and assist them in planning
for students’ growth in achievement. Early
characterisations of NAPLAN as diagnostic
at the individual student level have been
moderated in recent years, with ACARA
characterising the results as setting the
achievements in a national and state
or territory context, and triangulating
with teachers’ judgements and other
standardised assessments of student
achievement and progress.
Information for parents/carers. Individual
student results have been available from
census NAPLAN testing for more than a
decade. Results are provided by ACARA in
a standard format that situates individual
achievement in the context of all Australian
students’ achievement and in terms of
bands of achievement. In addition, schoollevel information on NAPLAN growth
and achievement has been available on
My School since 2010. These forms of
information could only be provided in a
more limited form if NAPLAN were a samplebased testing program.
Stakeholder perspectives on the
purposes of national standardised
assessment
Stakeholders responded to the NAPLAN
Review in face-to-face and webconferencing meetings, in a set of
Practitioner Reference Group meetings,
in formal written submissions and in
response to a short survey on the review
website. Much of the commentary
touched on the purposes of NAPLAN.
Although the documentary evidence
shows that successive state, territory and
national governments have endorsed
five purposes for NAPLAN, there is no
consensus among stakeholders that these
purposes are all equally legitimate – or that
24
they can be achieved with a single set of
standardised assessments.
There was widespread support from
school systems/sectors, unions, principals,
teachers and parents/carers for using
national standardised assessments to
monitor national and jurisdictional trends in
performance. As one of the teachers’ unions
said, notwithstanding their fierce criticism
of the current NAPLAN program, they have
‘an absolute commitment to a national
assessment program’.
Support for NAPLAN’s use in schoollevel accountability was much more
contested. School system/sector authorities
acknowledged ‘the accountability
regimes in schools and offices are reliant
on NAPLAN’. Although their staff ‘dislike
NAPLAN’, they also acknowledge that
‘they need it’. Another system official said
that despite schools’ ‘strong opposition’
to NAPLAN they ‘are willing to accept that
NAPLAN is part of accountability’. The
broad seam of discontent about NAPLAN
and accountability stems primarily from
concerns about public comparisons on My
School and in the news media. Stakeholders
representing a variety of perspectives on
school education called for the publication
of school-level results to stop. One of
the key issues raised was the perception
of unfairness of comparisons. As one
stakeholder put it:
Take the media out of it to stop unfair
comparisons of schools. My school does
well in NAPLAN and the school down
the road does not do as well. This is not
because of poor teaching or student
learning; they have a very different
clientele who have different base levels of
learning. It makes my school look good,
but I don’t think it is fair on all the other
schools (Respondent to the online survey).
NAPLAN Review Final Report
The third agreed purpose for NAPLAN is
school improvement. School systems/sectors
valued the use of NAPLAN data for this
purpose. As one said, “NAPLAN has provided
us with the opportunity to track schools over
time”. Another said, “We want school data
– allows us to open up conversations with
schools.” Views of school stakeholders about
the value of NAPLAN in school improvement
were mixed. One of the school principal
representatives characterised NAPLAN as
“a lever for innovation in pedagogy and
practice.” Others described the way in
which schools use NAPLAN for identifying
weaknesses in test domains or school
cohorts, and for forward planning and
identifying professional learning goals. Some
thought that NAPLAN was valued more
by people working at a system/sector and
leadership level than by people in schools.
Another less positive view expressed by a
school principal was that there was just too
much measurement going on altogether,
“We’re on a hamster wheel to prove what
we’re doing works”.
Unions were not opposed, in principle, to
using data for school improvement but
contested whether assessments designed
‘to properly address the needs of teachers,
students and families’ could also be relied
on to target school interventions. There
was also some ambivalence about using
NAPLAN results to set school targets. As
one of the principals’ associations argued,
‘When a measure becomes a target, it
ceases to be a good measure.’ Others took
the view that NAPLAN was valuable for
school improvement as well as system/
sector monitoring. As one of the principals’
association representatives said:
We believe it can do all those things…
Schools want it at the school level –
student performance analyser – that
would allow us to add the data, teacher
25
judgement, and the VCE. And that would
show us real trends, for instance the year
our Year 9 writing improved was the
year our median score improved. And
then systems need to make sure policy
is working… how are we going to assess
whether it was worth it and whether
[support] was going to the right place?
Regarding NAPLAN’s value in supporting
individual improvement, the most common
concern was about the balance between
standardised assessment scores and
teachers’ judgements about students’
achievement. Although a few submissions
rejected the use of standardised assessment
entirely, more commonly the issue was
characterised in terms such as these, ‘overreliance on standardised testing… diminishes
teacher professional judgement’ (Member
of the NAPLAN Review Practitioners’
Reference Group). NAPLAN tests were
seen as just one source of data, to be used
in triangulation with other information.
In addition to views about the primacy
of teacher judgment, some stakeholders
would prefer assessment that provide
more fine-grained diagnostic information
than is available from NAPLAN. There is
a clear call, one of the union submissions
argued, ‘from classroom practitioners and
principals for an assessment program that
is more closely linked to what teachers
do in their classrooms and what would
assist in diagnosing student strengths and
weaknesses’. Many of the teachers who
responded took this view, arguing that
NAPLAN needs to:
be designed in a way that the main
purpose is to identify and address
weaknesses in national, community and
individual key skills. Be a component of
the real teaching year and generate class
and individual lesson plans based on what
was found in the NAPLAN results. Allow
NAPLAN Review Final Report
monitoring of individuals and classes over
time to see improvement in weaknesses.
(Respondent to the online survey)
The final purpose that governments have
set out for NAPLAN over time is to provide
information for parents/carers. Although
not all parent stakeholder groups were
enthusiastic about NAPLAN, and especially
the comparisons that could be made
among schools using My School, there was
broad recognition of the right of parents/
carers to be informed about students’
achievement and some acknowledgement
that standardised testing could provide
information that is independent of the local
context. As one submission from a parents/
carers’ group put it,
[S]tandardised assessment is important
because it provides an independent
benchmarked measure as part of a wellrounded assessment of each child’s
learning achievement and growth. It
is particularly important for children in
smaller schools and rural or remote areas,
where the opportunities for comparisons
may be limited. While the context and
detailed knowledge gained by teachers
in their day to day professional dealings
with individual students cannot be
replaced, and will always be valuable to
students, parents/carers and schools,
external independent assessment over
time is also useful information for parents/
carers and students. Its role is to provide
an independent context to balance other
information about student learning
achievement and growth. (Written
submission response: Parents/carers’
association)
26
In addition to commentary on the
value of particular purposes for national
standardised assessment, there was a
good deal of concern that the problem
with NAPLAN’s purposes was lack of clarity.
Some stakeholders suggested that the
purposes have ‘strayed’ from those originally
agreed; others expressed this more kindly
as the purposes having ‘evolved’. One of the
school system/sector officials suggested
that NAPLAN was being used for things
it was not designed for, arguing that, ‘It
was designed for system level data’ but
was being used ‘for individual student
data’. Other stakeholders drew attention
to conflicts between some of the current
purposes of NAPLAN. As a representative
from one of the subject associations said,
NAPLAN Review Final Report
‘It is difficult to simultaneously achieve
census and system testing in conjunction
with diagnostic testing for teachers’. To
this end, union and principals’ association
representatives often argued that the
accountability and system performance
purposes of national standardised
testing could be met by sample testing,
and that teachers’ judgements should
be supported by richer, on-demand
diagnostic assessments.
27
Chapter 2: National and
international measures
of achievement
This chapter explores the evidence of achievement from more than a decade of NAPLAN
census testing and considers that evidence of achievement in the context of achievement
arising from three international sample testing programs – the Progress in International
Reading Literacy Study (PIRLS), Trends in International Mathematics and Science Study
(TIMSS) and Programme for International Student Assessment (PISA). The final section of
the chapter considers this evidence of continuity and change in achievement alongside
the feedback on national and jurisdictional performance received during consultation for
the NAPLAN Review.
Key points:
• National NAPLAN results have improved in the last decade in Years 3 and 5 but not in
Years 7 and 9. Writing achievement has been static in Years 3 and 5 and has declined
in Years 7 and 9.
• Some jurisdictions, notably Queensland and Western Australia, have improved more
than others on national and international measures.
• Australia is a middle-ranking country in international test comparisons, ranked below
high-performing Asian countries and often below Canada, England and Ireland.
• There are important differences among the national and international tests, including
test domains, item types, degree of focus on curriculum content, cognitive demand,
number of proficiency levels reported and whether they are sample tests.
• There have been improvements in NAPLAN reading and numeracy in Years 3 and 5,
PIRLS Year 4 reading and TIMSS Year 4 mathematics. Although achievement in Years
7 and 9 NAPLAN reading and numeracy has not changed in a decade, PISA reading
literacy and PISA mathematical literacy of 15-year-olds have declined.
• The proportion of high-performing students in PIRLS Year 4 reading and TIMSS Year 4
mathematics has increased, but there has been no change in the proportion of high
or low performers in TIMSS Year 8. In PISA, the proportion of low-performing students
has increased in reading and mathematics and the proportion of high-performing
students has decreased in mathematics.
• There is widespread support for a national standardised testing program, as well
as widespread concern about limitations of the current NAPLAN program.
NAPLAN Review Final Report
28
NAPLAN achievement and
improvement, 2008 to 2019
Twelve National Reports have been
produced since 2008, providing NAPLAN
data in the five test domains: reading,
writing, spelling, grammar and punctuation
and numeracy. Results are also available
on an interactive website provided by the
Australian Curriculum, Assessment and
Reporting Authority (ACARA). The most
recent 2019 National Report provides mean
scale scores, the proportion of students in
each of ten achievement bands and the
proportion of students above the national
minimum standard (NMS). These results are
disaggregated by sex, Indigenous status,
language background other than English
(LBOTE) status, geolocation, parental
education and parental occupation.1 Results
in each of these categories are available
for Australia as a whole and for each state
and territory. Comparisons of results across
jurisdictions are reported for all five test
domains. In addition to reporting NMS
proportions and mean scores in each
domain, ACARA provides estimates of the
statistical significance of differences
in achievement. When differences are
statistically significant and have an
effect size between 0.2 and 0.5 they are
described as ‘moderate’; when they are
significant and have an effect size greater
than 0.5 they are described as ‘substantial’
(ACARA, 2019c p. 300).
Reading
NAPLAN reading tests are designed to
‘assess students’ ability to read and view
texts to identify, analyse and evaluate
information and ideas’ (ACARA, 2017, p. 9).
The reading tests use written texts including
some with graphics and images. ACARA
notes that much of the teaching of literacy
occurs in the English learning area and
that the tests are aligned with the literacy
aspects of the Australian Curriculum (p. 5).
Australia-wide, there have been moderate
improvements between 2008 and 2019
in national mean scale scores and in the
proportion of students equal to or above
the NMS in Years 3 and 5 reading, but no
significant national increases in Years 7
and 9 reading achievement since NAPLAN
began (see Table 2).2
Table 2: Differences in achievements of students in reading, 2008 to 2019
Students
Year 9
AUS
NSW
VIC
QLD
WA
SA
TAS
ACT
NT
Mean
≥NMS*
Year 7
Mean
≥NMS
Year 5
Mean
≥NMS
Year 3
Mean
≥NMS
* NMS: National minimum standard
Key: No change
Moderate increase
Substantial increase
Moderate decrease
Substantial decrease
1
The NAPLAN National Reports use of the term ‘Indigenous‘ to refer to Aboriginal and Torres Strait Islander peoples. In this report, this term
Indigenous is used when the specific reference is to data from the National Reports; elsewhere, the term Aboriginal and Torres Strait Islander
peoples is preferred.
2
Data for these and the following NAPLAN comparison tables are drawn from time series data available
at https://reports.acara.edu.au/Home/TimeSeries
NAPLAN Review Final Report
29
National improvements in Years 3 and
5 reading were shared across a range of
student demographic groups. Both male
and female students showed moderate
increases in the proportion of Years 3 and
5 students at or above the reading NMS
and mean scores between 2008 and 2019.
Moderate increases were recorded for
Indigenous and LBOTE students in NMS and
mean scores in Years 3 and 5 and in mean
scores in Year 7. There were no significant
differences in Year 9 reading for any of these
demographic groups.
The differences among states and territories
in mean score achievement that were
evident in the first round of testing in 2008
have continued throughout the years of
NAPLAN testing. In both 2008 and in 2019,
the ACT, Victoria and NSW had the highest
mean scores and the Northern Territory
had the lowest mean scores in Year 3
reading. The rank order of other jurisdictions
changed slightly over time. Queensland’s
and Western Australia’s Year 3 reading mean
scores were lower than South Australia’s
and Tasmania’s in 2008 and higher in 2019.
In 2019, Victoria’s Year 3 reading mean score
was statistically similar to NSW and the
ACT, and superior to all other jurisdictions.
Similar patterns occurred in Years 5, 7 and
9 reading in both 2008 and 2019, where
the ACT, Victoria and NSW were typically
the three highest performing jurisdictions
and the Northern Territory was the lowest
performing jurisdiction. Among the other
states, moderate improvement in Western
Australia’s and Queensland’s mean scores
improved their rank order compared with
South Australia and Tasmania.
In Queensland, there were substantial
increases in Year 3 NMS and mean scores,
substantial increases in Year 5 NMS and
moderate increases Years 5 and 7 NMS
and mean scores. Changes in Western
Australia included a substantial increase
in the percentage of Year 3 students at or
NAPLAN Review Final Report
above the NMS and moderate increases
in NMS and mean scores in Years 3, 5, 7
and 9 reading.
Between 2008 and 2019, there were
moderate increases in NMS and mean scores
in Years 3 and 5 reading in Victoria and
South Australia, in Year 3 NMS and mean
scores in the Northern Territory and in Year
3 mean and Year 5 NMS and mean scores
in Tasmania. In NSW, there were moderate
increases in NMS and mean scores in Year 3,
but no changes in the NMS or mean scores
in other Years. The ACT showed moderate
increases in Years 3 and 5 mean scores, but
no change in the proportion of students
at or above the NMS. In Victoria, Tasmania
and the ACT there were decreases in the
proportion of students at or above the NMS
in Year 9 reading, but no change in the mean
scores from 2008 to 2019.
In sum, the evidence of a decade’s NAPLAN
results is that there have been moderate
improvements in reading achievement
nationally and in most jurisdictions in Years 3
and 5. Although there has been no national
improvement in Years 7 and 9 reading
scores, there have been moderate declines
in the proportion of students achieving at
or above the NMS in several jurisdictions
and improvements in that proportion and
in mean scores in other jurisdictions. The
performance of two states stands out:
Western Australia’s moderate improvements
in mean scores in Years 3, 5, 7 and 9, and
Queensland’s moderate improvement in
Years 5 and 7 and substantial improvement
in Year 3 reading achievement, though both
remain behind the ACT, NSW and Victoria.
Writing
The NAPLAN writing tests are aligned
to the Australian Curriculum in English
through a focus on seven sub-strand threads
of the curriculum – purpose, audience
and structures of different types of texts,
vocabulary, text cohesion, sentences and
30
clause level grammar, word level grammar,
punctuation and spelling (ACARA 2017, p.14).
The assessment of writing has been
subject to greater turbulence than the
other NAPLAN measures and is discussed
in detail in Chapter 5. One of the sources
of turbulence has been the possibility
that the writing prompts in any one year
could be for either narrative or persuasive
writing, leading initially to two separate
assessment scales, which made year-to-year
comparisons difficult. Since 2011, however,
the prompts have been for persuasive
writing in every year except 2016. NAPLAN
narrative writing scores from 2016 have since
been mapped onto the existing writing
scale, providing results and trend data
from 2011 to 2019 on a single scale. In the
comparison below, 2011 is the base year for
writing (see Table 3).
There has been little change in either
the national NMS proportions or mean
scores in writing over nine years, and what
little change there has been is negative.
Between 2011 and 2019 there were no
national improvements in either NMS or
mean scores, and there were moderate
declines in mean scores in Years 7 and 9.
Indigenous students achieved a moderate
increase in NMS proportions.
Among the states and territories there
were moderate increases in mean scores in
Year 3 in Western Australia and Tasmania
and moderate increases in the proportion
of students at or above NMS in Year 3 in
Western Australia and Queensland. In some
other jurisdictions there were declines in
performance between 2011 and 2019. Mean
scores in the ACT decreased in Years 5, 7 and
9. In Queensland, NMS and mean scores
declined in Years 7 and 9.
Table 3: Differences in achievements of students in writing, 2011 to 2019
Students
Year 9
AUS
NSW
VIC
QLD
WA
SA
TAS
ACT
NT
Mean
≥NMS*
Year 7
Mean
≥NMS
Year 5
Mean
≥NMS
Year 3
Mean
≥NMS
* NMS: National minimum standard
Key: No change
Moderate increase
NAPLAN Review Final Report
Substantial increase
Moderate decrease
Substantial decrease
31
Spelling
Spelling is one of two parts of the NAPLAN
language conventions tests. The second
part is grammar and punctuation. The
spelling tests focus on the accurate spelling
of written words and draw on the Australian
Curriculum English sub-strand of spelling.
Students are required to either provide the
correct spelling of a designated word or
identify a misspelled word and then write
the correct spelling (ACARA, 2017, p. 13).
There is some NAPLAN evidence of
improvement in national spelling
achievement since 2008, with moderate
increases in mean scores in Year 3 and NMS
and mean scores in Year 5 (see Table 4).
As has been the case in other test domains,
there is no evidence of changes in national
achievement in the secondary school test
years, Years 7 and 9.
Among the demographic groups, there have
been moderate increases in mean spelling
scores of Year 3 males and Years 3 and 5
males and females, as well as moderate
increases in NMS and mean
scores of Indigenous students in Years 3
and 5. Among LBOTE students there have
been moderate increases in mean scores for
Year 3 and moderate NMS and mean score
increases for Year 5.
The modest national improvements in
spelling achievement appear to have been
driven by improvements in two states –
Queensland and Western Australia. From
2008 to 2019, there was a substantial
increase in Queensland’s Year 3 mean scores
and moderate increases were observed
in Years 5 and 7 mean scores. Moderate
increases in the proportion of Year 3
students at or above the NMS were recorded
in Years 3, 5, 7 and 9. In Western Australia,
moderate increases were achieved in both
NMS and mean scores in Years 3, 5, 7 and
9. In contrast, there were no other states
or territories that recorded improvements
in spelling scores by either measure; and
NSW and Tasmania recorded a moderate
decrease in the proportion of students at or
above the Year 3 NMS.
Table 4: Differences in achievements of students in spelling, 2008 to 2019
Students
Year 9
AUS
NSW
VIC
QLD
WA
SA
TAS
ACT
NT
Mean
≥NMS*
Year 7
Mean
≥NMS
Year 5
Mean
≥NMS
Year 3
Mean
≥NMS
* NMS: National minimum standard
Key: No change
Moderate increase
NAPLAN Review Final Report
Substantial increase
Moderate decrease
Substantial decrease
32
Grammar and punctuation
The second part of the language
conventions test concerns grammar
and punctuation. The grammar items
focus on knowledge and accurate use of
grammar at a sentence, clause and word
level. Grammar items are developed from
the sub-strand threads of text cohesion,
sentences and clause level grammar and
word level grammar in the Australian
Curriculum in English. The punctuation
items are developed from the content
of the sub-strand thread of punctuation
(ACARA 2017, p. 12).
Evidence of national improvement in scores
on the NAPLAN grammar and punctuation
tests is scant (see Table 5). In Year 3 alone,
there were moderate increases in national
mean scores and in the proportion of
students at or above the NMS.
Mean and NMS scores increased for both
male and female Year 3 students between
2008 and 2019. Indigenous students’ and
LBOTE students’ achievement improved
on both measures in Year 3, and their mean
scores increased in Year 7.
Across the states and territories, there were
moderate mean score increases in Year 3
grammar and punctuation in NSW, South
Australia, the ACT and the Northern Territory
and substantial mean score increases
in Queensland and Western Australia.
Although there were no changes in Years
5, 7 or 9 scores on either measure in the
other states and territories, there were
moderate increases in Western Australia’s
NMS and mean scores in Year 7 and 9 and
Queensland’s mean score in Year 7.
Table 5: Differences in achievements of students in grammar and punctuation, 2008 to 2019
Students
Year 9
AUS
NSW
VIC
QLD
WA
SA
TAS
ACT
NT
Mean
≥NMS*
Year 7
Mean
≥NMS
Year 5
Mean
≥NMS
Year 3
Mean
≥NMS
* NMS: National minimum standard
Key: No change
Moderate increase
NAPLAN Review Final Report
Substantial increase
Moderate decrease
Substantial decrease
33
Numeracy
The NAPLAN numeracy tests assess
students’ application of mathematical
knowledge and skills in everyday contexts.
Test items draw on both the content
and proficiency strands of the Australian
Curriculum in mathematics. The proficiency
strands are understanding, fluency, problem
solving and reasoning. The content strands
are number and algebra, measurement
and geometry and statistics and probability
(ACARA 2017, p. 17). The Years 7 and 9
numeracy tests include some items for
which calculators may be used and some
non-calculator items. Calculators are not
used for Years 3 or 5 tests.
Unlike the Year 3 NAPLAN literacy tests,
where there were moderate national mean
score improvements in achievement in the
last decade, there is no evidence of national
improvement in the Year 3 numeracy
achievement over that time (Table 6). There
were moderate increases in NMS and mean
scores in Year 5, however, and an increase
in the proportion of students at or above
the NMS in Year 7 numeracy. Among the
demographic groups, increases in Year 5
NMS proportions and mean scores and the
NMS proportion in Year 9 numeracy were
achieved by males and females, Indigenous
students and LBOTE students.
Across the jurisdictions, the pattern of
differences in achievement from 2008
to 2019 was inconsistent. Although there
were national increases in Year 5 numeracy,
mean scores and NMS proportions did not
change in NSW or the Northern Territory.
Queensland recorded substantial increases
in mean scores and NMS proportions,
while Victoria, Western Australia and South
Australia recorded moderate increases on
both Year 5 numeracy measures. In Year 9,
Queensland recorded moderate increases
on both measures and Western Australia
recorded a substantial increase in NMS
proportions and a moderate increase in
mean scores.
Table 6: Differences in achievements of students in numeracy, 2008 to 2019
Students
Year 9
AUS
NSW
VIC
QLD
WA
SA
TAS
ACT
NT
Mean
≥NMS*
Year 7
Mean
≥NMS
Year 5
Mean
≥NMS
Year 3
Mean
≥NMS
* NMS: National minimum standard
Key: No change
Moderate increase
NAPLAN Review Final Report
Substantial increase
Moderate decrease
Substantial decrease
34
Patterns of change across the
NAPLAN test domains
Looking across the test domains, evidence
of national improvement in NAPLAN
achievement is overwhelmingly in the
primary school years (Table 7). There have
been moderate increases in Years 3 and
5 mean scores in reading, spelling and
grammar and punctuation; and NMS
increases in both reading, and grammar and
punctuation. There is no evidence of national
improvement in literacy in either Years 7
or 9, and evidence of moderate declines
in writing in both Years 7 and 9.
In the secondary years, the only evidence of
moderate improvement is in the proportion
of students meeting the NMS for Year
9 numeracy. There are no significant
improvements in mean scores in any of the
Year 7 or 9 test domains, and mean scores
showed moderate declines in both Years 7
and 9 writing.
Table 7: Differences in achievement, base year to 2019, by test domain, Australia
Students
Year 9
Reading
Writing
Spelling
Grammar &
punctuation
Numeracy
Mean
≥NMS*
Year 7
Mean
≥NMS
Year 5
Mean
≥NMS
Year 3
Mean
≥NMS
* NMS: National minimum standard
Key: No change
Moderate increase
NAPLAN Review Final Report
Substantial increase
Moderate decrease
Substantial decrease
35
Among the demographic groups tracked
in the NAPLAN National Reports, there is
evidence of moderate improvement by
Indigenous students on a range of measures.
Although Indigenous students’ NAPLAN
results continue to lag behind those of
non-Indigenous students, Indigenous
students’ mean scores or NMS proportions
increased in Year 3 reading, writing, spelling
and grammar and punctuation; Year 5
reading, spelling and numeracy; Year 7
reading and grammar and punctuation;
and Year 9 numeracy.
The sense that there has been limited
national improvement over the last decade
may be contrasted with the experience
of individual states and territories. Across
the five test domains and four test year
levels, Queensland produced moderate
or substantial increases in 11 mean scores
and 13 NMS proportions. Similarly, Western
Australia increased mean scores in Years 7
and 9 in all but the writing tests, which held
steady against a national trend of reduced
achievement in Years 7 and 9 writing. By
way of example, differences in achievement
in each test domain and test year from the
base year to 2019 for Western Australia are
reproduced in Table 8.
Table 8: Differences in achievement, base year to 2019, by test domain, Western Australia
Students
Year 9
Reading
Writing
Spelling
Grammar &
punctuation
Numeracy
Mean
≥NMS*
Year 7
Mean
≥NMS
Year 5
Mean
≥NMS
Year 3
Mean
≥NMS
* NMS: National minimum standard
Key: No change
Moderate increase
Substantial increase
NAPLAN and international
surveys of student
achievement
Three international surveys taken by students
in Australian schools overlap the knowledge
and skills tested by NAPLAN. The Progress
in International Reading Literacy Study
(PIRLS) assesses Year 4 students on a fiveyear cycle. Australia has participated in
the 2011 and 2016 cycles. The Trends in
International Mathematics and Science
NAPLAN Review Final Report
Moderate decrease
Substantial decrease
Study (TIMSS) assesses students in Years 4
and 8 in mathematics and science. Tests are
on a four-year cycle. Since 1995, Australia has
participated in six cycles. The Programme
for International Student Assessment (PISA)
assesses 15-year-old students’ ability to use
their science, reading and mathematics
knowledge and skills to meet real-life
challenges. Assessments are on three-year
cycles and Australia has participated in PISA
reading, mathematics and science since 2000.
36
PIRLS and NAPLAN reading
The PIRLS reading assessment framework
identifies two purposes for reading that
account for most reading activities done
by young students inside and outside the
classroom, for literary experience and to
acquire and use information (Thomson et
al., 2017a). Four processes of comprehension
are assessed within each of these two major
reading purposes:
• focusing on and retrieving explicitly
stated information
• making straightforward inferences
• interpreting and integrating ideas
and information
• examining and evaluating content,
language and textual elements.
The PIRLS reading purposes and processes
have a substantial overlap with the Year 4
Australian Curriculum in English.3 These
purposes and processes are also broadly
consistent with the NAPLAN reading tests
which focus on assessing ‘students’ ability
to read and view texts to identify, analyse
and evaluate information and ideas’
(ACARA, 2017, p. 9).
Fifty countries participated in PIRLS 2016
(Thomson et al, 2017a). Australia was in the
middle achievement group, with a higher
achievement than 24 countries, similar
achievement to 13 countries and lower
achievement than 13 countries. Higher
performing countries included most of the
Asian countries, Ireland and England. Similar
performing countries included the United
States and Canada and lower performing
countries included France and New Zealand.
Australia recorded a significant
improvement in the average reading
score between the 2011 and 2016 PIRLS
3
assessments, consistent with the statistically
significant improvements in reading
achievement in NAPLAN Years 3 and 5
mean scores. Among the highlights of
this improvement were students’ relative
strength in the literary reading purpose
without a relative weakness in acquiring
and using information. Regarding the
processes, Australian Year 4 students had
a relative strength in the interpreting,
integrating and evaluating processes scale,
and a relative weakness in the retrieving
and straightforward inferencing scale.
From PIRLS 2011 to PIRLS 2016 there
were also some differences in patterns of
improvement across states and territories:
• The performance of students in Victoria
was significantly higher than that
of students in all other jurisdictions
except the ACT.
• Students in South Australia performed
significantly lower, on average, than
students in Victoria, the ACT and
Western Australia.
• Western Australia showed the greatest
improvement of 28 points from PIRLS
2011 to 2016, followed by Queensland
(26 points) and Victoria (21 points). There
was no significant change in average
scores between 2011 and 2016 in the
remaining jurisdictions.
These inter-jurisdictional differences are
broadly consistent with the evidence of
change in NAPLAN Years 3 and 5 reading
over the last decade. Victoria and the ACT
(along with NSW) have typically had the
highest scores in the country. Queensland
and Western Australia have had the
strongest patterns of improvement, and
their rank order improvement has been at
the expense of South Australia.
These descriptions of the three international surveys and their relationship to the NAPLAN tests draws heavily on an advice paper prepared
for the NAPLAN Review by Dr. Sue Thomson of the Australian Council for Educational Research, May 2020.
NAPLAN Review Final Report
37
TIMSS and NAPLAN numeracy
The TIMMS assessment frameworks in
numeracy are organised around two
dimensions (Grønmo, Lindquist, Arora,
& Mullis, 2013). The content dimension
specifies the subject matter to be assessed
and the cognitive dimension specifies
the thinking processes to be assessed.
The content dimensions are different for
the Years 4 and 8 tests, reflecting content
commonly taught at each year level. The
Year 4 TIMSS content domains are number,
geometric shapes and data display; the Year
8 content domains are number, algebra,
geometry and data and chance. There is
a greater concentration of test content in
number in Year 4; and in Year 8 the content
focuses more on interpreting data and
the fundamentals of probability. There are
three TIMSS cognitive domains – knowing,
applying and reasoning. In Year 4, TIMSS has
less emphasis on the knowing domain and
greater emphasis on the reasoning domain
than in Year 8.
According to an advice paper prepared for
the NAPLAN Review, the content domains
of TIMSS are very similar to the Australian
Curriculum in mathematics, which in turn
underpins the NAPLAN numeracy tests.
There are, however, some differences
of emphasis in the cognitive domain.
The Australian Curriculum emphasis on
knowing and applying is similar to TIMSS
but the Australian Curriculum does not
appear to cover some of the complexity
that is described in the TIMSS framework
under reasoning. It seems likely, too, that a
substantial number of TIMSS mathematics
items are beyond Australian Curriculum
expectations for achievement, especially
at the Year 4 level.
Fifty-seven countries participated in TIMSS
2015 (Thomson et al., 2017b). In Year 4
mathematics, Australian students’ mean
score was higher than 20 countries, lower
NAPLAN Review Final Report
than 21 countries and not different from
seven countries. Countries with higher
scores included the TIMSS high-performing
Asian countries, Ireland, England and
the United States; countries with similar
scores included Canada and Germany;
and countries with lower scores included
Italy, Spain and New Zealand. In Year 8
mathematics, 12 countries had higher
performances than Australia and 21 had
lower performances. Higher performing
countries included TIMSS Asian countries,
Canada, Ireland and the United States;
lower performing countries included Italy
and New Zealand.
In the TIMSS content dimensions, in 2015
Australian Year 4 students performed better
in data display and geometric shapes and
measures but were weaker in number. In the
cognitive dimensions, they were better in
applying and reasoning but were weaker in
knowing. Year 8 students performed better
in data, and chance and number than in
algebra and geometry and were weaker in
applying and stronger in reasoning.
Australia’s Year 4 score in TIMSS 2015
was a significant improvement on the
1995 national score, but this was due to a
single increase recorded in 2007 with no
change recorded in 2011 or 2015. For Year
8, Australia’s result dipped in 2007 and
this was followed by a recovery in 2011.
Australia’s 2015 Year 8 mathematics score
was not significantly different from the
corresponding score in 1995.
Although there was no national change in
Year 4 TIMSS mathematics between 2011
and 2015, NSW, Queensland, South Australia,
Western Australia and Tasmania all had
significantly higher average scores in 2015
than in 1995 and Western Australia showed
the greatest score improvement. Year 8
mathematics achievement improved in
Victoria from 1995 to 2015 and from 2003 to
2015. NSW students’ achievement declined
38
from 2003 to 2015. Western Australia
improved from 2003 to 2015 but has not yet
returned to its 1995 level of achievement in
TIMSS mathematics.
Change in TIMSS mean scores over the long
term is broadly consistent with the NAPLAN
evidence. TIMSS Year 4 achievement has
improved from 1995 to 2015 and so has
Years 3 and 5 NAPLAN numeracy from 2008
to 2019; TIMSS Year 8 has been static and so
has Years 7 and 9 NAPLAN numeracy over
those time intervals.
PISA reading literacy and NAPLAN
reading
PISA reading literacy measures the capacity
to understand, use and reflect on written
texts to achieve goals, develop knowledge
and potential, and participate in society
(OECD, 2020). Unlike NAPLAN, PIRLS and
TIMSS, PISA is not a curriculum-based
assessment. The conceptualisation of
reading literacy has been revised each time
that it has been the major assessment PISA
domain (2000, 2009, 2018). The current
2018 reading literacy framework ‘integrates
reading in a traditional sense together with
the new forms of reading that have emerged
over the past decades and that continue to
emerge due to the spread of digital devices
and digital texts’ (OECD 2019a, p. 22).
An advice paper prepared for this
review noted that there has been no
comprehensive comparison between the
knowledge and skills assessed by PISA
and the levels of achievement in the Year
9 Australian Curriculum, but that the
level of the Australian Curriculum may be
less advanced than the PISA framework
suggests. Balanced against this, however,
it is important to note that most Australian
students have taken PISA tests in Year 10.
Average performance of Australian students
in PISA 2018 reading literacy was higher than
students in 58 countries and lower than
NAPLAN Review Final Report
those in 10 countries (Thomson et al., 2019).
Those that outperformed Australia included
the PISA high-performing Asian countries,
Canada and Ireland; countries with similar
achievement included New Zealand, the
United States and the United Kingdom.
Countries that Australia outperformed
included France, the Netherlands and Italy.
There were statistically significant declines
in PISA reading literacy from 2000 to 2018
as well as between 2003 and 2006, 2009
and 2015, and 2012 and 2018, but Australian
students’ achievement did not change from
2015 to 2018. These declines in PISA scores
are not reflected in the more curriculumbased NAPLAN Years 7 and 9 reading
assessments, which registered no significant
difference from 2008 to 2019.
In 2018, the average performance of
students from the ACT was higher than that
of students in any of the other jurisdictions.
The next group of states was Western
Australia, Victoria and Queensland. Students
in South Australia, NSW and Tasmania
performed at a similar but lower level and
the lowest performing jurisdictions were
Tasmania and the Northern Territory.
PISA mathematical literacy
and NAPLAN numeracy
PISA mathematical literacy measures
‘capacity to formulate, employ and interpret
mathematics in a variety of contexts. It
includes reasoning mathematically and
using mathematical concepts, procedures,
facts and tools to describe, explain and
predict phenomena. It assists individuals to
recognise the role that mathematics plays
in the world and to make the well-founded
judgements and decisions needed by
constructive, engaged and reflective citizens’
(OECD, 2019a, p. 75).
Australia outperformed 47 counties, was
similar to those in eight countries and lower
than those in 23 countries (Thomson et al.,
2019). The high-performing Asian countries,
39
Canada, the United Kingdom and Ireland
were among those with higher mean scores;
France and New Zealand had similar scores
and the United States had lower scores.
Although Australia remained in the
middle group of countries in PISA 2018
mathematical literacy, Australia’s mean
performance in PISA mathematical literacy
declined from 2003 to 2018 and in some of
the intervals in between (2006 to 2012, 2009
to 2012 and 2012 to 2015), but did not change
from 2015 to 2018 (Thomson et al, 2019).
This record of long-term decline in PISA
mathematical literacy is not matched by the
more curriculum-based NAPLAN numeracy
tests, which showed no significant changes
between 2008 and 2019 in Year 9.
As they have done in the most recent cycles
of international assessment and most
NAPLAN assessments, students in the ACT
performed at a higher level than other
jurisdictions in PISA 2018 mathematical
literacy. Students in Western Australia and
Victoria performed at similar levels. Students
in Queensland, NSW and South Australia
performed at similar but lower levels, and
students in Tasmania and the Northern
Territory were outperformed by students in
all other jurisdictions. There were, however,
declines in all Australian jurisdictions from
2003 to 2018. The largest decline was in
South Australia, where the decline was
almost equivalent to two years of schooling.
The national and international
standardised testing programs
compared
There are some differences between
the international standardised tests and
NAPLAN. PIRLS, TIMSS and PISA are sample
tests rather than whole-population tests.
PIRLS and TIMSS are both curriculumbased tests and are relatively well-linked
to the Australian Curriculum, which
underpins NAPLAN tests. PISA focuses
on the somewhat different constructs of
NAPLAN Review Final Report
reading literacy and mathematical literacy.
PISA tests are not curriculum-based and
are moving rapidly to embrace testing the
uses of literacy and numeracy in digital
contexts. A further difference between
the Australian and international tests is
that PIRLS, TIMSS and PISA all have the
capacity to report scores on sub-scales.
In PISA, for example, as well as national
estimates of overall achievement in reading
literacy, performance is reported in terms
of the three cognitive subscales (locating
information, understanding and evaluating,
and reflecting), and two text structure
subscales (single-source text and multiple
sources texts). It also seems likely that
several of the international tests are pitched
a little higher in cognitive terms than the
corresponding Australian curriculum for
that year level.
There are also some differences in item
types. TIMSS, PIRLS and PISA include some
open constructed response items that
require trained markers rather than digital
marking. NAPLAN paper tests, other than
writing, are now digitally marked. While
machine marking strategies may tend to
narrow the breadth of skills to be assessed,
they do have the advantage of allowing for
almost immediate feedback to students
and teachers.
Despite the differences in test domains,
item types and curriculum focus, at the
highest level of generality there are common
conclusions to be drawn about Australia’s
performance and improvement from the
international tests. On all three of PIRLS,
TIMSS and PISA, Australia is a middleranking country. Average performance is
below the high-performing Asian countries
and often below comparable Englishspeaking jurisdictions such as Canada,
the United Kingdom and Ireland.
Although the time intervals are different
(depending on when Australia entered the
particular test series and the number of
40
years between PISA, TIMSS and PIRLS test
cycles), there are some broad similarities in
performance across the testing programs.
Statistically significant improvements in
NAPLAN at Years 3 and 5 levels are echoed in
improvements in TIMSS Year 4 mathematics
and PIRLS Year 4 reading. Flat performance
in NAPLAN Years 7 and 9 numeracy is
echoed in flat performance in TIMSS Year 8.
The one exception is in NAPLAN Years 7
and 9 literacy, where flat performance from
2008 to 2019 corresponds to statistically
significant declines in PISA reading literacy
from 2000 to 2018. These statistically
significant changes are illustrated in Table 9.
The proportion of students at various
proficiency levels provides another
perspective on comparative performance.
PIRLS and TIMSS have low, intermediate,
high and advanced benchmarks; PISA
reports on proficiency levels 1 to 6, classifying
students below level 2 as low performers.
The patterns of change in proficiency are
similar to those for average achievement
across the international assessments.
• In PIRLS, the proportion of Australian
students who performed at or above the
advanced benchmark increased from
2011 to 2016, and nationally there was
no change in the proportion who failed
to reach the low benchmark (Thomson
et al. 2017a, p. 10).
• In TIMSS Year 4 mathematics the
proportion of Australian students
achieving at or above the advanced
benchmark increased and the
proportion of students not achieving
the low benchmark decreased in most
jurisdictions from 1995 to 2015 (Thomson
et al, 2017b, p. 21).
• In TIMSS Year 8 mathematics there was
no change in the national proportion
of students achieving at or above the
advanced benchmark or not achieving
the low benchmark from 1995 to 2015
(Thomson et al, 2017b, p. 53).
• In PISA reading literacy, the proportion
of low-performing students increased,
and the proportion of high-performing
students did not change between
2000 and 2018 (Thomson et al 2019, p.
47). In PISA mathematical literacy, the
proportion of low-performing students
increased, and the proportion of highperforming students decreased between
2003 and 2018 (p. 127).
Table 9: Differences in achievement, NAPLAN, PIRLS, TIMSS and PISA
Assessment
Year/age
Interval
PISA
15-year-old
2000-2018
PISA
15-year-old
2003-2018
NAPLAN
Year 9
2008-2019
TIMSS
Year 8
1995-2015
NAPLAN
Year 7
2008-2019
NAPLAN
Year 5
2008-2019
TIMSS
Year 4
1995-2015
PIRLS
Year 4
2011-2016
NAPLAN
Year 3
2008-2019
Key: No change
Significant increase
NAPLAN Review Final Report
Reading/literacy
Mathematics/
numeracy
N/A
N/A
N/A
N/A
N/A
Substantial decrease N/A not applicable
41
It is unclear why most of the improvements
registered by these national and
international assessments have occurred
in the primary school years. It may be the
result of strengthened national, state and
territory efforts to ensure that all students
make the best start possible in schools
through curriculum and professional
development reforms focused on the early
years. Alternatively, or additionally, it may
reflect a clearer connection between the
curriculum and the test domains in primary
schools, where most classroom teachers
teach both mathematics and English.
This is in comparison with the diffusion of
responsibility in secondary schools, where
literacy and numeracy are important in
many school subjects but are more explicitly
the responsibility of English teachers and
mathematics teachers. Which of these is
correct, or whatever other causes there may
be, there are multiple sources of evidence
that Australia’s secondary schools have not
shared their primary school colleagues’
success in improving achievement in
reading/literacy or mathematics/numeracy.
Where there are differences between
national and international estimates of
achievement, such as the decline in PISA
15-year-old scores and the absence of
change in Year 9 NAPLAN reading and
numeracy, it would be useful to be able to
investigate whether this has been due to
differences in test domains or item design by
investigating the performance of common
students taking both kinds of tests.
NAPLAN Review Final Report
Stakeholder views on
national and international
standardised testing
Among the stakeholders consulted in this
review, it is fair to say that there is not a lot
of affection for NAPLAN in its current form.
The stakeholder comments summarised
in Chapter 1 demonstrated a wide range of
views about what NAPLAN’s purposes have
been, should have been or could be. There
was, however, relatively little commentary
on what has been learned from a decade
of national standardised assessment.
Among the individuals who chose to
respond to the review’s online survey,
an overwhelming majority were critical
of NAPLAN, expressing concerns about
distortion of teaching programs, lack of
diagnostic value, or misuse of results in
ranking and comparing schools.
The views of organisational stakeholders
were more mixed. Some, like one of the
principals’ associations, took the view that
we “wouldn’t lose anything if NAPLAN wasn’t
around”, that NAPLAN ”Doesn’t say more
than what teachers already know” (Parent/
carers’ association), or “Hasn’t given us much
at all other than an ongoing debate in the
country when we should have been talking
about equity” (Teachers’ union). Others
took the view that those wishing to have
NAPLAN replaced should be ‘careful what
they wish for’. As one of the school system/
sector stakeholders put it: “If we removed
NAPLAN, sectors would need something
else to replace the gap. The new testing
could be more burdensome on teachers.
It’s naïve to say ‘Let’s get rid of it’ without
considering an alternative.”
42
Some stakeholders argued there have been
positive outcomes from NAPLAN. As one
representative of a principals’ association
put it, ‘NAPLAN has improved focus on
literacy and numeracy on schools – a positive
outcome.’ Others argued that a decade of
NAPLAN testing had had no impact:
It has not contributed to an increase in
educational outcomes. It has heaped
public scorn on disadvantaged students
and communities, which are placed in the
modern day stocks through the invasive
My School website. It rewards a narrow
band of often lower-order intellectual
capacities; it has narrowed the taught
curriculum; it has corresponded to a
seemingly inexorable decline in Australia’s
performance in major international tests.
(Teachers’ union)
Whether or not they believed that NAPLAN
or the international comparisons had
had an impact, most written submissions
indicated broad support for some type of
national assessment, usually as a tool for
helping teachers and schools understand
how best to assist students but often also to
support trend analysis capable of informing
NAPLAN Review Final Report
policy development. As a submission
from a statutory authority put it, ‘While
NAPLAN has a range of issues, as raised
in the review’s interim report, national
standardised testing can serve as important
indicator of the health of Australia’s school
education system’.
Even the harshest critics of NAPLAN
acknowledged the importance of having
a national standardised testing program:
The teaching profession continues to
recognise that it is essential to have
a National Assessment program for
Australia’s students. Such a program
is imperative if we are to support
communities most in need, to track how
educational standards are developing
and to assist individual students to
grow and progress to their optimal level.
(Teachers’ union)
Whether or not the current NAPLAN testing
program meets these high standards is the
topic of subsequent chapters in this review.
43
Chapter 3: Other national
educational assessment
practices
National educational assessment policies and practices vary, and the practices do not
relate systematically to the quality of students’ learning. This chapter describes practices
in countries, selected because they are like Australia in important respects (Canada,
New Zealand, England, Scotland) or stand out because they are high-achieving (Finland,
Singapore, Japan). It concludes with a description of issues of relevance for consideration
in Australia.
Key points
• There are no assessment practices common across high-performing countries.
• Some have maintained external, subject-based examinations at the end of primary
school and in mid-secondary school.
• Some, like Australia, having abandoned external, subject-based examinations before
the end of secondary education, have introduced census assessments at particular
year levels of foundational skills in literacy and numeracy and, in some cases, of
science as well.
• In most cases, school results are provided only to the schools and the education
authorities but, in some case, some of the school-level results are made
publicly available.
• In most cases, individual students’ results go to the students’ schools and their
families but, in Scotland, they go only to the schools.
• All those without census assessments, and some with them, use sample assessments
to monitor the performance of their education system.
NAPLAN Review Final Report
44
Country assessment policies
and practices
Singapore
Singapore does not conduct assessments
of general literacy and numeracy skills.
It has retained subject-based, national
examinations taken by all students at
the end of primary education, in midsecondary education, and in the final
year of secondary education.
The primary school curriculum includes
the following subjects: English Language,
Mother Tongue Language (MTL),
Mathematics, Science, Art, Music, Physical
Education, Social Studies, and Character and
Citizenship Education (Ministry of Education
Singapore, 2020c).
School-based assessments are conducted
in all levels of primary education but, from
2019, the extent is being reduced. To explain
the grounds for the change, the Ministry
of Education (2018) says, ‘To meet the
challenges of an increasingly complex world,
our students need to be lifelong learners. To
nurture lifelong learners, we need to help
our students discover more joy and develop
stronger intrinsic motivation in learning.’
Year-end, school-based examinations have
been removed from primary (P)1 (Year 1)
and P2 (Year 2) since 2019 and mid-year,
school-based examinations are being
removed from P3 (Year 3), P5 (Year 5),
secondary (S)1 (Year 7) and S3 (Year 9) over
the period 2019 to 2021. Annual reports to
parents/carers in the Holistic Development
Profile will no longer provide comparative
information on a student’s place in class or in
relation to a class mean but will still include
subject marks and grades, form teacher’s
comments, ratings of personal qualities and
reports on physical fitness, involvement
in community-based and co-curricular
activities, and school attendance.
NAPLAN Review Final Report
At the end of primary education (P6, Year 6),
there is an annual national Primary School
Leaving Examination conducted by the
Singapore Examinations and Assessment
Board (SEAB). It involves oral examinations
and listening and comprehension
examinations in English Language
and Mother Tongue, as well as written
examinations, mostly between one and two
hours, in English Language and Mother
Tongue, mathematics and science (SEAB,
2020). The other primary school subjects are
not assessed in the national examination.
Students are admitted to secondary schools
based on merit in the leaving examination
and their choice (Ministry of Education
Singapore, 2020a).
In mid-secondary education, students sit the
Singapore-Cambridge General Certificate
of Education examinations depending
on their course of study. Students in the
Normal (Academic)-Level (GCE N(A)-Level)
and the Normal (Technical)-Level (GCE
N(T)-Level) sit the examinations in S4 (Year
10). Students in the Ordinary Level (GCE
O-Level) sit the examinations in S5 (Year 11),
or in S4 if they are taking an Express Course.
Students in the Technical Level who excel
in specific subjects may be allowed to take
the examinations at the GCE N(A)-Level)
and students in the Academic Level who
excel in specific subjects may be allowed to
take the examinations at the GCE O-Level
(SEAB, 2020). From 2024, the GCE O- and
N-level streams will be replaced by subjects
grouped into three levels of study; from 2027,
the GCE O- and N-Level examinations will be
consolidated into a common examination
in a Singapore-Cambridge Secondary
Education Certificate (Times Online, 2020).
Students’ results in GCE O-Level can be
used for pre-university entry. They will be
admitted to a two- or three-year course
leading to the Singapore-Cambridge
General Certificate of Education Advanced
Level (GCE A-Level) examinations. Students
45
can choose to be examined at three levels
of study – Higher 1, Higher 2 and Higher 3
(Ministry of Education Singapore, 2020b).
Admissions to university courses are based
on examination performance and additional
interviews/tests if required (for example,
National University of Singapore, 2020).
Japan
National assessment of students in
Japan was conducted until 1964. It was
discontinued during a period of political
conflict with the Japan Teachers’ Union,
which was concerned that national
assessment was used by the government
to control educational content. In the 1990s,
there was considerable discussion of what
was taken to be a decline in the quality of
student learning attributable to the Ministry
of Education, Culture, Science, Sport and
Technology (MEXT) policy of yutori kyōiku
or ‘education that gives children room to
grow’, yutori meaning ‘relaxed’ or ‘pressurefree’. The debate was reinforced by a report
that university students could not perform
calculations with fractions (Okabe, Tose
& Nishimura, 1999 quoted in Kuramoto &
Koizumi, 2016, p.420). In 2002, MEXT issued
its ‘Recommendation for Learning’ which
called for ‘an improvement in scholastic
achievement’ and was seen to be a step back
from the relaxed approach (Kōichi, 2012).
The first Program for International
Student Assessment (PISA) survey of the
achievements of 15-year-olds in 2000,
however, suggested that Japanese
education was performing well prior to the
policy change. Only Finland was significantly
better than Japan in reading and none was
better in mathematics or science. In PISA
2003, ten countries were significantly better
in reading, three in mathematics and none
in science. This produced another discussion
about declining standards but then
improvement or maintenance was achieved
in PISA 2006 (nine ahead in reading, four
NAPLAN Review Final Report
in mathematics and two in science) and
PISA 2009 (five ahead in reading, five in
mathematics and three in science). Some of
the relative decline was due to other highperforming countries joining PISA – Hong
Kong, Taiwan and Estonia from 2006, and
Shanghai and Singapore from 2009 (OECD,
2001, pp.53, 79, 88; OECD, 2004, pp.281, 92,
294; OECD, 2007, pp.296, 316, 56; OECD, 2010,
pp.54, 134, 151).
In the 1990s, the Curriculum Council was
tasked with monitoring the achievement
of goals and the content of the Courses
of Study. It proposed a comprehensive
nationwide survey of academic achievement
among a representative stratified sample of
students across school years and subjects
for each administration. In 2007, the Council
on Economic and Fiscal Policy considered
the enhancement of academic achievement
through competitive principles and
proposed the introduction of the National
Assessment of Academic Ability (NAAA).
The NAAA was introduced in 2007 for all
students but was administered to only a
sample of students in 2010 to 2012. It has
been administered to all students since
then. Students are tested in Grade 6 (end
of primary) and Grade 9 in Japanese,
mathematics and science, with English
added from 2019. Each subject test has a
section that assesses comprehension and
a section that assesses application skills.
Mean NAAA subject scores in each region
are announced annually. Average scores
are shared with schools and prefectures
so that they can identify weak schools or
areas of policy that need attention. MEXT,
however, requires schools and school
boards to publish their improvement
plans partly based on data drawn from
the national assessments.
Kuramoto & Koizumi (2016) claim there is
ambivalence to testing in Japan, largely
attributable to the high-stakes university
46
entrance examinations and the competitive
labelling of students based on the
university to which they gain admission.
This competitiveness is said to influence
preparation for the NAAA assessments in the
earlier school years with the consequence
that “almost every problem with Japanese
youth is attributed to testing” (p.418).
There are, however, earlier examinations
on completion of elementary (Years 1 to
6) and lower secondary school (Years 7
to 9) for entry to upper secondary school
(MEXT– Japan, 2020). ‘Admission into senior
high schools is extremely competitive, and
in addition to entrance examinations, the
student’s academic work, behavior and
attitude, and record of participation in the
community are also taken into account.
Senior high schools are ranked in each
locality, and Japanese students consider the
senior high school where they matriculate
to be a determining factor in later success’
(CIEB, 2020).
Canada – Ontario
Canada is a federation like Australia,
but school education is the exclusive
responsibility of the provinces and territories.
There is no national ministry of education.
Some ‘pan-Canadian’ issues are dealt
with collaboratively by the provinces and
territories through the Council of Ministers
of Education, Canada, but assessment
is a matter of provincial policy. This
description focuses on the most populous
province, Ontario. In PISA 2018, Ontario was
significantly ahead of Australia in reading,
mathematics and science (OECD, 2019b,
pp. 73, 76, 79).
At the primary school level, there are
‘curriculum-based, province-wide
assessments that measure the reading,
writing and maths skills they are expected
to have learned by the end of Grade 3 and
Grade 6. … All students who attend publicly
funded schools and who follow the Ontario
Curriculum are required to write them.’
NAPLAN Review Final Report
There are four language sections and two
mathematics sections. They measure
whether students understand different
types of texts; express their thoughts
clearly for others to understand; and have
acquired the appropriate mathematics
skills to solve problems. The assessments
are conducted during a three- to six-day
period in late May and early June, thus
towards the end of the academic year.
Students have approximately one hour to
complete each section. The assessments are
conducted by the Ontario Education Quality
and Accountability Office (Ontario EQAO,
2020a), an arms-length organisation to the
Ontario Ministry of Education.
Results are available when students return to
school after the summer vacation. Students
receive an Individual Student Report directly
from their school to take home. Parents/
carers and students are assured that there is
no need to study for the assessments. ‘EQAO
assessments are based on the Ontario
Curriculum and do not require additional
preparation (for example, tutoring, extra
books)’ (Ontario EQAO, 2020a, p. 2).
Student cohorts are tracked from Grade 3
to Grade 6, with results identifying students
as ‘maintained standard’, ‘rose to standard’,
‘dropped from standard’ and ‘never met
standard’. EQAO emphasises that the
assessment results tell only part of the story
on students’ learning and progress:
EQAO assessment results should be
reviewed alongside students’ daily
classroom work and other studentachievement-related assessment
information to gauge student learning
and determine where more support may
be needed. For students who do not meet
the provincial standard, it is particularly
important for parents or guardians
and educators to discuss how to work
together to close learning gaps and
improve student achievement (Ontario
EQAO, 2020a p.3).
47
At the secondary school level, there is a
Grade 9 Assessment of Mathematics which
has different versions for students in the
academic and the applied mathematics
courses. Grade 9 mathematics teachers
may use this test as part of their course
assessment. There is a Grade 10 Ontario
Secondary School Literacy Test (OSSLT)
on which successful completion is one
of the requirements to earn an Ontario
Secondary School Diploma; however,
if students are unsuccessful in passing
the OSSLT after two attempts they may
enrol in a literacy course as an equivalent
diploma requirement. There are no
provincial subject-based examinations at
the end of secondary education.
In September 2017, the Government of
Ontario announced a review of provincial
assessment and reporting practices.
A statement issued in June 2019 indicated
that most of the changes involved additional
support for students who were English
language learners. On the Ontario Secondary
School Literacy Test, the report category
‘Unsuccessful’ would be replaced by ‘Not
Yet Successful’ in student, school, board and
provincial reports (Ontario EQAO, 2020b).
In 2012, EQAO introduced EQAO Reporting,
an interactive web-based reporting
application that enables school principals
to access their school’s EQAO data and
to link achievement data to contextual
and attitudinal data. This application was
made available to elementary school
principals in 2012 and to secondary school
principals in 2013.
England
Traditionally, primary and lower secondary
education in England was highly
decentralised with control in the hands of
school and local education authorities. There
were external examinations at the end of 11
years of schooling (for 16-year-olds) for the
General Certificate of Education Ordinary
NAPLAN Review Final Report
Level (GCE O-Level) in which students sat
for examinations in eight to nine subjects.
External examinations were also held at the
end of secondary education after 13 years of
schooling (for 18-year-olds) for the General
Certificate of Education Advanced Level
(GCE A-Level) in which students studied
and sat for examinations in three subjects
of considerable depth. The GCE curricula
and examinations were conducted by
independent examinations boards. Schools
determined from which board they took the
subjects, including the possibility of taking
different subjects from different boards.
Over several stages of amalgamation, the
board have reduced to three: Assessment
and Qualifications Alliance; Pearson; and
Oxford, Cambridge and RSA, which all
operate under guidelines from the Office
of Qualifications and Examinations. The
guidelines include specifications of subject
content and what the mix of examination
and school-based assessment is involved,
with the school-based component
ranging from 0% (for example, in English,
mathematics, chemistry) to 100% in art
and design (Ofqual, 2020).
The Education Reform Act 1988 introduced a
first national curriculum in England. National
assessments were introduced in English,
mathematics and science at the end of Key
Stage 1 (Years 1 and 2), Key Stage 2 (Years 3 to
6) and Key Stage 3 (Years 7 to 9) when most
students are aged 7, 11 and 14 respectively.
The GCE O-Level at the end of Key Stage 4
(Years 10 to 11) was replaced from 1987 by the
General Certificate of Secondary Education
(GCSE) to provide a national qualification
for students wanting to leave school at 16
years without going on to GCE A-Level. At
Key Stages 1, 2 and 3, schools were statutorily
obliged to report on students’ performances
using standardised assessment tasks (SATs).
At Key Stage 1 they were cross-curricular
tasks delivered in the classroom while at
Key Stages 2 and 3 they were tests.
48
A new primary curriculum was introduced in
2014. ‘End-of-key stage national curriculum
tests were re-designed to take account
of the national curriculum programmes
of study, and to provide more accurate
and reliable information for teachers and
parents/carers, and for school accountability
purposes. ... The new progress measures,
introduced in 2016, ensure that schools are
recognised for the work they do with all of
their pupils, regardless of whether these
pupils are high, middle or low attainers’
(UK Department of Education, 2017, p. 3).
[Progress measures] provide a much
stronger incentive for schools to focus
on improving the attainment of the
lowest-attaining pupils, rather than
focusing efforts on getting pupils over the
threshold of the expected standard.
Such progress measures require a
baseline to establish pupils’ starting
points … to work out how well, on average,
a school’s year 6 pupils do at key stage 2
compared to other pupils nationally with
similar starting points. … [T]he intention is
for a new assessment to be introduced in
the reception year to act as this baseline.
Roll-out of the assessment on a statutory
basis will be in autumn 2020, with a largescale pilot in the preceding year (p.11).
The new reception measure will be used
only to create ‘school-level average progress
measures when the pupils reach the end of
key stage 2, 7 years later’.
With the introduction of statutory reception
baseline assessment, assessments at the
end of Key Stage 1 will become non-statutory
from 2022-2023 and the existing nonstatutory English grammar, punctuation and
spelling test will remain non-statutory (p.16).
A new statutory, national multiplication
tables check, however, has been scheduled
for introduction in the 2019 to 2020
academic year, with the intention that ‘[d]
NAPLAN Review Final Report
ata from the assessment will be published
at national and local level only, not at school
level, and data from the check will not be
used to trigger intervention or inspection’
(p.17).
At the end of Key Stage 2, there is a
‘statutory duty for schools to report teacher
assessment judgements in English reading,
English writing, mathematics and science’
but the teacher judgements in reading
and mathematics are not used ‘to calculate
headline accountability measures, as data
from national curriculum tests is used
instead’. Consequently, and in an effort to
reduce teacher workload, the requirement
for teachers to assess students ‘against
teacher assessment frameworks in reading
and mathematics’ will be removed (p. 14).
Scotland
Scotland has had regular sample surveys
of students’ achievements from 1983,
initially by the Assessment of Achievement
Programme, that mainly assessed English
language, mathematics and science (1983
to 2004), then by the Scottish Survey of
Achievement (SSA) conducted annually in
primary and secondary schools from 2005
until 2009. The SSA collected evidence on
students’ achievement and progression;
teachers’ judgements of pupils’ attainment
levels; students’ and teachers’ experience
of learning and teaching; and changes in
performance over time. It focused on a
different aspect of the school curriculum
each year. In 2011, the SSA was replaced
with the Scottish Survey of Literacy and
Numeracy (SSLN), which supported
assessment approaches under Scotland’s
new Curriculum for Excellence (CfE). It was
conducted until 2016, assessing primary (P)4,
P7 and secondary (S)2 learners’ progress in
literacy and numeracy in alternate years.
It also collected information on students’
and teachers’ attitudes towards aspects of
learning and teaching.
49
The SSA covered literacy and numeracy −
but in the context of other subjects − so
provided a measure of progress in those
subjects as well. It collected information on
students’ attitudes and also, for example,
on practical work in science. Because some
local authorities thought the assessments
were too time consuming, the survey
was restricted to literacy and numeracy.
When that survey showed a decline in
performance, the local authorities said
that it was because it did not give good
information. In response, the Scottish
Government withdrew the sample survey
and developed a census assessment of all
students in particular years of schooling.
In 2016, the Scottish Government published
‘The National Improvement Framework
for Scottish Education’. Based on evidence
provided in the development of the
framework document, the government
nominated six key drivers for improvement,
one of which was assessment of children’s
progress (Scottish Government, 2016, pp.
44-45). The framework was developed to
support a new curriculum and intended
to ‘provide a level of robust, consistent and
transparent data across Scotland to extend
the understanding of what works, and drive
improvements across all parts of the system’
(ACER, 2018, p. 6).
The Scottish Government then discontinued
the sample-based SSLN and introduced
a census collection of Achievement of
Curriculum for Education (CfE) Levels
as a replacement to inform and target
improvement at school, local authority
and national level. Teachers of students in
P1, P4, P7 and S3 indicate whether each
child in their class has achieved the CfE
level associated with that stage. Teachers’
professional judgement is at the heart of
the Scottish Education system (Hayward,
2018, Hutchinson & Young, 2011). However,
the government also introduced the
NAPLAN Review Final Report
Scottish National Standardised Assessments
(SNSA) to provide nationally consistent
information about progress in literacy
and numeracy as an additional source of
diagnostic information to inform teachers’
professional judgements.
The Australian Council for Educational
Research (ACER) was contracted to develop
assessments in numeracy, reading/literacy
and writing for use in P1, P4, P7 and S3.
Schools administer Scottish National
Standardised Assessments (SNSAs) once
each year at a time they choose but
they may opt out of the program. The
assessments are digital and delivered online
and ‘reports to schools and teachers are
provided as soon as a learner completes
an assessment. Additional reports are
available for local authorities’ (ACER, 2018,
p. 6). National reports are produced each
year noting overall national achievement
results and analyses of the achievement
levels by gender, ethnic background and
for various subgroups of students seen to
have special needs (for example, those with
additional support needs, registered for
free school meals, in out-of-home care, and
speaking English as an additional language.)
The latest national report, the second to be
produced, is for the academic year 2018-19
(ACER, 2020b).
The introduction of the SNSA tests was
contentious, particularly with young
children. In a review of testing at P1, Reedy
(undated), however, concluded that, while
‘media reports and some members of the
Scottish Parliament reported that the P1
SNSA was causing children distress … the
majority of head teachers and teachers did
not see any distress or discomfort as children
undertook the P1 SNSA, in fact, they reported
that the children enjoyed it’ (p. 39).
50
The Scottish assessments are similar to
NAPLAN in their coverage of reading,
language conventions, writing and
numeracy and in their application to
all students in a census in four years of
schooling. The Scottish assessments cover
P1, 4, S7 and S10 whereas NAPLAN assesses
in Years 3, 5, 7 and 9. Like NAPLAN Online,
the Scottish assessments are computerdelivered as adaptive (branching) tests.
Individual student reports are similar to
the NAPLAN student reports. There is a
continuum drawn up the page with bands
on it and descriptions of what a student
at a particular band can do. The students’
location on the band is marked but there
are no markers for school, local authority
or national means. There are three other
significant differences. One is that Scottish
schools can decide whether to administer
the tests. The national reports on the
program do not indicate whether any
schools did opt out, but they report that
95% of students were tested in 2017 to 2018
(ACER, 2018, p.12) and 93.4% were tested
in 2018 to 2019 (ACER, 2020b, p.12). These
participation rates match those achieved
with the NAPLAN census tests in Australia.
Schools can also decide when to administer
the tests, although some of the 32 local
authorities insist that all students take the
tests at the same time. This variation in time
of testing means that national means would
not be useful in any case. The other, more
substantial, difference is that the results
are not published and are not provided
to parents/carers and students unless the
school chooses to do so. They go only to the
school and the teacher to become one piece
of information that teachers use with their
own local information for their assessments
of students and reports to parents/carers.
The information collected centrally from
schools is teachers’ judgements of their
students based on their own assessments
and the students’ SNSA results.
NAPLAN Review Final Report
New Zealand
New Zealand has no nationally mandated
external assessment of all students until
the end of secondary schooling, though the
National Administration Guidelines state that
‘Each board of trustees, with the principal
and teaching staff, is required to: … on the
basis of good quality assessment information,
report to students and their parents/carers
on progress and achievement of individual
students in plain language, in writing, and
at least twice a year and across the National
Curriculum … including in mathematics and
literacy’ (NZ Ministry of Education, 2020c).
The Ministry provides detailed information on
assessment tools and resources (NZ Ministry
of Education, 2020a).
From 1997, the Ministry of Education
provided a School Entry Assessment
that school could use to assess students’
concepts about print, numeracy and their
oral language. A replacement assessment
is being developed.
A sample-based National Education
Monitoring Project operated from 1995 to
2010. It was replaced in 2012 by the National
Monitoring Study of Student Achievement,
which assesses students in Years 4 and
8 in arts, health and physical education,
science, English, mathematics and statistics,
social sciences, technology, and languages,
with the subjects rotated over a five-year
cycle. The assessments involve a mix of
group and individually administered tasks,
with some administered on computers.
The program is a collaboration between
the Educational Assessment Research
Unit at the University of Otago, the New
Zealand Council for Educational Research
(NZCER) and the Ministry of Education
(University of Otago, 2020).
As an alternative to external testing, new
National Standards were introduced in 2010.
The justification was that there was ‘an
urgent need to raise student achievement
and for parents/carers to be better informed
51
about their children’s performance in literacy
(reading and writing) and numeracy in their
primary and intermediate schooling years’.
Schools were required to use the standards
to guide teaching and learning, to report
children’s progress and achievements
against the standards to parents/carers, and
to include baseline data and targets in their
2011 Charters. Annual reporting of results
was required from 2012. Some of these
assessments are available in Māori, together
with others exclusively in Māori (NZ Ministry
of Education, 2020a).
A Progress and Consistency Tool (PaCT) was
introduced in 2015 to support teachers in
making dependable judgements about their
students’ achievement. It provides decision
frameworks that capture teachers’ ‘best fit’
judgements of their students on aspects of
mathematics, reading and writing. It locates
the students’ overall level of achievement on
scales on which progress can be tracked from
school-entry to Year 10. The PaCT scales were
originally benchmarked against National
Standards but are now linked to curriculum
levels (Education Services, NZ, 2020).
Judgements against the standards were
taken to be comparable across schools and
they were used, among other things, to
create league tables. They were also claimed
to narrow the curriculum and, after a change
of government in 2017, were abolished in
favour of ‘plain English’ reporting to parents/
carers on students’ progress without
reference to National Standards (Collins, 2017).
Teachers and schools use a range of
externally developed assessments
(NZ Ministry of Education 2020a).
These include NZCER’s Progressive
Achievement Tests in reading, listening
comprehension, punctuation and grammar
and mathematics (NZCER, 2020b) and
the Assessment Tools for Teaching and
Learning (e-asTTle), which assess reading,
mathematics and writing (NZ Ministry of
Education 2020b). With e-asTTle, teachers
NAPLAN Review Final Report
can design their own tests by assembling
groups of items from an item bank.
Multiple-choice questions are machinemarked online. Open-ended questions are
marked by the teacher against a marking
guide and the results entered online.
Schools can compare their results with the
curriculum levels. The Ministry has also
developed Assessment Resource Banks
in mathematics, English and science,
administered by NZCER (2020a). These
materials have a strong formative focus.
Finland
In Finland, there are regular sample-based
surveys with standardised tests of student
learning outcomes in pre-primary (one year)
and basic education grades 1 to 9 (age 7 to
16). The surveys are conducted by the Finnish
Education Evaluation Centre (FINEEC).
There were, for example, surveys of English
in grade 7 in 2018 and mathematics at the
end of basic education in 2020. A survey
of English in grade 9 is scheduled for
2021. The samples involve 5% to 10% of the
relevant age group, with oversampling of
schools providing education in Swedish, the
second national language, to obtain stable
estimates for that sub-population.
The assessments are based on objectives
defined in the national curriculum. The
evaluation tasks are trialled in schools
that are not in the sample and the final
evaluation instrument is put together
based on feedback from teachers and
analyses of the data from the trial. Schools
in the sample receive information on their
students’ performances in relation to
the national average. Information is also
collected from principals, teachers and
students ‘on working methods and teaching
arrangements, educational resources,
student evaluation and study attitudes of
the pupils’ (FINEEC, 2020). The responses
from the students feed into an established
indicator to study students’ views of
themselves as learners of the subject,
52
on the attractiveness of the subject and
on the usefulness of studying it. A national
report is prepared and summaries are
provided to meet ‘the needs of the Ministry
of Education and Culture, the Finnish
National Agency for Education, Departments
of Teacher Education, education providers,
schools, teachers and other bodies’
(FINEEC, 2020).
Apart from those in sample schools,
students do not take any external
assessment until the end of secondary
education and then only if they wish to go
on to university. The focus of assessment
in the schools is formative to facilitate
and guide students’ learning. At the end
of general upper secondary education
(grades 10 to 12), which is the path to
university education, there is a Matriculation
Examination in which students take four
examinations – mother tongue, and three
by choice from second national language,
a foreign language, mathematics and one
from humanities and natural sciences. At
the end of vocational upper secondary
education, ‘Qualification-specific learning
outcomes evaluations focus at vocational
skills and are based on vocational skills
demonstrations and supplementary
evaluation material, such as students’ selfevaluations, self-evaluations of VET providers
and workplaces, and evaluations of the
quality of the demonstrations’ (Matriculation
Examination Board, Finland, 2020).
FINEEC conducts thematic and system
evaluations that focus on ‘the state of
a certain form of education … a whole
education system or some part of it …
education policy and its implementation
NAPLAN Review Final Report
or the renewal and development processes
of the education system. Evaluations may
target one educational level or cover several’.
Recent evaluations have included ‘learning
and competencies in basic education and
general upper secondary education’, ‘best
practices for the integration of immigrants
into the educational system’ and ‘student
transitions and smooth study paths’
(FINEEC, 2020).
Potential relevance for
Australia
Table 10 compares the mean performances
of Australian students with the mean
performances of students in the countries
described in the previous section on
the three main PISA scales – reading,
mathematics and science in 2018. Canada–
Ontario, Finland and Singapore were
significantly ahead of Australia in all
domains; Japan was significantly ahead in
mathematics and science but not different
in reading; England was significantly ahead
in mathematics but not different in reading
and science; New Zealand was significantly
ahead in science but not different in
reading and mathematics; and Scotland
was not different from Australia in reading
and mathematics but significantly behind
Australia in science (OECD, 2019b, pp. 57, 59,
61, 73, 76-77 & 79).
The countries for which assessment policies
and practices have been summarised in the
preceding section are higher performing
than Australia or equal to Australia except for
Scotland in science where its performance is
significantly lower than Australia’s.
53
Table 10: Position of comparison countries in relation to Australia in PISA 2018
Location
Reading
Mathematics
Science
Canada – Ontario
England
Finland
Japan
New Zealand
Scotland
Singapore
Key: Not different from Australia
Significantly behind Australia
The interesting question, from the
perspective of this review of NAPLAN, is
whether there are common features of their
assessment policies that are significantly
different from Australia’s, which might
account for their higher performances.
Australia’s policies provide for census
assessments of students in NAPLAN in
Years 3, 5, 7 and 9 in reading, language
conventions, writing and numeracy, as well
as assessments of samples of students
on a three-yearly cycle in Years 6 and 10
in science, civics and citizenship, and
information and communication technology
literacy. NAPLAN results are reported
publicly at national and state and territory
level and for subpopulations of interest,
including for males and females, Indigenous
students and students with a language
background other than English. Reports on
students are provided to schools, parents/
carers and students and reports on schools
are provided to education systems/sectors
and schools and were, in earlier versions
of My School, made available publicly in
comparison with the results of other schools
with students from a similar level of socioeducational advantage. A comparison with
the other countries is provided in Table 11.
This table, together with the descriptions in
the summaries of country policies make it
clear that there are no assessment policies
or practices common to all the countries
surveyed. All of them have assessments
NAPLAN Review Final Report
Significantly ahead of Australia
of all students at the end of secondary
education in whatever subjects they are
studying. Australia does too but that is
outside the range of NAPLAN testing.
The comparisons offered in the text of this
chapter and in Table 11 are for the years prior
to the end of secondary school.
There are high-performing countries in
Australia’s region, Singapore and Japan
that retain subject-based examinations at
the end of primary education and in midsecondary education. Ontario is a provincial
system like one of Australia’s large states.
It conducts census assessments quite like
NAPLAN and reports all results down to
school-level publicly. That is what Australia
did with NAPLAN results prior to the recent
changes to the My School website, which
no longer gives such visibility to school
comparisons. The website still provides the
school means, with confidence intervals,
on the results page and the comparison of
school means to means of students with a
similar background visible with a hover-over
function. The dominant reports on the My
School website for individual schools are
graphical representations of growth rates
for students between Years 3 and 5 and
between Years 7 and 9. England has census
assessments and now, like Australia, focuses
on growth not current status. It provides
school-level information to the schools,
as well as to central and local authorities,
but not publicly.
54
Scotland offers cohort assessments in
numeracy, reading and writing but school
participation is optional. Students’ results
are reported to schools, but they are not
provided to students or parents/carers. It is
expected that teachers will use the students’
results from the tests together with their
own information on each student for reports
to parents/carers and as part of school
reports to local authorities.
Finland and New Zealand have sample
assessments that are conducted in all
school subject domains on a cycle. Finland
adds a cycle of school reviews. New Zealand
provides schools and teachers access to a
range of assessment resources for local use.
Finland is a high-performing country that
has no census assessments of students.
It has a cycle of sample assessments that
covers all school subjects and thematic and
system evaluations.
Finland, New Zealand and Scotland are
rather similar in their current policies and
practices, but Finland has attracted more
international attention because it has
much higher results in PISA, and because
its current policies and practices are more
longstanding. Finland and New Zealand
have only sample assessments at a
national level. Scotland is attracting some
interest now because it does have census
assessment, despite an ‘opt-out’ provision for
schools, to obtain measures on all students
but does not provide the results beyond
the school to students and parents/carers.
Instead, teachers use the results from the
Scottish National Standardised Assessments
as one piece of evidence alongside the
teachers’ own assessments.
Finland gained a lot international attention
when its students performed very well in the
first PISA in 2000. It was significantly ahead
of all others in reading, significantly behind
only Japan in mathematics and significantly
behind only Korea in science (OECD, 2001,
pp.53, 79, 88). Over the successive PISA cycles
since 2000, however, Finland’s performance
has declined significantly in all domains
(OECD, 2019b, pp. 57, 59, 61, 131).
Table 11: Nature of assessments in other countries
Finland
Japan
Singapore
Ontario
New
Zealand
England
Scotland
Test
population
Sample
Census
Census
Census
Sample
Census
On demand
Test domains
All school
subjects
over time
Japanese
Maths
Science
English
English
Language,
Mother Tongue,
Maths, Science
Reading
Writing
Maths
All school
subjects
over a fiveyear cycle
English
Maths
Science
Numeracy
Reading
Writing
School years
assessed
(Australian
equivalent)
Years
Years
Years
Years
Years
Years
Years
1-9
6, 9
6, 10
3, 6, 9, 10
4, 8
2, 6
1, 4, 7, 10
System
results public
Students’
results to
parents/carers
School-level
test results to
school
School-level
test results to
public
NAPLAN Review Final Report
55
Many educators from other countries,
including many Australians, have visited
Finland since the PISA 2000 results were
published in 2001, in the hope of learning
lessons for their own countries. Frequently,
they cherry picked those policies and
practices that coincided with their personal
preferences for their own domestic policies.
Among them, the selectivity of teacher
education programs, high quality of
teachers, a high-level curriculum leaving
a great deal of discretion to schools and
teachers and the absence of external
assessments of students until the end of
upper secondary education for the students
in the general education stream. Visitors
presumed that they were seeing in Finland’s
current policies the seeds of its success
in 2000. They may have been seeing the
seeds of its decline to 2018.
Sahlberg, former Director-General of
the Centre for International Mobility in
the Finnish Ministry of Education and
Culture, warns against simple notions of
transfer between systems (2015, p.xxii) and
emphasises the need to understand the
complex series of reforms over time that
stood behind Finland’s high performance
in PISA in 2000 (p.5). Oates (2015) points out
NAPLAN Review Final Report
that, while the Finnish national curriculum
is a general, high-level document,
government-approved textbooks gave detail
until 1992 and continued to be used after
that. Reflecting on developments in Finland
and in other countries, Sahlberg (2015, 2016)
has provided a list of features of reform in
other countries that he claims are different
from Finland’s and counterproductive – 1)
competition among schools for enrolment, –
2) standardisation of teaching and learning,
– 3) increased emphasis on reading literacy,
mathematics and science, – 4) borrowing
change models from the corporate world,
– and 5) test-based accountability policies.
He invites those learning from Finland to
reject these assessment practices. They are
practices that characterise Japan, Singapore,
Ontario and England.
The more general conclusion of the survey
of countries covered in this chapter is
that there are no common assessment
practices in high-performing countries.
In the end, each will need to develop its
own policies and practices while, examining
the practices of others.
56
Chapter 4: Quality of NAPLAN
digital tests
NAPLAN assesses all students in Years 3, 5, 7 and 9 in reading, writing, language conventions
(spelling, grammar and punctuation) and numeracy. From 2008 to 2017, all the tests were
delivered on paper. From 2018, a growing proportion of schools administered the tests
online in digital format, which were marked by the computer as the students took the tests.
The exception was writing, which was marked by human makers. This chapter considers
the properties of the digital tests, to some extent in comparison with their paper precursors.
The writing tests are considered in Chapter 5.
Key points
• From 2008 to 2017 NAPLAN tests in reading, language conventions (spelling,
grammar and punctuation) and numeracy have involved both multiple-choice
questions (for which answers can be machine scored) and short, constructedresponse formats that humans mark.
• From 2018, a growing number of schools have used a new computer-delivered digital
version of the tests. In NAPLAN Online, all items are multiple-choice and students’
responses are recorded by the computer as right or wrong as students respond.
The current expectation is that all schools will use this format from 2022.
• From 2008 to 2016, the NAPLAN tests were based on the national Statements
of Learning for English and Statements of Learning for Mathematics. From 2017,
they have been based on the Australian Curriculum.
• NAPLAN Online delivers branching tests in which, on the basis of the computer’s
scoring of students’ answers as they work through a test, branches the students
to items of different complexity after one third of the items have been answered
and again after two thirds have been answered. This ‘adaptive’ testing provides
better measures of high and low achievers than can be obtained when all
students answer the same questions.
• The use of some common items in the Years 3 and 5 tests, the Years 5 and 7 tests
and the Years 7 and 9 tests enable all students’ results, in a process called ‘vertical
scaling’, to be expressed on a common scale. Links back to the first NAPLAN enable
the results to be expressed on the scale that was originally established in 2008
in a process called ‘horizontal scaling’.
NAPLAN Review Final Report
57
Key points continued
• There is uncertainty in all educational measurements due to uncertainty in the
measure itself and, in the case of NAPLAN, to uncertainty in the vertical and
horizontal scaling. The level of uncertainty depends on how much data the measures
are based on and on how far from the overall mean they are. The most precise means
are national, followed by state and territory means. The least precise are individual
student’s results. Means for large schools are more precise than means for small
schools. Means for schools and results for students closer to the national mean are
more precise than for those further from it.
• NAPLAN tests are currently census tests intended to be taken by the whole-cohorts
of students in Years 3, 5, 7 and 9, with some specific adjustments to accommodate
students with disabilities. There are provisions for parents/carers to request students
be withdrawn from the testing and there are students absent for other reasons on the
days of testing. Non-participation rates vary across the states and territories and year
levels. Where the rates are high, that may bias estimates of state and territory means.
Content of tests
Paper tests
The NAPLAN tests in reading, language
conventions and numeracy were paper
tests for all students from 2008 to 2017 and,
in 2018 and 2019, for students in schools
not opting to use the new digital tests
provided as NAPLAN Online. The paper
tests consisted of multiple-choice items and
constructed response items that required a
numeric answer, a word or a short phrase.
Responses to the multiple-choice items
were recorded on a machine-readable form
and were machine marked. Responses
to the constructed response items were
marked by human markers trained to
apply nationally agreed marking protocols
and item-specific answer criteria (ACARA,
2020b). The structure and coverage of the
paper tests in 2019 are shown in Table 12
(ACARA, 2020e, pp. 27-28).
NAPLAN Review Final Report
Information on the nature of the tests
is provided in Chapter 2. Copies of past
NAPLAN test papers and answers for all
years from 2008 to 2016 are provided online
for teachers, students, parents/carers and
anyone else interested in reviewing them.
Later tests are not provided because the
Australian Curriculum, Assessment and
Reporting Authority (ACARA) keeps the
specific questions secure ‘for a range of
purposes, including ACARA’s research and
development studies’ (ACARA, 2020b).
Keeping the items secure also enables items
to be reused to establish links between the
results in different years and to enable the
results for successive years to be located
on the same scale.
58
Table 12: Structure of paper NAPLAN paper tests, 2019
Number of items
Time available
Year 3
Reading
Language
conventions
Spelling
25
Grammar and punctuation
25
Numeracy (no calculator)
37
45 minutes
50
45 minutes
36
45 minutes
39
50 minutes
50
45 minutes
42
50 minutes
50
65 minutes
50
45 minutes
Year 5
Reading
Language
conventions
Spelling
25
Grammar and punctuation
25
Numeracy (no calculator)
Year 7
Reading
Language
conventions
Numeracy
Spelling
25
Grammar and punctuation
25
No calculator
40
Calculator allowed
8
48
55 minutes
10 minutes
Year 9
Reading
Language
conventions
Numeracy
Spelling
25
Grammar and punctuation
25
No calculator
40
Calculator allowed
Some stakeholders regretted the loss of
access to the actual tests because they
also received the actual responses of their
students to the individual items and could
see precisely where errors were made. They
would then use that information to provide
additional instruction to individual students
or to adjust their teaching to the whole class.
As one stakeholder said:
Previously, teachers could see items,
conduct item analysis and have
conversations and learn from the
tests. Now there is a high-level item
NAPLAN Review Final Report
8
50
65 minutes
50
45 minutes
48
55 minutes
10 minutes
description, link to the Australian
Curriculum and a link to an example item.
(Education expert)
There was an alternate view expressed
from the context of an overseas
assessment system.
Never allow information at the item
level to be reported. Instead, cluster the
items into curriculum concepts and
the report [to schools and teachers] can
then focus on curricula and not items.
(Education expert)
59
Online tests
From 2018, there has been a phased shift
from paper to online NAPLAN tests. In
2018, just over 15% of schools participated in
NAPLAN Online. In 2019, more than 50% did.
With NAPLAN cancelled in 2020, the target
date for all schools to undertake NAPLAN
Online is now 2022, not 2021 as earlier
envisaged (ACARA, 2020h).
With the paper NAPLAN tests, all students
take the same test. For high-performing
students, easy items provide essentially no
information on how well they can perform.
Similarly, for low-performing students,
difficult items on the common test tell little
about how well they can perform.
A key advantage of an online test is that it
can be adaptive and present students with
tasks targeted close to their performance
level and so provide much better information
on what each student knows and is able to
do. The branching structure for the NAPLAN
Online literacy and numeracy tests is shown
in Figure 2 (ACARA, 2020e, p. 30).
Figure 2: Branching structure of NAPLAN Online literacy and numeracy tests
NAPLAN Review Final Report
60
All students start with common items for
their year level in a first testlet A. As they
respond to the items, the computer scores
each response ‘right’ or ‘wrong’. Students
determined by the computer to be doing
well at the end of testlet A are moved to
testlet D with more complex items while
those doing less well are moved to testlet
B with less complex items. Students who
struggle with testlet A are moved directly
to testlet C to give them an opportunity to
achieve success on the least complex items
before then moving to testlet B.
in Figure 3. Students who had completed
reading testlet F were directed to a highcomplexity grammar and punctuation test
F, those who had completed reading testlet
E were directed to a medium-complexity
grammar and punctuation test E, and those
who had completed reading testlet C were
directed to a low-complexity grammar and
punctuation test C. To link the results onto a
common grammar and punctuation scale,
there were common items in the grammar
and punctuation tests F and E and in the
grammar and punctuation tests E and C.
After completing testlets D or B, students
are moved to a third testlet based on their
performance in their first two: AD or AB.
The students performing at the highest
level are moved to testlet F, which has highcomplexity items. Those performing at the
lowest level are moved to testlet C with easy,
low-complexity items. Those in between are
moved to testlet E. Those whose first pair
was AC are moved, as indicated, to B.
There were three versions of each testlet
with the versions being comparable in the
difficulties of the items, curriculum coverage
and skills assessed. The first version of each
testlet included items from the paper test
and new online items. The other versions
included items from NAPLAN 2018 to enable
the results to be placed on the same scale
as that used in 2018 and ultimately back to
2008, to which it had been linked.
The spelling test had a structure similar
to that in Figure 2, except that there were
only two testlets at the Stage 3. The testlets
in Stages 1 and 2 involved spelling spoken
words delivered by the computer. The
testlets in Stage 3 involved proof reading
to detect spelling errors in text.
The grammar and punctuation test had
no branching. Instead it had three tests
of differing complexity to which students
were directed based on their final testlet
in the reading test. The structure is shown
NAPLAN Review Final Report
Figure 3: Branching structure of NAPLAN Online
grammar and punctuation test
Reactions to the online format, from
stakeholders contributing to this review,
were generally positive – ‘Supportive of the
branched testing, even at the item level,
though acknowledged this kind of test is
difficult to create’ (Subject association);
and ‘Appreciate the branching in online
testing’ (Member of the NAPLAN Review
Practitioners’ Reference Group).
NAPLAN, especially with the online
branching model, is designed to direct
high-achieving students to their limit by
increasing item complexity until students
fail. It is contended that this assessmentmethodology difference is not well
understood and may indeed contribute
to the negative feelings reported around
NAPLAN testing. That said, for low-
61
mid achieving students, schools have
anecdotally reported more positive test
experiences due to the platform design
being less confronting than a large paper
test, and that the branching method
gives opportunity for all students to
demonstrate proficiencies in matching
with their abilities. (Written submission
response: school system/sector)
There were anecdotes about Year 9 students,
who had found the NAPLAN tests in Years,
3, 5 and 7 relatively easy and had finished
them quickly, being surprised at how much
longer they were taking in Year 9 with the
online version. Though they had been told
about the new form, they apparently did not
appreciate that they were slow because they
were taking a more complex test.
There was concern about the impact of
NAPLAN becoming exclusively online on
students in schools with poor connectivity
to the internet and for students who do
not have access to computers at home to
develop fluency in using them.
There will be some students who
potentially will never be able to do the test
online due to connectivity issues (rural/
remote). If the test could be electronic
not online (e.g. USB) that may be fine.
(Parents’/carers’ association)
For these students, the obvious solution
to poor connectivity at school at the time
of testing would be to provide the tests
on a portable device that can be accessed
locally at the school. Limitation in access to
computers outside school and opportunities
to develop fluent use is a serious problem
but of a different order and not relevant
only for NAPLAN.
NAPLAN Review Final Report
One respondent speculated about the
possibility of having a paper version of the
branching test. That would not be possible
because the branching depends on being
able to mark students’ responses as they
work through a test.
Prior to the adoption of NAPLAN as a
common national test, the Northern
Territory used differentiated tests as an
approximation of what branching can
provide. Two versions of the paper test were
provided with one having more difficult
items and the other easier ones but with
the two versions having sufficient common
items for results to be expressed on the
same scale. Teachers were asked to give
each student what they judged to be the
most appropriate test. A similar model was
advocated by the Northern Territory when
the form of the new common NAPLAN
tests was being negotiated in 2007 but
the Northern Territory could not persuade
the other jurisdictions to adopt it. With the
transition from paper to branching digital
tests, the Northern Territory is gaining
even more than the flexibility it had lost
from 2008.
Since students respond to different items,
depending on which path they follow
through NAPLAN Online, comparisons
cannot be made among students based
on the proportion of items they answer
correctly. The meaningful measure is a
student’s score on the underlying NAPLAN
scale on which items are arranged by
difficulty and students are arranged by
achievement level.
The interpretation/analysis of online test
items has become more difficult for
teachers due to the online test. The paper
test showed teachers the proportion of
items that were right and they could
compare to similar schools and gauge
class performance. In the online test
you cannot compare a proportion
of items [correct] due to branching.
(Education expert)
62
While NAPLAN is conducted as both
paper-based and online assessments,
every effort is made to make the tests
parallel. One consequence is that the
power of online assessment cannot be fully
exploited, particularly the facility to use
more interactive items. The one exception
in the transition period is in the spelling
tests where spelling of words presented
by dictation has been added in the online
form to proofreading for spelling errors that
has been the means of testing spelling in
the paper test.
The structure and coverage of the online
tests in 2019, shown in Table 13, are
essentially the same as the structure and
coverage of the paper tests in 2019 shown
in Table 12 (page 59).
NAPLAN Review Final Report
Item selection
All items are trialled before they are
included in the final tests. Careful analyses
are undertaken to detect any bias in items
that would disadvantage males or females,
students from a language background
other than English, Aboriginal and Torres
Strait Islander students and students from
different states and territories. Bias cannot
be judged based on whether items might
be easier for some groups than others
because that may reflect real differences in
achievement levels. Bias is detected through
examining relative difficulties of items for
the different groups.
63
Table 13: Structure of NAPLAN Online tests, 2019
Number of items
Time available
Year 3
Reading
39
Spelling: audio dictation
Language
conventions
Spelling: proofreading
Grammar and punctuation
25
45 minutes
50
45 minutes
25
Numeracy (no calculator)
36
45 minutes
39
50 minutes
Year 5
Reading
Spelling: audio dictation
Language
conventions
Spelling: proofreading
Grammar and punctuation
25
50
45 minutes
25
Numeracy (no calculator)
42
50 minutes
48
65 minutes
Year 7
Reading
Spelling: audio dictation
Language
conventions
Numeracy
Spelling: proofreading
25
Grammar and punctuation
25
No calculator
40
Calculator allowed
8
50
45 minutes
48
65 minutes
Year 9
Reading
48
Spelling: audio dictation
Language
conventions
Numeracy
NAPLAN Review Final Report
Spelling: proofreading
25
Grammar and punctuation
25
No calculator
40
Calculator allowed
8
65 minutes
50
45 minutes
48
65 minutes
64
For example, in exploring the possibility
of gender bias for a particular test, such
as Year 7 numeracy, the question is not
whether females find the items harder or
easier than males. The question is whether
the relative difficulty of individual items
compared with other items is the same for
males and females. If an item stands out in
providing a view of male/female differences
in performance that is inconsistent with
the view provided by the other items, then
the inconsistent item would be judged to
be gender biased and so excluded from
consideration for inclusion in the final test.
Items that are inconsistent with other
items are detected using Differential Item
Functioning (DIF) analyses. The procedure
is discussed and results from its application
are provided in the annual NAPLAN
technical reports (for example, ACARA,
2020e, pp. 94-100). The DIF analyses behind
the development of the NAPLAN tests work
effectively to deliver unbiased tests.
Links to the Australian Curriculum
As described in Chapter 1, all of the states
and territories conducted census testing
of students in literacy and numeracy prior
to the introduction in 2008 of NAPLAN as
a common national assessment. When
the 2008 NAPLAN tests were developed in
2007, they were based on the Statements
of Learning for English (Curriculum
Corporation, 2005) and the Statements
of Learning for Mathematics (Curriculum
Corporation, 2006).
“Since 2016, NAPLAN tests have been
aligned to the Australian Curriculum:
English and the Australian Curriculum:
Mathematics” (ACARA, 2020g). Detailed
mapping of the paper tests and the Online
Tests in 2019, including mapping by pathway
for the online tests, are provided in the
NAPLAN 2019 Technical Report (ACARA,
2020e, pp. 34-41). One stakeholder noted,
somewhat ironically, the importance of the
NAPLAN Review Final Report
link between NAPLAN and the Australian
Curriculum, – ‘What are we testing if it is
not linked to the Australian Curriculum?’
(Member of the NAPLAN Review
Practitioners’ Reference Group)
Other stakeholders, however, revealed the
link not to be widely understood. There
were comments in the submissions and
the consultations that suggested that
such alignment would be a good idea.
These included, ‘NAPLAN is missing links
to the Australian curriculum.’ (Teachers’
association) and ‘Linking NAPLAN to the
Australian Curriculum could be a positive
development. This may help increase
confidence in the test.’
(School system/sector)
There were suggestions, however, that
the link was not well-established for the
language conventions test.
The grammar in NAPLAN does not match
the Australian Curriculum and there are
many items in NAPLAN that would only
be recognised by children trained in a
particular form of language conventions.
(Educational organisation)
Some thought the alignment could be
to more than the Australian Curriculum:
English and the Australian Curriculum:
Mathematics. Comments included, ‘It
would be a positive move to recalibrate
the NAPLAN tests back to the Australian
Curriculum. We could assess literacy as
part of science.’ (School system/sector); and
‘NAPLAN should be totally aligned to the
Australian Curriculum – including general
capabilities.’ (Written submission response:
principals’ association)
There should also be stronger alignment
with the assessment framework and
the Australian Curriculum. A continuous
improvement model for standardised
testing in Australia would consider
the benefits of a broader assessment
65
including General Capabilities, realising
a faster turnaround of data and more
engaging targeted assessment to deliver
a richer, more holistic results dataset more
reflective of the Australian Curriculum and
better-informing student learning growth
strategies. (Written submission response:
school system/sector)
Others also picked up on the final part of the
preceding comment, expressing a view that
strong links to the Australian Curriculum
could help teachers identify weaknesses in
their own teaching – ‘It should be used to
assist teachers to reflect on their teaching
of the Australian Curriculum.’ (Written
submission response: principals’ association)
There was also a suggestion that
communication with parents/carers about
NAPLAN results would be more helpful if
NAPLAN’s links to the Australian Curriculum
and pedagogy were made clearer.
Schools could assist parents/carers
and students in their understanding
of NAPLAN by better disseminating
information about NAPLAN; its place in
the curriculum, the process and how the
results are used to inform pedagogy and
policy. (Written submission response:
Education expert)
Psychometric properties
of the tests
Effect of branching within
online tests
There are three main reasons for moving
to online testing. One is to achieve rapid
scoring and reporting of results to students
and schools. A second is to capitalise on
the flexibility of electronic delivery to create
more complex and interesting test items.
The third is to make the tests ‘adaptive’
with students being presented with items
close to their achievement level to maximise
information about their level.
NAPLAN Review Final Report
The adaptive, branching structure in
NAPLAN Online provides a further benefit
beyond better measurement of students
at the extremes of the distribution who
are not well served by a common test
taken by all students. Having all students
taking items that are better matched to
their achievement level yields more precise
measurement of students throughout the
range of performances.
Figure 4: Proportions of students taking each
path – Year 3 numeracy, 2019
The proportions of students being directed
by their performances to each of the
pathways in the Year 3 numeracy test are
shown in Figure 4 (ACARA, 2020e, Appendix
A.1). More than half (57.8%) the students were
directed from testlet A to testlet D. From
there, half (28.9%) went to testlet F and half
(28.9%) to testlet E. The fact that none
needed to be directed to the easy items in
testlet C suggests that the directions from
A to D were appropriate.
Just under half (41.9%) were directed from
testlet A to testlet B (27.8%) or direct to
testlet C and then back to testlet B (13.8%).
Of those who went to testlet B, most then
moved on to testlet E (23.8%) and most of
the rest moved on to testlet C (4%). The fact
that so few went to testlet F suggests that
the directions from A to B were appropriate.
The proportions of students being directed
to the various paths are determined by the
branching rules adopted in the program.
A 50:50 split between D and B from A would
be desirable. If testlet A works appropriately,
66
the paths ADC and ABF should be unlikely to
be followed, as was the case.
There are seven paths through the
branching test structure as shown in
Figure 2 and summarised in Figure 4.
There are three versions of each of the
testlets so students following the same one
of the seven paths are likely to be answering
different questions. There are, in fact,
126 different paths.
The effectiveness of the branching
in achieving better measurement in
the extremes of the range of student
performances can be seen in Figure 5,
where the results for the online assessment
in Year 3 numeracy in 2019 are shown
(ACARA, 2020e, Appendix A.2).
The first testlet gives an approximation of
the distribution of achievement levels that
would be produced if all students took
the same test as they do with the paper
test. That distribution is labelled AXX in
Figure 5. By the end of the second testlet,
those moved to D (ADX in the figure) have
been differentiated from those moved to
B (ABX in the figure). The more difficult
items in D reveal the higher achievements
of those students. The reason that the ABX
distribution does not extend as far down
as the AXX distribution is that 13.9% of the
students in AXX were directed to testlet C
and not to either B or D.
By the time all students have completed
their third testlet, the whole population has
become much better differentiated than
could be achieved with a single, common
test taken by all. The subgroup taking
the ADF path, for example, extends into a
region that would not be measured well
by a common test without differentiation.
The same can be said of the subgroup
taking the ACB path. The branching clearly
achieves the purpose of measuring over a
fuller range of student achievements.
Figure 5: Distributions of student achievement by pathway – Year 3 numeracy, 2019
NAPLAN Review Final Report
67
Scaling of results over year levels
and time
In the first NAPLAN in 2008, students’ results
across Years 3, 5, 7 and 9 were set on a scale
with an overall mean of 500 and a standard
deviation of 100. All results were located on
the same scale using some common items
in the tests for adjacent Years: 3/5, 5/7 and
7/9. In NAPLAN, this common-item scaling
is called vertical equating.
Results in subsequent years are also located
on the NAPLAN 2008 scale in a process
called horizontal equating. Because all
NAPLAN items have been publicly released
up until 2016, it has not been possible to
include items from earlier tests in a current
test to use them in common-item scaling for
the horizontal equating. Instead, commonperson scaling has been used. For each year
level, there are secure NAPLAN tests for
each domain, developed in 2009, that are
administered to samples of students at each
year level in the current year who also take
the full current NAPLAN tests. Commonperson scaling involves the use of the scores
of the samples of students in the secure
tests from 2009 and the current tests. The
process is described for 2017, the last year in
which all students did the NAPLAN tests on
paper, in the NAPLAN 2017 Technical Report
(ACARA, 2018e, pp. 39-49).
The common-person scaling used in
the horizontal equating also provides
information for vertical equating for the
current year because there are common
items in the secure 2008 tests for 3/5, 5/7
and 7/9 taken by the samples of students
for each year level. So, for 2017 for example,
there were two sets of information on where
the Years 3, 5, 7 and 9 results should be
located on the NAPLAN scales established
back in 2008. Both were used. ‘The results
of common person [horizontal] equating
were checked against the results of 2017
4
common-item vertical equating and both
sets of results were taken into consideration
in finalising the scaling of the reading,
spelling, grammar and punctuation and
numeracy tests’ (ACARA, 2018e, p. 39).
The NAPLAN 2017 Technical Report (ACARA,
2018e) provides detail on the consistency
of the two scaling results for each of the
16 scales, reading, spelling, grammar and
punctuation, and numeracy at each of the
year levels, 3, 5, 7 and 9 (pp. 40-58) and
the resolution of inconsistencies between
the horizontal and vertical scaling (pp. 5860). In the process, the performance of all
the link items is reviewed to identify any
showing a marked difference in relative
difficulty, compared with the difficulties of
other items, in the link. These items are then
removed from the link but retained as part
of the current performance measure.
Occasionally, the equating procedures reveal
what could be a deficiency in a current
test. In the 2017 Year 9 reading test, the
distribution of students’ results, compared
with previous years, was compressed at the
top end, with proportions of students in
Bands 9 and 10 considerably smaller than in
previous years. The issue then was whether
this was a real decline in the performance
of high achievers or a consequence of the
test poorly measuring the high achievers.
Based on further investigation, ACARA’s
international Measurement Advisory Group
agreed that the results were a consequence
of poor measurement not poor student
performance (ACARA, 2018e, p. 60). The
solution was to adjust the distribution of
results on the 2017 Year 9 reading test to
match the mean and standard deviation of
the results on the 2016 Year 9 reading test.
This, of course, obliterated any real change in
performance of Year 9 students in reading,
either up or down, that might have occurred
between 2016 and 20174.
As a matter of full disclosure, Barry McGaw, one of the members of the NAPLAN Review Panel is a member of the ACARA Measurement
Advisory Group.
NAPLAN Review Final Report
68
In 2018 and 2019 and, prospectively, in
2021 when both paper and online forms
of NAPLAN are in use, the scaling process
becomes more complicated because
it now involves not only the commonperson horizontal and common-item
vertical equating, but also the use of data
from different modes of assessment. The
procedures used are described in the
NAPLAN 2019 Technical Report for the most
recent case where about 50% of students
took the paper test and about 50% the
online test (ACARA, 2020e, pp. 103-158).
Vertical equating for both the paper and
online versions involved, as before, commonitem scaling with items common in each
form between adjacent tests, 3/5, 5/7 and
7/9. Horizontal equating involved commonperson scaling using the secure equating
test that had been administered to samples
of students from 2009. This time it was given
in paper form to samples of students from
the populations taking both the paper and
online versions two weeks later. As in earlier
years, the link items were then reviewed to
remove any for which there was a marked
difference in relative difficulty compared
with other link items. The final shifts to
locate the current student performances
in both the paper and online test modes
in Years 3, 5, 7 and 9 on the historical
NAPLAN scale from 2018, involved resolving
differences between the shifts suggested by
the vertical and horizontal equating.
To see if there were mode effects exerting
arbitrary influences on the results unrelated
to differences in student achievement,
the distributions of results in 2019 for
both the paper and online groups were
compared with the distributions of the
results for the schools involved in previous
years. These comparisons were examined
by ACARA’s National Assessment, Data,
Analysis and Reporting Reference Group,
a group that has representatives from all
education departments, test administration
NAPLAN Review Final Report
authorities (where these are different from
the department), Catholic and independent
school systems/sectors and other
relevant stakeholders.
This group determined that, where there
were inconsistencies in the distributions,
the shape of the 2019 distribution, either
paper or online, should be adjusted to match
that in the distribution of the results in the
corresponding 2017 paper test for schools
involved in the relevant group, paper or
online. As with the earlier adjustment of the
distribution of the 2017 Year 9 reading results
to match the distribution of the 2016 Year 9
reading results, these adjustments of 2019
distributions to match the corresponding
2017 distributions similarly obliterated any
real change that might have occurred
from 2017 to 2019. The numbers of 2019
distributions adjusted in this way are shown
in Table 14 (ACARA, 2020e, pp. 151-152).
That the distributions of 10 of 16 online scales
and 8 of 16 paper scales needed adjustment
indicates that achieving satisfactory
horizontal and vertical equating is difficult
when using two modes of assessment.
The task will become easier when all
assessment is in a single, online mode
from 2022.
The equating will also be more secure when
full advantage can be taken of NAPLAN
items no longer being released and more
being available for use in links, both vertical
and horizontal. All links could then be based
on common-item scaling without recourse
to common-person scaling through a
sample of students doing an additional
set of NAPLAN tests.
69
Table 14: 2019 scales for which distributions were adjusted to match those for 2017
Year 3
Year 5
Year 7
Year 9
Reading
Yes
–
–
Yes
Spelling
–
Yes
–
–
Grammar and punctuation
Yes
Yes
Yes
Yes
Numeracy
Yes
Yes
–
Yes
Reading
Yes
–
–
Yes
Spelling
–
Yes
–
–
Grammar and punctuation
Yes
Yes
Yes
Yes
Numeracy
–
–
–
Yes
Online version
Paper version
Confidence in measurement
NAPLAN results are reported for individual
students and in aggregate for various
grouping – schools, groups of students
(Indigenous, language background other
than English), jurisdictions (government,
Catholic and independent), state and
territory and national.
There are two sources of uncertainty
in NAPLAN scores – uncertainty in
measurement and uncertainty in
equating. Uncertainty in measurement is a
consequence of NAPLAN collecting data on
student achievement with relatively short
tests administered on a single occasion.
A parallel test, with different items covering
the same curriculum domain, would be
unlikely to yield exactly the same result
for each student. Longer tests would yield
more precise results.
equating is used with all students taking
the relevant NAPLAN test and a sample of
those students taking the secure test from
2009, there is also uncertainty due to the
sampling. If a different sample of students
were used, the common-person equating
would not yield exactly the same results.
How precise results are depends on how
much data they are based on. Considering
average results, the most precise estimates
are national means. Means for larger
schools would be more precise than means
for smaller schools. The least precise are
individual student results. The extent of
uncertainty is shown in Table 15 for cases
at the national mean of 495.9. The national
mean is essentially a precise measure
because it is based on so much data and so
has virtually no uncertainty.
Uncertainty in equating is a consequence
of the common items not having exactly
the same relative difficulty levels in each of
the tests in which they are embedded, that
is, 3/5, 5/7 and 7/9. When common-person
NAPLAN Review Final Report
70
Table 15: Confidence ranges (95%) for the 2019 NAPLAN Year 5 numeracy scores
Means
Nation
50 students
25 students
513.3
519.6
539.4
495.9
495.9
495.9
478.5
472.2
452.4
Upper bound
Mean
495.9
Lower bound
In a school with 50 Year 5 students and a
mean of 495.9, the level of uncertainty in
the data means that, with a parallel test
notionally administered on 100 different
occasions with the same students, the
means on 95 of the occasions would be
expected to range from 478.5 to 513.3. On the
other five occasions, the mean would be
expected to be outside this range. The range
from 478.5 to 513.3 is the 95% confidence
interval for the mean of a school with
50 students in Year 5.
In a smaller school with 25 students in Year 5,
the estimate of the mean is less precise
and so the 95% confidence interval is wider.
For such a school with a mean measured to
be at the national mean of 495.9, it can be
said with 95% confidence that the mean is
between 472.2 and 519.6. For a single student
measured to be at the national mean, it
can be said with 95% confidence that the
student’s result is somewhere between
452.4 and 539.4.
NAPLAN Review Final Report
Individual
student
School
The level of uncertainty in individual student
results varies depending on how far the
student is above or below the average.
The level of uncertainty is greatest with
extreme scores at the tail ends of the
distribution and smallest for those in the
middle of the distribution. For a student at
the national mean, as illustrated in Table
15, the 95% confidence interval extends to
43.5 points above and below the measured
score. For a student with an extreme score,
the 95% confidence interval would extend
more than 100 points above and below the
student’s measured score, and that is more
than two NAPLAN bands.
71
Figure 6: Extent of uncertainty in student NAPLAN results with print and branching digital tests, 2018
The extent of uncertainty is greater for scores
further from the mean. It would be expected
to be greater for a single print test form that
all students take than for a branching, digital
test in which students are presented with
items close to their achievement level to
obtain a more precise measurement. This is
illustrated in Figure 6 for both the print and
branching digital forms of the 2018 Year 3
numeracy test in which students responding
to the print test are shown in red and those
responding to the branching digital test are
shown in blue (ACARA, 2020e, p. 184).
The location of the dots in the figure are
determined by the students’ NAPLAN results
(achievement score) on the horizontal axis
and by the extent of uncertainty associated
with the score on the vertical axis. The
centre of the horizontal axis is located at
the national mean and there the extent
of uncertainty associated with individual
student scores is lowest, although slightly
lower for those taking the branching digital
test (in blue) than those taking the print
test (in red). For NAPLAN results away from
the mean, the extent of uncertainty rises
the further from the mean students’ results
lie, either above the mean (to the right) or
below the mean (to the left). That increased
NAPLAN Review Final Report
uncertainty occurs with both the print and
branching digital tests but is less for the
branching digital test. That is revealed by the
blue dots being lower on the graph than the
red dots for all results away from the mean.
Just as Figure 5 showed how the branching,
digital form measured the full range of
achievements better than either a common
print or digital test could, Figure 6 shows
that the branching digital form measures
students’ achievements throughout the
range with less uncertainty.
It is also important to recognise that the level
of uncertainty in a simple growth measure
(such as the difference between a student’s
successive NAPLAN results or between a
school’s successive mean NAPLAN results) is
greater than the uncertainty associated with
either of the two NAPLAN results from which
the growth is calculated.
The annual NAPLAN Technical Reports
provide information on the confidence
bounds for those interested in examining
the precision of the measurements.
However reports to students and parents/
carers provide only the actual scores of
the student with no indication of the
uncertainty of the score.
72
Such a level of uncertainty is not unique to
NAPLAN. Patterns of uncertainty would be
essentially the same for other standardised
tests such as the Australian Council for
Educational Research’s (ACER) Progressive
Achievement Tests that many schools use.
The level of uncertainty would be greater
for most locally developed assessments
that teachers create and use because they
would likely be less reliable measures than
standardised ones. Reliability and validity
can be increased with teacher-made
assessments when multiple assessments are
used, as long as ‘halo effects’ do not cause
one result to influence another arbitrarily.
Teachers understand uncertainty in
measurement since they see the variations
in assessed results for individual students
over time. It is why so many teachers
said in submissions to the review, and in
consultations that they ‘triangulate’ with
multiple measures to obtain an increasingly
stable assessment of individual students.
Uncertainty in measurement does not have
a differential effect on horizontal equating
since the secure tests are administered
to a large sample of 800 students, large
enough to obtain stable estimates of
item difficulty. Uncertainty in horizontal
equating is a consequence of the relative
difficulties of the items in the secure test
revealed in the performance of the current
sample of students not being the same as
those obtained in 2009. The uncertainty in
horizontal equating affects the confidence
with which longitudinal comparisons can
be made in investigating trends over time in
the results of successive cohorts of students.
NAPLAN Review Final Report
The two types of uncertainty (uncertainty in
measurement and uncertainty in equating)
have different impacts on interpreting
test results, depending on the level of
aggregation of the data. For student’s
results, uncertainty mostly comes from
measurement uncertainty. However, as
the aggregation moves up, for example to
the state or national averages, uncertainty
in measurement has virtually zero effect
and other sources of uncertainty, including
equating uncertainty, become the
dominating source. It will be very important
to monitor the impact on uncertainty in
equating when NAPLAN is fully online and
the equating is exclusively common-item
equating with no common-person equating,
to see if the anticipated benefits accrue.
Establishing benchmarks
There are single scales for NAPLAN literacy
and numeracy, each with the form shown in
Figure 7 (ACARA, 2019b, pp. vi). The vertical
equating, discussed above, enables the
students’ results in Years 3, 5, 7 and 9 all to be
located on the same scale. The scale has ten
bands but, as can be seen in Figure 7, only
part of the range is used at each year level.
73
Figure 7: NAPLAN assessment scale
As described in Chapter 2, there is a National
Minimum Standard (NMS) set on each scale
for each year level. The NMS is defined in the
following terms.
The second lowest band on the
achievement scale reported for each year
level represents the national minimum
standard expected of students at that year
level. The national minimum standard
is the agreed minimum acceptable
standard of knowledge and skills without
which a student will have difficulty
making sufficient progress at school
(ACARA, 2019c, pp. vi).
There is no indication of how these
benchmarks became ‘the agreed minimum
acceptable standard’ in each domain but it
is clearly a matter of professional judgement
and consensus.
As also described in Chapter 2, there
are benchmarks set on the scales for
the international surveys of student
achievement – Progress in International
Reading Literacy Study (PIRLS), Trends in
International Mathematics and Science
Study (TIMSS) and Programme for
International Student Assessment (PISA)
with which comparisons were made with
NAPLAN Review Final Report
NAPLAN. The discussion in Chapter 2
focused on similarities and differences
in trends over time in the percentages
of students performing at or above the
benchmarks on the various scales. The
international scales have more than one
benchmark, as indicated in Chapter 2.
The actual percentages below the lowest
benchmark in the latest assessments in
reading/literacy and numeracy/mathematics
in each of the surveys are shown in Table 16,
where it is clear that fewer students fail to
reach the NAPLAN benchmarks than fail to
reach the other benchmarks. This could be
because the students are better prepared
for the NAPLAN tests because of their
connection with the Australian Curriculum,
or engage with them more because they
are domestic census tests rather than
international sample surveys. The other
obvious possibility is that the NAPLAN NMS
benchmarks are less demanding than the
international ones. That is even more likely
given that all students who did not sit the
NAPLAN tests because they were exempt
are counted as below NMS. The percentage
of those who sat and are below NMS is
therefore smaller than the percentages
shown in Table 16.
74
Table 16: Percentages of Australian students below minimum standards benchmarks
NAPLAN
PIRLS
Year 3
Year 5
Year 9
Year 4
Reading/literacy
4.1
5.3
8.2
13.0
Numeracy/
mathematics
4.5
4.6
4.0
The 1996 National School English Literacy
Survey provided earlier data on the
performance of Years 3 and 5 Australian
students in reading and writing (Masters
and Forster (1997b). The then federal Minister
for Education commissioned Masters and
Forster to set minimum performance
standards as benchmarks for Years 3 and
5 students. They did this with samples of
students’ work from the national survey,
obtaining judgement from teachers, as well
as from literacy and numeracy specialists,
on whether samples of students’ work
were above or below the standard they
expected of students at those year levels.
This enabled Masters and Forster to locate
the benchmarks on the reading and writing
scales developed in the national survey. The
conclusion was that 27% of Year 3 students
and 29% of Year 5 students did not meet the
relevant benchmarks (Masters and Forster,
1997a, p. 15).
Masters and Forster’s work and the
international surveys all beg the question
of whether the NAPLAN National Minimum
Standards are set too low. There is work
underway to set additional higher
benchmarks on the NAPLAN scales from
2021 for ‘proficient’ and ‘highly proficient’
performance (ACARA, 2019b, pp. 10-11).
NAPLAN Review Final Report
TIMSS
Year 4
PISA
Year 9
15-years
19.0
9.0
11.0
19.6
Inclusiveness of the tests
The NAPLAN testing program is designed
as a census assessment in which all
students are intended to participate, though
parents/carers can withdraw their child.
There are three reasons why a student
may not participate in some or all of the
NAPLAN tests. They are:
Exemption – Students with a language
background other than English, who
arrived from overseas less than a year
before the tests, and students with
significant disabilities.
Withdrawal – Students withdrawn by their
parent/carer based on religious beliefs or
philosophical objections to testing.
Absence – Students not present at school
because of an accident or mishap or by
choice (ACARA, 2018e, pp. vii-viii).
The reasons for non-participation at a
national level are shown in Table 17 for
NAPLAN 2017, the last year in which only the
print form was used (ACARA, 2018e, pp. 59,
123, 187, 251). The rate of exemptions and
withdrawals was generally consistent over
all the year levels tested but the rates of
absence on the days of testing were higher
in the secondary years and particularly
at Year 9.
75
Table 17: Percentages of non-participating students in NAPLAN 2017 tests
Reading
Writing
Language
conventions
Numeracy
Exemptions
1.9
1.9
1.9
1.9
Withdrawals
2.8
2.9
2.8
2.7
Absences
2.3
2.4
2.2
2.7
Exemptions
1.9
1.9
1.9
1.8
Withdrawals
2.3
2.3
2.3
2.2
Absences
2.3
2.4
2.2
2.8
Exemptions
1.8
1.8
1.8
1.7
Withdrawals
2.1
2.1
2.0
2.1
Absences
3.5
3.4
3.2
4.0
Exemptions
2.0
2.0
2.0
2.0
Withdrawals
2.7
2.6
2.6
2.7
Absences
6.0
5.8
5.6
6.6
Year 3
Year 5
Year 7
Year 9
Table 18: Participation rate (%) in NAPLAN in 2017
NSW
Vic
Qld
WA
SA
Tas
ACT
NT
Aust
Year 3
97
95
93
95
93
95
94
88
95
Year 5
97
95
93
96
94
95
94
88
95
Year 7
97
95
91
96
94
94
95
86
95
Year 9
95
91
87
94
90
90
90
79
91
NAPLAN Review Final Report
76
Participation rates vary across year levels
and states and territories, as can be seen
in Table 18 for NAPLAN 2017. Participation
rates are lower for Year 9 than Years 3, 5
and 7 and they are lower in the Northern
Territory, South Australia and Queensland
than in the other states and territories.
Non-participation is not randomly
distributed across student groups, with
low performing students (students with
lower prior achievement scores) more
likely to be absent from the tests than their
counterparts (Centre for Education Statistics
and Evaluation, 2016). As the test results are
not missing at random, this can bias the
state and territory mean scores. It would be
good for jurisdictions to investigate students’
reasons for absence and seek to reduce
the current levels.
Students with disability
Adjustments to NAPLAN testing are
available for students with a disability.
These include assistive technology that
does not ‘compromise a student’s ability
to independently demonstrate the literacy
or numeracy skills that are being assessed
through the NAPLAN tests’ (for example,
text-to-speech), alternative questions (for
example, audio presentation, adjustments
in font size, variations in colour), support
persons (for example, a scribe for the
writing test or a reader for other tests) and
extra time and rest breaks (where these
are part of the student’s regular teaching
and learning experience) (ACARA, 2020f).
There is also provision for some students
to be withdrawn.
In cases where the severity or complexity
of a student’s disability does not allow
the student to participate in NAPLAN, or
where a student is from a non-Englishspeaking background and arrived in
Australia less than one year before the
tests, students can be exempted from one
or more NAPLAN tests (ACARA, 2020i).
NAPLAN Review Final Report
One submission noted a marked
improvement over time in the inclusiveness
of the testing arrangements.
As a teacher I was previously against the
Basic Skills Test in NSW as I taught in a
poor and culturally diverse community.
I have been impressed with the way
assessments have continually been made
more inclusive. (Education expert)
Perhaps because they are involved with
adjustments for students taking end-ofsecondary school examinations and other
assessments, secondary schools were
said to be better than primary schools
in accommodating students’ needs, –
‘Secondary schools are generally much more
aware of the accommodations available for
students with disabilities who sit NAPLAN.’
(Disability group representative)
Parents/carers and representatives of
agencies that provide support for children
with disabilities, particularly learning
difficulties, stressed the importance of
students’ participation to obtain an external
assessment of the students’ progress.
Parents/carers of children with disabilities
don’t have many sources to compare their
child’s performance with other similar
students. We don’t want to lose the value
of NAPLAN for that group of people.
(Parent/carers association)
A lot of parents/carers come to external
organisations for an assessment after
Year 3 NAPLAN, because, if their child
doesn’t do well at the national level,
it “sets off alarm bells”. There are also
more referrals after Year 7 NAPLAN,
because primary schools don’t generally
pass on information about students
with disabilities to secondary schools.
That usually falls to the parents/carers.
(Disability group representative)
77
Some parents/carers complained that the
decision on withdrawal was effectively taken
from them by school principals.
Some students with disabilities have
been asked by schools to not participate
in NAPLAN. This is due to the perception
that these students will have a negative
effect on a school’s NAPLAN results.
(Disability group representative)
Many parents/carers are told it would be
best for their child not to do NAPLAN.
It is the parent’s right to have their child
do NAPLAN. Schools don’t dissuade in
writing, but they do tell children not to
participate. (Parent/carers’ association)
Kids don’t discriminate but, when schools
identify children with disability and
keep them out of NAPLAN, it is noticed
by the other students and it begins the
process of discrimination. (Parent/carers’
association)
On the other hand, it was reported that,
if school funding is based on evidence
of student need, participation can be
encouraged – ‘If funding is attached to
NAPLAN data, some schools encourage
students with disabilities to attend tests
so that they can attract higher funding.’
(Disability group representative)
Aboriginal and Torres Strait Islander
students
Most Aboriginal and Torres Strait Islander
students live in capital and regional cities.
A minority live in remote communities, but
there is evidence that students from both
settings may face some problems with
NAPLAN. Exclusion from the tests is one.
Comparing schools through NAPLAN is
concerning. This has resulted in a negative
effect of some Aboriginal and Torres Strait
Islander students being asked to stay
home on the day of the tests to improve
school results. (Aboriginal and Torres Strait
Islander representative body)
NAPLAN Review Final Report
There are concerns about the validity of the
tests for many Aboriginal and Torres Strait
Islander students. For some, it is the lack of
recognition of students’ ability in languages
other than Standard Australian English (SAE)
– ‘English as a second language is discussed
in the negative, rather than the benefits of
being multilingual.’ (Aboriginal and Torres
Strait Islander representative body)
Children who grow up with English as
first language still have difficulty. The
assumption is that, because they don’t speak
their own language, they didn’t qualify to
be recognised as students who would need
extra assistance. (Aboriginal and Torres Strait
Islander representative body)
Students in remote communities do not
speak SAE at home, in the playground
or in the classroom. Regional and metro
children who speak Indigenous English
as another Language or Dialect (IEAL/D)
at home still need to learn English for
peer interaction at school. This is not the
case for remote Aboriginal and Torres
Strait Islander students. We already
know that the SAE nature of NAPLAN
excludes these students. Making
these stats available to the public,
often used by media, perpetuates the
myth of inferior Aboriginal and Torres
Strait Islander student ability. (Online
submission respondent)
For others, the lack of validity of the NAPLAN
tests lies in the exclusion of Aboriginal and
Torres Strait Islander knowledge, – ‘There is
research by other Aboriginal and Torres Strait
Islander education researchers that shows
that NAPLAN falls short of reporting some
really important attributes in how Aboriginal
students see the world.’ (Aboriginal and
Torres Strait Islander representative
body), and
‘The Stronger Smarter Institute released
results that showed Aboriginal and Torres
Strait Islander students outperformed nonAboriginal and Torres Strait Islander students
78
in environmental knowledge. This reflects
Aboriginal and Torres Strait Islander student
capabilities.’ (Aboriginal and Torres Strait
Islander representative body)
Nevertheless, there was a strong view that
assessment in basic literacy and numeracy
is important for Aboriginal and Torres Strait
Islander students, – ‘Literacy and numeracy
are incredibly important from the outset.’
(Aboriginal and Torres Strait Islander
representative body), ‘NAPLAN has a “very
important message of giving a good start
to get a good finish”. It is important to make
sure all our kids get a good start.’ (Aboriginal
and Torres Strait Islander representative
body), and ‘Our whole community suffers
when we are not creating people who
will contribute to their communities in
a constructive way. We want NAPLAN to
benefit our kids so they can contribute to
society.’ (Aboriginal and Torres Strait Islander
representative body)
Cultural and language diversity
Migrant students also have the advantage
of speaking more than one language but, in
taking the NAPLAN tests while still acquiring
Standard Australian English, can have their
academic achievements underestimated.
There is a general consensus amongst
Teachers of English to Speakers of
Other Languages (TESOL) scholars that
standards-based assessment regimes
are not inclusive of students learning
English as an Additional Language or
Dialect (EAL/D) because they operate
from a monolingual paradigm that fails
to acknowledge how the language and
literacy practices of multilingual learners
differ. (Written submission response:
educational expert)
NAPLAN Review Final Report
Our national curriculum and assessment
frameworks must truly recognise the
linguistic diversity of Australia’s student
population by providing distinct learning
and assessment pathways that are
tailored to our students’ English language
learning needs. (Written submission
response: educational expert)
Similarity to other tests used
in schools
NAPLAN tests capture students’
performance on a single occasion in Years
3, 5, 7 and 9 so they give a limited snapshot
of students’ development. Teachers’ regular
observations and assessments of students
work add a richness to the view, but they
lack the comparative perspective that
comparable assessments across the system,
state and nation can provide.
Many schools use other external,
standardised assessments to obtain further,
objective information on their students’
achievements. The most commonly reported
as used in schools are Australian Council for
Educational Research (ACER’s) Progressive
Achievement Tests (PATs) which are available
to assess Early Years (mathematics and
reading in the first two years of schooling),
reading, vocabulary skills, spelling,
punctuation and grammar, mathematics,
science, and STEM contexts (inquiry and
problem solving in the domains of science,
technology, engineering and mathematics).
All the tests are available online and in print
except for PAT-R Spelling which is available
only in print and PAT Early Years and PAT
STEM contexts which are available only
online. The PATs typically cover the range
from early primary to Year 10, are mapped
to the Australian Curriculum and provide an
external reference for interpreting levels of
performance through norms that reflected
the distribution of performances in a
relevant, reference population (ACER, 2020a).
79
Other assessments nominated as in use
in schools include:
• The PM Benchmark Reading Assessment
Resources (Nelson, 2020).
• ICAS Assessments (UNSW Global, 2020b).
• Reach Assessments (UNSW Global,
2020a).
• Academic Assessment Services (2020).
• York Assessment of Reading for
Comprehension (GL Assessment, 2020).
• PROBE2 Reading Comprehension
Assessment (Parkin & Parkin, 2011).
These tests produce results with the same
kind of measurement uncertainty that
NAPLAN does. They avoid uncertainty
due to equating over time (horizontal
equating) because they do not attempt to
monitor system-level changes over time.
One reason that some schools prefer these
other standardised tests to NAPLAN is
that the results are available much more
quickly, – ‘We place greater trust in the
ACER tests where we can get data straight
away.’ (Member of the NAPLAN Review
Practitioners’ Reference Group)
Another reason for preferring other tests
to NAPLAN is that schools have control
over the use of results. NAPLAN results
are made public; the others need not be,
– ‘The advantage of PAT testing is that
marking “can be kept in-house”.’ (Member
of the NAPLAN Review Practitioners’
Reference Group)
Some schools prefer not to use standardised
tests at all, –‘We don’t use a lot of
standardised tests or do many tests. We
use anecdotal notes, observation and
moderation against work samples to assess
students.’ (Member of the NAPLAN Review
Practitioners’ Reference Group)
NAPLAN Review Final Report
Our main data source is from formative
assessment and conferencing with
students. We also use focus groups.
We look at where students are at against
achievement standards and what they
need. When we get NAPLAN results,
we often see big discrepancies between
assessments. (Member of the NAPLAN
Review Practitioners’ Reference Group)
Submissions and consultations regularly
referred to schools ‘triangulating’ data
from different sources to best understand
students’ achievement levels. There
is, however, no straightforward way to
combine the information and, apparently,
little attempt to integrate the NAPLAN
results in schools’ reports to parents/carers
and students.
Comments received as part of this review
indicated that NAPLAN results are provided
to parents/carers and students in different
ways. Many send the individual reports
home in sealed envelopes. Some add a
covering letter with general comment on
NAPLAN and, usually, an invitation to discuss
the report with the child’s teacher though
one secondary school department head
added, ‘To be honest, I think many teachers
themselves wouldn’t be completely sure
how to interpret the data “as is”’. Whatever
the communication strategy, little link
seems to be made between NAPLAN results
and schools assessments. That may not
be surprising with NAPLAN results being
returned to schools, months after the
testing. (The ACT gets around this problem
by distributing interim results for all but
writing to schools as soon as available and
before the end of Term 2.)
Some comments referred to triangulation as
desirable but not straightforward, – ‘We use
ACER tests for reading, maths and spelling.
We’d like to try to triangulate with NAPLAN
data but the results don’t necessarily work
together’ (Member of the NAPLAN Review
Practitioners’ Reference Group).
80
There is currently a lot of “piecemeal” data
analysis occurring. Matching NAPLAN
assessment data to other assessment
data would be a powerful student data
mechanism. Ideally, this data could later
be matched with teacher judgement.
(Principals’ associations)
The term ‘triangulation’ seemed to be used
quite loosely in much of the comment in the
consultations and submissions to describe
only a process in which several pieces of
information are borne in mind as an overall
judgement of a student’s performance is
formed. That raises all the regular questions
about validity and reliability of assessment
– ‘How replicable would one person’s use
of the data or other evidence be?’ ‘Would
a different teacher reach the same overall
judgement?’ This is not to suggest that
‘triangulation’ be abandoned as a term or
as a practice, it is to suggest that it needs
to be made as rigorous as possible and as
collaborative as possible.
There is a new Australian initiative that is
intended to make this task much easier
for schools. It is the Online Formative
Assessment Initiative being undertaken by
the Australian Curriculum, Assessment and
Reporting Authority (ACARA), Education
Services Australia (ESA) and the Australian
Institute for Teaching and School Leadership
(AITSL) which:
aims to provide Australian teachers with
innovative assessment solutions that
integrate resources, data collection and
analytical tools in one ‘ecosystem’ that is
easily accessible, interactive and scalable
to meet future needs. The initiative will
give teachers the tools, flexibility and
professional learning they need to plan
teaching that will work best for the
students in their classroom. It will also
give students more insight into their
learning and better understanding about
next steps to improve progress.
…
NAPLAN Review Final Report
The ecosystem will help teachers who
want to use online formative assessment
identify where their students are in
their learning and then work with
students on their next learning steps
by identifying and using effective
teaching practices and quality resources.
The system will offer access to quality
assessments and digital resources that
are aligned to the National Literacy and
Numeracy Learning Progressions and
the Australian Curriculum.
The ecosystem will also help teachers
bring together information about student
learning from a range of tools or resources
that they might already be using, to
create a coherent view of progress that
can be shared with the students and
parents (ACARA, ESA, AITSL, 2020).
Students’ NAPLAN results could be
one piece of evidence placed into
this ‘ecosystem’ and there it could be
more readily connected with the other
information that the teacher holds.
Summary
The NAPLAN tests in literacy, language
conventions and numeracy, developed in
2007 and used for the first time in 2008,
were the product of a national collaboration
created by the ministerial council to
build upon separate state and territory
assessments that had been developed
and implemented over the previous two
decades. The new national tests were based
on new national Statements of Learning
for English and the national Statements
of Learning for Mathematics until 2016,
after which they have been based on the
Australian Curriculum.
The first major change came with the
introduction of a computer delivered,
digital form of the tests in 2018 that was
used by just over 15% of schools while the
rest used the established print form. In
2018, over 50% of schools used the digital
81
form. It was expected that all schools would
have switched to the digital form by 2021.
With the 2020 NAPLAN testing abandoned
because of the COVID-19 virus, full adoption
of the digital form is now anticipated in 2022.
Using the digital and print forms of the
NAPLAN tests in parallel meant that the
digital form was required to match the print
form with the consequence that the digital
form could not exploit the full capacity of
electronic delivery to use more innovative
test items. That constraint will be removed
when all schools use the digital form.
There is, however, one constraint in the print
form that has already been removed for
those schools using the digital form. With
the print form, all students answer the same
set of questions and spend some of their
time answering questions that are either too
difficult or too easy for them and so provide
little information on how well the students
are performing. Computer delivered tests
can be similarly inflexible but, with NAPLAN,
the digital forms are adaptive. One third of
the way through the literacy and numeracy
tests, with students’ responses having been
marked as they answered, the students are
branched to more or less complex items for
the next third of the test. Then, at two thirds
of the way through the test the students
are branched again to more or less complex
items. (See Figure 2, p. 60.)
Better matching the test items to the
students’ achievement levels in this fashion
provides better coverage of the range of
student’s achievements at the high and
low ends of the distribution than can be
achieved with a single print form that
all students take. (See Figure 5, p. 67.)
The branching digital form also provides
measures of student achievement with
less uncertainty than the common print
form, particularly of high and low achievers.
(See Figure 6, p. 72.)
NAPLAN Review Final Report
National Minimum Standards are set on
the NAPLAN scales for each year level.
The percentages of students performing
below those levels are smaller than
the percentages of Australian students
performing below the minimum standards
set in all of the international surveys of
student achievement in which Australia
participates. Australian students may do
better on NAPLAN because it better fits
with the Australian Curriculum but it could
equally be that the National Minimum
Standards are set too low on NAPLAN.
The NAPLAN tests are intended to be taken
by all but a small number of students who
are exempt because they have significant
disabilities or have recently arrived in
Australia and have a language background
other than English or who are withdrawn
on the request of their parents/carers on
religious or philosophical grounds. There are
more students who do not participate by
not attending on the NAPLAN testing days.
Participation rates are better at the primary
level than at the secondary level and they
also vary across states. The absences are
not random. It is poorer performers who
are most likely to fail to participate, and
that risks bias in jurisdiction and system/
sector means.
NAPLAN provides one piece of information
about students, alongside many other pieces
of information that schools and teachers
have. None of these other pieces is likely to
be more reliable or valid than NAPLAN; but
together they can all contribute to a richer,
and more valid and reliable, picture of the
student’s achievement and progress. The
ways in which the pieces of information are
combined is what teachers and principals
call ‘triangulation’, however, this could be
made more rigorous and systematic.
82
Chapter 5: Quality of the
NAPLAN writing test
The NAPLAN writing test has been part of Australia’s national census testing program,
dating back to 2008. This chapter presents an overview and critique of the test. Areas of
focus include: the writing prompts; the scoring rubric; the conditions under which students
undertake the test, offered in 2019 in two modes (computer-based and pen and paper
based); and the alignment of the test to the Australian Curriculum: English. At the time of
writing, the Australian Curriculum is under review, and work is underway on the general
capabilities and the development of National Literacy and Numeracy Learning Progressions.
The Terms of Reference for the current Australian Curriculum Review indicate the intent to:
revisit and improve where necessary, the learning continua for the general capabilities
with reference to current research, in particular: replace the learning continua for
literacy and numeracy with Version 3 of the National Literacy and Numeracy Learning
Progressions and use the progressions to inform refinements to the Australian
Curriculum in English and Mathematics, as well as review the literacy and numeracy
demands of content in the other learning areas (Australian Curriculum, Assessment
and Reporting Authority [ACARA], 2020l, p. 4).
An international overview of the assessment of writing in selected countries
was also undertaken to inform the review and is provided in Appendix 4 as
supplementary information. A distilled set of NAPLAN writing test issues to be
resolved and recommendations on dealing with them are presented in Chapter 7.
When the below National Minimum Standards figures are stated in this chapter,
these exclude exempt students.
Key points:
• The testing of students’ writing proficiency has been a long-contested site in Australia
and internationally, suggesting the complexity of the domain being assessed.
• Stakeholder consultations presented a sustained thread of dissatisfaction about
the content of the writing tests. The main issues are: the prompts, the choice of
forms, the criteria and the related range of scores, and the conditions under which
students write.
• The writing test has had unintended effects on how writing is taught; and
students’ writing for NAPLAN was frequently described as formulaic.
• Writing data from the NAPLAN National Reports (Table 19, Table 20 and Table 21)
show a picture of young people reaching Year 9 without achieving writing proficiency.
Concentrations of performance at below National Minimum Standard (NMS) are
higher for students in regional and remote areas. The difference in performance
between males and females is significant and has been evident each year since 2008.
NAPLAN Review Final Report
83
Key points continued
• The data indicate that writing has not improved since 2011.
• Teachers’ professional judgement was repeatedly referred to as a critical missing
element in the NAPLAN writing test beyond teachers’ involvement in scoring as
part of marker panels.
• The mode of the writing test for students in Year 3 should be pen and paper based;
for students beyond Year 3, the mode should be computer-based.
• There is a widely reported lack of confidence in Automated Writing Evaluation (AWE),
especially for authorial aspects of writing.
• Those students in Years 5, 7 and 9 who have prior opportunities to develop typing
fluency and word processing skills by regularly using a keyboard are better placed
to produce sustained writing online.
• The explicit teaching of keyboarding skills and monitoring typing proficiency
beyond Year 3 are essential in developing students’ writing skills and for equitable
participation in computer-based writing assessments.
• How a country tests writing reflects interrelated decisions about: the purposes of
testing; the stages of schooling to be included; the curriculum domains and related
forms of writing to be tested; how the writing is to be scored (including criteria,
judgement method, human and machine scoring); the role of the profession; quality
assurance processes including online moderation; intended uses of the reported
results, and to whom and how they are released. All these matters are central to a
decision about whether a test is fit-for-purpose.
NAPLAN writing test
International testing programs (for example,
Progress in International Reading Literacy
Study (PIRLS), Trends in International
Mathematics and Science Study (TIMSS)
and Programme for International Student
Assessment (PISA)) have gathered
information about reading literacy,
mathematics and science. However,
the domain of writing has not been
included to date. A recent report from
the United Nations Educational, Scientific
and Cultural Organisation (UNESCO)
(2019) acknowledged that inter-country
assessment of writing is in its infancy. The
report identified that assessing writing
skills in domain areas is not well-advanced,
generally lying beyond the scope of largescale learning assessments. Writing is
characterised as a:
NAPLAN Review Final Report
…foundational skill required for
communication, future learning and
full participation in economic, political
and social life as well as in many aspects
of daily life. In a digital age and in the
context of a knowledge economy,
personal and social communication is
increasingly conducted in written text,
including through mobile phones and
social media. Assessing writing skills or
the use of them to measure domains,
such as creativity, curiosity and the
appreciation of culture, also generally lies
beyond the scope of Large-Scale Learning
Assessments (UNESCO, 2019, p. 42).
Contestation surrounding the direct
assessment of writing is not new. Humphry
and Heldsinger (2019) suggest that the
current lack of agreement about best
practice in assessing writing has occurred
84
‘[p]erhaps because of the complex and
multi-faceted nature of writing’ (p. 3).
Added to this are competing and strongly
held views about assessment validity in
the case of writing, and specifically the
nature and scope of evidence requirements;
rating approaches (holistic scoring,
analytic scoring) and how markers should
apply criteria, standards and numeric
scales, separately and in combination, as
components of valid assessment of writing
performance (Messick, 1994; Wyatt-Smith &
Adie, 2020). Added to these are thorny issues
about the conditions under which students
are expected to write, specifically the time
needed for authentic writing processes,
the roles of human scorers and automated
writing evaluation, and the conditions under
which teacher judgement can be made
dependable (Harlen, 2005a, 2005b). These
issues are addressed later in this chapter.
Some researchers have reported that the
typical matrix design of criteria and analytic
rubrics poses a threat to validity (Sadler,
2009). They suggest that criteria statements
alone do not guarantee high inter-rater
reliability or overall accuracy of scoring
(Delandshere & Petrosky, 1998; Wilson, 2006
cited in Rezaei & Lovorn 2010) and point
to how criteria can trigger ‘pronounced
rating tendencies of a form that would
usually be interpreted to indicate a halo
effect’ (Humphry & Heldsinger, 2014, p. 253),
discussed later in this chapter.
Three main types of writing are identified
for use in the NAPLAN writing test. These
are imaginative, persuasive and informative
texts. To date, the latter category has not
been used in the writing test, though there
is provision for it to be included in future
tests (ACARA, 2017). Informative writing is
arguably the most common and important
genre, both in professional and business
writing. Perelman (2018) drew on the oftreported link between what is tested and
what is taught, asserting that ‘not testing
informative writing devalues it in the overall
curriculum’ (p. 7).
NAPLAN Review Final Report
Each year, the writing prompt is the same for
all children in Years 3 and 5, with a different
prompt offered to young people in Years
7 and 9. The form or genre is common
across the testing year levels. Since 2008,
the narrative and persuasive forms have
been set (narrative: 2008, 2009, 2010, 2016,
2019; since 2011, the prompts have been
for persuasive writing every year except
2016 and 2019) (ACARA 2017). Readers are
also advised to see the section on Writing
in Chapter 2.
The recognised features of the imaginative,
persuasive and informative texts are
described below. Each text type has a
recognised primary purpose, use in social
contexts, recognisable structural features
and associated linguistic characteristics.
Imaginative texts — texts for which the
primary purpose is to entertain through
their imaginative use of literary elements.
They are recognised for their form, style
and artistic or aesthetic value. These texts
include novels, traditional tales, poetry,
stories (also known as narratives), plays,
fiction for young adults and children
including picture books and multimodal
texts. … For a NAPLAN writing test
students may be asked to write a story
that is centred on an idea, tension or
conflict, and use a structure that has
an orientation, a complication, and a
resolution.
Persuasive writing — texts for which
the primary purpose is to put forward
a point of view and persuade a reader,
viewer or listener. They form a significant
part of modern communication in both
print and digital environments. They
include advertising, debates, arguments,
discussions, polemics and influential
essays and articles. … A NAPLAN writing
prompt of this text type is constructed to
allow students to convince the reader to
adopt a given point of view or urge the
reader toward a specific action.
85
Informative writing — texts for which the
primary purpose is to inform. Informative
writing includes explanations and
descriptions with the express purpose
of informing the reader. It is one of the
most commonly used writing forms and is
central to learning across the curriculum.
... A NAPLAN writing prompt of this
type either provides the students with
the necessary information or requires
students to have sufficient content
knowledge of the topic for them to be
able to demonstrate their writing skills
(ACARA, 2017, pp. 15-16)
Regarding purpose, the NAPLAN
Assessment Framework (ACARA, 2017)
indicates that the NAPLAN writing task
is designed to assess the accurate, fluent
and purposeful writing of either an
imaginative or persuasive text in Standard
Australian English. Officially, the ‘NAPLAN
writing test complements the NAPLAN
conventions of language test assessing
spelling, grammar and punctuation within
the context of writing’ (ACARA, 2017, p. 14).
The assessment framework also indicates
that the test aligns with the Australian
Curriculum: English ‘through a focus on
three central types of texts that are essential
for students to master if they are to be
successful learners, confident and creative
individuals, and active and informed citizens:
persuasive, imaginative and informative’
(ACARA, 2017, p. 15).
Critique of the NAPLAN
writing test
Writing is arguably the most complex
performance in the three domains currently
assessed in NAPLAN. It is also the domain
that has attracted consistently negative
comment throughout the stakeholder
consultations. In the words of one
respondent, “writing can’t be easily fixed”
(Education expert). There was a sustained
thread of dissatisfaction about the writing
test, with the concerns being common
NAPLAN Review Final Report
across Years 3, 5, 7 and 9. NAPLAN writing
results have also attracted considerable
negative comment from national and
state media outlets.
While there is widespread recognition
among those participating in the review
about the value of assessing writing in
schooling, the NAPLAN writing test received
criticism across stakeholder groups. While
there were those who proposed that the
writing test be terminated, there were also
views that it should be retained, though
in a redesigned form. Those taking this
stance referred to how the test generated
information valuable to schools and that this
was not otherwise available, especially for
monitoring and comparative purposes.
Significant concerns were raised about
how the test is designed, implemented
and reported. While the review panel heard
some variation in the intensity of the views,
the common thread was that overall, the
NAPLAN writing test does not support
students to produce excellent writing; in its
current form, is not highly valued by teachers
and school leaders; is not well-designed,
impacts negatively on how writing is taught
in the classroom; and leads to narrowing of
students’ literacy learning. It was frequently
mentioned that students produce formulaic
writing for NAPLAN. It was also common
throughout the consultations to hear that
the writing test is having a negative impact
on children’s and young people’s enjoyment
of writing, their creativity, and opportunities
to express imagination in writing. An
additional claim, often repeated, was that
the test has the effect of suppressing
the quality of the writing students
could demonstrate at the high-end of
performance in favour of attempts to deliver
writing to fit ‘the formula’ – “NAPLAN has an
effect on the “joy” of writing” (Parents’/carers’
association); and “The richness of writing
has been lost. ‘Cookie cutter’ writing is being
produced” (Subject association).
86
Formulaic teaching of writing and
teaching writing as formulaic
The repeated observation was that
NAPLAN has had unintended effects on
writing pedagogies. Respondents in the
consultations, including those who had been
involved in NAPLAN scoring, characterised
writing produced for NAPLAN as formulaic.
The potential for rich writing pedagogy was
talked about as being reduced to students
composing paragraphs to a formula. More
than this, the lesson for students was that
there is a set formula for producing quality
writing. In the words of one respondent,
“Some students learn a piece of writing
and reproduce it” (School system/sector),
with another commenting, “The writing
test tends to be quite formulaic and the
responses seem to be quite formulaic as
well” (Member of the NAPLAN Review
Practitioners’ Reference Group). This
observation is consistent with reported
downward pressure in some sites on
teachers’ use of mock tests or rehearsals
for NAPLAN writing that consumed
considerable teaching time. The space for
creativity or imagination, and opportunities
for what some teachers have referred to
as being playful with language appears
to be shut down.
Stakeholders also highlighted factors
external to the test itself that impacted
Years 3, 5, 7 and 9 and non-NAPLAN Years
(2, 4, 6 and 8). They reported growth in
tutoring businesses with some parents/
carers paying tutors to prepare students
for the writing test and other domains of
the test. Also identified was the active role
of private testing companies that offered
schools opportunities to sit NAPLAN-like
tests or simulations, with student work
scored externally and reported back to the
school. This scheduling of local versions of
NAPLAN, like testing and reporting, had
been added to some schools’ own programs
of assessment and, in some cases, they had
become ‘normalised’ over time.
NAPLAN Review Final Report
There was a range of responses addressing
test preparation and time spent on
building student readiness for writing as
a solo performance. At the one end of the
continuum is the position that preparing
for NAPLAN writing interrupted the school’s
curriculum program delivery. This occurred
as teachers and students stopped planned
learning to focus instead on imaginative
or persuasive writing in the weeks leading
up to the testing period, with practice
sessions where students sat mock writing
tests. These provided students with the
experience of composing under restricted
time conditions, with no scaffolding and no
access to material and human resources.
Where this displacement of curriculum
occurred, NAPLAN test preparation became
the proxy curriculum and teaching writing
was reduced to a dominant focus on the
structure of the writing, the formula for
producing the narrative or the persuasive
form, with some commentary about
maximising the score on writing through
‘gaming’ the criteria. In commenting on
classroom preparation of students for
NAPLAN writing, one respondent said
“You’re forensically taking out different
things, and you can formulaically
produce a response that maximises the
result” (Member of the NAPLAN Review
Practitioners’ Reference Group), with
another commenting that “the writing
assessment is formulaic and focuses on
structure over content” (Member of the
NAPLAN Review Practitioners’ Reference
Group). Especially for students in Years 3
and 5, the requirement to produce writing
independently was described as ‘alien’, that
is, missing the types of scaffolding including
time for planning, drafting and editing,
and feedback that teachers reported to be
routine in the classroom.
Stakeholders’ comments showed a common
interest in how the writing test in its
entirety (choice of forms, prompts, criteria
and scores) could maintain a relevance
87
to classroom practice and the learning of
individual students and sub-groups. There
were numerous calls to provide greater
scaffolding or supportive framing to be
built into the test design evident in the
illustrative segments below – ‘Primary
students are never expected to write under
these conditions unless they are preparing
for the test. Learning should never be about
preparing for a test. (Respondent to the
online survey).
A student should be permitted to submit
three drafts of a piece of writing. They
should be given three times to complete
each stage of the process with access
to reference materials like dictionaries
to improve their work. (Respondent to
the online survey)
NAPLAN could incorporate a longerterm written component, possibly
across 1 term or several weeks, in which
quality of argument based on thorough
background research is prioritised
over the ability to write a lot in a highly
stressful environment. (Respondent to
the online survey)
Talk with parents/carers about NAPLAN
writing results being used for goal setting
and informing next step teaching appears
to be limited, reflecting the teachers’ widely
reported perception that the writing test
arrives back in the school too late to be used
for diagnostic purposes. This observation is
consistent with earlier research reporting
that NAPLAN results were not regarded
by teachers as having a clear ‘feedforward’
function, unless system/sector and school
leadership enabled this to happen using a
range of strategies including target setting
and a related focus on teachers’ assessment
literacy. While some jurisdictions have
investigated NAPLAN writing results and
school reporting of student achievement in
English, discussion about how the two relate
at the level of cohorts and individual young
people did not feature as part of routine
parent/carer and teacher communication
about students’ progress in writing.
The limited profile given by teachers and
parents/carers could reflect a perceived
significant difference between the NAPLAN
writing assessment and how writing is
assessed in classrooms, as mentioned
above. This is also evident in the responses
below, especially for students from a range
of cultural and linguistic backgrounds –
‘The marking schema is seen by many
to be formulaic and the criteria may not
reflect how writing is generally assessed
in classrooms’ (Respondent to the online
survey); and ‘It has been suggested that the
marking criteria should include a greater
recognition of the genre characteristics
as part of quality writing.’
(School system/sector).
Respondents also commented on how the
NAPLAN writing test results and the reports
specifically did not play an important role
in teacher discussions with parents/carers
about student progress or achievement.
Some teachers indicated that they had
no recollection of a parent initiating a
discussion with them about their child’s
writing results, though, as noted elsewhere,
primary teachers reported being aware
of high schools asking parents/carers to
provide NAPLAN results for enrolment
and related screening purposes for high
school enrolment.
Factors internal and external to
the test
While it is required practice for schools to
send home the NAPLAN reports, very few
references were made to teachers and
school leaders arranging to meet with
parents/carers about the data in the reports.
Overall, teachers, school leaders and
union representatives talked about how,
in its current form, the NAPLAN writing
test was not regarded as a world class
test. Concerns included that it lacked
NAPLAN Review Final Report
88
authenticity or relevance for students and
did not reflect adequately the key aspects of
writing pedagogy and the range of valued
characteristics of writing that should be
covered in a high-quality writing test. It was
characterised as having significant design
limitations. These included the choice of
prompts, the limited range of forms or
text types, the marking guide, including
the stated criteria and the accompanying
numeric scores, the limited attention to
writing purpose and audience, and the ‘alien’
conditions under which the students were
expected to produce their piece of writing.
According to teachers and school leaders,
in the main students did not see NAPLAN
writing as relevant to how they learn about
‘good writing’. There was consensus that
using students’ ‘on-demand writing’ in
restricted time conditions means that
students did not have the opportunity to
demonstrate their best writing, irrespective
of the mandated prompt for the year.
Further, as a limited snapshot of writing on
demand, the widely reported comment was
that the NAPLAN writing reports had little,
if any, diagnostic utility. As mentioned earlier,
reports arrived back in the school too late to
inform targeted interventions at the level of
the whole class, sub-groups in a cohort and
individual students. A frequently reported
view was that the writing assessment did
not tell teachers anything that they did not
already know. For those espousing this view,
the value of NAPLAN writing results was
that they served to confirm teachers’ own
assessments of student writing.
This is a point regularly made, suggesting
that there are few, if any, surprises for the
teacher in the writing reports showing
ordering of students in the class. What
NAPLAN can add is comparative information
about the level of performance in other
schools with comparable students. There
were also repeated references by teachers
and school leaders to the practice of
triangulating data where different types of
NAPLAN Review Final Report
assessment evidence are considered, to see
patterns and possible areas for intervention.
Very few of these related to NAPLAN writing
data. Where they occurred, there were
some references to spelling for school target
setting – ‘We pin things down to specific
areas. For example, if spelling dips down in
comparison to previous years […]. We’ve done
interventions in areas of targeted growth’
(Respondent to the online survey).
The writing test and alignment to
the Australian Curriculum
NAPLAN tests, including the writing tests,
preceded the Australian Curriculum: English.
To date, there has not been a published
mapping of the English curriculum,
including the Achievement Standards, the
General Capabilities, the National Literacy
and Numeracy Learning Progressions,
the NAPLAN writing test and the Marking
Guide. In its current form the writing test
aligns with the English curriculum through
its inclusion of the text types, imaginative
and persuasive to date, and its focus on
seven sub-strand threads of the curriculum
– purpose, audience and structures of
different types of texts; vocabulary; text
cohesion; sentences and clause level
grammar; word level grammar; punctuation;
and spelling.
The writing skills in Foundation to
stage 2, stages 3 to 4 and 5 to 6 are clear
in the curriculum. The stated intent
of the curriculum in secondary school
(Years 7 to 10) is that learning in English
‘builds on concepts, skills and processes
developed in earlier years, and teachers will
revisit and strengthen these as needed’
(ACARA, 2019a, unpaginated).
ACARA (2018c) has drawn useful
comparisons between the Australian
Curriculum: English and the British
Columbia New Curriculum English
Language Arts (BCC:ELA), with both
curricula built on the understanding
of students becoming reasonably
independent writers by Year 10.
89
By Year 10, Australian students are
expected to be able to construct
sustained texts for a range of purposes
that address challenging and complex
issues. Their writing should reflect an
emerging sense of personal style, use of
appropriate structure and use of language
and literary devices and features which
have been selected specifically for the
intended audience.
The BCC: ELLA Composition Year 10
course develops students’ skills in written
communication. The course requires
students to explore and create coherent,
purposeful compositions through
processes of drafting, reflecting and
revising to create texts that demonstrate
breadth, depth and evidence of writing
for a range of situations.
Both curricula are built on the implicit
understanding that students have, by
now, become reasonably independent
writers. Instruction is centred on writing
techniques that allow students to craft
and refine their writing for very particular
purposes. For mastery of the content
in either curriculum, students must be
proficient in the fundamentals of writing,
be able to plan, draft and edit, be skilled in
accessing and applying research material
and be able to select and use language
forms and features in precise and
accurate ways (ACARA, 2018c, p. 57).
The Australian Curriculum: English
and the Singapore Curriculum: English
Language Syllabus have in common that
they “are built on the expectation that by
the conclusion of compulsory schooling,
students are independent writers with
control over essential grammar, spelling and
punctuation.” (ACARA, 2018d, p.57). Further,
writing techniques are foregrounded in
both curricula:
NAPLAN Review Final Report
Both curricula are centred on writing
techniques that allow students to craft
and refine their writing for very particular
purposes. By Year 10, for example,
Australian students should be able to
construct sustained texts that address
challenging and complex issues. Their
writing should reflect an emerging
personal style, use of appropriate
structure and the deliberate choice of
language and literary devices to suit
the purpose (ACARA, 2018d, p. 64).
The above segments make clear an
expectation drawn from international
practice that by Year 10, students will be
proficient in strategies and skills of writing
- what is referred to in BCC:ELA as ‘the
fundamentals of writing’ (ACARA, 2018c, p.57)
- and be able to select and use language
forms and features with well-developed
control of composing processes. This
expectation needs to be considered against
a backdrop of the NAPLAN writing data.
Sex
The percentages of male and female
students from 2008 to 2019 whose writing
was judged to be below the National
Minimum Standard (NMS) are shown in
Table 19 (ACARA, 2020j). As previously
mentioned, the below NMS figures stated
in this chapter exclude exempt students.
In 2019, nationally the rates for Year 3 were
2.8% for male students and 1.2% for female
students. By Year 9, the rates were 21.3% for
male students and 10% for female students.
The increasing percentages of students
with writing assessed as below NMS across
NAPLAN testing year levels provide an
opening to consider whether students have
become ‘reasonably independent writers’ by
Year 10. The information presented in Table
19 has been compiled using data from the
NAPLAN National Reports (ACARA, 2020j),
with a focus on NAPLAN writing results for
males and females.
90
Table 19: Percentage of male and female students below National Minimum Standard in writing
Genre
Year
Sex
Year 3
Year 5
Year 7
Year 9
(band 1)
(band 3
and below)
(band 4
and below)
(band 5
and below)
Male
2.8
7.8
12.8
21.3
Female
1.2
3.3
5.1
10.0
Male
5.4
11.6
15.9
24.2
Female
2.2
5.1
6.8
12.6
Male
3.7
9.0
14.5
22.2
Female
1.3
3.7
5.9
10.4
Male
2.7
7.2
12.1
20.8
Female
0.9
2.7
4.8
9.4
Male
3.7
8.2
15.3
23.8
Female
1.6
3.4
6.5
11.3
Male
5.8
10.7
13.7
22.5
Female
2.5
4.5
5.5
10.2
Male
4.4
9.2
13.1
22.1
Female
1.7
3.4
4.8
9.0
Male
3.8
8.3
12.3
23.1
Female
1.5
3.3
4.5
10.1
Male
3.9
8.0
10.5
19.1
Female
1.6
3.2
3.9
8.0
Male
3.3
7.2
8.6
16.1
Female
1.3
2.8
2.9
6.2
Male
3.4
7.4
9.0
15.6
Female
1.4
3.1
3.3
6.0
Male
4.0
8.3
10.0
16.3
Female
1.7
3.4
3.8
6.6
(below NMS)
Narrative
2019
Persuasive
2018
Persuasive
2017
Narrative
2016
Persuasive
2015
Persuasive
2014
Persuasive
2013
Persuasive
2012
Persuasive
2011
Narrative
2010
Narrative
2009
Narrative
2008
Note: The figures above refer to the lowest band and exclude exempt students.
NAPLAN Review Final Report
91
Table 20: Percentage of students by location below National Minimum Standard in writing
NSW
Major cities
Inner regional
Outer regional
Remote
Very remote
Vic
Major cities
Inner regional
Outer regional
Remote
Very remote
Qld
Major cities
Inner regional
Outer regional
Remote
Very remote
WA
Major cities
Inner regional
Outer regional
Remote
Very remote
SA
Major cities
Inner regional
Outer regional
Remote
Very remote
Tas
Major cities
Inner regional
Outer regional
Remote
Very remote
ACT
Major cities
Inner regional
Outer regional
Remote
Very remote
NAPLAN Review Final Report
Year 3
Year 5
Year 7
Year 9
0.9
1.8
2.5
4.5
6.9
1.1
1.6
1.6
0.8
2.0
2.7
3.4
8.3
17.2
1.5
2.4
4.1
7.0
21.1
2.0
3.0
3.8
4.3
28.8
2.4
3.0
6.3
n.p.
1.9
n.p.
-
3.5
7.6
10.9
10.1
26.2
1.9
3.8
3.7
0.5
6.0
8.6
10.4
17.6
36.2
4.4
7.9
10.3
15.5
36
6.5
9.0
12.1
9.5
40.7
6.9
9.8
14.3
n.p.
4.2
-
6.6
12.9
16.9
26.2
35.8
4.6
8.1
9.7
4.6
9.4
14.9
16.9
27.4
46.8
7.2
11.1
15.9
20.5
48.7
7.4
9.9
13.8
15.2
46.2
10.5
14.4
15.9
n.p.
8.2
-
12.4
21.4
29.6
44.7
51.2
10.7
15.4
15.5
6.4
18.1
24.4
27.1
40.1
58.6
10.7
15.7
18.4
26.8
57
13.6
17.6
25.6
21.8
42.4
16.8
22.0
33.0
n.p.
13.5
-
92
NT
Major cities
Inner regional
Outer regional
Remote
Very remote
Aust
Major cities
Inner regional
Outer regional
Remote
Very remote
Year 3
Year 5
Year 7
Year 9
6.2
21.0
67.9
1.3
2.1
3.3
9.0
33.8
13.5
32.1
79.3
3.9
6.8
10
17.2
49.1
21.1
42.3
88.2
6.8
11.6
15.8
25.0
61.2
32.8
46.4
90.9
13.0
19.7
25.1
32.8
68.2
Key: ‘- ’ or missing row indicates that the geolocation code does not apply within this state/territory or for this
year level.
‘n.p.’ indicates data not published as there were no students tested or the number of students tested was
less than 30.
Geographic location
In addition to the marked differences among
male and female students and across the
years of schooling in the percentages of
students below NMS, there are also marked
geographic differences as shown in Table 20,
where schools are classified using the
Australian Bureau of Statistics’ Australian
Statistical Geography Standard Remoteness
Structure with Major Cities of Australia, Inner
Regional Australia, Outer Regional Australia,
Remote Australia and Very Remote Australia
(ACARA, 2020j).
At the national level, the data show a stark
difference between the percentages of
Year 9 students in major cities (13%) and
those reported for remote (32.8%) and very
remote (68.2%) locations. Being below NMS
for Year 9 is defined as Band 5 or below on
the NAPLAN scale.
The descriptions of the ten NAPLAN writing
proficiency bands in (ACARA, 2020e) are
shown in Table 21. The NAPLAN assessment
scale is provided in Chapter 4 Figure 7.
The data show a concerning picture of
writing performance in remote and very
remote locations in particular. In three
states and one territory (NSW, Queensland,
Western Australia and the Northern
Territory), the NAPLAN writing test results
show more than 50% of students in Year 9
have been assessed as below the NMS.
NAPLAN Review Final Report
93
Table 21: Descriptions of performance bands on the writing scale
Proficiency
band
Writing skills and knowledge
Band 10
Writes a cohesive, engaging text that explores universal issues and
influences the reader. Creates a complete, well-structured and wellsequenced text that effectively presents the writer’s point of view. Effectively
controls a variety of correct sentence structures. Uses punctuation correctly,
including complex punctuation. Spells all words correctly, including many
difficult and challenging words.
Band 9
Incorporates elaborated ideas that reflect a worldwide view of the topic.
Makes consistently precise word choices that engage or persuade the reader
and enhance the writer’s point of view. Punctuates sentence beginnings and
endings correctly and uses other complex punctuation correctly most of the
time. Shows control and variety in paragraph construction to pace and direct
the reader’s attention.
Band 8
Writes a cohesive text that begins to engage or persuade the reader. Makes
deliberate and appropriate word choices to create a rational or emotional
response. Attempts to reveal attitudes and values and to develop a
relationship with the reader. Constructs most complex sentences correctly.
Spells most words, including many difficult words, correctly.
Band 7
Develops ideas through language choices and effective textual features.
Joins and orders ideas using connecting words and maintains clear
meaning throughout the text. Correctly spells most common words and
some difficult words, including words with less common spelling patterns
and silent letters.
Band 6
Organises a text using paragraphs with related ideas. Uses some effective
text features and accurate words or groups of words when developing
ideas. Punctuates nearly all sentences correctly with capitals, full stops,
exclamation marks and question marks. Correctly uses more complex
punctuation markers some of the time.
Band 5
Structures a text with a beginning, complication and resolution, or with an
introduction, body and conclusion. Includes enough supporting detail for
the text to be easily understood by the reader, although the conclusion or
resolution may be weak or simple. Correctly structures most simple and
compound sentences and some complex sentences.
Band 4
Writes a text in which characters or setting are briefly described, or in which
ideas on topics are briefly elaborated. Correctly punctuates some sentences
with both capital letters and full stops. May demonstrate correct use of
capitals for names and some other punctuation. Correctly spells most
common words.
NAPLAN Review Final Report
94
Proficiency
band
Writing skills and knowledge
Band 3
Attempts to write a text containing a few related events or ideas on topics,
although these are usually not elaborated. Correctly orders the words
in most simple sentences. May experiment with using compound and
complex sentences but with little success. Orders and joins ideas using a few
connecting words but the links are not always clear or correct.
Band 2
Shows audience awareness by using common text elements, for example,
begins writing with Once upon a time; or I think … because … Uses some
capital letters and full stops correctly. Correctly spells most simple words
used in the writing.
Band 1
Writes a small amount of simple content that can be read. May name
characters or a setting; or write a few content words on a topic. May write
some simple sentences with correct word order but full stops and capital
letters are usually missing or incorrect. Correctly spells a few simple words
used in the writing.
Referring to data in Table 19 and Table 20,
the implicit understanding that students
have become reasonably independent in
writing by Year 10 appears problematic. At
a deeper level, if writing is understood to
be a key means through which students
learn in all curriculum areas, then the data
point to how Year 9 students whose writing
is assessed as below NMS are likely to face
significant barriers to success in senior
schooling, given that they are not able to
write in ways described in bands 6 to 10.
NAPLAN Review Final Report
High performance in writing
at Year 9
The percentages of Year 9 students who
were in the top two bands (bands 9 and
10), and so well above National Minimum
Standard, are shown in Table 22 (ACARA,
2020j). This table indicates there are
generally declining percentages of students
in the top 2 bands from 2011 to 2019. At the
national level in 2011 there were 21.5% at this
level (13.4% achieving at band 9 and 8.1%
achieving at band 10). In 2019, only 12.4%
of students achieved at this level (9.4% at
band 9 and 3% at band 10).
95
Table 22: Percentages of Year 9 students in top two bands in NAPLAN writing
2011
2012
2013
2014
2015
2016
2017
2018
2019
Per
Per
Per
Per
Per
Nar
Per
Per
Nar
NSW
19.8
18.8
17.3
15.2
13.4
11.7
16.8
14.1
13.1
Vic
25.4
19.4
18.2
15.7
15.3
15.5
16.6
11.4
12.8
Qld
20.3
11.2
14.2
12.6
11.3
8.5
12.6
8.6
9.2
WA
20.6
18.3
16.8
17.3
14.5
13.3
16.3
13.9
15.6
SA
20.5
14.9
15.5
14.1
12.9
10.9
12.1
9.4
13.8
Tas
17.3
14.2
13.4
11.4
10.4
12.8
13.0
7.8
12.8
ACT
26.3
20.7
21.8
18.2
17.0
13.7
19.5
15.0
14.4
NT
14.7
9.8
9.2
9.4
6.1
7.7
10.5
8.3
6.4
Aus
21.5
16.8
16.5
14.8
13.4
12.3
15.4
11.7
12.4
Key: Per=Persuasive genre; Nar=Narrative genre
The writing results presented in Table 19,
Table 20 and Table 22 provide a basis for
considering greater explicitness about the
writing knowledge and skills that students
are expected to develop in the upper middle
years. This revisits the implicit understanding
made in the current Australian Curriculum:
English that students become reasonably
independent writers by Year 10. The data
suggest that this could be considered
aspirational. The data also open the space
to consider a strengthened focus on the
writing domain in the current review of
the Australian Curriculum: English and
other curriculum areas, and how these are
intended to align with the National Literacy
and Numeracy Learning Progressions.
Critique of the writing test
There are 10 specified criteria in the rubric
or scoring traits used for assessing writing
in NAPLAN. Nine of these are common
across the NAPLAN testing year levels, with
some customisation occurring in criterion 4,
dependent on the selected form of writing
(persuasive or narrative writing). The criteria
NAPLAN Review Final Report
are not weighted equally. The suite of criteria
and the related score ranges are:
1.
Audience [0-6]
2.
Text structure [0-4]
3.
Ideas [0-5]
4.
Character and setting [0-4]
(for narrative writing)
Persuasive devices [0-4]
(for persuasive writing)
5.
Vocabulary [0-5]
6.
Cohesion [0-4]
7.
Paragraphing [0-3]
8.
Sentence structure [0-6]
9.
Punctuation [0-5]
10.
Spelling [0-6].
As shown, the 10 criteria are to be marked on
four different score ranges [0-3, 0-4, 0-5, 0-6],
and then totalled to compute a composite
score from an available total of 48.
Paragraphing has the most limited range;
text structure, persuasive devices, character
96
and setting, and cohesion are slightly higher,
with a range of up to 4 possible points;
ideas, vocabulary and punctuation, up to
5 points; and audience, sentence structure
and spelling each with a range up to
6 points. The consultations brought forward
considerable dissatisfaction, not only with
the criteria, but also with the accompanying
scores. They were talked about as having a
distorting effect on teaching writing.
More than two decades ago, Messick
(1994) posed the issue of whether rubrics
validly meet the purposes of their usage
and the prospects of their achieving valid
assessments of performance, especially
in the case of complex performances.
His question was:
‘By what evidence can we be assured that
the scoring criteria and rubrics used in
holistic, primary trait, or analytic scoring
of products or performances capture the
fully functioning complex skill?’ (p. 20).
This question has relevance to this review.
Only limited empirical research has been
conducted to date concerning the nature
and function of rubrics in education
generally, and the NAPLAN writing rubrics
(narrative and persuasive) in particular.
Research by Humphry and Heldsinger (2014)
is a notable example. They identified what
they referred to as ‘a potentially widespread
threat to the validity of rubric assessments
that arose due to design features’ (p. 253)
and presented evidence from empirical
research conducted in the context of
assessing narrative writing using rubrics.
They claimed that:
the evidence indicates that the typical
grid or matrix design of the rubric design
used in this context [narrative writing]
induces pronounced rating tendencies of
a form that would usually be interpreted
to indicate a halo effect. The term halo
effect refers to a strong tendency for
NAPLAN Review Final Report
ratings on separate items or criteria
to reflect a general rater impression
of a performance (pp. 253-254).
In this empirical investigation of two
different rubrics,
It was established that the issue was not
the raters’ inability to treat each criterion
independently but that the rubric itself
forced judgements to be dependent,
resulting in an apparent halo effect
(p. 262).
While Humphry and Heldsinger (2014) did
not argue that this finding was generalisable
to rubrics in other contexts, they proposed
the need to resolve the validity effect
potentially caused by the design of scoring
rubrics. They also asserted the value of:
more productive research into a number
of questions, such as to ascertain which
and how many criteria should be used,
whether the operational independence
of criteria can be established, and the
optimal number of qualitative gradations
for each separate criterion. Resolving the
threat to validity might also open the way
to more productive research into whether
raters make more valid assessments using
rubrics than holistic judgments (p. 253).
Regarding the number of criteria, other
researchers (for example, Sadler, 1989)
have suggested that fewer criteria may be
desirable to achieve high rater-consistency
(with self and over time) and inter-rater
reliability (self with other raters). In NAPLAN
writing it is arguably unreasonable to expect
individual markers to hold 10 criteria in their
head concurrently during a scoring episode.
ACARA indicated its intent to examine the
reliability and validity of the scoring rubric
developed to incorporate these criteria of
the writing task as part of its next four-year
work plan, with revisions to be made as
warranted (ACARA, 2017). To date, however,
there have been no published revisions to
97
the criteria. Calls for the revisions echoed
through the consultations with one system/
sector, one of many respondents making the
case, saying that it ‘also supports a review of
the marking rubrics as the current criteria
do not reflect the reality of teaching writing
in Australian classrooms’.
Conventions of language test
The ACARA Assessment Framework:
NAPLAN Online (2017-2018) states that
the ‘NAPLAN writing test complements
the NAPLAN conventions of language
test assessing spelling, grammar and
punctuation within the context of writing’
(p. 14). The additional language conventions
test includes three categories namely the
grammar and punctuation items, and
the spelling test.
The grammar items in the grammar and
punctuation test focus on knowledge and
accurate use of grammar at a sentence,
clause and word level. Grammar items
are developed from the content of the
Australian Curriculum: English sub-strand
threads of text cohesion, sentences and
clause level grammar and word level
grammar. (ACARA, 2017, p. 12)
The punctuation items in the test focus
on the identification of accurate use of
punctuation conventions. Punctuation
items are developed from the content
of the Australian Curriculum: English
sub-strand thread of punctuation.
(ACARA, 2017, p. 12)
The NAPLAN spelling test focuses on
the accurate spelling of written words,
and consists of an audio component
and a proofreading component. Spelling
items are developed from the Australian
Curriculum: English sub-strand thread of
spelling. (ACARA, 2017, p. 13)
In presenting the NAPLAN Online language
conventions test an innovation includes
the use of an audio file ‘in which words
NAPLAN Review Final Report
are presented in context sentences’
(ACARA, 2017, p.13) with accommodations
to be made for students with hearing
impairments. A further innovation is the
interlocking grammar and punctuation
testlet design as part of the adaptive
design in NAPLAN Online. Respondents
indicated that the inclusion of spelling and
punctuation in both the writing test and
the language conventions test appeared
to be unnecessary, especially if the writing
test were to be ‘improved’, as indicated
below. This observation applied irrespective
of the mode of the NAPLAN writing test.
A preference for assessing language
conventions as part of the writing test
was repeatedly mentioned.
Do we need to continue to assess
language conventions in the NAPLAN
test? If we improve the writing test, we
could incorporate language conventions
as part of writing (Subject association).
The available scores for spelling,
punctuation, paragraphing, and grammar
were reported to be at the expense of
higher-order writing features. The NAPLAN
narrative and persuasive marking manual
(ACARA, 2010, 2013a) indicates that scorers
are to count the occurrence of correctly
spelt words defined as simple, common,
difficult and challenging. A script containing
no conventional spelling scores zero, with
correct spelling of most simple words and
some common words yielding a mark of
two. To attain a mark of six, a student must
spell all words correctly, and include at least
10 difficult words and some challenging
words or at least 15 difficult words (ACARA,
2010, 2013a). Respondents’ widely held
view, also reported by Perelman (2018),
that this approach to scoring spelling had
the effect of prioritising accurate spelling
of easier words rather than attempted
approximations of more difficult vocabulary
that accrue minimal score – ‘The writing
test marking criteria encourage a formulaic
98
writing style. The marking criteria
privilege grammar and spelling over ideas’
(Subject association).
It was also reported that NAPLAN writing
encourages a response using a fiveparagraph form. According to Perelman
(2018), ‘Although the five-paragraph essay
is a useful form for emerging writers, it is
extremely restrictive and formulaic. Most
arguments do not have three and only three
supporting assertions. More mature writers
such as those in Year 7 and Year 9 should be
encouraged to break out of this form. The
only real advantage of requiring the fiveparagraph essay form for large-scale testing
appears to be that it helps ensure rapid
marking’ (p. 8).
There was a large corpus of commentary on
the current criteria for the NAPLAN writing
test, with the clear thread being the need for
a review of the criteria, the related numeric
scoring, the choice of prompts and the
limited range of forms set in the test.
The writing component of NAPLAN is
problematic for a number of reasons
including the prevalence of rehearsed
or formulaic writing; the accuracy of
assessment rubrics and criteria and the
challenge of effectively assessing in online
environments. … While acknowledging
the concerns… [it] remains supportive of
the writing component continuing at the
present time. (School system/sector)
[It] welcomes the review of the content
suite covered by NAPLAN testing. An
evaluation of the writing task is especially
appreciated, though [it] questions the
value of removing the writing task.
(School system/sector)
Markers and scoring
NAPLAN marker training is undertaken
within states and territories, with oversight
by ACARA. Teachers and other suitably
qualified personnel in each jurisdiction
NAPLAN Review Final Report
are invited to apply to be NAPLAN
markers. Sample scripts are provided
to all markers nationwide; state-based
marker quality teams use these for locally
implemented training and calibration.
There is little published information
regarding processes for achieving training
consistency nationwide and monitoring
of local processes to maintain marking
quality within jurisdictions over time.
Limited information is available, including
in technical reports, regarding how
marker feedback processes are managed
during scoring operations at jurisdictional
and national levels.
The scoring process involves a single marker
who scores each student’s script, applying
all 10 criteria. A possible effect of this is
that the correlation between scores on
different criteria could be artificially inflated,
leading to possible violations of the local
independence assumption built into the
scaling model (Humphrey & Heldsinger,
2014). A study to examine such effects
should be undertaken given the concerns
expressed by a number of stakeholders.
Further options to explore in marking
implementation include: i) multiple markers
marking a subgroup of students’ scripts
and ii) different markers assigning marks
to different criteria for a particular student
– ‘There are validity issues where teachers
mark writing tests too fast and to set criteria.
The criteria and allocation of marks should
be closely examined’ (Subject association).
Teachers’ professional judgement was
repeatedly referred to as a critical missing
element in NAPLAN writing test processes.
The recurring and strongly expressed view
was that the re-visioning of NAPLAN writing
should make explicit provision for teacher
judgement, including a component of
moderation to seek comparability across
teachers and schools.
99
These observations open a space to consider
the potential forms of accountability and
verification of systems and processes that
could be nationwide. Addressing these
matters could build confidence in reporting.
One respondent pointed to the need for
bolstering confidence in the test stating,
‘There is not a lot of confidence in
the writing test. Those running the
assessment have indicated that writing
is the area of NAPLAN that they are least
confident in for reliability’
(School system/sector).
NAPLAN writing test mode
Test mode (computer-based writing test
and a pen and paper test) featured as an
important issue throughout the stakeholder
consultation process. The widely reported
view was that the teaching of handwriting
should be prioritised over the teaching of
keyboarding in the early years, and that the
pen and paper test mode was frequently
mentioned as ‘the only’ defensible choice for
students in Year 3. A widely held view was
that Year 3 students were too young to sit
the writing test online with one respondent
stating, ‘they should be properly focusing on
handwriting and learning how to write’.
This stance is consistent with the findings
from the Centre for Education Statistics
and Evaluation (CESE)’s 2016 study
that investigated,
‘whether primary students in NSW
schools perform differently according
to the mode of the writing test… and
the extent to which typing proficiency
accounts for any differences observed in
students’ performance in a computerbased writing test versus in a pen and
paper test’ (Lu, Turnbull, Wan, Rickard
& Hamilton, 2017, p. 4).
The study found that, for Year 3 students,
‘the median typing speed was 9 words per
minute’, which, based on the literature, is
NAPLAN Review Final Report
reported to be ‘lower than the handwriting
speed for this age group, hence it is likely
that many Year 3 students would struggle
to produce online texts comparable to
handwritten texts in a timed condition’ (p. 5).
A related key finding is that ‘most trial
schools do not explicitly teach keyboarding
skills’ (p. 4). Consistent with other published
research, the recommendation is that
‘typing instruction is best commenced in
the upper primary years’ (Lu et al., 2017,
p. 5). Perhaps more importantly, the study
recognised i) the need to investigate how
new technologies can be used to enrich the
teaching of writing and students’ experience
of the writing process, and ii) for schools to
identify ‘an effective method for developing
students’ typing fluency and to monitor the
development of their typing proficiency over
time, for students beyond Year 3’ (p. 5).
During the review consultations, participants
identified how keyboarding had potential
to add to the cognitive load of students in
sitting the writing test online. There were
also numerous comments regarding the
technical readiness of schools to participate
in NAPLAN Online, an observation also
reported in the 2016 CESE study. Some
review participants mentioned the limited
number of computers available in some
schools, the limited technical support
available, and the challenges faced by some
schools relating to system infrastructure
and school budgets. These issues go beyond
this review but are recognised as core in
enabling all students to have equity of
opportunity for success in participating
in the writing test online, and NAPLAN
Online more generally. The panel is aware
that systems/sectors are putting in change
programs to help schools transition to
online testing.
It was clear that many schools welcomed
the move to NAPLAN writing online,
indicating ‘we were ready for this’. However,
there were recurring concerns about the
100
preparedness of some students in Years 5,
7 and 9 ‘to do their best writing’ in timerestricted conditions, especially for those
with limited typing fluency.
Several respondents identified that those
students in Years 5, 7 and 9 who had prior
opportunities to develop typing fluency were
better placed to produce sustained writing
online, while those students with little, or no,
experience, were characterised as being at a
disadvantage. The explicit teaching of typing
fluency in the curriculum could go some way
to address this.
The move to NAPLAN writing online was
consistently referred to as having significant
resourcing implications for preparing
students for sitting the test in schools.
Operational impediments including those
relating to school and system/sector
infrastructure were also reported.
Concerns regarding NAPLAN Online
Automated Writing Evaluation were
widespread. This was the case, even though
respondents recognised that this would lead
to earlier return of writing test results to the
schools and that, for many students, their
‘normal’ is working online.
While there was a full range of views about
the move of NAPLAN to online, in the case
of the writing test, by far the dominant
view was that student writing should not
be machine marked: ‘marked by a robot’.
While many recognised that ‘a machine’ can
score technical or grammatical criteria (for
example, sentence structure, punctuation
and spelling), there was explicit rejection
of the idea that genre-based or authorial
criteria (for example, audience, ideas and
cohesion) of writing could be fairly assessed
using Automated Writing Evaluation. More
comprehensive research could address
concerns about the utility and efficiency
of the machine scoring technical and
authorial criteria.
NAPLAN Review Final Report
The NAPLAN Technical Report (ACARA,
2020e) examined jurisdictional Differential
Item Functioning for both paper and
online marking for the writing test, finding
that “the expected score curves of the ten
ratings criteria were plotted for the eight
jurisdictions...None of the criteria showed
notable differences across jurisdictions”
(p. 100).
There is a need to examine a range of
strategies for building confidence in the
profession and in the community regarding
the reliability of Automated Writing
Evaluation scoring and its dependability in
assessing all aspects of writing. Implications
of Automated Writing Evaluation for
teacher professionalism merit serious
consideration. Further research designed
to investigate and collect evidence on the
feasibility and validity of automated scoring
systems and processes will be essential in
the context of NAPLAN writing. These next
steps would build on already completed
work by ACARA, and ongoing international
research and development, to generate
necessary empirical evidence on the validity
of automated scoring systems in relation to
the validity of tests content and response
processes including scoring (ACARA, 2018f;
Bridgeman & Ramineni, 2017; Shermis 2014;
Eliot et.al, 2013).
The data that the writing test has produced
is distinctive internationally, and its potential
for longitudinal investigations of children’s
and young people’s trajectories in writing
is arguably under-utilised. Longitudinal
research would be greatly facilitated by the
availability of a Unique Student Identifier
with which to link students’ writing in
future tests.
Summary
The calls for change presented in this section
of the review go well beyond adding another
form (for example, informative writing). They
101
give voice to a wide range of significant
issues with the NAPLAN writing test.
These include:
the consultations with teachers, school
leaders, other experts in systems/sectors
and unions, and professional associations.
• the content of the writing tests, including
the prompts, choice of forms and the
criteria and related range of scores
Calls for change should be distinguished
from a uniform call for removing
standardised assessment of writing as
a domain in NAPLAN. The preceding
discussion shows that, while several
areas of dissatisfaction and concern were
identified, there remains clear support
for standardised assessment of writing to
continue in NAPLAN, following significant
redevelopment. The need for a redeveloped
writing test was the dominant position.
Regarding next steps, however, there is
support for national testing of writing
through sampling, and also as a census
test. Chapter 7 takes up this matter in the
recommendations.
• technical issues of scoring including
dependencies among criteria, especially
in relation to adjacent year levels. This
opens the opportunity for investigating
the difficulty and ease with which scorers
can separate the criteria for scoring
purposes, discriminating among them
and addressing the specified features
within each criterion
• the conditions under which students sit
the test, providing limited opportunity for
planning, drafting, revising and editing
• the validity and reliability of the marking
operations currently undertaken within
jurisdictions and the potential for a
strengthened form of monitoring that
could be nation-wide.
The data suggest that the writing test in its
current form has not sustained confidence
and trust within the profession and at
system level. Writing results are perceived
to be less reliable than those from the
reading and numeracy tests. Further, writing
does not feature strongly in improvement
targets and strategies in some states and
at school level. Overall, it is clear that there
is little, if any, support for a claim that the
test is perceived to be well-aligned with
classroom writing and further, that it has
positively impacted on the teaching of
writing, and that students have benefited
from participation – ‘The singular focus
on persuasive writing has not had a good
impact on teaching and student writing is
formulaic’. (Education expert)
The calls for significant change to the writing
test were clear and sustained throughout
5
Finally, the review brought to light strong
support for rethinking the role of the
profession in national testing including
for system monitoring purposes, and the
role of teacher judgement in scoring and
interpreting NAPLAN writing results in
conjunction with other assessments to
inform practice. It is also clear that further
research is needed into Automated Writing
Evaluation in the context of NAPLAN writing.
This review has opened the opportunity to
consider the potential to bring together
human judgement, moderation and
machine scoring: Automated Writing
Evaluation scoring of language conventions
(for example, grammar, punctuation,
spelling5, paragraph, sentence structure)
with authorial aspects of writing (for
example, audience, text structure, ideas,
character/setting/persuasive devices,
vocabulary, cohesion) scored by teachers.
Chapter 7 takes up the issues in NAPLAN
writing to be addressed and presents
recommendations for action.
Grammar, punctuation and spelling are currently tested in the NAPLAN language conventions test.
NAPLAN Review Final Report
102
Chapter 6. Uses of NAPLAN
Chapter 1 identified five purposes for the current national standardised assessment
program – monitoring progress towards national goals, school system accountability and
performance, school improvement, individual student learning achievement and growth,
and information for parents/carers on school and student performance. The purpose of this
chapter is to describe how NAPLAN data are used nationally, by school systems/sectors and
schools, by teachers and by parents/carers and the broader community.
Key points
• For more than a decade, NAPLAN’s standardised assessment data have underpinned
public reporting on national, state and territory trends in student achievement
and growth.
• Wide differences in school-level achievement and growth may be observed among
schools serving similar communities, and these differences are reported to schools
through systems/sectors’ data analytics tools and to the public through My School
and school system/sector websites.
• States and territories and school systems/sectors set targets for NAPLAN
achievement, often in terms of increasing the proportion of students in higher
achievement bands and decreasing the proportion in lower bands.
• School-level NAPLAN results are routinely used by schools in target-setting,
planning and monitoring achievement, but some stakeholders believe that schoollevel NAPLAN targets narrow teachers’ focus to students near the boundaries of
bands of achievement and unfairly categorise schools with lower than expected
student achievement.
• Teachers report that they use individual-level NAPLAN results in triangulation
with other standardised assessments and teachers’ judgments.
• Parents/carers value individual-level NAPLAN results as a source of
information external to students’ schools.
• Many professional stakeholders are opposed to the publication of schoollevel NAPLAN results because they can be used for school comparisons and
league tables, but there is a tension between this view and evidence of broader
community expectations about transparency of school-level achievement data.
NAPLAN Review Final Report
103
National uses
The annual National Reports published each
year since 2009 have used whole-cohort
NAPLAN assessments to report on trends
in achievement, alongside a range of other
educational indicators including attendance,
participation and school funding. These
reports also provide detailed analysis of
achievement and cohort gain in each of
the NAPLAN cohort assessment domains
at national and jurisdictional levels, with
subsidiary analysis of achievement by sex,
Indigenous status, language background
other than English status, geolocation
and parental education.
The use of national standardised
assessments to monitor national, state
or territory trends or to monitor the
performance of equity groups was not
contested by stakeholders consulted for the
NAPLAN Review. School sector respondents
noted that the data were used to identify
areas of need, including equity issues. The
national interests most often mentioned by
online respondents included government
decision making and funding allocation
to areas of the greatest need, consistent
with the support among stakeholders
for the national monitoring purpose of
NAPLAN identified in Chapter 1. As we
have documented elsewhere, however,
concerns about the use of NAPLAN for
school comparisons and league tables
were widespread and many stakeholders
suggested that the national interest in
monitoring achievement could be met
equally well with sample assessments.
Because the kinds of simple comparisons of
average achievement that often appear in
league tables are misleading, governments
and school systems/sectors have looked
towards more nuanced comparisons
that take account of differences in the
backgrounds and achievement of students
served by each school. The first version
NAPLAN Review Final Report
of the My School website avoided league
tables but did provide each school with
a comparison with other schools with
students from similar home backgrounds.
In framing the first version of My School,
data on individual families in schools were
not available so the Australian Curriculum,
Assessment and Reporting Authority
(ACARA) used data from Australian Bureau
of Statistics Collection Districts in which
students lived. Using collection district data,
ACARA created a new measure of advantage
based on the education and occupations
(but not income) of adults in the district.
It was labelled an Index of Community Socioeducational Advantage (ICSEA) and not as
an index of ‘socio-economic advantage’.
The technical weakness of this strategy is
that it makes the statistical assumption that
that the social characteristics of a district
could be applied to all students regardless
of the school they attended. For the 2011
version of the My School website, data on
individual families were available so it was
no longer necessary to use collection district
data. ACARA developed a new measure of
advantage based on the actual parents/
carers’ education and occupation. Now that
it used family data and not community data,
it might have been labelled an Index of
Socio-educational Advantage (ISEA) to signal
the change but the established name and
the acronym ICSEA were retained.
Following the NAPLAN Reporting Review
(Louden, 2019), education ministers agreed
to remove the ‘similar schools’ display
from the 2019 version of My School and
to introduce new measures of student
progress. Instead of comparisons among
‘similar schools’, one school-level display
compares improvement with that of other
students across the country who had the
same NAPLAN score two years ago and who
have a similar background as the students
at the selected school (Figure 8).
104
Figure 8: My School comparison with students with same starting score and similar background
A second display shows average achievement over time compared with students with
a similar background (Figure 9).
Figure 9: My School: Selected school compared with students with and similar background
NAPLAN Review Final Report
105
Notwithstanding the introduction of the
‘same starting score’ component and
the elimination of the ‘similar school’
display, some stakeholders continued
to express concern about whether
ICSEA does effectively identify schools
in comparable circumstances.
The use of the Index of Community
Socio-Educational Advantage (ICSEA)
calculations … to make comparisons with
schools is constructive but needs to be
understood in a context where schools
have multifaceted environmental factors
or unique philosophical approaches.
(Written submission response:
School system/sector)
While the data on My School purports
to be a means of comparing ‘like
schools’ with ‘like schools’, the Index of
Community Socio-Educational Advantage
(ICSEA) measure does not always enable
proper comparisons to be made. ICSEA
is inadequate in that it fails to consider
a range of additional variables about a
student’s background that may impact on
NAPLAN performance. The use of ICSEA
should be reviewed. (Written submission
response: School system/sector)
The alternative view is that school-level data
are already collected, that these data reveal
significant differences in results achieved by
schools serving similar student populations,
and that these differences are a matter of
legitimate public interest. Figure 10 provides
an example using NSW Government
school data from the 2019 ACARA My
Schools data set. It shows the distribution
of school actual average NAPLAN scores
compared with average scores achieved
by students with similar backgrounds. The
diagonal yellow line is the regression line
summarising the relationship. It slopes
upwards, left to right, showing that, on
average, schools with students from more
advantaged backgrounds achieve higher
average NAPLAN results. What is also clear
is that many schools are not well-described
by this average relationship. School C
performs much better than would be
expected given the level of advantage of its
students. So does School B. School A, on the
other hand, performs worse than would
be expected given the level of advantage
of its students.
More broadly, many stakeholders
contributing to the review argued that
the public availability of comparative data
across schools created unreasonable stress
on teachers and, in turn in some cases, on
students. Some wanted to see the data no
longer published. Others, expecting that
available data could not be suppressed
after nine years of publication since the
first My School in 2010, proposed that data
be collected only for a sample of students
rather than for all.
NAPLAN Review Final Report
106
Figure 10: Relationship between schools’ socio-educational advantage and NAPLAN results
The data displayed in Figure 10 make
clear that there are marked differences in
NAPLAN results among schools serving
students with similar backgrounds. School
A could not readily substantiate a claim
that it could not do better, ‘given the kind
of students it has’. Schools with advantaged
students below the regression line to the
right of the display may be comfortable with
performances above the national average
but they too are doing less well than other
schools with similarly advantaged students.
This may raise the question of whether these
schools are ‘coasting’ and, like School A, can
be challenged by comparison with other
schools at a similar level of socio-educational
advantage to improve their performance.
School systems/sectors
The state and territory education systems/
sectors use analyses like the one in Figure 10
to identify schools from which more could
be expected and some were doing so
with state and territory assessments
before common national assessments
were introduced with NAPLAN and before
My School provided all schools with
comparative data.
NAPLAN Review Final Report
School systems/sectors make widespread
use of NAPLAN data – including statistically
similar school data – in setting system/
sector-wide achievement targets and
managing their school improvement
programs. The 2018 National School Reform
Agreement committed state and territory
governments to improvements in academic
achievement expressed in terms of
NAPLAN targets:
[To] lower the proportion of students
in the bottom levels and increase the
proportion of students in the top levels
of performance (bottom two and top
two bands) in the National Assessment
Program–Literacy and Numeracy
(NAPLAN) Literacy and Numeracy, of
Years 3, 5, 7 and 9 (COAG, 2018, p. 7).
This national commitment has cascaded
through to plans and targets published
by jurisdictions and systems/sectors. In
Victoria, for example, the 2018-19 Annual
Report of the Department of Education and
Training reported on the indicator ‘Students
meeting the expected standard in national
and international literacy and numeracy
assessments’ using NAPLAN literacy and
107
numeracy results in the top two and bottom
three bands to (Victoria Department
of Education and Training, 2019, p.19).
Similarly, performance measures for the
NSW Department of Education’s 2018-2022
strategic plan include ‘Increased proportion
of students in the top two NAPLAN bands for
reading and numeracy’. The 2019-20 targets
for Queensland include the proportion of
students at or above the NAPLAN National
Minimum Standard in reading, writing and
numeracy (Queensland Department of
Education, 2019, pp. 7-8). The ACT’s targets
are set in terms of NAPLAN reading and
numeracy gain scores Years 3 to 5 and 7
to 9. (Australian Capital Territory Budget
Statements, 2019, pp. 7-8).
School systems/sectors pursue such statelevel targets through their various schoollevel targets and improvement programs.
To take just two examples, in Victoria the
Differentiated School Performance Method
has been developed to target support for
schools. Data used in allocating schools
to performance groups include levels of
achievement in the top two and bottom two
NAPLAN bands in Years 5 and 9 as well as
NAPLAN benchmark performance growth
(Victoria Department of Education and
Training, 2019, p. 3). In NSW public schools,
a targeted program called Bump it Up
provides every school with tailored targets
for improving performance in reading,
numeracy, wellbeing and attendance. The
Department of Education has reported
that two-thirds of the Bump it Up schools
improved their share of students in the top
two NAPLAN bands between 2015 and 2019.
Jurisdictions have also developed digital
dashboards and data analytics software to
assist in tracking achievement and targeting
improvement. The NSW Government’s
Scout system, for example, is described as
having been developed to “provide school
and corporate staff with information about
what’s working well, and what can be
improved.” The NAPLAN component of
Scout provides online, graphics-intensive
NAPLAN Review Final Report
information on school performance,
student performance and NAPLAN itemlevel performance. The school performance
component of Scout includes displays
showing NAPLAN scores over time, the
performance of equity groups; number and
percentage of students in achievement
bands over time; the percentage of
students in each band compared with a
statistically similar school group and the
whole state percentage of students in the
top two bands in reading and numeracy;
and student growth in scores and across
bands. Scout is available to government and
non-government schools in NSW and the
ACT. Similar systems linking NAPLAN and
other data to school improvement planning
include Queensland’s OneSchool and
Victoria’s Panorama.
There was little consensus among
stakeholders about the appropriateness
of school systems/sectors use of NAPLAN
data. School system/sector representatives
often emphasised the value of NAPLAN
in managing a large-scale system. As
one put it – ‘We have 2,200 schools in our
system and we need to be able to initiate
conversations with schools about their
performance. NAPLAN helps us to frame
that conversation’ (School system/sector).
Members of the NAPLAN Review’s
Practitioners’ Reference Group characterised
NAPLAN as ‘the main form of accountability’
for public schools and described how
‘annual implementation plan goals and
school strategic plan goals are linked to
NAPLAN data’. Some found this helpful and
acknowledged that although “NAPLAN
data is used to set targets for students and
informs teacher’s work quite heavily… It is
also encouraging for these schools to see
improvement in NAPLAN results.”
Others noted how system/sector targets in
some jurisdictions led to categorisation of
schools based on their students’ NAPLAN
performance. Such categorisations could
depend on the performance of a small
108
number of students at the edge of particular
performance bands. ‘We lost three kids in
the top two bands in reading last year, so
now we’re a ‘transform’ school, which was
quite devastating’ (Member of the NAPLAN
Review Practitioners’ Reference Group).
School performance measures can
narrow teacher focus to individual
students that can be moved into the
top two bands. Schools can be asked to
identify a number of students to move
out of middle bands into top bands and
to then choose individual students to
provide with targeted support. This means
other students miss out. (Member of
the NAPLAN Review Practitioners’
Reference Group)
Union stakeholders described NAPLAN as
‘overused’ in accountability and argued that
systems/sectors’ use of NAPLAN in setting
school-level achievement targets had
undermined teaching.
By making NAPLAN results into the high
stakes indicator of student attainment
and school quality, governments and
education departments undermined
the fundamentals of good teaching and
learning in schools.” (Written submission
response: Union)
Schools
The most commonly mentioned use of
NAPLAN data in schools was in triangulation
with other standardised assessment data
and teachers’ professional judgements.
NAPLAN results were variously used to ‘look
at trends on cohorts’ and to consider ‘the
skills that the school may need to focus on’.
One said, “We love the student and school
summary report (SSSR), and if we could have
that straight away, it would be more useful.”
Members of the Review’s Practitioner
Reference Group, however, were clear that
they did not over-emphasise NAPLAN
results in their school planning – ‘NAPLAN
NAPLAN Review Final Report
data can be used as a check-in point to
consider teacher judgement within a school
compared to other schools’; ‘We have heaps
of data already, and this is the least accurate
data we have. The accuracy just isn’t there.’;
‘Data is always useful, but we have other
ways of getting data. We use PAT testing.
All Catholic schools use the same learning
management system, and the data uploads
easily’; ‘We pin things down to specific
areas, for example, if spelling dips down in
comparison to previous years […] We’ve done
interventions in areas of targeted growth.’
We look at the strengths that come out
and areas of weakness. If specific students
are out line, we don’t worry if it is a bad
day, but if it identifies an area they are not
good at we use that. We are sometimes
surprised how high our kids score on
reading. It is the only time we compare to
other schools and we don’t necessarily put
a huge emphasis on the comparison as
NAPLAN doesn’t test the things we value
like creativity, problem-solving or thought.
Stakeholders reported widespread use of
NAPLAN data analytics tools. Catholic sector
representatives mentioned the Catholic
Education Network’s (CENet) CED3 portal
as well as a range of locally developed tools
and reports. Others mentioned public school
systems/sectors’ data analytics tools and the
raw data files available from curriculum and
assessment authorities. Despite teachers’
and principals’ reservations about NAPLAN,
it is clear that the results of these tests
are now routinely used by schools in local
planning and monitoring of school-level
progress.
Individual teachers
About half of the respondents to the review’s
online survey expressed a clear preference
about who should have access to NAPLAN
results. Among these there was near
universal agreement that teachers should
have access to detailed individual reports on
109
students’ achievement. Some stakeholders
reported high levels of teachers’ use of
this information. One of the Practitioners’
Reference Group members, for example,
reported that all teachers in their school had
access to the NSW Government’s Scout data
analytics tool and ‘90% of staff use it and find
it quite user-friendly’. At the other extreme,
one of the teachers’ union stakeholders
reported that ‘teachers do not engage with
or talk about NAPLAN data’.
The key issue in terms of teachers’ use of
NAPLAN data concerns the balance between
test scores and teachers’ judgments. One
member of the Practitioners’ Reference
Group mentioned that “it’s nice at times to
get an indication of how well our students
do in an external test” but others argued
that teachers “know where our students
are at” and prefer to “use the data they
gather from students on a day-to-day basis”.
Where teachers saw value in NAPLAN
data it was typically in combination with
teachers’ judgements or other assessments
– ‘It doesn’t replace teacher judgement,
but it can provide a catalyst to have a
conversation with a teacher about a child’s
progress’ (Member of the NAPLAN Review
Practitioners’ Reference Group), and
‘Students are flagged through multiple data
sets, including NAPLAN which can help
triangulate where a student is in need of
support’ (Member of the NAPLAN Review
Practitioners’ Reference Group).
Family and community
Beyond the professional communities
of teachers, schools and school systems/
sectors, NAPLAN results are available to
parents/carers and the broader community
in two ways – through individual students’
NAPLAN results and through the publication
of school-level NAPLAN results on My School,
on school system/sector websites and on
individual school websites.
NAPLAN Review Final Report
There was widespread appreciation of the
individual NAPLAN results provided to
families. Parent group stakeholders noted
that ‘parents/carers appreciate seeing an
average that they can compare their child
against’, that it is ‘incredibly important
to provide parents/carers with external
judgment’ about individual student
achievement and that NAPLAN ‘can
complement school reporting’. Principals’
association stakeholders reported that
‘parents/carers want to see how their child
performs in relation to their year group’
and that they are ‘interested in the data
to compare their school’s performance
and to compare their child’s growth and
overall achievement with the information
the school is providing them’. This was
confirmed by members of the Practitioners’
Reference Group, one of whom commented
that “parents/carers value NAPLAN data”
because “it is an external test they can use
to compare their kids nationally before
the HSC”.
In contrast with their use of individual
reports, principals reported that few parents/
carers mentioned or relied on school-level
NAPLAN achievement data. One principal
said that in seven years as principal in a
low-socioeconomic status (SES) school they
“did not have one parent ask about NAPLAN
data”. Another referred to their experience
in several high SES schools where parents/
carers were ‘quite vocal’ but were ‘not
worried about NAPLAN.’ This is consistent
with findings of the 2018 Queensland
NAPLAN Review Parent Perceptions Report,
which concluded that:
Parents were not generally familiar with
the content of the tests and tended to
be unaware of the full range of NAPLAN
reports. Indeed, many parents felt they
had not been given clear messages
about what NAPLAN is or what it is for
(Matters, 2018, pp. 33-34).
110
The same report noted that a strong
majority of parents were ‘particularly
critical of the role of the media in making
NAPLAN a high-stakes assessment through
publishing league tables and placing
too much emphasis on school results’
(Matters, 2018, p. 33).
One of the ways in which NAPLAN results
have an impact is on organisations that
provide services to schools. Several learning
difficulties organisations reported that
the release of NAPLAN individual results
coincided with an increase in referrals for
assessment and inquiries about support.
As one of the parent/carer association
stakeholders explained, NAPLAN may flag
difficulties that have not been made clear
to parents/carers by schools.
NAPLAN can also be external evidence
of any learning difficulties or if a student
is behind. Particularly in primary schools,
teachers are reluctant to give out D and E
grades. Teachers always used “strengthbased language” as well. Parents/carers
can be uninformed about their child’s
performance in relation to the child’s
cohort without NAPLAN.
(Parents/carer’s association)
Several parent/carer association
representatives commented on the role of
NAPLAN in school choice. One reported that
in their consultation for this review dozens
of people responded and ‘all said that that
NAPLAN results were a consideration when
selecting a school’. Others cautioned that
there is no value in choice if there is ‘only
one local school’ or if parents/carers ‘can
no longer school shop’ because of school
zoning regulations. One of the independent
school sector stakeholders drew attention
to a previously reported survey that showed
only 18% of parents/carers had NAPLAN
in their top three reasons for choosing a
school. Similarly, a parent/carer stakeholder
group argued:
NAPLAN Review Final Report
School performance is a secondary
consideration. Less than 5 per cent of
parents/carers who took part in APC’s
2018 Survey, said NAPLAN results were
important in choosing a school, for
example. Parents/carers make their own
judgements about a school based on
a range of factors. (Written submission
response: Parents/carers’ association)
The release of school-level NAPLAN results
was much more controversial. More than
two-thirds of those who responded to the
review’s online submission process offered
an unambiguous opinion on the public
release of data associated with NAPLAN. Of
those 67%, 54% were against public release
of any data, 6% favoured public release of
global data but not school-level data and 7%
favoured release of school level data, citing
greater transparency as the reason for doing
so. The tenor of these concerns about public
release of data is captured in this online
response to the review:
I believe that data should be provided to
schools and parents/carers but not the
media. We have enough to deal with in
schools without having to worry about
comparisons which the public make
between schools based on NAPLAN data.
Several sources report, however, that despite
this low level of use most parents/carers
agreed that school-level NAPLAN results
should be available on a public website
(Louden, 2019, p. 91). Similar conclusions
about parents/carers’ perceptions have been
reported in a recent review commissioned
by ACARA.
Parents had generally not given this issue
prior thought and, when questioned,
found it difficult to see a reason why the
information on the My School website
would not be freely available in the public
domain (ACARA, 2018b, p. 5).
111
Summary
For more than a decade, NAPLAN’s
standardised assessment data have
underpinned public reporting on national,
state and territory trends in student
achievement and growth. Wide differences
in school-level achievement and growth
may be observed among schools serving
similar communities, and these differences
are reported to schools through systems/
sectors’ data analytics tools and to the public
through My School and school system/
sector websites.
This review’s evidence of the uses to which
NAPLAN results and analyses are put
is broadly consistent with other recent
reviews. Stakeholders confirmed that states
and territories set targets for NAPLAN
achievement and that NAPLAN results are
routinely used by school systems/sectors
and schools in target-setting, planning
and monitoring achievement. Many
stakeholders, however, were concerned
that school-level NAPLAN targets may
narrow teachers’ focus to students near the
boundaries of bands of achievement and
unfairly categorise schools with lower than
expected student achievement. Similarly,
the 2018 Queensland NAPLAN Review
concluded that NAPLAN had been ‘both a
positive and negative driver of education’
(Cumming et al, p. 14). Participants ‘were
comfortable with educational accountability
for transparency of educational outcomes
and monitoring the health of an education
system’ but concerned that ‘emphasis on
NAPLAN as an accountability measure at
system and school levels continues to create
a negative competitive environment for
systems and schools, perpetuating negative
educational practices in some schools’ (p.14).
NAPLAN Review Final Report
Teachers and principals responding to this
review indicated that school-level NAPLAN
results were used to track trends in cohorts
and identify skills that may need more
focus, but that results at the individual
level were of less use. Where teachers saw
value in NAPLAN data it was typically in
combination with teachers’ judgements or
other assessments. This is consistent with
the Queensland NAPLAN Review which
reported ‘extensive data collection in schools
for triangulation, including NAPLAN data’
but ‘limited engagement with NAPLAN test
data’ (Cumming et al, pp. 11-12).
Principals reported that parents/carers rarely
mention or use school-level NAPLAN results,
a conclusion confirmed by the Queensland
NAPLAN Review Parent Perceptions Report
(Matters, 2018). Parents/carers, principals and
teachers consulted in this review, however,
reported that families appreciated the
individual-level NAPLAN results because
they provide a judgment external to the
local school.
The most controversial use of NAPLAN
results is the publication of school-level
NAPLAN results. Many professional
stakeholders are opposed to the publication
of school-level NAPLAN results because
they can be used for school comparisons
and league tables. There is, however, a
tension between this view and evidence
of broader community expectations about
transparency of school-level achievement
data. Few parents/carers consulted in either
of the recent NAPLAN reporting reviews
(ACARA, 2018; Louden, 2019) had used
the school level reports NAPLAN reports
available on the My School website but
nevertheless most agreed that schoollevel NAPLAN results should be available
to the public.
112
Chapter 7: Recommendations
This final chapter sets out the rationale for specific changes proposed in the
recommendations offered. It relates the recommendations to the review terms of reference
and proposes a timeline for implementation.
National standardised
assessment
Standardised assessment provides one way
to see how well education is progressing.
The population may ask this of the whole
system/sector. Parents/carers may ask it of
their own children or their children’s school.
Using common test-taking conditions,
questions, time to respond and scoring
procedures, standardised assessments
can provide answers framed in a larger
perspective than local classroom or
school assessments can provide.
Purposes of national standardised
assessment
Chapter 1 identified the following
five important purposes for national
standardised assessment that have been
endorsed through a decade of decisions by
national ministerial councils in Australia.
School system accountability
and performance
Measurement of students’ achievements to
monitor progress towards national goals can
also provide public information on school
system accountability, including interjurisdictional and inter-sectoral comparisons
and information on the performance of
students in equity groups.
School improvement
The Australian results from the international
PIRLS, TIMSS and PISA surveys can identify
areas of general weakness in Australia as a
whole or in particular states and territories
to which schools can respond but, except
for schools in the sample, it would not be
their own students’ data that they would be
examining. For detailed comparative data to
which all schools can respond, all students
need to be tested in a census.
Monitoring progress towards national goals
Individual student learning achievement
and growth
Measurement of students’ achievements
and progress over time can monitor the
progress of the education system toward
national goals at national, jurisdictional
and system levels. They can also provide
information on the relative performance of
students by gender, geographic location
of schools, socioeconomic background
and Indigenous background. This can be
achieved with domestic data collections
or through Australia’s participation in
international surveys such as the Progress in
International Reading Literacy Study (PIRLS),
Trends in International Mathematics and
Science Study (TIMSS) and Programme for
International Student Assessment (PISA).
A focus on individual student achievement
and growth requires individual students
to be tested. The data can be obtained
simultaneously by testing all students at
the same stage of school or by testing small
groups or individuals at the discretion of
schools or individual teachers. Simultaneous
testing allows comparison and interpretation
of individual student’s performances in
the light of the performances of the whole
population or relevant sub-populations.
When individuals or groups of students
are tested with published standardised
tests, there will usually be norms for the
relevant age group or year level to provide
comparative information. The comparisons
NAPLAN Review Final Report
113
would not be precise if the norms were
determined at a different time in the
school year from the time that the school
uses the test.
Information for parents/carers on school
and student performance
Standardised tests can provide information
for parents/carers that is situated in a wider
frame of reference than their children’s own
school. It is this wider frame of reference
that enables parents/carers to have some
understanding of the position of their own
children in relation to the population of
which their children are part and, depending
on the scope of the data and the extent
of their access to it, some understanding
of the school’s position among all schools
or among schools with students similar to
their own. The frame of reference could
be provided by results for the relevant
population of students obtained at the same
time on the same standardised tests or from
norms established on the test at an earlier
time with similar students.
Features of an assessment system
Sample versus census testing
Census tests have the capacity to meet a
wider range of the purposes of a national
standardised assessment program than
do sample tests, as shown in the summary
in Table 23.
Table 23: Census and sample assessment and the purposes of national standardised assessment
Purpose of national standardised assessment
Census
Sample
Monitoring progress towards national goals
• National, jurisdictional and system estimates of achievement
• Relative performance by gender, geographic location of schools,
socioeconomic background and Aboriginal and Torres Strait Islander
background
School system accountability and performance
• Accountability for system performance
• Accountability for school performance
School improvement
• School-level information on achievement and growth by
assessment domain
• School-level targets informed by system comparative data
Individual student learning achievement and growth
• Student level achievement estimates for comparative purposes
(cohort, test domain, gain, equity groups)
• Student level achievement estimates for diagnostic purposes
Information for parents/carers on school and student performance
• Individual student achievement
• Relative school performance
NAPLAN Review Final Report
114
Census tests can be an appropriate source
of information for monitoring national policy,
system accountability, school improvement
and reporting to parents/carers on school
and individual performance. Although they
have less diagnostic value at the individual
level than more intensive and extensive
standardised assessments designed for
diagnostic purposes, they can provide a
‘point-in-time’ indication of a student’s
position in relation to the whole population
of which the student is a member.
Students’ results can be reported directly
to parents/carers. They may also signal
the need for exploration with specialised
diagnostic assessment.
than 300 people chose to respond through
the online submission process and about
half of these people had a clear view on
whether national standardised assessment
should be based on a sample or use wholepopulation census testing. Of these, about
half (25% of the total respondents) supported
continuation of census testing. Their reasons
included ‘the opportunity for all parents/
carers to gain information about the
progress of their child’, the ability of schools
to ‘determine strengths or weaknesses
of individuals, cohorts, and the whole
school over time’, and concern that sample
testing would mean that ‘trends could be
dismissed as sampling errors’.
Sample tests, on the other hand, are
effective only in monitoring progress
towards national goals and accountability for
system performance. Sample tests have the
benefit of reducing the risk of some of the
unintended consequences of census testing.
They have a lesser tendency to narrow the
curriculum in schools because only some
schools are involved and because a wider
range of knowledge, understanding and
skills can be measured when only a sample
of students is involved. They do not enable
school-by-school statistical comparisons that
many, particularly teachers, find undesirable
but they also reduce transparency, limit
school-level accountability and invite the
imposition of other census test regimes
to support school systems/sectors’ school
improvement targets.
A similar proportion (22% of the total
respondents) advocated for a move to
sample testing. These respondents argued
that sample testing would reduce pressures
to teach to the test and eliminate schoolby-school comparisons but maintain the
value of national assessments as ‘a health
check for the education system as a whole’.
As one respondent put it:
Stakeholders’ views on NAPLAN were sought
through two rounds of interviews, a written
submission process and the opportunity
to complete an on-line survey. To obtain
more in-depth practitioner perspectives,
meetings were also held with a Practitioners’
Reference Group, including principals
and teachers from government, Catholic
and independent schools across the four
participating jurisdictions, and a nominee
of the Australian Education Union. More
NAPLAN Review Final Report
While some data would be lost by moving
to a sample system, it would certainly
take the pressure off schools. System
wide data might refocus efforts on equity
measures, rather than punitive targeting
of individual schools. (Respondent to the
online survey)
Among the half of respondents who did
not support either sample or census
testing, the largest proportion (19% of total
respondents) were opposed to NAPLAN in
principle, opposed to high stakes testing
more generally, or had concerns about
the reliability of the current tests. Of the
remainder, some respondents expressed
concern about sampling error (12%) and
others were ambivalent, expressing views
both for and against sample testing (9%).
115
Stakeholder interviews revealed the same
broad range of views. School system/
sector stakeholders most often supported
continuation of census testing. One of the
large non-government school systems/
sectors, for example, acknowledged that
sample testing would reduce unintended
consequences such as test anxiety and
teaching to the test, and would be wellreceived by some schools, but noted that:
one of the disadvantages of sample
testing is that it removes the main benefit
that NAPLAN data currently provides to
schools, being that data is provided for
all students in Years 3, 5, 7 and 9 enabling
subsequent direct comparisons and the
identification of learning growth trends
over time. (School system/sector)
School system/sector representatives noted
the greater analytic power of census tests,
warned against the loss of systems/sectors’
evidence base, the loss of their schools’
capacity to examine progress over time
and the loss of universally comparable
individual student reports to parents/
carers. Moreover, as one system/sector
representative warned, in the absence of
whole-population assessment “the void
may be filled with something else” because
school systems/sectors’ “big picture school
improvement work wouldn’t be possible
with a sample test”.
Among teacher union stakeholders, there
was universal preference for sample
over census testing. Union stakeholders
acknowledged the legitimate role of national
standardised testing in monitoring the
performance of school systems and the
targeting of resources to equity groups,
however they argued this could be achieved
without what they characterised as the
negative consequences of census testing
– student stress, narrowing the curriculum,
teaching to the test and comparisons of
schools on the My School website. As one
of the written responses argued:
NAPLAN Review Final Report
The legitimate needs of system selfmonitoring can be met by representative
sampling methods which can provide
accurate and useful information without
any of the negative outcomes of mass
standardised testing. This would give an
overall snapshot of student achievement
in each state and territory jurisdiction….
and enable education authorities to track
the progress of various student cohorts
such as Aboriginal and Torres Strait
Islander students…. (Written submission
response: Union)
The views of other stakeholders were more
mixed. Principals’ associations typically
preferred sample testing. Some subject
association stakeholders preferred sample
tests, others preferred census tests and one
stakeholder was ‘more sample-orientated
but could be convinced of census tests’.
Among the members of the NAPLAN
Review’s Practitioners’ Reference Group
there was some support for the schoollevel data available from NAPLAN census
tests but more often the sentiment was
to support a move towards PISA-style
sample tests. Some parents’ association
representatives preferred census testing,
noting that ‘census testing works well’,
that ‘schools are doing things about their
kids learning because of the data and it
is being useful’ and that, without census
testing parents/carers, ‘would not get an
external assessment on their child’. Another
parent group stakeholder preferred sample
testing but cautioned that “not having
information on every school wouldn’t ‘wash’
with Ministers”.
Among educational experts responding to
the review, judgements about whether to
prefer sample to census testing typically
turned on the question of purposes of
assessment. They noted that census testing
‘allows for accountability and reporting to
parents/carers’ and ‘exposes disadvantage’,
but that census data are ‘less precise
116
for individuals’ than groups. Sample
testing using longer test instruments
would increase precision and ‘provide the
opportunity to test students about more of
the curriculum’. Several experts commented,
however, that the current National
Assessment Program sample testing in
scientific literacy, civics and citizenship and
information and communication technology
literacy does not have an impact in schools.
As one said, the reports “make headlines
but states and systems do not take action
on the basis of the data”.
Practices in other countries
The international comparisons provided
in Chapter 3, address the issue of sample
versus census testing in the various national
contexts. In Singapore, the Primary School
Leaving Examination (PSLE) provides census
testing at the end of primary school with
oral and listening and comprehension
examinations in English Language and
Mother Tongue, and written examinations
of one to two hours in English, Mother
Tongue, mathematics and science. In
middle secondary school, the SingaporeCambridge General Certificate of Education
provides census testing in examinations,
depending on which course of study the
students are pursuing. In Japan, students
take competitive subject examinations at
the end of Year 9 for selection entry to senior
high schools. In Ontario, there is census
testing of students in reading, writing and
mathematics at the end of Grades 3 and 6.
In England, schools are obliged to report
teacher judgements in reading, writing,
mathematics and science. A formerly
optional English grammar and punctuation
test at the end of Key Stage 1 (Year 2) will
remain optional but a compulsory, online
‘multiplication tables check’ is scheduled
for introduction in 2019-20. At the end
of Key Stage 2 (Year 6) and Key Stage 3
(Year 9) there are census tests in English,
mathematics and science. At the end of
NAPLAN Review Final Report
Key Stage 4 (Year 11), there are national
subject-based examinations for the General
Certificate of Secondary Education (GCSE).
New Zealand only has national surveys
of student learning but provides access
for schools to a range of standardised
assessments to assess their own students.
In Finland, prior to national examinations
at the end of secondary education,
assessments of students’ progress are
school-based but there are sample surveys
with standardised tests of students’
achievements in particular subjects selected
on a cycle. In Scotland, there are census
assessments in Years 1, 4, 7 and 10 in reading/
literacy, writing and numeracy. While
schools have the right to opt out, student
participation rates match those achieved by
the NAPLAN census assessment in Australia.
Students’ results are reported only to schools
where they are to be used in conjunction
with teachers’ assessments to create reports
on students that are provided to parent,
students and local education authorities.
Uncertainty in measurement
There is always a level of uncertainty or
imprecision in measurement. Some of it
is due to the test itself, some due to the
uniqueness of the random sample chosen
if only a sample of students is tested (that
is, a sampling effect), and some due to
links to previous tests if trends over time
are measured and reported. As shown in
Chapter 4, the level of uncertainty depends
on the amount of data behind the measure.
There is greatest precision with national
means and least with individual student’s
results. There is greater precision with means
for large schools than with means for smaller
schools. There are ways in which the degree
of uncertainty can be reported numerically
or graphically to reduce the risk of overinterpretation of a single individual score
or group mean.
117
Role of NAPLAN in meeting
national purposes
Chapter 6 provides information on the
current uses of NAPLAN by governments,
education systems/sectors, schools, teachers
and parents/carers. It is a somewhat mixed
picture reflecting the diversity of stakeholder
views about NAPLAN. As reported in the
preceding section, there is general support
for all the purposes of national standardised
assessment shown in (page 114) that can
be supported by sample testing but less
for those that require census testing. The
strongest rejection of census testing was
driven by concern about the public exposure
of school results and the facilitation of
inter-school comparisons that this enables.
The culprits included media that produced
league tables of schools, taking no account
of difference in school contexts, but the main
culprit was said to be My School.
The My School website originally provided
comparisons only among schools that
enrolled students with similar levels of socioeducational advantage but the availability
of the results for all schools enabled users
to make other comparisons. Among
respondents, there seemed to be little
awareness yet of the significant changes
to My School in 2019 (Louden, 2019) which,
as pointed out in Chapter 6, removed the
comparison of the achievements in schools
with similar students and introduced a
NAPLAN Review Final Report
comparison of the improvement achieved
between successive NAPLAN tests (Years 3 to
5 and Years 7 to 9) by students in a selected
school with the improvement by other
students across the country who had the
same NAPLAN score two years earlier and
who have a similar background.
The concerns about publicly available
school-level data are not only that they
permit inter-school comparisons but also
that the comparisons are limited to the
particular student achievements in literacy
and numeracy that NAPLAN measures.
There are also concerns that government
education systems set state-wide targets
in terms of NAPLAN achievement and
improvement and set school-level targets
based on them. Concern about the impact
on schools and teachers of such target
setting is behind the kinds of comments
this review has heard about “pressure” and
“punitive targets”. These are matters about
managing improvement rather than about
the measures that provide the criteria for
setting and judging the improvement. As
one jurisdiction’s business analytics tool
puts it, NAPLAN census data provide a
ready source of information about “what’s
working well, and what can be improved”.
Jurisdictions have made substantial
investments in such business analytics
tools for use by the central and regional
administrations and schools.
118
While the absence of school-level reporting
would eliminate the possibility of school
comparisons, it would not meet the
accountability commitment made by
education ministers in their 2019 Alice
Springs Declaration:
For schools, Australian Governments
provide assessment results that
are publicly available at the school,
sector and jurisdiction level to ensure
accountability and provide sufficient
information to parents, carers, families,
the broader community, researchers,
policy makers and governments to make
informed decisions based on evidence.
(Education Council, 2019a, p. 11).
Recommendation 1
1.1
Ministers re-endorse the importance of standardised testing
in Australian school education for:
a.
Monitoring progress towards national goals.
b. School system accountability and performance.
c.
School improvement.
d. Individual student learning achievement and growth, noting the limitations
on use in detailed diagnosis of learning deficiencies and difficulties due to
the degree of uncertainty in measures of individual students.
e.
Information for parents/carers on student and school performance.
1.2
Ministers re-affirm the role of national standardised assessment in fulfilling
these purposes.
1.3
Continue to conduct national standardised assessment as a census test of
student achievement.
1.4
Define the purposes and limitations of national standardised assessment by
decision of the Ministerial Council in the manner proposed in Table 23 and
communicate this on the Australian Curriculum, Assessment and Reporting
Authority (ACARA) website and in communications with schools and
parents/carers.
NAPLAN Review Final Report
119
Changes to the
NAPLAN tests
Curriculum coverage
Connection with the Australian Curriculum
As discussed in Chapter 1, Australian
students used to take subject-based
assessments as external examinations at the
end of primary school and in mid-secondary
school. Now they take no such assessments
before the end of secondary school.
Over a period from the late 1980s, states
and territories introduced new external
assessments to be taken by all students at
various stages prior to the end of secondary
school before, in 2007, the ministerial council
resolved to replace the external assessments
from 2008 with the common national
assessments known as NAPLAN.
All these assessments, both NAPLAN and
the state and territory assessments that
preceded it, have been limited to literacy
and numeracy. There was no desire to
reintroduce extensive subject-based
assessments for all students, so the focus
was placed on the foundational domains of
literacy and numeracy on which so much
other learning depends. There are concerns,
however, that the focus on literacy and
numeracy has had unintended effects.
An important concern is that it has narrowed
the curriculum through too much attention
being given to literacy and numeracy at
the expense of other things that ought to
be central to students’ learning and their
general experience of school. This narrowing
effect is seen to have had a greater impact
on primary schools since other subjects
are protected in secondary school through
specialist teachers and timetables that
allocate space for all subjects.
An extensive focus on literacy and numeracy
in primary schools is not new. In a study for
the Australian Primary Principals Association
of time allocations in the primary school
curriculum, Angus, Olney & Ainley (2007)
reported that, ‘Up to the twentieth century,
NAPLAN Review Final Report
the elementary school curriculum was truly
elementary: over three quarters of the time
was spent on literacy and numeracy’ (p.15).
Despite a broadening of the curriculum and
complaints that it had become overcrowded,
their survey of teachers’ actual time
allocations revealed that ‘One of the realities
of primary schools is that more than half the
instructional time is spent on English and
Mathematics’ (p.24).
As Chapter 4 reported, many stakeholders
expressed concern about a lack of alignment
between NAPLAN and the Australian
Curriculum. There has, however, been a
substantial effort to ensure that this is
not the case. ACARA acknowledges that
‘NAPLAN draws on all learning areas of the
Australian Curriculum to supply contexts
for testing’ but ‘they do not assess the
content of learning areas other than English
and Mathematics’ (ACARA, 2017, p. 6). The
reading test is designed to ‘assess students’
ability to read and view texts to identify,
analyse and evaluate information and ideas’
and is aligned to English.
As set out in the Australian Curriculum:
English, students read texts for different
purposes: personal interest and pleasure,
to participate in society, and to learn.
Since the emergence of visual and
digital communication media, the
traditional view of literacy has broadened
and evolved, and viewing is now a key
literacy skill. NAPLAN assesses students’
ability to read and view multimodal
texts for literacy experience and to
acquire, use and evaluate information.
(ACARA, 2017, p. 9)
Similarly, the spelling tests draw on the
spelling sub-strand; grammar draws on
the sub-strand threads of text cohesion,
sentences, and clause and word level
grammar; and writing draws on seven substrand threads of the Australian Curriculum:
English (2017, p. 15). The curriculum
connection is even closer between
numeracy and the Australian Curriculum:
Mathematics, as the tests draw on both
120
the proficiency strands (understanding,
fluency, problem solving and reasoning)
and the content strands (number and
algebra, measurements and geometry, and
probability and statistics) of the Australian
Curriculum: Mathematics. The connections
should have been clear through the naming
of reading, language conventions and
writing NAPLAN tests, but for the avoidance
of doubt there is an opportunity to make
the link clearer by renaming numeracy as
mathematics. There is no case for renaming
the rest as English since reading and writing
and language conventions are named and
tested separately.
Concern about teaching to the tests
Another aspect of the concern about
narrowing of the curriculum is that it will
become limited to those aspects of literacy
and numeracy that are actually tested in
NAPLAN and that teachers will allocate time
unproductively to teaching to the test and
to unnecessary student practice in taking
NAPLAN-like tests. This effect has been most
obvious in writing in which students seem to
learn formulaic ways of writing in particular
genres in response to the prompts in the
NAPLAN writing test but it is said to occur in
all NAPLAN test domains.
The branching structure of the NAPLAN
Online tests provides some protection
against teaching to the test because there
is no single test in each domain that all
students take. Even those students who
follow the same path through the branching
structure will not necessarily be responding
to the same questions. As noted in Chapter
4, in the 2019 NAPLAN Online reading
and numeracy tests, for which there were
seven pathways through the testlets,
there were actually 126 paths through the
actual test items. If the items effectively
cover the Australian Curriculum: English,
the Australian Curriculum: Mathematics
and the literacy and numeracy continua
in the Australian Curriculum, the only
effective way to prepare students for the
NAPLAN Review Final Report
tests would be through implementing
the Australian Curriculum.
A further influence on narrowing the
curriculum is reported to be the publication
of school results in NAPLAN tests,
particularly on My School, but also as a
requirement in school reports. Publication is
also said to narrow the conception of quality
of schooling and invite judgements of school
quality on the primary basis of literacy and
numeracy results.
Results of the National Assessment Program
surveys of samples of Years 6 and 10 students
in science literacy, civics and citizenship, and
information and communication technology
(ICT) literacy are published but do not seem
to have much influence on public discussion
of the performance of the education
systems/sectors. This is despite the reports
providing comparisons among states
and territories and analyses of the relative
achievement levels of sub-populations
of interest.
Inclusiveness of the tests
The NAPLAN tests cater well for students
whose special needs require adjustments
to the form and delivery of the tests. The
concern is not with the inclusiveness of the
tests but the extent of coverage of the full
student cohort in two respects. First, some
parents of students with learning difficulties
were disappointed that their children’s
schools had urged that their children be
excluded from the tests. These parents had
not exercised the option to withdraw their
children. Rather, the school had placed
them explicitly in the exempt category or,
by default, in the absent category. Secondly,
there are numbers of students who are
simply absent on the days of testing to an
extent that exceeds those missing through
exemptions or approved withdrawal. The
absence rates should be investigated to
learn why students do not arrive for the
assessments with action then taken to
reduce the rates.
121
Broadening the range of the tests
It is timely to consider broadening the range
of NAPLAN tests. Literacy and numeracy
have been the focus because they are
foundational and in order to restrict the
scope and impact of census testing. The
international assessments in PISA and TIMSS
include science in which Australian students’
relative and absolute achievement levels
have been declining. Japan and England
include science with national language and
mathematics in census testing. Singapore
includes science as one of the subjects
in its primary school leaving examination
at the end of P6 (Year 6). In Australia, all
jurisdictions are placing new emphasis, not
only on science, but on STEM more generally.
In terms of the Australian Curriculum that
would incorporate not only mathematics
under the ‘M’ but digital technologies
under the ‘T’ and ‘E’.
There is also increased interest in Australia
in learning and assessment of the General
Capabilities in the Australian Curriculum.
One of them at least, critical and creative
thinking, is fairly clearly domain specific,
particularly the critical thinking aspect.
Critical and creative thinking in history is not
the same as critical and creative thinking in
STEM, for example, so the assessment would
need to be situated in a domain. Adding
critical and creative thinking in STEM to
Australia’s census assessments would reflect
the priority being attached to both STEM
and the General Capabilities.
It would take time and a deal of
experimentation to develop the new
assessments and it would be best not to
attempt to apply them at Year 3, at least in
the first instance, until the nature and scope
of the assessments are well-defined and
valid and reliable tests have been developed.
It could be replaced in the triennial cycle of
sample surveys with a new assessment such
as in history and intercultural understanding.
Recommendation 2
2.1
Ministers note that national assessment policies and practices vary and that there
are no common features of assessment among high-achieving countries.
2.2
Rename the numeracy test as mathematics, to clarify that it assesses the content
and proficiency strands of the Australian Curriculum: Mathematics.
2.3
Add assessment of critical and creative thinking in science, technology,
engineering and mathematics (STEM) to the national standardised census
assessment program, except at Year 3, and introduce it only after a period of
experimental test development that demonstrates that valid and reliable tests
have been developed.
2.4
Withdraw the current triennial sample survey of science literacy in Years 6 and 10
and consider replacing it in the triennial cycle with another covering both a subject
and a general capability from the Australian Curriculum, such as history and
intercultural understanding.
2.5
Explicitly map the tests to the National Literacy and Numeracy Learning
Progressions to provide insight into student learning progress in the year levels in
which the tests are administered.
2.6
Jurisdictions investigate students’ reasons for absence from NAPLAN testing and
seek to reduce the current levels of absence, particularly at the secondary level.
NAPLAN Review Final Report
122
Frequency and timing of tests
From 2008, when the states and territories
adopted NAPLAN as common, national tests,
NAPLAN has been administered in May to all
students in Years 3, 5, 7 and 9.
Timing of testing and return of results
within the year
The NAPLAN Review interim report raised
the possibility of shifting the timing of
the testing from May to late February
or early March, based on the following
considerations.
Shifting the tests to early in the year,
combined with speedy delivery of results,
would make NAPLAN a measure of
teachers’ and students’ starting points
for the year. It could liberate NAPLAN to
play a formative rather than a summative
assessment role and to inform decisions
about future curriculum and teaching
choices, not judgements about past ones.
It is, of course, possible that start-of-year
assessments would be seen as summative
assessments of the end of the previous
year. That argument is potentially
weakened by the impact of declines in
student performance over the summer
vacation and the tendency for class
groups to be formed with different mixes
of students in each new year of schooling.
Assessment of the starting points for
the year could give school systems
the opportunity to provide additional
resources to schools in most need of
additional support (McGaw, Louden &
Wyatt-Smith, 2019, p.6).
In submissions to the review and in
consultations, there was general support
for this proposal. The only considerations in
determining how early in the school year the
tests could be administered, would be how
long it would take for students to be settled
into their new classes and for schools to
settle the class rolls.
NAPLAN Review Final Report
Administering NAPLAN early in the school
year would reduce the likelihood that any
school might spend time preparing students
for the tests beyond ensuring familiarity
with the format. Once NAPLAN Online is
fully implemented, results could be returned
to schools and students within days of
testing. That would reinforce their value as a
measure of the starting point of the year.
Years of testing
This review’s interim report also raised
the possibility of the tests being taken by
students in other Years than 3, 5, 7 and 9. The
primary considerations were whether to shift
from 3 and 7 and, if so, to adjust the other
years or to eliminate them. Testing every
two years would provide a better view of
students’ growth over the school years than
would testing on only two occasions. On
the question of when to start, the review’s
interim report canvased the possibilities of
Years 2, 3 or 4 for the first tests.
Would Year 3 be too early if the tests were
at the beginning of the school year? On
the other hand, would waiting until the
beginning of Year 4 be too late, given the
importance for students’ academic selfconcept of becoming secure readers early
in their school lives?
In submissions to the review and in
consultations, both of these options and
also the possibility of testing in Year 2 were
raised. Year 2 would have the benefit of
earlier detection of emerging learning
problems for students, but NAPLAN-style
tests would not be appropriate for students
at that age. Furthermore, with all systems/
sectors now using early screening by one
means or another, at least in literacy, early
detection should already be in hand.
There was also some support for testing
in Year 6 rather than Year 7. That would
make the assessment more summative for
primary schools, or for the primary stage
in Foundation to Year 12 schools, and that
123
would reinforce the judgemental role of
NAPLAN that worries those who think it
provides too narrow a set of criteria for
evaluating schools. On the other hand,
keeping the testing in Year 7 and moving
it to early in the school year, would give
secondary schools formative information
about their incoming students. Many
secondary schools have students take
standardised tests early in Year 7, or even
in Year 6 if they know which students
will be coming to them in the new year.
Respondents from schools with these
practices claimed that they did not believe
that reports on students from the various
primary schools provided comparable
assessments. If NAPLAN were to test
all students early in Year 7, secondary
schools may well be able to abandon other
assessments that they currently use at the
beginning of the year or late in the prior year.
The possibility of shifting from Years 7
and 9 to Year 8 and 10 was raised in the
review’s interim report for consideration
(McGaw, Louden & Wyatt-Smith, 2019, p.4).
An alternative would be to maintain testing
in Year 7, at the commencement of the
secondary school years, but to delay the
later testing to Year 10. Year 9 is generally
regarded as a difficult year for students and
schools and the NAPLAN test results certainly
reveal a low level of engagement of Year 9
students. With Year 10 a key year for students’
decisions about their future study options,
the commencement of that Year would
be a good time to obtain for the students,
their teachers and their parents/carers an
assessment of their current progress.
Recommendation 3
3.1
Conduct NAPLAN tests as early in the school year as is administratively feasible.
3.2
Set a goal for the results from all NAPLAN Online tests marked online being
reported to schools, students and parents/carers within a week of the conclusion
of the testing window.
3.3
Continue to administer NAPLAN tests in Years 3, 5 and 7 and replace assessments
in Year 9 with assessments in Year 10. Assessments in Year 9 are not to be held
in 2021.
NAPLAN Review Final Report
124
Rebranding the program
The current NAPLAN census tests are part
of a broader National Assessment Program
that includes sample surveys in science,
civics and citizenship and information and
communication technology literacy. This
review has proposed a rebalancing of the
sample and census assessments, with
reading and numeracy to continue as census
tests, critical and creative thinking in science,
technology, engineering and mathematics
(STEM) to be added as a census test from
Year 5; and writing to be rebuilt -- first
developed as a sample test and later
implemented as a census test.
At present the National Assessment
Program (NAP) is an umbrella title for
both the sample surveys and the literacy
and numeracy census tests as NAPLAN.
To distinguish them more clearly and to
recognise that the census tests are proposed
to move beyond literacy and numeracy, it
is proposed that new names be adopted
for each program: the Australian National
Standardised Assessments (ANSA) instead of
NAPLAN and National Sample Assessment
Program (NSAP) instead of NAP.
The addition of a census test in critical and
creative thinking in STEM would lead to
an overlap with the current sample survey
in science, which should, therefore, be
discontinued. Instead, a sample survey in
some other domain, perhaps history in
combination with a general capability such
as intercultural understanding, could be
added to the three-year sample survey cycle.
Recommendation 4
4.1
Adopt a new name, Australian National Standardised Assessments (ANSA),
in recognition of the changes in the existing tests and the addition of tests
of critical and creative thinking in science, technology, engineering and
mathematics (STEM).
4.2
Discontinue the National Assessment Program (NAP) sample survey in science
literacy with the introduction of ANSA in critical and creative thinking in STEM.
4.3
Maintain the National Assessment Program (NAP) sample surveys in civics and
citizenship and in information and communication technology literacy on their
current three-yearly cycle. Rename the program the National Sample Assessment
Program (NSAP).
NAPLAN Review Final Report
125
Redeveloping the online
branching tests
Creating new ‘digital-native’ tests
Completing the move to online tests in
the absence of simultaneous use of print
versions of the tests will liberate the online
tests from the restriction imposed by a
requirement to parallel the print versions.
The new online tests should be ‘born digital’
so there should be no requirement to match
the online versions used in the transition
period when parallel print and digital
forms are used.
The new digital tests will require creative test
development to capitalise on the flexibility
and capacity of digital delivery. The stems for
test items need not be static and responses
could be constructive. Simulations could be
used, for example, to engage the students’
in complex analysis and reflection.
There are two benefits of a move to exclusive
use of digital tests for vertical equating
of the scales over Years 3, 5, 7 and 10 and
horizontal equating of the scales over years
of testing. First, the complications faced in
equating both print and digital versions in
the transition years will be avoided. Secondly,
provided a new time series is started and
links back to 2008 are abandoned, there will
be no need to use common-person equating
with the secure print tests from 2008 in the
horizontal equating. All the equating will
become common-item equating. The earlier
decision to cease publishing the NAPLAN
items will make more items available for
repeated use over years and that will enable
the vertical and horizontal links between
tests to be strengthened.
There will be lessons to be learned about
equating in this new arrangement. As
described in Chapter 4, each testlet in the
branching tests for Literacy and Numeracy
has three forms that need to be parallel in
the sense of having the same breadth and
depth of coverage of the curriculum and
NAPLAN Review Final Report
equivalent item difficulty levels. If this is not
achieved, whether students on the same
path AD, for example, might be directed
then to F or E (see Figure 2, p. 60), could
become an arbitrary consequence of the
particular versions of the testlets A and D
with which they were presented. Technically,
this should not matter because of the
claim that the psychometric model used
can establish the students’ achievement
levels on the NAPLAN scales independent
of the difficulties of the particular items
to which they responded. The difficulty is
that the process at present uses number
of items correct and not calibrated scores
on the testlets to determine which branch
to take, so it is essential that the three
versions of each testlet have very similar
item difficulties. To minimise these risks, it
will be important for there to be thorough
trialling of the test items before they are
used in NAPLAN Online tests. There will
need to be trialling to confirm that the
branching based on determined item
difficulties works effectively as well as prior
trialling to establish the item difficulties.
There should also be further development
of the computer delivery platform to see if
the branching could be based on calibrated
estimates of achievement derived by the
psychometric model and not number of
items answered correctly in the testlets.
Setting new benchmarks
The discussion of benchmarks in Chapter 4
(see Table 16, p. 75), where it is clear that
fewer students fail to reach the NAPLAN
benchmarks than fail to reach the other
benchmarks. This could be because the
students are better prepared for the
NAPLAN tests because of their connection
with the Australian Curriculum or engage
with them more because they are domestic
census tests rather than international
sample surveys. The other obvious possibility
is that the NAPLAN National Minimum
Standards (NMS) benchmarks are less
demanding than the international ones.
126
That is even more likely given that all
students who did not sit the NAPLAN tests
because they were exempt are counted as
below NMS. The percentage of those who sat
and are below NMS is therefore smaller than
the percentages shown in Table 16, p. 75.
Table 16, p. 75, revealed that a smaller
proportion of Australian students fall below
the minimum standard thresholds in the
NAPLAN tests than fall below the minimum
standard thresholds set in the international
surveys of student achievement in
which Australia participates, Progress in
International Reading Literacy Study (PIRLS),
Trends in International Mathematics and
Science Study (TIMSS) and Programme for
International Student Assessment (PISA).
Differences among the various surveys in
what is tested are discussed in Chapter 2
but, regardless of those differences and the
possibility that lower proportion below the
minimum standard in NAPLAN is due to
NAPLAN being more closely aligned with
the Australian Curriculum, the levels at
which minimum competence in NAPLAN
are set should be reviewed. Work should also
proceed on the development of ‘proficient’
and ‘highly proficient’ NAPLAN benchmarks.
Recommendation 5
5.1
Redevelop the reading and mathematics tests as digital assessments, capitalising
on all the flexibility that the digital form offers for content and item form, with no
constraint to mirror the current print versions of the tests.
5.2
Develop the new critical and creative thinking in science, technology, engineering
and mathematics (STEM) so that it is ‘born digital’ since it will have no print form
that it might have been constrained to match.
5.3
Undertake further development of the branching model and system changes to
see if the branching could be based on estimates of achievement derived by the
psychometric model and not number of items answered correctly in the testlets.
5.4
Review the level of the National Minimum Standards on the NAPLAN scales to see
if they are set too low and progress work on developing additional ‘proficient’ and
‘highly proficient’ benchmarks.
NAPLAN Review Final Report
127
Redeveloping the writing test
As indicated in Chapter 5, the writing test
attracted the most sustained negative
comment from stakeholders in this review.
Throughout the consultations there were
sustained calls for change to the current
approach to testing writing in NAPLAN.
These calls go well beyond adding another
form (for example, informative writing).
Among the most common issues raised
were – the need for using richer prompts,
broadening the range of forms or genres;
examining the criteria and accompanying
scores against which writing is assessed;
changing the conditions in which students
are required to write to permit time
for planning and review in composing
processes; and the potential benefits
of including a component of teacher
judgement, beyond involving teachers in
state-based NAPLAN Marker Quality Teams.
It is interesting to note that while cohort
gain is a feature of the National Report
analyses of NAPLAN reading and numeracy
data, cohort gain for writing is not included.
Further, the panel was advised that the
writing results are used to a lesser extent
by systems/sectors and schools than results
for reading and numeracy. An explanation
for this lack of use of the writing test results
included the following observations:
• Writing results are perceived to be
less reliable than those for reading
and numeracy, as they are subject to
more external sources of variance than
other test data.
• These sources consist of genre effects,
prompt specific effects, marking criteria
and marking consistency.
• Collectively, these external sources of
variance can have a significant undue
influence on the trends of writing results.
For example, Figure 11 shows that, with
few exceptions, results of all jurisdictions
move in unison from one year to the next,
likely to reflect the influence of common
external sources of variance (for example,
NAPLAN Review Final Report
prompt and equating effects), rather
than any real changes in the states and
territories underlying performance over
time. (Measurement expert)
These deficiencies could be dealt with if
the test was fully redeveloped to offer a
broadened range of prompts and forms
of writing, with altered test conditions,
and explicit provision for student choice
in how the test is designed.
Purpose of the NAPLAN Writing
assessment is not clear
Currently the relationships of the writing
assessment and the marking criteria
to the Australian Curriculum: English
and Achievement Standards, General
Capabilities, and the National Literacy
and Numeracy Learning Progressions
remain opaque.
Clarity about the relationships is essential
so that teachers do not see NAPLAN writing
in isolation from their classroom practice.
It would support teachers’ diagnosis of
students’ learning needs, as well as their
selection of curriculum adjustments to
support students with disability. With the
move to NAPLAN Online, exemplars of
student writing, with rich annotations, could
be provided to illustrate student writing
at different year levels and in NAPLAN
performance bands. This would all help build
teachers’ assessment literacy.
Tasks are often decontextualised without
sense of audience and purpose for writing
Teachers frequently mentioned that
emphasis on audience and purpose is
integral to how students are taught writing
in the classroom but largely absent from
the NAPLAN writing test, which was
characterised as ‘alien’. The Australian
Curriculum: English requires students to
manipulate language features appropriate
to audience and purpose but where ‘the
audience is generic, students do not have
a sense of the level of formality required’
(Educational organisation).
128
When students write in school, they have
time for planning, drafting and editing and
the prompts for their writing generally cue
them effectively to select vocabulary and
language features to build an effective
relationship with the reader. The NAPLAN
test does not enable most to produce their
best writing.
Figure 11: Trends in mean performances on the NAPLAN Writing test
Writing required is too narrow
A new writing test could include a wider
range of prompts designed to take account
of developmental stages of writing. The
scope of the writing required could also be
broadened to include an extended response
or multiple short writing tasks or both,
subject to testing purpose and scheduling.
Short writing tasks could produce useful
information about students’ writing skills,
particularly in secondary education where
the informative genres are relevant. The
length of the extended response could be
reviewed for Years 7 and 10 and considered
in relation to time allocated to test
implementation. The extended response
and multiple short writing pieces could be
staged, if two test window opportunities
were available.
At present a single genre for responses is
used each year, so far either narrative or
persuasive. The test could be expanded to
include more than a single form or genre.
Beginning in Year 5, students could be
given a choice from a specified range of
forms, as best suits their choice of stimuli,
discussed next.
NAPLAN Review Final Report
Prompts could be presented in a range of
ways, especially if the writing test is ‘born
digital’ and where the test is completed
online. A digital placemat would present
students with an overall concept or theme
and a number of visual and verbal stimuli
they would respond to. That should increase
the perceived authenticity and relevance
of the test to young people and contribute
to efforts to arrest the problem, widely
reported, of student disengagement from
NAPLAN writing.
While the proposal for a digital placemat
of stimuli in the context of NAPLAN is
new, the use of a range of stimuli is a wellrecognised feature in examinations. The
writing task that was part of the Core Skills
Test in Queensland is a useful reference
point for the diverse range of stimuli (for
example, artworks, short literary pieces
including poetry, news items, reports, and
various graphic representations of ideas).
Further details can be found at Queensland
Curriculum and Assessment Authority (2019).
Informed by the widely reported position
that students in Year 3 do not have welldeveloped keyboarding skills and are
129
therefore unlikely to produce their best
writing online, the online writing test
could be introduced in Year 5. This would
recognise teachers’ insights regarding the
classroom priority of mastering handwriting.
It would allow opportunity for introducing
keyboarding and word processing to
strengthen opportunities for students’
success in undertaking the online writing
test in Year 5.
them and addressing the specified features
within each criterion. The most recent
National Assessment Program – Literacy
and Numeracy (NAPLAN) 2019 Technical
Report (June 2020) includes studies of this
type and adds to documentation necessary
to achieve transparency in test design and
implementation and to build confidence at
system/sector and school levels and in the
wider public.
The marking rubrics need simplification
Assessment and reporting could involve
teacher judgement
As explained in Chapter 5, marking of
NAPLAN writing is a complex process with
10 traits with reported potential problems of
interdependence among them and of a halo
effect in applying them.
The marking rubric should be redesigned
to include fewer, and conceptually more
distinct, traits that would minimise the risk
of trait interdependence. The rubric should
be consistent with the definition of the
writing construct and reflect a balance of
the following, consistent with the aspects
of writing expected for students at different
learning stages:
• higher order authorial skills of audience,
ideas and text structure
• the mechanical skills of spelling,
punctuation, paragraphing and
sentence structure.
The redesigned writing test should take
a strengthened focus on language use
in context. The language conventions
of spelling, grammar and punctuation
should be assessed in the writing that
students produce, and not separately in
decontextualised applications.
The technical issues of scoring including
dependencies among criteria, especially
in relation to adjacent year levels, should
be examined routinely as part of ongoing
test evaluation and validity studies.
These would open the opportunity for
investigating the difficulty and ease with
which scorers can separate the criteria for
scoring purposes, discriminating among
NAPLAN Review Final Report
The redevelopment of the new writing test
will take some time. In 2021, students will sit
the writing test as already planned. A small
trial could be undertaken in 2021 with a
selection of students to ensure the proposed
redevelopments are fit-for-purpose.
Then, in 2022, the new writing test would
be introduced as a sample assessment.
This would allow an opportunity to check
that the proposed redevelopment delivers a
sound approach to the testing of writing.
In 2023, assuming the sample period
provides positive evidence, the writing test
would revert to census testing.
Noting also the strongly held view of review
participants regarding the importance
of teachers’ professional judgement, it is
recommended that the profession plays a
key role in the redevelopment of the writing
test, including through moderation systems
and processes. It is recommended that
these be developed initially as part of the
sampling methodology in the trial (shortterm action), and as a feature of quality
assurance systems and processes, with the
redeveloped writing test to be re-instated as
part of census testing.
The redevelopment work would include a
National Calibration Sample, which has been
used in NAPLAN to calibrate scales, with a
well-structured, random sample of schools
to represent each state and territory, taking
account of factors such as school size and
geolocation with oversampling to ensure
130
sufficient numbers in some subgroups that
have small numbers in the population.
The scripts from the Calibration Sample
could be marked centrally as at present
but consideration should be given to the
use of digital technologies to support
national marker training and marker
moderation online to bring a stronger
national perspective to the marking, which is
currently done within states and territories.
The efficiency of the marking could be
improved with human judges marking
the scripts for authorial aspects of writing
(audience, text structure, ideas, persuasive
devices/character and setting, cohesion,
paragraphing). Automated scoring could
be used for scoring selected criteria
(spelling, vocabulary, sentence structure
and punctuation).
Recommendation 6
6.1
Undertake significant development work on a new writing test to be ‘born digital’.
6.2
Ensure the new test design demonstrates clear alignment to the Australian
Curriculum: English, the Achievement Standards, the General Capabilities,
and National Literacy and Numeracy Learning Progressions.
6.3
Ensure the new test offers a broadened range of forms including imaginative,
persuasive and informative genres, staged across the years of testing (for
example, Year 3: imaginative; Year 5: imaginative and persuasive; secondary:
persuasive and informative).
6.4
Ensure the prompt clarifies to students the audience for the writing.
6.5
Extend the assessment time for the writing test to be sufficient for students to
be able to draft and edit before producing final copy.
6.6
Develop a ‘digital placemat’ to present students with an overall concept or theme
and a number of visual and verbal stimuli they could respond to.
6.7
Withdraw the language conventions test as a separate test and assess grammar,
punctuation and spelling in the writing test.
6.8
Allow Year 3 students to hand write responses and Years 5, 7 and 10 students to
write using a computer.
6.9
Systematically train students in the use of a keyboard to achieve efficiency
before Year 5, with demonstration of fluency in typing to be ongoing
throughout schooling.
6.10 Simplify the marking rubric with fewer criteria that are conceptually independent.
6.11
Trial automated scoring of spelling, vocabulary, sentence structure and
punctuation in the writing test, with authorial aspects of writing are to be
scored by teachers.
6.12 Reinstate the writing test as a census test in 2023, following redevelopment
and evidence from the sample trial.
6.13 Explore digital approaches to support national marker training and marker
moderation online, including the use of exemplars and rich commentaries for a
stronger national perspective to the marking and teacher judgement contribution.
NAPLAN Review Final Report
131
Starting a new time series
With the full implementation of NAPLAN
Online, a new time series should be
commenced, freed from the constraint of
achieving satisfactory horizontal calibration
of current scales back to the original 2008
NAPLAN scales. The new scale could be set
as the first one was, with a mean of 500
and a standard deviation of 100. Subsequent
years’ data would then reveal whether
and how much means change from this
starting point.
Recommendation 7
Establish a new time series, beginning with the year in which NAPLAN Online
is fully implemented.
Reporting
Monitoring trends
The 1989 Hobart Declaration committed
Australian governments to monitoring
national trends in achievement. Since the
first 2008 NAPLAN National Report, annual
reports have provided national and state
and territory analyses of achievement
and student progress. They have also
considered the differences in achievement
among male and female students, and by
Indigenous status, language background
other than English, geolocation and parental
education and occupation. Although some
stakeholders would prefer sample rather
than census assessments for this purpose,
the importance of monitoring trends in
achievement over time, across jurisdictions
and across equity groups was widely
accepted by stakeholders.
In the last decade, NAPLAN results have
become a fundamental part of the evidence
base for school system accountability and
school improvement. Under the current
National School Reform Agreement (2018),
for example, jurisdictions have agreed
to focus on increasing the proportion
of students in the top two NAPLAN
achievement bands and decreasing the
proportion of students in the bottom
two achievement bands, including for
students in priority equity groups. These
commitments cascade down to state and
territory budget targets, school system/
sector targets and school-level improvement
targets. For this reason, annual public
reporting of achievement and progress
should continue.
Recommendation 8
Continue to publish annual reports on performance levels – national, state and
territory and jurisdiction, as well as for subgroups of interest, such as male and female,
Indigenous, students with a language background other than English – and trends
in performance levels over time.
NAPLAN Review Final Report
132
Reporting to schools, parents/carers
and the community
There was broad recognition of the value
of information for parents/carers provided
through individual students’ NAPLAN
reports. Some parent/carer groups and
organisations supporting students with
learning difficulties also valued the
availability of information independent
of the local school context.
Individual students’ NAPLAN results
currently arrive in schools too late to be
useful for guiding individual learning, but
the data are often used in triangulation
with other standardised tests and teachers’
judgements. Fully online testing will reduce
the time lag to days, not months, and will
increase the value of the assessments
in supporting teaching and learning of
individual students. Nevertheless, levels
of precision of measurements mean that
national means are the most precise, school
means are more precise for larger schools
and measures are least precise for individual
students. Appropriately, the ministerial
council has recently clarified the importance
of teacher judgments and limits to the
diagnostic use of NAPLAN.
Many stakeholders were concerned about
the impact of public reporting of schoollevel NAPLAN results. Comparisons among
schools in published league tables and on
the My School website were widely believed
to have made NAPLAN into high stakes
tests and led to negative consequences
such as teaching to the tests, narrowing
of the curriculum and increasing stress for
students and teachers. For these reasons,
successive ministerial councils have warned
against the construction of simplistic
league tables. Direct comparisons among
statistically similar schools have been
removed from the My School website and
replaced with comparisons of achievement
and progress of students with similar
starting points and similar backgrounds.
Ministers have also clarified that NAPLAN
‘does not measure overall school quality’.
There remains, however, a public interest
in making available information about
each school’s contribution to the national
effort to make Australia a high equity,
high performance nation.
Recommendation 9
9.1
Ministers emphasise that standardised test results should be used in conjunction
with school-based assessments in judging students’ progress and in reporting
to parents/carers.
9.2
Ministers emphasise that My School does not compare statistically similar schools
but instead provides information on patterns of achievement and growth of
similar students from the larger Australian population.
NAPLAN Review Final Report
133
Ongoing evaluation
The review has recommended substantial
changes to the national standardised
assessment program. Responding to
stakeholders’ view that NAPLAN may have
narrowed the curriculum, the review has
proposed widening the assessment domains
beyond literacy and numeracy to include
critical and creative thinking in STEM.
Concerned that preparation for NAPLAN
may have dominated the curriculum in
some schools and led to teaching to the
test, the review has proposed decreasing
the number of assessments and moving
the assessments as close as practicable to
the beginning of the school year, making
the assessments less summative and more
formative. Building on feedback about Year
9 students’ attitudes to NAPLAN, the review
has proposed moving the Year 9 assessment
to Year 10, a year in which students are
making decisions about their further study
options. Based on widespread feedback
about the scope, quality and marking of the
current writing assessment, the review has
proposed a substantial process of rebuilding
take place before writing returns as a census
assessment. In preparation for a fully online
assessment, the review has proposed that
reading and numeracy should be completely
redeveloped, capitalising on the flexibility
that the digital form offers for content and
item development. Finally, in consequence
of all these changes to NAPLAN, the review
has proposed that a new time series be
established, beginning in the year in which
NAPLAN Online is fully implemented.
The scope of these proposed changes, if
implemented, and the degree to which
they succeed in improving the scope and
quality of assessment and ameliorating the
potential negative consequences of national
standardised assessment, are sufficient to
warrant a formal evaluation of the program.
Recommendation 10
Undertake a formal evaluation of any changes made to the national standardised
assessment program, with particular attention to the costs and benefits of these
changes for students, teachers, schools and school systems/sectors.
NAPLAN Review Final Report
134
Links to terms of reference and proposed timeline
The ten recommendations above are shown in Table 24 together with their links to the terms
of reference for the review and with suggested timing for implementation for each part of
each of the recommendations.
Table 24: Terms of Reference, recommendations and timeline
Term of Reference
Recommendation
Term of reference 1
Recommendation 1:
Standardised assessment and
the role of NAPLAN
Determine what the
objectives for standardised
testing in Australia should
be, given its evolution
over time – this could be
objectives that support:
• individual student learning
achievement and growth
1.1
Ministers re-endorse the importance of
standardised testing in Australian school
education for:
b. School system accountability
and performance.
• system accountability and
performance
c. School improvement.
• national, state and territory
programs and policies;
2020
a. Monitoring progress towards
national goals.
• school improvement
• information for parents/
carers on school and
student performance
Timeline
d. Individual student learning
achievement and growth, noting
the limitations on use in detailed
diagnosis of learning deficiencies
and difficulties due to the degree
of uncertainty in measures of
individual students.
e. Information for parents/carers on
student and school performance.
NAPLAN Review Final Report
135
Term of Reference
Recommendation
Term of reference 2
Recommendation 1:
Role of NAPLAN
Assess how well placed
NAPLAN is to meet these
objectives, including:
• the appropriateness,
accuracy and efficacy of
assessment in each domain
• the effectiveness in
tracking student and
system progress over time
(including the impact
of equating, and the
placement of the tests in
years 3, 5, 7 and 9)
Timeline
1.2
Ministers re-affirm the role of national
standardised assessment in fulfilling
these purposes.
2020
1.3
Continue to conduct national
standardised assessment as a census
test of student achievement.
2020
1.4
Define the purposes and limitations
of national standardised assessment
by decision of the Ministerial Council
in the manner proposed in Table
23 and communicate this on the
Australian Curriculum, Assessment and
Reporting Authority (ACARA) website
and in communications with schools
and parents/carers.
2020
• alignment with the
Australian Curriculum
(including any gaps)
• the impact of the
assessment on
schools, students and
the community.
Term of reference 3
Consider the key objectives,
uses and features of effective
national assessment programs
internationally, and how the
objectives and performance of
NAPLAN compare with this.
NAPLAN Review Final Report
Recommendation 2:
Other national practices
2.1
Ministers note that national assessment
policies and practices vary and that there
are no common features of assessment
among high-achieving countries.
2020
136
Term of Reference
Recommendation
Term of reference 4
Recommendation 2:
Content coverage of tests
Identify targeted
improvements that can be
made to standardised testing
in Australia in the short-term,
including the level of school
and student engagement,
so it better meets the
objectives above.
2.2
Rename the numeracy test as
mathematics, to clarify that it assesses
the content and proficiency strands of
the Australian Curriculum: Mathematics
2020
2.3
Add assessment of critical and creative
thinking in science, technology,
engineering and mathematics (STEM)
to the national standardised census
assessment program, except at Year 3,
and introduce it only after a period of
experimental test development that
demonstrates that valid and reliable
tests have been developed.
2021-22
2.4
Withdraw the current triennial sample
survey of science literacy in Years 6
and 10 and consider replacing it in the
triennial cycle with another covering
both a subject and a general capability
from the Australian Curriculum, such as
history and intercultural understanding.
2023
2.5
Explicitly map the tests to the National
Literacy and Numeracy Learning
Progressions to provide insight into
student learning progress in the
year levels in which the tests are
administered.
2023
2.6
Jurisdictions investigate students’
reasons for absence from NAPLAN
testing and seek to reduce the current
levels of absence, particularly at the
secondary level.
From 2021
Term of reference 5
Identify longer-term
objectives, uses and features
of standardised testing in
Australia within the context of
a future national assessment
landscape; and consider, in
line with these objectives,
longer-term improvements
that can be made to ensure
that Australia has the most
efficient and effective system
for assessing key literacy and
numeracy outcomes at the
national level.
NAPLAN Review Final Report
Timeline
development
2023
introduction
137
Term of Reference
Recommendation
Timeline
Recommendation 3:
Timing of tests
3.1
Conduct NAPLAN tests as early in the
school year as is administratively feasible.
2022
3.2
Set a goal for the results from all
NAPLAN Online tests marked online
being reported to schools, students
and parents/carers within a week of the
conclusion of the testing window.
2020
3.3
Continue to administer NAPLAN tests in
Years 3, 5 and 7 and replace assessments
in Year 9 with assessments in Year 10.
Assessments in Year 9 are not to be
held in 2021.
2022
Recommendation 4:
Rebranding of programs
NAPLAN Review Final Report
4.1
Adopt a new name, Australian National
Standardised Assessments (ANSA),
in recognition of the changes in the
existing tests and the addition of tests
of critical and creative thinking in
science, technology, engineering and
mathematics (STEM).
2022
4.2
Discontinue the National Assessment
Program (NAP) sample survey in science
literacy with the introduction of ANSA in
critical and creative thinking in STEM.
After 2021
4.3
Maintain the National Assessment
Program (NAP) sample surveys in civics
and citizenship and in information and
communication technology literacy on
their current three-yearly cycle. Rename
the program the National Sample
Assessment Program (NSAP).
2022
138
Term of Reference
Recommendation
Timeline
Recommendation 5:
Test development
5.1
Redevelop the reading and mathematics
tests as digital assessments, capitalising
on all the flexibility that the digital form
offers for content and item form, with
no constraint to mirror the current print
versions of the tests.
2022
5.2
Develop the new critical and creative
thinking in science, technology,
engineering and mathematics (STEM)
so that it is ‘born digital’ since it will have
no print form that it might have been
constrained to match.
2023
5.3
Undertake further development of the
branching model and system changes to
see if the branching could be based on
estimates of achievement derived by the
psychometric model and not number of
items answered correctly in the testlets.
2022
5.4
Review the level of the National
Minimum Standards on the NAPLAN
scales to see if they are set too low and
progress work on developing additional
‘proficient’ and ‘highly proficient’
benchmarks.
2022
Recommendation 6:
Writing test changes
NAPLAN Review Final Report
6.1
Undertake significant development work
on a new writing test to be ‘born digital’.
From 2021
6.2
Ensure the new test design
demonstrates clear alignment to the
Australian Curriculum: English, the
Achievement Standards, the General
Capabilities, and National Literacy and
Numeracy Learning Progressions.
2022
6.3
Ensure the new test offers a broadened
range of forms including imaginative,
persuasive and informative genres,
staged across the years of testing (for
example, Year 3: imaginative; Year 5:
imaginative and persuasive; secondary:
persuasive and informative).
2022
139
Term of Reference
NAPLAN Review Final Report
Recommendation
Timeline
6.4
Ensure the prompt clarifies to students
the audience for the writing.
2022
6.5
Extend the assessment time for the
writing test to be sufficient for students
to be able to draft and edit before
producing final copy.
2022
6.6
Develop a ‘digital placemat’ to present
students with an overall concept or
theme and a number of visual and verbal
stimuli they could respond to.
2022
6.7
Withdraw the language conventions test
as a separate test and assess grammar,
punctuation and spelling in the
writing test.
2022
6.8
Allow Year 3 students to hand write
responses and Years 5, 7 and 10 students
to write using a computer.
2022
6.9
Systematically train students in the
use of a keyboard to achieve efficiency
before Year 5, with demonstration
of fluency in typing to be ongoing
throughout schooling.
From 2022
6.10 Simplify the marking rubric with
fewer criteria that are conceptually
independent.
2022
6.11
Trial automated scoring of spelling,
vocabulary, sentence structure and
punctuation in the writing test, while
authorial aspects of writing are to be
scored by teachers.
2022
6.12 Reinstate the writing test as a census
test in 2023, following redevelopment
and evidence from the sample trial.
2023
6.13 Explore digital approaches to support
national marker training and marker
moderation online, including the use
of exemplars and rich commentaries
for a stronger national perspective
to the marking and teacher
judgement contribution.
2022
140
Term of Reference
Recommendation
Timeline
Recommendation 7:
New time series
Establish a new time series, beginning
with the year in which NAPLAN Online is
fully implemented.
2022
Recommendation 8:
Annual reports
Continue to publish annual reports on
performance levels – national, state and
territory and jurisdiction, as well as for
subgroups of interest, such as male and
female, Indigenous, students with a language
background other than English – and trends
in performance levels over time.
2021
Recommendation 9:
Use with school assessments
9.1
Ministers emphasise that standardised
test results should be used in
conjunction with school-based
assessments in judging students’
progress and in reporting to parents.
2020
9.2
Ministers emphasise that My School
does not compare statistically similar
schools but instead provides information
on patterns of achievement and growth
of similar students from the larger
Australian population.
2020
Recommendation 10:
Evaluation
Undertake a formal evaluation of any
changes made to the national standardised
assessment program, with particular
attention to the costs and benefits of these
changes for students, teachers, schools
and school systems.
NAPLAN Review Final Report
2026
141
Appendix 1: Summary of recommendations
Recommendation 1
1.1
Ministers re-endorse the importance of standardised testing in Australian school
education for:
a.
Monitoring progress towards national goals
b. School system accountability and performance
c.
School improvement
d. Individual student learning achievement and growth, noting the limitations on use
in detailed diagnosis of learning deficiencies and difficulties due to the degree of
uncertainty in measures of individual students
e.
Information for parents on student and school performance.
1.2
Ministers re-affirm the role of national standardised assessment in fulfilling these
purposes.
1.3
Continue to conduct national standardised assessment as a census test of student
achievement.
1.4
Define the purposes and limitations of national standardised assessment by decision
of the Ministerial Council in the manner proposed in Table 23 and communicate this on
the Australian Curriculum, Assessment and Reporting Authority (ACARA) website and
in communications with schools and parents/carers.
Recommendation 2
2.1
Ministers note that national assessment policies and practices vary and that there
are no common features of assessment among high-achieving countries.
2.2
Rename the numeracy test as mathematics, to clarify that it assesses the content
and proficiency strands of the Australian Curriculum: Mathematics.
2.3
Add assessment of critical and creative thinking in science, technology, engineering
and mathematics (STEM) to the national standardised census assessment program,
except at Year 3, and introduce it only after a period of experimental test development
that demonstrates that valid and reliable tests have been developed.
2.4
Withdraw the current triennial sample survey of science literacy in Years 6 and
10 and consider replacing it in the triennial cycle with another covering both a
subject and a general capability from the Australian Curriculum, such as history
and intercultural understanding.
2.5
Explicitly map the tests to the National Literacy and Numeracy Learning Progressions
to provide insight into student learning progress in the year levels in which the tests
are administered.
2.6
Jurisdictions investigate students’ reasons for absence from NAPLAN testing and
seek to reduce the current levels of absence, particularly at the secondary level.
NAPLAN Review Final Report
142
Recommendation 3
3.1
Conduct NAPLAN tests as early in the school year as is administratively feasible.
3.2
Set a goal for the results from all NAPLAN Online tests marked online being reported
to schools, students and parents/carers within a week of the conclusion of the
testing window.
3.3
Continue to administer NAPLAN tests in Years 3, 5 and 7 and replace assessments in
Year 9 with assessments in Year 10. Assessments in Year 9 are not to be held in 2021.
Recommendation 4
4.1
Adopt a new name, Australian National Standardised Assessments (ANSA), in
recognition of the changes in the existing tests and the addition of tests of critical and
creative thinking in science, technology, engineering and mathematics (STEM).
4.2
Discontinue the National Assessment Program (NAP) sample survey in science literacy
with the introduction of ANSA in critical and creative thinking in STEM.
4.3
Maintain the National Assessment Program (NAP) sample surveys in civics
and citizenship and in information and communication technology literacy
on their current three-yearly cycle. Rename the program the National Sample
Assessment Program (NSAP).
Recommendation 5
5.1
Redevelop the reading and mathematics tests as digital assessments, capitalising on all
the flexibility that the digital form offers for content and item form, with no constraint
to mirror the current print versions of the tests.
5.2
Develop the new critical and creative thinking in science, technology, engineering and
mathematics (STEM) so that it is ‘born digital’ since it will have no print form that it
might have been constrained to match.
5.3
Undertake further development of the branching model and system changes to
see if the branching could be based on estimates of achievement derived by the
psychometric model and not number of items answered correctly in the testlets.
5.4
Review the level of the National Minimum Standards on the NAPLAN scales to see
if they are set too low and progress work on developing additional ‘proficient’ and
‘highly proficient’ benchmarks.
NAPLAN Review Final Report
143
Recommendation 6
6.1
Undertake significant development work on a new writing test to be ‘born digital’.
6.2
Ensure the new test design demonstrates clear alignment to the Australian Curriculum:
English, the Achievement Standards, the General Capabilities, and National Literacy
and Numeracy Learning Progressions.
6.3
Ensure the new test offers a broadened range of forms including imaginative,
persuasive and informative genres, staged across the years of testing (for example,
Year 3: imaginative; Year 5: imaginative and persuasive; secondary: persuasive
and informative).
6.4
Ensure the prompt clarifies to students the audience for the writing.
6.5
Extend the assessment time for the writing test to be sufficient for students to be
able to draft and edit before producing final copy.
6.6
Develop a ‘digital placemat’ to present students with an overall concept or theme
and a number of visual and verbal stimuli they could respond to.
6.7
Withdraw the language conventions test as a separate test and assess grammar,
punctuation and spelling in the writing test.
6.8
Allow Year 3 students to hand write responses and Years 5, 7 and 10 students to write
using a computer.
6.9
Systematically train students in the use of a keyboard to achieve efficiency before
Year 5, with demonstration of fluency in typing to be ongoing throughout schooling.
6.10 Simplify the marking rubric with fewer criteria that are conceptually independent.
6.11
Trial automated scoring of spelling, vocabulary, sentence structure and punctuation
in the writing test, while authorial aspects of writing are to be scored by teachers.
6.12 Reinstate the writing test as a census test in 2023, following redevelopment and
evidence from the sample trial.
6.13 Explore digital approaches to support national marker training and marker moderation
online, including the use of exemplars and rich commentaries for a stronger national
perspective to the marking and teacher judgement contribution.
Recommendation 7
Establish a new time series, beginning with the year in which NAPLAN Online is
fully implemented.
NAPLAN Review Final Report
144
Recommendation 8
Continue to publish annual reports on performance levels – national, state and
territory and jurisdiction, as well as for subgroups of interest, such as male and female,
Indigenous, students with a language background other than English – and trends in
performance levels over time.
Recommendation 9
9.1
Ministers emphasise that standardised test results should be used in conjunction with
school-based assessments in judging students’ progress and in reporting to parents.
9.2
Ministers emphasise that My School does not compare statistically similar schools
but instead provides information on patterns of achievement and growth of similar
students from the larger Australian population.
Recommendation 10
Undertake a formal evaluation of any changes made to the national standardised
assessment program, with particular attention to the costs and benefits of these
changes for students, teachers, schools and school systems.
NAPLAN Review Final Report
145
Appendix 2: Review of NAPLAN terms of reference
Background
NAPLAN has been in place since 2008, and is evolving with the introduction of
online testing.
Noting changes in the broader education landscape, both nationally and within states and
territories, it’s important to consider how NAPLAN can continue to support an effective and
contemporary national assessment environment.
The review will be delivered jointly by the state governments of the Australian Capital
Territory, Queensland, New South Wales and Victoria.
The review will be informed by and build on work already undertaken or underway, including
work that has considered the extent to which NAPLAN has met its original objectives
(see below).
Terms of reference
The review will:
1.
determine what the objectives for standardised testing in Australia should be,
given its evolution over time - this could be objectives that support:
• individual student learning achievement and growth
• school improvement
• system accountability and performance
• information for parents on school and student performance
• national, state and territory programs and policies;
2.
assess how well placed NAPLAN is to meet these objectives, including:
• the appropriateness, accuracy and efficacy of assessment in each domain
• the effectiveness in tracking student and system progress over time (including the
impact of equating, and the placement of the tests in years 3, 5, 7 and 9)
• alignment with the Australian Curriculum (including any gaps)
• the impact of the assessment on schools, students and the community;
3.
consider the key objectives, uses and features of effective national assessment
programs internationally, and how the objectives and performance of NAPLAN
compare with this;
4.
identify targeted improvements that can be made to standardised testing in Australia
in the short-term, including the level of school and student engagement, so it better
meets the objectives above;
5.
identify longer-term objectives, uses and features of standardised testing in Australia
within the context of a future national assessment landscape; and
6.
consider, in line with these objectives, longer-term improvements that can be made
to ensure that Australia has the most efficient and effective system for assessing key
literacy and numeracy outcomes at the national level.
NAPLAN Review Final Report
146
Other relevant work that the review will need to consider
The review will build on previous and current work including:
• the 2018 Queensland NAPLAN Review
• the 2018/19 Review of NAPLAN Data Presentation
• reviews associated with NAPLAN Online
• other relevant reviews of NAPLAN.
It will not duplicate the outcomes or findings of this work.
The review will also take into account:
• concurrent work on assessment, including commitments in the National School
Reform Agreement
• other streams of work which might have implications for assessment goals, including
the review of the Melbourne Declaration and updates to the Closing the Gap targets.
Review process
The review will be led by a panel of up to three members, to be appointed by participating
governments. Members will have expertise in assessment, curriculum or other relevant
fields. An international expert will be considered as one of the members or as a key advisor.
The review will also be supported by an inter-jurisdictional reference group of practitioners.
Targeted stakeholder consultation (by invitation) will occur in each stage. This will be
targeted with the aim of supporting outputs for the reports.
The terms of reference for the review, including the proposed reviewers, will be agreed
by relevant ministers.
Review outputs
Interim report
Stage 1 will provide an interim report to Education Council in December 2019 with:
• a statement clarifying the objectives of standardised testing in Australia
• suggested immediate improvements to standardised testing in Australia to better
meet these objectives
• a summary of longer term issues for investigation that will inform stage 2 of the review.
Final report
Stage 2 will report to Education Council in June 2020 (*updated) on a strategic blueprint for
standardised testing in Australia, to be considered in concert with the introduction of new
assessment approaches (including improvements associated with NAPLAN Online and the
national formative assessment capacity).
*Stage 2 will now report to Education Council in September 2020.
NAPLAN Review Final Report
147
Appendix 3: List of stakeholders consulted
Stakeholder consultations
The panel conducted 91 face-to-face and videoconference consultations with 175 individual
stakeholders during the review. These organisations and individuals are named below:
Location
Stakeholder institution
Stakeholder name
ACT
ACT Aboriginal and Torres Strait Islander Elected Body
Maurice Walker
ACT Council of Parents and Citizens Associations
Kirsty McGovern-Hooley
Veronica Elliot
ACT Education Directorate
Katy Haire
Meg Brighton
Deb Efthymiades
Robert Gotts
Kate McMahon
Mark Huxley
Simon Tiller
ACT Government
Ms Yvette Berry MLA
Joshua Ceramidas
Rebecca Hobbs
ACT Principals’ Association
Liz Bobos
Wendy Cave
Association of Independent Schools of the ACT
Andrew Wrigley
Joanne Garrison
Australian Education Union – ACT Branch
Glenn Fowler
Malisa Legyel
Sean van der Heide
NAPLAN Review Final Report
148
Location
Stakeholder institution
Stakeholder name
QLD
Association of Heads of Independent Schools Queensland Branch
Ros Curtis
Australian Catholic University
Michelle Haynes
Catholic School Parents Queensland
Carmel Nash
Catholic Secondary Principals Association of Queensland
Ann Rebgetz
Independent Schools Queensland
Josephine Wise
Michael Gilliver
Joint Council of Queensland Teacher Associations
Danielle Gordon
Sherryl Saunders
QLD Aboriginal and Torres Strait Islander Education and Training
Advisory Committee
Ned David
Anita Lee Hong
Dr Melinda Mann
QLD Association of Combined Sector Leaders
Brian O'Neill
QLD Association of Special Education Leaders
Roselynn Anderson
QLD Association of State School Principals
Leslie Single
QLD Catholic Education Commission
Dr Lee-Anne Perry
Yvonne Ries
QLD Catholic Primary Principals Association
Chris Leeson
QLD Council of Parents and Citizens’ Association
Kevan Goodworth
QLD Curriculum and Assessment Authority
Chris Rider
Brian Short
QLD Department of Education
Tony Cook
Jim Cousins
Racquel Gibbons
Stacie Hansel
Chris Kinsella
Amanda O’Hara
Mick O’Leary
Lesley Robinson
Robyn Rosengrave
Pia St Clair
QLD Government
The Hon. Grace Grace, MP
QLD Independent Education Union
Dr Adele Schmidt
QLD Isolated Children's Parents' Association
Tammie Irons
QLD Secondary Principals’ Association
Mark Breckenridge
QLD Teachers’ Union
Cresta Richardson
NAPLAN Review Final Report
149
Location
Stakeholder institution
Stakeholder name
NSW
Association of Independent Schools NSW
Geoff Newcombe
Robyn Yates
Australian Association of Special Education – NSW Chapter
Sally Howell
Catholic Schools NSW
Dallas McInerney
Danielle Cronin
Council of Catholic School Parents NSW/ACT
Peter Grace
English Teachers Association NSW
Karen Yager
Mel Dixon
Family Advocacy
Karen Tippett
Independent Education Union
Mark Northam
Pat Devery
Isolated Children’s Parents Association NSW
Claire Butler
Lifestart
Sue Becker
NSW Department of Education
Mark Scott
Leslie Loble
Lucy Lu
Rob Johnston
NSW Disability Council
Rachael Sowden
NSW Education Standards Authority
Paul Martin
Sofia Kesidou
NSW Federation of Parents and Citizens
Tim Spencer
NSW Government
The Hon. Sarah Mitchell,
MLC
Meghan Senior
David Cross
NSW Maths Association
Karen McDaid
Maria Quigley
Darius Samojlowicz
NSW Parents Council
Teresa Rucinski
NSW Primary Principals’ Association
Bob Willetts
Scott Sanford
NSW Teachers Federation
Amber Flohm
Denis Fitzgerald
Maurie Mulheron
Professional Teachers Council NSW
David Browne
Secondary Principals’ Association NSW
Craig Petersen
Christine Del Gallo
SPELD NSW
Georgina Perry
Rhonda Filmer
NAPLAN Review Final Report
150
Location
Stakeholder institution
Stakeholder name
VIC
Australian Education Union (Victorian Branch)
Meredith Peace
Justin Mullaly
Catholic Education Melbourne
Bruce Philips
Simon Lindsay
Children and Young People with a Disability
Maeve Kennedy
Council of Professional Teaching Associations of Victoria
Dr Deb Hull
Department of Education and Training Victoria
Jenny Atta
Katherine Whetton
Scott Widmer
David Howes
Gabi Burman
Connie Spinoso
Robert Mizzi
Independent Education Union Victoria and Tasmania
Cathy Hickey
Independent Schools Victoria
Helen Schiele
Sarah Tielman
Mathematical Association of Victoria
Peter Saffin
Parents Victoria
Gail McHardy
Leanne McCurdy
SPELD Victoria
Yasotha V
University of Melbourne
Sandra Milligan
VIC Association for the Teaching of English
Kate Gillespie
VIC Association of State Secondary Principals
Sue Bell
VIC Curriculum and Assessment Authority
Sharyn Donald
Claude Sgroi
VIC Government
The Hon. James Merlino
MP
Noah Elrich
Claudia Laidlaw
VIC Principals’ Association
Anne-Maree Kliman
VIC Student Representative Council
Joe (Year 11)
Anna (Year 12)
Rielly (Year 11)
Sam (Year 10)
Astrid (Year 4)
Nina (Executive Officer)
NAPLAN Review Final Report
151
Location
Stakeholder institution
Stakeholder name
National
Australian Council for Educational Research
Geoff Masters
Catherine McClellan
Dr. Ray Adams
Julian Fraillon
Australian Curriculum, Assessment and Reporting Authority
Peter Titmanis
Australian Institute for Teaching and School Leadership
Professor John Hattie
Australian Literacy Educators Association
Dr Jennifer Rennie
Eveline Gebhardt
Dr Xian-Zhi Soon
Dr Jessica Mantei
Brightpath
Dr Sandy Heldsinger
Learning Difficulties Australia
Dr Lorraine Hammond
Learning Progressions and Online Formative Assessment Initiative
Dr Jenny Donovan
MultiLit
Dr Jennifer Buckingham
Dr Molly de Lemos
Dr Robyn Wheldall
Parents for ADHD Advocacy Australia
Rimmelle Freedman
Alex Yourakelis
Phil Lambert Consulting
Dr Phil Lambert
Practitioners’ Reference Group
Sue Bambling
Kylie Baxter
Paul Bennett
Dale Cain
Gareth Erskine
Denis Fitzgerald
Liam Holcombe
Seir Holley
Steven Kolber
Megan Krimmer
Heidi Livermore
Isaac Lo
Julie Ross
Tonia Smerdon
Greg Terrell
Lina Vigliotta
Sophia Williams
Primary English Teaching Association Australia
Dr Pauline Jones
Robyn Cox
Megan Edwards
University of Western Australia
Dr Stephen Humphry
David Andrich
International
UNSW Gonski Institute
Prof. Pasi Sahlberg
NZ Council for Educational Research
Charles Darr
NZ National Commission for United Nations Educational, Scientific
and Cultural Organisation
Robyn Baker
University of Glasgow
Prof. Louise Hayward
NAPLAN Review Final Report
152
Appendix 4: International practice in standardised
writing assessment
International testing of writing
A scan of international practices of standardised writing assessment was undertaken
to inform this review. The scan included those countries where census testing has been
undertaken at the national level, other large-scale standardised assessments of writing
implemented through states or provinces, and those countries or states (provinces) that
used sampling methodology and opt-in strategies (Table 26).
The scan identified seven countries (Australia, Denmark, Hong Kong, New Zealand, Norway,
Singapore, and the United States) that have implemented national large-scale assessments
of student writing. Additionally, Canada has implemented census testing in two provinces,
Ontario and Manitoba. Australia, Denmark6, Hong Kong, and Singapore are identified as
the only countries implementing census testing of writing at the national level. Scotland’s
implementation of standardised testing in 2017 to 2018 has been excluded as Scotland’s
writing assessment does not include open-ended writing. A further recent testing
initiative, the inter-country large-scale assessment of writing known as the Southeast Asia
Primary Learning Metrics (SEA-PLM), has been included due to the distinctive nature of
the assessment.
The scan took as its focus the following – the purpose of the writing assessment; the
methodology as either census or sample testing; the forms of writing selected and the
scope of the testing (extended writing, spelling, conventions, punctuation); the prompts
and modes of assessment; the test conditions for completing the assessment, including
time duration and completion online or handwritten; the function of criteria scoring;
scoring (human judgement and machine marking); moderation as part of quality assurance
processes; and reporting of results from the writing tests (see Figure 12). These feature areas
relevant to a priority area that emerged during this review of NAPLAN, namely the role of the
profession in standardised testing of writing.
Australia
Australia’s National Assessment Program – Literacy and Numeracy (NAPLAN) is an annual
assessment for students in Years 3, 5, 7 and 9, that ‘tests the fundamental disciplines of
literacy and numeracy’ (ACARA, 2017, p. 1). NAPLAN tests skills in the following four areas (or
‘domains’) – reading, writing, language conventions (spelling, grammar and punctuation)
and numeracy. NAPLAN has been undertaken nationally since 2008. The NAPLAN writing
assessment aligns with the Australian English curriculum and includes the types of texts
that are essential for students to master if they are to be ‘successful learners, confident
and creative individuals, and active and informed citizens’ (MYCEETYA, 2008a, p. 7). The
underlying construct of NAPLAN writing assessment tasks is independent of year level;
6
The census testing of writing in Denmark is undertaken in public schools only with schools in other sectors able to opt-in.
NAPLAN Review Final Report
153
NAPLAN prompts span two-year levels, Years 3 and 5, and Years 7 and 9 respectively; the
design of NAPLAN marking rubrics applies to all prompts, irrespective of year level, with
the reporting of the results on a single scale for all students in Years 3 to 9. There are ten
criteria (ACARA, 2017) that students’ writing is assessed against, divided into genre-based or
authorial criteria and technical or grammatical criteria.
Figure 12: Categories of validity evidence in large-scale assessment of writing
The assessment of writing focuses on two genres (Persuasive and Narrative). One is selected
each year for assessment with two prompts chosen, one for students in Years 3 and 5 and
another chosen for students in Years 7 and 9. Marking is conducted externally with current
and ex-teachers, but the writing assessment is marked independently across the six states
and two territories, with centres monitored by the appointed members or the Marker
Quality Team. Teachers have a limited role in the system; they do not contribute to selecting
assessment writing prompts or participate in marking the student’s scripts unless they apply
to mark NAPLAN external to their teaching roles. Results from student writing are provided
to all schools, students and parents/carers, although there is limited understanding of how
much the data are used for informing teaching and learning (Hardy, 2014).
While the intent of Australia’s NAPLAN is not to determine the pathway for secondary
schooling, the testing in Years 3, 5, 7 and 9 is part of a broader goal in a ‘national approach
to setting educational expectations’ and to provide ‘a national consistent measure to
determine whether or not students are meeting important educational outcomes’
(ACARA, 2020a, unpaginated).
In Australia, school results are currently published online on the platform My School
(ACARA, 2020c). This platform allows comparison of schools, with reported concerns
including the resultant narrowing of the curriculum (Comber, 2012), scrutiny of principals
and teachers (Hardy, 2014, Thompson, 2013), and judgement from media and the
community of school performance (Dulfer et al., 2012, Gorur, 2016).
Canada: Ontario
The Canadian province of Ontario conducts yearly, large-scale, census testing in Years 3, 6, 9
and 10 in reading, writing and maths. Assessment of writing occurs in Year 3 (primary), and
Year 6 (junior) and Year 10. Students in Years 3 and 6 complete the Education Quality and
Accountability Office (EQAO) elementary assessment that tests students with extended
response questions. These are marked against a holistic criterion that focuses on topic
development and conventions. EQAO results are reported at the provincial, school board and
school levels and are used by the Ministry of Education, district school boards and schools to
improve learning, teaching and student achievement (EQAO, 2007).
NAPLAN Review Final Report
154
Year 10 students complete the Ontario Secondary School Literacy Test (OSSLT), which is a
provincial test of literacy (reading and writing) skills students have acquired by Grade 10.
It is based on the literacy skills expected in the Ontario Curriculum across all subject areas
up to the end of Grade 9. The writing assessment involves two extended writing tasks
and two short-writing tasks (six lines each). EQAO reports on student achievement at the
individual, school, board and provincial levels. Students who participate in the OSSLT receive
an Individual Student Report that indicates whether they have successfully completed the
OSSLT. Schools and boards will also receive a report that provides aggregated achievement
results, aggregated contextual data about students’ literacy preferences and practices and
provincial results (EQAO, 2020).
EQAO recruits as many teacher-markers (that is, members of the Ontario College of
Teachers) as possible and fills the complement with retired educators and qualified noneducators (defined as other-degree markers). All potential scorers must pass a qualifying
test to ensure they have sufficient proficiency in English or French. A blind scoring model is
used with two markers scoring the scripts.
If the two scores are in exact agreement, that score is assigned to the student. If the two
scores are adjacent, the higher score (for reading and short-writing tasks) or the average
of the two scores (for news reports and paragraphs expressing an opinion) is assigned to
the student. If the two scores are non-adjacent, the response is scored again by an expert
scorer to determine the correct score for the student (EQAO, 2020, p. 15).
Canada: Manitoba
The Manitoba Education Department takes the position that the primary role of assessment
is to ‘enhance teaching and improve student learning and supports this through the
Provincial Assessment Initiative and the Provincial Assessment Program’ (Manitoba, 2020a,
unpaginated). The primary purpose of the Middle Years Assessment policy is to enhance
student learning and engagement through classroom-based assessment processes that
build student awareness and confidence in learning.
Manitoba students in Grades 8 undergo classroom-based assessments in writing. Teachers
base assessments of their students on their observations, conversations with students, and
their evaluations of students’ classroom-based work. They report on student performance
levels as of the last two weeks of January. Evaluation criteria, including the competencies
and scoring scales with descriptions and examples, are provided by the department and
are used by teachers when reporting achievement results for these assessments to parents/
carers and to the department (Manitoba Education Department, 2020b). A summary report
is published on the department’s website as aggregated results (Manitoba Education and
Training, 2020c)
While there is a national test in Canada called the Pan-Canadian Assessment Program, it is a
sample test and does not assess the domain of writing.
NAPLAN Review Final Report
155
Denmark
Danish National Tests were implemented in the public compulsory schools as a mean
of evaluating the performance of the public-school system (Folkeskole) in 2010. Census
assessment of Year 9 Folkeskole students is not compulsory for students in private schools
and these schools may opt-in. Denmark’s extensive test program consists of ten mandatory
tests in six subjects in grades 2 through to grade 8, although the assessment of writing is
not included in these national tests.
At the conclusion of Years 9 and 10, Folkeskole students complete school-leaving
examinations, which are compulsory in Year 9 but voluntary in Year 10. Of the compulsory
five examinations, one is a written examination in Danish. These assessment genres can take
the form of literary fiction, journalistic genres or an essay. Text materials form a part of the
assessment prompt to provide inspiration for student writing (Krogh, 2018). Examinations
are graded by the classroom teacher and an external censor on a 7-point ordinal scale: -3, 00,
02, 4, 7, 10, and 12, with marks 02 or above considered a pass (Beuchert & Nandrup, 2015).
In conjunction with the leaving examinations, a mandatory interdisciplinary project
assessment needs to be submitted and is assessed ‘in a written statement on the content,
working process and presentation of the final result’ (Ministry of Children and Education,
2020, unpaginated). At the student’s request, a mark may be awarded and be indicated in
the leaving certificate.
According to an OECD Review of the Evaluation and Assessment in Education (2011),
the introduction of the national tests offered monitoring information on the Folkeskole
at different stages in compulsory education and provided ‘the first real opportunity to
reliably monitor progress in educational outcomes over time against the national Common
Objectives. However, the lack of inclusion of the private sector limits their national
monitoring value’ (Shewbridge, Jang, Matthews, & Santiago, 2011 p.9). In this review it was
also noted that there is a need for the national tests to include open-ended questions in
Years 2 to 8.
During the lower secondary leaving examination students are provided “small booklets
containing assignments and various text materials are made available to students” (Krogh,
2018, p. 10) as support material for the completion of the writing assessment. Visuals are
typically embedded with the prompt and provided as ‘inspiration for student writing, only
rarely to be addressed explicitly’ (Krogh, 2018, p. 10).
Marking is conducted by teachers and ‘a sample of examinations are marked by an
external censor [who] provides an equitable way to judge whether students have achieved
the national Common Objectives’ (Shewbridge et al., 2011, p. 51). The secondary leaving
examination is aligned to the National Common Objectives and four reports are customised
for different stakeholders. Students, parents/carers and teachers are provided with student’s
test results, school leaders are provided with their school’s test results, the Municipality
receives the average score of the schools in the municipality, and at the national level, the
national average test result for all schools is published and made available to the public
(Houlberg et al., 2016).
NAPLAN Review Final Report
156
Hong Kong
The Territory-wide System Assessment (TSA) is administered by the Hong Kong Examination
and Assessment Authority. The intent of the assessment is to provide schools with
information about the performance of students in primary 3 (P.3), primary 6 (P.6) and
secondary 3 (S.3) including strengths and weaknesses against specific Basic Competencies
and to ‘to help schools understand students’ overall academic standards in the main key
learning areas and as a reference for the follow-up action of learning and teaching’ (Hong
Kong Examinations and Assessment Authority, 2019, p. 1). A related intent is to ‘help the
Government to review policies and to provide focused support to schools’ (Hong Kong
Examinations and Assessment Authority, 2020, unpaginated). The TSA began as a national
census test for students in Years 3, 6 and 9, however starting from 2012, an alternate-year
arrangement has been adopted in the Year 6 assessment, with Year 3 students assessed
using a sampling method. The assessment is low-stakes and does not determine secondary
placements or pathways. The writing assessment design focuses on letter, narrative or
descriptive writing with the criteria for Years 3 and 6 focusing on Content and Language
with Year 9 including two extra criteria, Organisation and Features.
Support is offered to teachers through an online platform (web-based learning and teaching
support) which provides teaching activities and materials for addressing students’ ‘relevant
learning difficulty in Basic Competencies’ (Hong Kong Examinations and Assessment
Authority, 2019, p. 1). Schools are also encouraged to make use of the data to adjust teaching
plans and teaching strategies.
Moderation Committees are formed, which consist of serving teachers or school heads, a
professional staff member of a tertiary institute, and subject officers and managers from the
Education Bureau. Students receive randomly allocated writing assessment prompts with
teachers playing no role in prompt selection. Markers for the TSA are all qualified serving
teachers with a requirement of an attainment of the ‘Language Proficiency Assessment for
Teachers in English’ (Hong Kong Examinations and Assessment Authority, 2019) before being
employed as a marker. Extensive training precedes the scoring of the writing assessments.
The school reports
provide detailed data on the performance in the sub-papers for individual learning
dimensions (skills) of individual subjects as well as data at the territory-wide level for
reference to help schools identify the overall strengths and weaknesses of students in
learning (Hong Kong Examinations and Assessment Authority, 2019, p. 15).
The intent is for schools to use ‘the relevant data to adjust their school-based curriculum,
teaching strategies and activities’ (p. 15). The performance of individual students is
not included in all reports, ‘which are strictly confidential and provided for schools’
reference only’ (p. 15).
NAPLAN Review Final Report
157
New Zealand
The New Zealand National Monitoring Study of Student Achievement (NMSSA) is a national
sample test that assesses eight learning areas specified in the New Zealand Curriculum. One
test of writing was conducted in 2012, which specifically targeted Years 4 and 8 students. The
NMSSA aligned to the Literacy Learning Progressions and English Curriculum, and while the
NMSSA only tested writing in 2012, a National Report was published. The mode of delivery
was pencil and paper and the task focused on the narrative genre. The writing assessment
was based on the Electronic Assessment Tools for Teaching and Learning (e-asTTle)
framework with external markers trained to rate student writing.
The testing of writing as part of NMSSA has not been undertaken since 2012. Instead the
New Zealand Ministry of Education provides an optional assessment for students called
e-asTTle, an online assessment tool developed to assess students’ achievement and progress
in reading, mathematics, writing, and in pānui, pāngarau, and tuhituhi. The writing tool
has been developed to assess students in Years 1 to 10 and can be used formatively at any
time during the year as determined by the teacher. Further, teachers mark the writing
assessments with support from published manuals and the data is used as part of the
teachers’ wider collection of information about students’ learning and progress.
New Zealand provides e-asTTle free of cost to all schools. The flexibility of the online
assessment provides autonomy for teachers, enabling them to make decisions regarding
timing of the assessment, selection of appropriate writing prompts for students, and
engages their professional judgement in scoring student scripts through the provision of
support material with manuals and exemplars to mark student writing. E-asTTLE is aligned
to the New Zealand Curriculum Reading and Writing Standards for Years 1 to 8. The results
are reported through the platform and professional support is provided to teachers via
extensive and relevant resources for addressing and raising student achievement. These
resources are also available to parents/carers, school managers and trustees (e-asTTle, 2020).
Norway
National census tests in Norway were launched initially in 2005 in response to concerns
that students were not receiving adequate instruction in ‘key competencies’ (Official
Norwegian Report (green paper): NOU 2002), however an evaluation of the writing test
found low rater reliability resulting in the discontinuation of the writing test in 2006 (Skar,
2017). Instead, the Norwegian Directorate for Education and Training commissioned the
National Writing Centre (NWS) to develop the Norwegian Sample-Based Writing Test
(NSBWT), an annual writing test on a nationally representative sample of students in primary
and lower secondary school. The writing assessment design created the option of three
genres, persuasion, description and imagination and was marked against five rating scales.
A sample of teachers was also asked to complete a survey relating to student writing as part
of the report of the writing sample results.
NAPLAN Review Final Report
158
The intent of the project was to ‘set up a national panel of raters (NPR), consisting of
teachers, with the purpose of 1) establishing a strong interpretive community and 2) having
in place a panel that would reliably rate the NSBWT’ (Skar & Jølle, 2017, p. 1). The tasks and
rating scales of the NSBWT were represented in the theoretical model called the wheel of
writing, a theoretical frame for specifying standards of writing proficiency for the teaching
profession (Berge, Evensen, & Theygesen, 2016). The NSBWT defined writing proficiency as
‘the proficiency to engage in an act of writing using necessary mediating tools’ (Skar, 2017,
p. 5). Fulfilling the intent of the project was to work with teachers as raters and also to have
teachers complete survey questions relating to – the relevance of the writing assessment
task; student motivation to complete the task; and sufficient time for task completion (Skar,
2017). Results from marking the sample of student scripts and data from the survey, were
published in a National Technical Report.
While the Norwegian Centre for Writing Education and Research (The Writing Centre) is still
in operation to support teacher learning in writing education, the NSBWT was discontinued
in 2017 due to state budget cuts (Jeffery, Elf, Skar & Campbell Wilcox, 2019).
Singapore
Singapore’s Primary School Leaving Examination (PSLE) is a national census, high-stakes,
large-scale assessment that is used to ‘gauge students’ readiness and aptitude to proceed
to higher level of schooling – either to select or place students appropriately’ (SEAMEO
INNOTECH, 2015, p. 133). The PSLE writing assessment is divided into two types of writing
defined as ‘situational’ and ‘continuous’. The situational writing requires students to write a
short functional piece that could be either a letter, email or report. The continuous writing
requires students to write an extended response on a given topic and multi-modal options
may be used as prompts. Writing tests span multiple languages inclusive of, English and
other mother tongue languages for example, Bengali, Gujarati, Hindi, Panjabi and Urdu.
The writing assessment is designed by ‘a professional panel of specialists with assessment
and subject expertise’ (Singapore Examinations and Assessment Board, 2020, unpaginated)
and is aligned to the Primary School Curriculum.
The results from the PSLE contribute to decisions about a students’ secondary school
pathway. On completion of the tests, and dependent on the results, students will progress
along one of three options – an express course, a normal academic course, or a normal
technical course (SEAMEO INNOTECH, 2015). The PSLE aligns with what Verger, Parcerisa
and Fondevila (2019) define as one of three uses or purposes of national assessments namely
the ‘Assessment for students’ certification, streaming and selection purposes [and are]
standardised examinations that are high stakes for students, but not necessarily for schools’
(p. 10). As Singapore’s PSLE is designed for streaming and selection purposes for secondary
pathways, the PSLE could be seen as high-stakes for students.
Key Marking Personnel (KMP) made up of Principals, Vice-Principals and Heads of
Department engage in moderation practices of the marking scheme and application of the
scores to the scripts. Once consensus has been reached regarding the application of criteria
to writing scripts by the KMP, markers begin the process of scoring student scripts. KMP at
various levels sample check the marking to ensure the accuracy and consistent application
of the criteria to the student scripts.
NAPLAN Review Final Report
159
The Ministry of Education and the Singapore Examinations and Assessment Board present
a joint press release on student results that indicates the total number of primary students
involved as well as percentages of students progressing into the Express course, the Normal
(Academic) course, and the Normal (Technical) course.
United States of America
In the United States, the National Assessment of Educational Progress or NAEP is the
‘largest nationally representative and continuing assessment’ (National Center for Education
Statistics (NCES), 2019, p. 1) of student achievement in civics, economics, geography,
mathematics, music and visual arts, reading, science, technology and engineering
literacy, U.S. history, and writing. The assessment is ‘a congressionally mandated project
administered by the NCES within the U.S. Department of Education and the Institute of
Education Sciences (IES)’ (NCES, 2019, p. 2). This assessment involves a representative sample
of students across the country in the domain of writing with the test including Years 8
and 12 students nationally in 2011. Further testing in 2017 of students in Years 4 and 8, was
conducted online and the report targeted for release in 2020 expected to provide insight
into the future design and administration of digitally based NAEP writing assessments
(NCES, 2017). The next NAEP writing assessment is scheduled for 2029 (NCES, 2020a). The
NAEP is aligned to the Common Core State Standards.
The nature of the writing assessment is the choice of two types of genres, that is, to explain
or to persuade. The assessment is online, and prompts are presented to students in a variety
of ways including text, audio, photographs, video or animation. Standard tools for editing,
formatting and viewing writing are included as part of the online platform as is the ability
to use spell check and a thesaurus. Marking is centralised and 25% of scripts are doublemarked and 5% are check marked (NCES, 2020b). Results from the NAEP are published in
the form of a National report.
In addition to the US NAEP testing, individual states may opt to implement state-based
tests of writing. As an example, two consortia that test writing across US states are the
Smarter Balanced Summative consortium and the Partnerships for Assessment of
Readiness for College and Careers (PARCC) consortium. Both provide writing tests to
member states across multiple stages of schooling and multiple text types. The tests are
designed to function as summative assessments of student writing skills, linking to the US
Common Core State Standards, and can be implemented annually according to each state’s
testing window.
Another example of state-based implementation of large-scale assessment of writing in the
US is the New York State English Language Arts tests (ELA). This is an annual census test for
all students in Years 3 to 8. The writing test design asks students to complete both shortresponse writing and an extended response. Extended response questions are designed to
assess writing from sources and will ask students to express a position and support it with
textual evidence to support their ideas. The intent of ELA is ‘to ensure that schools prepare
students with the knowledge and skills they need to succeed in college and in their careers’
(New York State Education Department, 2020, p. 1). Results from the assessment are used to
assess whether students are meeting the New York State P-12 Learning Standards.
NAPLAN Review Final Report
160
Inter-country large-scale standardised assessment
Lastly, in 2012, the Southeast Asian Ministers of Education Association (SEAMEO) and UNICEF
initiated the Southeast Asia Primary Learning Metrics (SEA-PLM) in an effort to assess and
monitor students’ acquisition of knowledge and skills and to further improve the quality of
primary education in Southeast Asia. This distinctive inter-country, large-scale assessment
of writing aims to inform policy makers in the participating countries of the progress of
educational development in their respective countries. SEA-PLM assesses mathematical
literacy, reading literacy, writing literacy and global citizenship.
Of the eleven countries that are involved in the program, only six countries (Myanmar,
Vietnam, Lao PDR, Cambodia, Malaysia, Philippines) implement the writing literacy
assessment component of the suite of assessments. The sample-based assessment targets
Year 5 students and utilises the genres of narrative, description, persuasion, instructional
and transactional as part of the assessment. The program is designed as a samplebased assessment, with the intent of generating information to assist other stakeholders,
such as teachers, parents/carers and students, in improving learning at the local level
(UNICEF & SEAMEO, 2019a).
The assessment reflects a global policy priority of increasing access to education for children
and the use of data to monitor progress towards national targets for improvement (UNESCO,
2019). As part of SEA-PLM, the cross-national sample writing assessment ‘marks the first
cross-national initiative to measure writing literacy understood as the ability to construct
meaning by generating a range of written texts to express oneself and communicate
with others to meet personal, societal, economic and civic needs.’ (UNESCO, 2019, p. 42).
The SEA-PLM draws on the curricula of six countries (mentioned above) to develop the
assessment framework.
The extended writing assessment criteria had the challenge of measuring writing in a
multilingual assessment and achieving equivalence across languages. To work through
this challenge,
‘the SEA-PLM writing literacy assessment model treats some writing processes as
common across languages, while others may be treated as applicable only to one
language or to a group of languages. This approach will yield some comparisons
between writing performance in different languages, while recognising the particular
characteristics of individual languages’ (UNICEF & SEAMEO, 2019a, p. 41).
NAPLAN Review Final Report
161
The Australian Council for Educational Research (ACER) was contracted to design and
implement the first round of SEA-PLM assessment. How students perform and the success
of the testing is yet to be released, with the results from the first SEA-PLM due in 2020.
The criteria used are in conjunction with numeric scores, but a degree of flexibility needed
to be created to achieve equivalence across languages. An example of the model for
writing assessment processes by language and text type is shown in Table 25 (UNICEF &
SEAMEO, 2019a, p. 41).
Table 25: Model for assessment of writing in multiple languages
Process
Application by language
Application by type of text
Generate ideas
Apply across languages
Vary by text type
Control structure
Apply across languages
Vary by text type
Manage coherence
Apply across languages
Apply cross text types
Use vocabulary
Apply across languages
Apply cross text types
Control syntax and grammar
May vary by language
Apply cross text types
Other language-specific features
May vary by language
Apply cross text types
Data standards have been implemented to ensure the ‘comparability of data across each
of the participating countries, the responses from all test participants should be coded
following a single coding scheme’ (UNICEF & SEAMEO, 2019b, p. 21) and ‘coders’ are recruited
and trained to adhere to ‘agreed procedures’. While the intent is for the results to be used
by policy makers and to help teachers inform practice, the role of the teacher in the system
appears to be removed and replaced by external contractual arrangements. Subject to
conditions of implementation, expectations of data use, and the engagement of the
profession, the role of teachers has the potential to grow.
Summary
This section has described international practice in the assessment of writing in seven
countries. Of these, Australia, Hong Kong, Denmark and Singapore are selected countries
currently implementing national census testing of writing.
How a country tests writing reflects interrelated decisions about – the purposes of testing;
the stages of schooling to be included; the curriculum domains and related forms of writing
to be tested; how the writing is to be scored (including criteria, judgement method, human
and machine scoring); the role of the profession; quality assurance processes including
online moderation; intended uses of the reported results, and to whom and how they are
released. All these matters are central to a decision about whether a test is fit-for-purpose.
Why countries implement or do not implement national census testing of writing and their
approach to implementation merit further exploration. Finally, the scan shows growing
interest internationally in the role of the profession in national testing, including for system
monitoring purposes, and the role of teacher judgement in scoring and interpreting the
results for use in the classroom.
NAPLAN Review Final Report
162
Table 26: International large-scale assessment of writing
Country
National or Sample or
state-based Census
Stage or
schooling
Writing genre
Criteria
Technology
Marking
Australia:
National
Assessment
Program
Literacy and
Numeracy
(NAPLAN)
National
3,5,7,9
• Persuasion
• Narrative
•
•
•
•
Paper and
pencil
and online
Human
markers
Canada:
Ontario
State
Census
3,6
Year 3
• Personal opinion
Year 6
• Letter
• Topic development [0-4]
• Conventions [0-3]
Paper
Human
markers
State
Census
10
• News report
•
•
•
•
Paper
Human
markers
Census
•
•
•
•
•
•
EQAO
Elementary
Assessments
Canada:
Ontario
OSSLT
•
•
NAPLAN Review Final Report
(one page)
Series of paragraphs
expressing an opinion
(two pages)
Two short-writing
tasks (six lines each).
Audience
Text Structure
Ideas
Persuasive devices/ Character
and setting
Vocabulary
Cohesion
Paragraphing
Sentence structure
Punctuation
Spelling
Clarity of communication,
Development of ideas,
Organisation
Language conventions
and usage.
163
Country
National or Sample or
state-based Census
Stage or
schooling
Writing genre
Criteria
Technology
Marking
Canada:
Manitoba
State
8
• Expository
• Classroom writing
• Ideas (generates, selects
Online
(using spell
check,
thesauruses,
and
dictionaries)
to edit and
proofread.
Teachers
mark own
classroom
writing
and
submit
scores
Paper
Teacher
and
external
‘censor’
Census
summative
results based on
achievement as of
the last two weeks
of January.
Denmark:
School-leaving
examinations
(SLE)
National
(for
Folkeskole
schools)
NAPLAN Review Final Report
Census
9, 10
Two types of written
assessment:
1. Written examination
under exam
conditions
– Literary fiction
– Journalistic
genres
– Essay
Use of text materials,
typically embedded in
visuals.
2. A mandatory
project assignment
gives students the
opportunity to
complete and present
an interdisciplinary
project. The project
assignment is
assessed in a
written statement
on the content,
working process and
presentation of the
final result.
•
•
•
and organises)
Language (word choice
and sentence patterns)
Conventions (spelling, grammar,
and/or punctuation)
Resources (spell-checker,
thesaurus, dictionaries) to
edit and proofread.
Marks are awarded according to a
7-point marking scale
12 Excellent performance,
high command, few minor
weaknesses
10 Very good performance,
high command, with minor
weaknesses
7
Good performance, good
command, some weaknesses
4
Fair performance, some
command, major weaknesses
2
Meeting only the minimum
requirements for acceptance
0
Does not meet the minimum
requirements for acceptance
-3
Unacceptable in all respects
164
Country
National or Sample or
state-based Census
Stage or
schooling
Writing genre
Criteria
Hong Kong:
Territorywide System
Assessment
(TSA)
National
Census
(with the
exemption of
Year 3)
3, 6, 9
• Letter
• Narrative
• Description
Year 3 and 6
Paper and
• Content (level of detail, structure, pencil
ideas and clarity).
• language (e.g. vocabulary,
sentence patterns, cohesive
devices, grammar, punctuation,
capitalisation and spelling)
Secondary (Year 9) has two
additional criteria
• Organisation (Paragraphs,
coherent links and connectives)
• Features (Structure e.g. letter
format, description and speech
in narration).
Human
markers
Myanmar,
Vietnam,
Lao PDR,
Cambodia,
Malaysia,
Philippines:
Southeast
Asia Primary
Learning
Metrics (SEAPLM)
Inter-Country Sample
5
•
•
•
•
•
Narrative
Descriptive
Persuasive
Instructional
Transactional
•
•
•
•
•
•
Generate ideas
Paper and
pencil
Control structure
Manage coherence
Use vocabulary
Control syntax and grammar
Other language-specific features
(spelling, character formation
and punctuation)
Human
markers
New Zealand:
-e-asTTle
National
1-10
•
•
•
•
•
Describe
Explain
Recount
Narrate
Persuade
•
•
•
•
•
•
•
Ideas
Structure and language
Organisation
Vocabulary
Sentence structure
Punctuation
Spelling
Human
markers
(classroom
teachers)
NAPLAN Review Final Report
Optional
Technology
Paper and
Pencil.
Scripts are
scored offline
and marks
entered into
e-asTTle
system.
Marking
165
Country
National or Sample or
state-based Census
Stage or
schooling
Writing genre
Criteria
Technology
Marking
New Zealand:
National
Monitoring
Study of
Student
Achievement
(NMSSA)
National
Sample
4,8
• Narrative
Writing for a variety of purposes
Based on the e-asTTle framework
(see above).
Process of writing
This comprised seven elements:
• Audience awareness
• Planning
• Crafting/writing
• Revising and editing
• Proofreading
• Feedback
• Publishing
Paper and
pencil,
one to one
interviews and
questionnaire.
Human
markers
Norway:
Norwegian
Sample-Based
Writing Test
(NSBWT)
2010-2016
National
Sample
5, 8
• Persuade
• Describe
• Imagine
•
•
•
•
•
Paper and
pencil
Human
markers
Singapore:
National
Primary
School Leaving
Certificate
(PSLE)
Census
6
Situational Writing:
• AO1 write to suit purpose,
• letter
• email
• report
Human
markers
•
Paper and
audience and context in a way
pencil
that is clear and effective
AO2 use appropriate register and
tone in a variety of texts
AO3 generate and select
relevant ideas, organising and
expressing them in a coherent
and cohesive manner
AO4 use correct grammar,
spelling and punctuation
AO5 use a variety of vocabulary
appropriately, with clarity
and precision
NAPLAN Review Final Report
Continuous Writing:
• Three pictures will
be provided on
the topic offering
different angles of
interpretation.
• Candidates may also
come up with their
own interpretation of
the topic.
•
•
•
Writer-reader interaction
Content
Text structure
Language use
Coding competencies (e.g.
grammar, spelling and
punctuation)
166
Country
National or Sample or
state-based Census
United States: State
PARCC
(Partnerships
for Assessment
of Readiness
for College and
Careers)
Optional
United States:
New York
English
Language Arts
(ELA) test
Census
State
Stage or
schooling
Writing genre
Criteria
Technology
Marking
K-11
• Research Simulation
•
•
•
•
Online
External
human
markers
Option for
computer or
pen/paper
Human
markers
Online (Tasks):
Human
markers
•
•
3-8
Task (RST)
Literacy Analysis Task
(LAT)
Narrative Writing Task
(NWT)
• Text provided:
• Short answers
• Text provided:
Extended Response
Development of ideas
Organisation
Clarity of language
Knowledge of language
and conventions.
Short response (2 point)
Holistic Rubric
• Make a claim,
• Take a position or
• Draw a conclusion,
(complete sentences)
Extended response (4 point)
Holistic Rubric
• Content and Analysis,
• Command of Evidence
• Coherence, Organisation
•
United States:
NAEP 2017
US national
sample
assessment
National
NAPLAN Review Final Report
Sample
8,12
• Explanation
• Persuasion
• Convey experience
(real or imagined).
and Style
Control of conventions
• Development of idea
• Organisation of ideas
• Language facility
and conventions
•
•
•
•
•
Text
Audio
Photographs
Video
Animation.
167
References
Academic Assessment Services. (2020). Tracking student progress. Accessed on 29 June
2020 on http://www.academicassessment.com.au.
Australian Bureau of Statistics. (2020). 2001 Census of Population and Housing – Geographic
Areas. Accessed on 27 June 2020 on https://www.abs.gov.au/websitedbs/d3110124.
nsf/497f562f857fcc30ca256eb00001b48e/53bbe9630b24d6f4ca256c3a000475b8!Open
Document#Collection%2520District%2520(CD).
Australian Capital Territory, Budget Statements. (2019). Education Directorate. Canberra;
Author. https://apps.treasury.act.gov.au/__data/assets/pdf_file/0007/1369789/F-EducationDirectorate.pdf.
Australian Council for Educational Research (ACER). (2018). Scottish National Standardised
Assessments: national report for academic year 2017 to 2018. Edinburgh: Scottish
Government. Accessed on 16 June 2020 on https://www.gov.scot/publications/scottishnational-standardised-assessments-national-report-academic-year-2017-2018/pages/2/.
ACER. (2020a). Progressive achievement: tests, teaching resources and professional
learning. Accessed on 29 June 2020 on https://www.acer.org/au/pat.
ACER. (2020b). Scottish National Standardised Assessments: national report for academic
year 2018 to 2019. Edinburgh: Scottish Government. Accessed on 16 June 2020 on https://
www.gov.scot/publications/scottish-national-standardised-assessments-national-reportacademic-year-2018-2019/
ACARA (2010). Narrative Marking Manual. Retrieved from: https://www.nap.edu.au/_
resources/2010_Marking_Guide.pdf.
ACARA (2013a). Persuasive Marking Manual. Retrieved from: https://www.nap.edu.au/docs/
default-source/resources/2013_persuasive_writing_marking_guide.pdf
ACARA (2013b). Submission to the Senate Inquiry, The Effectiveness of the National
Assessment Program – Literacy and Numeracy. https://www.aph.gov.au/Parliamentary_
Business/Committees/Senate/Education_and_Employment/Naplan13.
ACARA (2017). The Australian National Assessment Program Literacy and Numeracy
(NAPLAN) assessment framework: NAPLAN Online 2017-2018. https://www.nap.edu.au/docs/
default-source/default-document-library/naplan-assessment-framework.pdf?sfvrsn=2
ACARA (2018a). About us. Retrieved from: https://www.acara.edu.au/about-us
ACARA (2018b). Colmar Brunton Report. Accessed on 1 July 2020 on https://acaraweb.
blob.core.windows.net/acaraweb/docs/default-source/assessment-and-reportingpublications/2018-naplan-online-parent-research.pdf?sfvrsn=2.
ACARA (2018c). International Comparative Study: The Australian Curriculum and The
British Columbia New Curriculum. Retrieved from: https://www.australiancurriculum.edu.au/
media/3923/ac-bcc-international-comparative-study-final.pdf
NAPLAN Review Final Report
168
ACARA (2018d) International Comparative Study: The Australian Curriculum and
The Singapore Curriculum. Retrieved from: https://www.australiancurriculum.edu.au/
media/3924/ac-sc-international-comparative-study-final.pdf
ACARA (2018e). NAPLAN 2017 Technical report. Sydney: ACARA. Accessed on 6 July 2020 on
https://www.nap.edu.au/results-and-reports/national-reports.
ACARA (2018f). NAPLAN Online Automated Scoring Research Program: Research Report.
Retrieved from: https://www.nap.edu.au/docs/default-source/default-document-library/
naplan-online-aes-research-report-final.pdf?sfvrsn=0
ACARA. (2019a). Australian Curriculum English. Level Description. Retrieved from:
https://www.australiancurriculum.edu.au/f-10-curriculum/english/
ACARA (2019b). NAPLAN national report for 2019. Sydney: Author. Accessed on 25 June
2020 on https://nap.edu.au/docs/default-source/resources/naplan-2019-national-report.
pdf?sfvrsn=2.
ACARA (2020a). About us. Retrieved from: https://www.acara.edu.au/about-us
ACARA (2020b). Assessment: NAPLAN. Accessed on 27 June 2020 on https://www.acara.edu.
au/assessment/naplan.
ACARA (2020c). My School. Retrieved from: https://myschool.edu.au/
ACARA (2020d). NAPLAN. https://www.nap.edu.au/naplan.
ACARA (2020e). NAPLAN 2019 Technical report. Sydney: ACARA. Accessed on 27 June 2020
on https://www.nap.edu.au/results-and-reports/national-reports.
ACARA (2020f). NAPLAN – adjustments for students with disability. Accessed on 27 June
2020 on https://www.nap.edu.au/naplan/school-support/adjustments-for-students-withdisability.
ACARA (2020g). NAPLAN – Australian Curriculum. Accessed on 29 June 2020 on
https://www.nap.edu.au/naplan/australian-curriculum.
ACARA (2020h). NAPLAN Online. Accessed on 29 June 2020 on https://www.nap.edu.au/
online-assessment.
ACARA (2020i). NAPLAN – participation. Accessed on 27 June 2020 on https://www.nap.edu.
au/information/faqs/naplan--participation.
ACARA (2020j). NAPLAN Reports. Retrieved from: https://reports.acara.edu.au/Home/Results
ACARA (2020k). Student reports. Accessed on 27 July 2020 on https://nap.edu.au/results-andreports/student-reports
ACARA (2020l). Terms of Reference – Review of The Australian Curriculum F-10. Retrieved
from: https://www.acara.edu.au/docs/default-source/curriculum/ac-review_terms-ofreference_website.pdf
NAPLAN Review Final Report
169
Angus, M., Olney, H. & Ainley, J. (2007). In the balance: the future of Australia’s primary
schools. Kaleen, ACT: Australian Primary Principals Association. Accessed on 6 July 2020 on
https://appa.asn.au/wp-content/uploads/2020/05/In-the-balance.pdf
Australia, Department of Education, Skills and Employment. (2010). Media release: My School
website launched. https://ministers.dese.gov.au/gillard/my-school-website-launched.
Australian Education Council. (1989). The Hobart Declaration on Schooling. Author. http://
www.educationcouncil.edu.au/EC-Publications/EC-Publications-archive/EC-The-HobartDeclaration-on-Schooling-1989.aspx.
Australia, Parliament. (2008). Bills Digest 60, 2008-09 Australian Curriculum, Assessment and
Reporting Authority Bill. https://www.aph.gov.au/Parliamentary_Business/Bills_Legislation/
bd/bd0809/09bd060.
Berge, K., Evensen, L.S., & Theygesen, R. (2016). The Wheel of Writing: a model of the writing
domain for the teaching and assessing of writing as a key competency, The Curriculum
Journal, 27:2, 172-189, DOI: 10.1080/09585176.2015.1129980
Beuchert, L.V., & Nandrup, A.B. (2015). The Danish National Tests – A Practical Guide.
Economic Working Papers 2014-2025. Retrieved from: https://pdfs.semanticscholar.org/
fc01/9a5636c68d3accd776820fe2638c599cac58.pdf
Bridgeman, B., & Ramineni. C. (2017). Design and evaluation of automated writing evaluation
models: Relationships with writing in naturalistic settings. Assessing Writing (34): 62–71.
Center on International Education Benchmarking (CIEB). (2020) Japan: learning systems.
Accessed on 15 June 2020. https://ncee.org/what-we-do/center-on-international-educationbenchmarking/top-performing-countries/japan-overview/japan-instructional-systems/.
Collins, S. (2017). Government confirms primary schools to scrap National Standards. New
Zealand Herald. Accessed on 23 June 2020 on https://www.nzherald.co.nz/nz/news/article.
cfm?c_id=1&objectid=11958067.
Comber, B. (2012). Mandated literacy assessment and the reorganisation of teachers’ work:
Federal policy, local effects. Critical Studies in Education, 53(2), 119-136. http://doi.org/10.1080/1
7508487.2012.672331
Council of Australian Governments. (2018). National School Reform Agreement. https://docs.
education.gov.au/system/files/doc/other/national_school_reform_agreement_8.pdf
Curriculum Corporation. (2005). Statements of Learning for English. Carlton South:
Curriculum Corporation. Accessed on 29 June 2020 on http://www.curriculum.edu.au/verve/_
resources/SOL_English_Copyright_update2008_file.pdf
Curriculum Corporation. (2006). Statements of Learning for Mathematics. Carlton South:
Curriculum Corporation. Accessed on 29 June 2020 on http://www.curriculum.edu.au/verve/_
resources/SOL_Mathematics_2006.pdf
Delandshere, G., & Petrosky, A.R. (1998). Assessment of Complex Performance: Limitations of
Key Measurement Assumptions. Educational Researcher, 27(2), 14-24.
NAPLAN Review Final Report
170
Dulfer, N., Polesel, J., & Rice, S. (2012). The experience of education: The impacts of high
stakes testing on school students and their families. An Educator’s Perspective. University of
Western Sydney: Whitlam Institute.
Education Council. (2014). Communiqué. 31 October 2014. http://www.educationcouncil.
edu.au/site/DefaultSite/filesystem/documents/Communiques%20and%20Media%20
Releases/2014%20Communiques/Education%20Council%2031%20October%20Communique.
pdf
Education Council. (2015). Communiqué. Fifth Education Council meeting, 29 May 2015,
Brisbane. http://www.educationcouncil.edu.au/site/DefaultSite/filesystem/documents/EC%20
Communiques%20and%20media%20releases/Education%20Council%2029%20May%20
2015%20-%20Communique.pdf.
Education Council. (2016). Media release: My School updated for 2016. http://www.
educationcouncil.edu.au/site/DefaultSite/filesystem/documents/Key%20Documents/
Education%20Council%20media%20release%20-%20My%20School%20updated%20for%20
2016.pdf.
Education Council. (2018). Communiqué. 13th April 2018. Adelaide. http://www.
educationcouncil.edu.au/site/DefaultSite/filesystem/documents/Communiques%20and%20
Media%20Releases/2018%20Media%20Releases/Education%20Council%20Communique%20
13%20April%202018.pdf
Education Council. (2019a). Alice Springs (Mparntwe) Declaration. Author. http://www.
educationcouncil.edu.au/Alice-Springs--Mparntwe--Education-Declaration.aspx.
Education Council, (2019b). Communiqué. 28 June 2019, Melbourne. http://www.
educationcouncil.edu.au/site/DefaultSite/filesystem/documents/Communiques%20and%20
Media%20Releases/2019%20media%20releases/Education%20Council%20Communique%20
28%20June%202019%20final.pdf.
Education Council, (2019c). Communiqué. 12 December 2019, Alice Springs http://www.
educationcouncil.edu.au/site/DefaultSite/filesystem/documents/EC%20Communiques%20
and%20media%20releases/Education%20Council%20Communique%20-%2012%20
December%202019.pdf.
Education Quality and Accountability Office. (2007). Framework Ontario Secondary School
Literacy Test. December Edition. Ontario, Canada.
Education Quality and Accountability Office. (2020). EQAO’s Technical Report for the 2017–
2018 Assessments. Toronto, Canada.
Education Services, NZ Ministry of Education. (2020). PaCT information. Accessed on 23 June
2020 on https://services.education.govt.nz/schools/pact/information/.
Electronic Assessment Tools for Teaching and Learning (e-asTTle). (2020). E-asTTle basics.
Retrieved from: https://e-asttle.tki.org.nz/About-e-asTTle/Basics
Eliot, N., Ruggles Gere, A., Gibson, G., Toth, C., Whithaus, C & Presswood, A. (2013). Uses
and Limitations of Automated Writing Evaluation Software. Council of Writing Program
Administrators ComPile Research Bibliographies, No. 23.
NAPLAN Review Final Report
171
Finnish Education Evaluation Centre (FINEEC). (2020). Learning outcomes evaluations.
Accessed on 24 June 2020 on https://karvi.fi/en/.
GL Assessment. (2020). York Assessment of Reading for Comprehension. Accessed on 29
June 2020 on https://www.gl-assessment.co.uk/support/yarc-support/australia.
Gorur, R. (2016). The performative politics of NAPLAN and My School. In B. Lingard, G.
Thompson, & S. Sellar (Eds.), National testing in schools: An Australian assessment (1st ed.,
30-43). Oxon: Routledge.
Grønmo, L., Lindquist,M., Arora, A., & Mullis, I. (2013). TIMSS 2015 Mathematics Framework.
https://timssandpirls.bc.edu/timss2015/downloads/T15_FW_Chap1.pdf
Hardy, I. (2014). A logic of appropriation: enacting national testing (NAPLAN) in Australia.
Journal of Education Policy, 29:1, 1-18, doi: 10.1080/02680939.2013.782425
Harlen, W. (2005a). Teachers’ summative practices and assessment for learning – tensions
and synergies, The Curriculum Journal, 16 (2), 207-223, doi: 10.1080/09585170500136093
Harlen, W. (2005b) Trusting teachers’ judgement: research evidence of the reliability
and validity of teachers’ assessment used for summative purposes, Research Papers in
Education, 20 (3), 245-270, doi: 10.1080/02671520500193744
Hayward, E. (2018). Notes from a small country: teacher education, learning innovation and
accountability Scotland. In Wyatt-Smith, C. & Adie, L. (eds.) Innovation and accountability in
teacher education. Singapore: Springer, pp. 37-50. (doi:10.1007/978-981-13-2026-2_3)
Hong Kong Examinations and Assessment Authority. (2019). Hong Kong National Report of
Territory-wide System Assessment (TSA). Retrieved from: http://www.bca.hkeaa.edu.hk/web/
TSA/en/2019tsaReport/eng/TSA2019E.pdf
Hong Kong Examinations and Assessment Authority. (2020). Introduction. Retrieved from:
http://www.bca.hkeaa.edu.hk/web/TSA/en/Introduction.html
Houlberg, K., Andersen, V.N., Bjørnholt, B., Krassel, K.F & Pedersen, L.H. (2016). Country
Background Report- Denmark. OECD Review of Policies to Improve the Effectiveness of
Resource Use in Schools (Project no. 10932). Retrieved from www.kora.dk
Humphrey, S., & Heldsinger, S. (2019). Raters’ perceptions of assessment criteria relevance.
Assessing Writing, 41, 1-13.
Humphrey, S., & Heldsinger, S. (2014). Common Structural Design Features of Rubrics
May Represent a Threat to Validity. Educational Researcher, 43(5), 253-263. DOI:
10.3102/0013189X14542154
Hutchinson, C. & Young, M. (2011). Assessment for learning in the accountability era: empirical
evidence from Scotland, Studies in Educational Evaluation, 37, 62-70. https://doi.org/10.1016/j.
stueduc.2011.03.007
Jeffery, J.V., Elf.N., Skarc, G.B & Campbell Wilcox, K. (2019). Writing development and
education standards in cross-national perspective. Writing and Pedagogy, 10 (3), 333–370.
NAPLAN Review Final Report
172
Kōichi, N. (2012). Japan’s drifting education system: the debate of Japan’s academic decline.
Nippon.co. Accessed on 23 June 2020 on https://www.nippon.com/en/in-depth/a00601/thedebate-over-japan’s-academic-decline.html.
Krogh, E. (2018). Crossing the Divide Between Writing Cultures. In Kristyan Spelman Miller &
Marie Stevenson (eds.). Transitions in writing (pp.72-104). Leiden/Boston: Brill.
Kuramoto, N. & Koizumi, R. (2016). Current issues in large-scale educational assessment
in Japan: focus on national assessment of academic ability and university entrance
examinations. Assessment in Education: Principles, Policy & Practice, 25, 415-433, DOI:
10.1080/0969594X.2016.1225667.
Linn, R.L. (2000). Assessments and accountability. Educational Researcher, 29, 4-16.
Louden, W. (2019). NAPLAN reporting review: prepared for the COAG Education Council.
Melbourne: Education Services Australia. Accessed on 1 July 2020 on http://www.
educationcouncil.edu.au/site/DefaultSite/filesystem/documents/Reports%20and%20
publications/NAPLAN%20Reporting%20Review/Final%20Report.pdf.
Lu, M., Turnbull, M., Wan, W.Y., Rickard, K., & Hamilton, L., (2017). Are writing scores from
online writing tests for primary students comparable to those from paper tests? Centre for
Education Statistics and Evaluation: Sydney, New South Wales.
Manitoba Education and Training. (2020a). Assessment and Evaluation. Retrieved from:
https://www.edu.gov.mb.ca/k12/assess/myreporting.html
Manitoba Education and Training. (2020b). Middle years assessment: Grade 8 English
language arts: reading comprehension and expository writing: support document for
teachers. Winnipeg, Manitoba, Canada. ISBN: 978-0-7711-7516-9 (pdf)
Manitoba Education and Training. (2020c). Provincial Results. Retrieved from:
https://www.edu.gov.mb.ca/k12/assess/results/index.html
Masters, G.N. & Forster, M. (1997a). Literacy standards in Australia. Melbourne:
ACER. Accessed on 13 July 2020 on https://research.acer.edu.au/cgi/viewcontent.
cgi?article=1005&context=monitoring_learning.
Masters, G.N. & Forster, M. (1997b). Mapping literacy achievement: results of the 1996
National School English Literacy Survey. Melbourne: ACER.
Matriculation Examination Board, Finland. (2020). Matriculation Examination. Accessed
on 16 June 2020 on https://www.ylioppilastutkinto.fi/en/matriculation-examination/theexamination.
Matters, G. (2018). Queensland NAPLAN Review. Parent Perceptions Report. Accessed on 1
July 2020 on https://qed.qld.gov.au/programsinitiatives/education/Documents/naplan-2018parent-perceptions-report.pdf.
McGaw, B., Louden, W. & Wyatt-Smith. (2019). NAPLAN review interim report. Sydney:
NAPLAN Review.
Messick, S. (1994). The interplay of evidence and consequences in validation of performance
assessment. Educational Researcher, 23(2), 13–23.
NAPLAN Review Final Report
173
Ministerial Council on Education, Employment, Training and Youth Affairs. (1999). The
Adelaide Declaration on National Goals for Schooling in the Twenty First Century. Canberra:
Author. http://www.educationcouncil.edu.au/EC-Publications/EC-Publications-archive/ECThe-Adelaide-Declaration.aspx.
Ministerial Council on Education, Employment, Training and Youth Affairs. (2005).
Information statement, 18th MCEETYA meeting, Canberra 12 May 2005 to 13 May 2005. http://
www.educationcouncil.edu.au/site/DefaultSite/filesystem/documents/Communiques%20
and%20Media%20Releases/Previous%20Council%20info%20statements/MCEETYA%20
meeting%20info%20statements/MC18_information_statement.pdf.
Ministerial Council on Education, Employment, Training and Youth Affairs. (2008a).
Melbourne Declaration on Educational Goals for Young Australians. Canberra: Author.
http://www.curriculum.edu.au/verve/_resources/National_Declaration_on_the_Educational_
Goals_for_Young_Australians.pdf.
Ministerial Council on Education, Employment, Training and Youth Affairs. (2008b). Media
Release. https://ministers.dese.gov.au/gillard/ministerial-council-education-employmenttraining-and-youth-affairs.
Ministerial Council on Education, Employment, Training and Youth Affairs. (2008c). National
Report on Schooling in Australia. Author. http://www.educationcouncil.edu.au/site/
DefaultSite/filesystem/documents/Reports%20and%20publications/Archive%20Publications/
National%20Report/ANR%202008.pdf.
Ministerial Council on Education, Employment, Training and Youth Affairs. (2009a).
Communiqué, 17th April 2009, Adelaide. http://www.educationcouncil.edu.au/site/
DefaultSite/filesystem/documents/Communiques%20and%20Media%20Releases/
Previous%20Council%20info%20statements/MCEETYA%20meeting%20info%20statements/
MC27%20Communique%20Inc%20ACER.pdf.
Ministerial Council on Education, Employment, Training and Youth Affairs. (2009b). Principles
and Protocols for Reporting on Schooling in Australia. http://www.educationcouncil.edu.
au/site/DefaultSite/filesystem/documents/Reports%20and%20publications/Publications/
Measuring%20and%20reporting%20student%20performance/Principles%20and%20
protocols%20for%20reporting%20on%20schooling%20in%20Australia.pdf.
Ministerial Council on Education, Employment, Training and Youth Affairs. (2011).
Communique. Twelfth MCEECTYA Meeting, 8 July 2011, Melbourne. http://www.
educationcouncil.edu.au/site/DefaultSite/filesystem/documents/Communiques%20and%20
Media%20Releases/Previous%20Council%20info%20statements/MCEECDYA%20meeting%20
info%20statements/C12_Communique.pdf.
Ministry of Children and Education. (2020). Examinations and Other Forms of Assessment.
Retrieved from: https://eng.uvm.dk/primary-and-lower-secondary-education/the-folkeskole/
examinations-and-other-forms-of-assessment
Ministry of Education, Culture, Sports, Science and Technology – Japan (MEXT). 2020.
Principals guide Japan’s educational system. Accessed 15 June on https://www.mext.go.jp/
en/policy/education/overview/index.htm.
NAPLAN Review Final Report
174
Ministry of Education Singapore (2018). ‘Learn for life’ – preparing our students to excel
beyond exam results. Accessed 27 June 2020 on https://www.moe.gov.sg/news/pressreleases/-learn-for-life---preparing-our-students-to-excel-beyond-exam-results.
Ministry of Education Singapore. (2020a). From primary to secondary education. Accessed
on 27 June 2020 on https://www.moe.gov.sg/education/primary/from-primary-to-secondaryeducation.
Ministry of Education Singapore. (2020b). GCE ‘A’ Level curriculum. Accessed on 15 June 2020
on https://www.moe.gov.sg/education/pre-university/gce-a-level-curriculum.
Ministry of Education Singapore. (2020c). Primary school subjects and syllabuses. Accessed
on 15 June 2020 on https://beta.moe.gov.sg/primary/curriculum/syllabus/.
National Center for Education Statistics (NCES). (2017). NAEP 2017 Writing Assessments.
Retrieved from:https://nces.ed.gov/nationsreportcard/subject/writing/pdf/2017_writing_
technical_summary.pdf
National Center for Education Statistics (NCES). (2019). An Overview of NAEP. Retrieved from
https://nces.ed.gov/nationsreportcard/subject/about/pdf/naep_overview_brochure_2018.pdf
National Center for Education Statistics (NCES). (2020a). Assessment Schedule. Retrieved
from: https://nces.ed.gov/nationsreportcard/about/calendar.aspx
National Center for Education Statistics (NCES). (2020b). Writing Interrater Agreement.
Retrieved from: https://nces.ed.gov/nationsreportcard/tdw/scoring/scoring_within_wri.aspx
National University of Singapore. (2020). Singapore-Cambridge GCE ‘A’ Level: Admission
requirements. Accessed on 15 June 2020 on http://www.nus.edu.sg/oam/apply-to-nus/
singapore-cambridge-gce-a-level/admissions-requirements.
New South Wales, Department of Education. (2018). Strategic Plan 2018-2022. Sydney:
Author. https://education.nsw.gov.au/about-us/strategies-and-reports/strategic-plan#Our5.
Nelson. (2020). PM Benchmark. Accessed on 29 June 2020 on
https://cengage.com.au/primary/browse-series/pm/pm-benchmark.
New York State Education Department. (2020). Educator Guide to the 2020 Grades 3–8
English Language Arts Tests. Retrieved from: https://www.engageny.org
NZCER, (2020a). Assessment Resource Banks. Accessed on 26 June 2020 on
https://arbs.nzcer.org.nz.
NZCER. (2020b). Progressive Achievement Tests (PATs). Accessed on 23 June 2020 on
https://www.nzcer.org.nz/tests/pats.
NZ Ministry of Education. (2020a). Assessment online. Accessed on 15 June 2020 on
http://assessment.tki.org.nz.
NZ Ministry of Education. (2020b). e-asTTle. Accessed on 23 June 2020 on
https://e-asttle.tki.org.nz.
NZ Ministry of Education. (2020c). The national administration guidelines. Accessed on 26
June 2020 on https://www.education.govt.nz/our-work/legislation/nags/.
NAPLAN Review Final Report
175
Oates, T. (2015). Finnish fairy stories. Cambridge: Cambridge Assessment.
Official Norwegian Report (green paper) (2002). First class from the first class – Proposal
for a framework for a national quality assessment system of Norwegian basic education.
Retrieved from https://www.regjeringen.no/no/dokumenter/nou-2002-10/id145378/
sec5?q=grunnleggende#KAP3-4-1-P3
OECD. (2001). Knowledge and skills for life: first results from the OECD Programme for
International Student Assessment (PISA) 2000. Paris: OECD.
OECD. (2004). Learning for tomorrow’s world: first results from PISA 2003. Paris: OECD.
OECD. (2007). PISA 2006: science competencies for tomorrow’s world, Volume 1: analysis.
Paris: OECD.
OECD. (2010). PISA 2009 results: What students know and can do – student performance in
reading, mathematics and science, Volume I. Paris: OECD.
OECD (2019a), PISA 2018 Assessment and Analytical Framework, PISA, OECD Publishing,
Paris, https://doi.org/10.1787/b25efab8-en.
OECD. (2019b). PISA 2018 results: what students know and can do, Volume I, Paris: Author.
OECD (2020), Reading performance (PISA) (indicator). https://doi.org/10.1787/79913c69-en
(Accessed on 25 June 2020) https://data.oecd.org/pisa/reading-performance-pisa.htm
Office of Qualifications and Examinations (OFQUAL). (2020). Summary of changes to GCSEs
from 2015. Accessed on 21 July 2020 on https://www.gov.uk/government/organisations/ofqual.
Okabe, T., Tose, N., & Nishimura, K. (1999). Bunsuu ga dekinai daigakusei [University students
who cannot perform calculations using fractions]. Tokyo: Toyo Keizai.
Ontario Education Quality and Accountability Office. (2020a) Everything you need to know
about EQAQ elementary assessments. Accessed 15 June on https://www.eqao.com/en/
assessments/communication-docs/guide-elementary-assessments-english.pdf.
Ontario Education Quality and Accountability Office. (2020b). Upcoming changes to EQAO’s
assessments and reports. Accessed 15 June on https://www.eqao.com/en/about_eqao/
modernization/Pages/memo-changes-eqao-assessments-reports-2019.aspx.
Parkin, C. & Parkin, C. (2011). PROBE2 Reading Comprehension Assessment. Upper Hutt,
NZ: Triune Initiatives. Accessed on 29 June 2020 on https://comprehenz.com/resources-allresources/resources-assessment/probe-2-reading-comprehension-assessment/.
Perelman, L. (2018). Towards A New NAPLAN: Testing to the Teaching. Surry Hills, Sydney.
ISBN 978-0-3482555-2-9.
Queensland Curriculum and Assessment Authority. (2019). Retrospective: 2018 Queensland
Core Skills Test Writing Task. Accessed on 19 July on https://www.qcaa.qld.edu.au/downloads/
senior/qcs_retro_18_3.pdf.
Queensland, Department of Education. (2019). Service Delivery Statements. Brisbane: Author.
https://budget.qld.gov.au/files/2019-20%20DoE%20SDS.pdf.
NAPLAN Review Final Report
176
Reedy, D. (undated). Independent review of the Scottish National Standardised Assessments
at Primary 1. Accessed on 16 June 2020 on https://www.gov.scot/binaries/content/documents/
govscot/publications/progress-report/2019/06/scottish-national-standardised-assessmentsreview-2019/documents/independent-review-of-the-scottish-national-standardisedassessments-at-primary-1/independent-review-of-the-scottish-national-standardisedassessments-at-primary-1/govscot%3Adocument/Independent%2BReview%2Bof%2Bthe
%2BScottish%2BNational%2BStandardised%2BAssessments%2Bat%2BPrimary%2B1.pdf
Rezaei, A.R & Lovorn, M. (2010). Reliability and validity of rubrics for assessment through
writing. Assessing Writing, 15 (1), 18-39.
Sadler, D. R. (1989). Formative assessment and the design of instructional systems.
Instructional Science, 18(2): 119–44.
Sadler, D.R. (2009). Indeterminacy in the use of preset criteria for assessment and grading,
Assessment & Evaluation in Higher Education, 34:2, 159-179, DOI: 10.1080/02602930801956059
Sahlberg, P. (2015). Finnish lessons 2.0: what can the world learn from educational change
in Finland? New York: Teachers College Press.
Sahlberg, P. (2016). The global education reform movement and its impact on schooling. In
(Eds.) Mundy, K., Green, A., Lingard, B. & Verger, A. The handbook of global education policy.
doi:10.1002/9781118468005.ch7. Chichester: John Wiley & Sons, pp. 145-161.
SEAMEO INNOTECH, (2015). Assessment Systems in Southeast Asia: Models, Successes and
Challenges. Retrieved from: http://www.seameo-innotech.org
Singapore Examinations and Assessment Board (2020). Your trusted authority in
examinations and assessment. Accessed on 15 June 2020 on https://www.seab.gov.sg/
home/#.
Scottish Government. (2016). National improvement framework for Scottish education: 2016
evidence report. Edinburgh: Author. Accessed on 16 June 2020 on https://www.gov.scot/
publications/national-improvement-framework-scottish-education-2016-evidence-report/.
Shermis, M. (2014). The challenges of emulating human behavior in writing assessment.
Assessing Writing, 22, 91-99.
Shewbridge, C., Jang, E., Matthews, P., & Santiago, P. (2011). OECD Reviews of Evaluation and
Assessment in Education Denmark Main Conclusions. Retrieved from: www.oecd.org/edu/
evaluationpolicy
Singapore Examinations and Assessment Board (SEAB). (2020). Behind the Scene: Key Exam
Processes. Retrieved from: https://www.seab.gov.sg/home/examinations/psle/behind-thescene
Skar, G. (2017). The Norwegian National Sample-Based Writing Test 2016: Technical Report.
Retrieved from: http://www.skrivesenteret.no/uploads/files/Skriveproven2017/NSBWT2017.pdf
Skar, G. B. & Jølle, L. J. (2017). Teachers as raters: An investigation of a long-term writing
assessment program. L1-Educational Studies in Language and Literature, 17, 1-30.
https://doi.org/10.17239/L1ESLL-2017.17.01.06
NAPLAN Review Final Report
177
Thompson, G. (2013). NAPLAN, My School and accountability: Teacher perceptions of the
effects of testing. The International Education Journal: Comparative Perspectives, 12(2), 6284. Retrieved from: www.iejcomparative.org
Thomson, S., Hillman, K., Schmid, M, Rodrigues, S, and Fullarton, J. (2017a). Highlights from
PIRLS 2016 Australia’s perspective. Melbourne: Australian Council for Educational Research.
https://research.acer.edu.au/cgi/viewcontent.cgi?article=1001&context=pirls
Thomson, S., Wernert, N., O’Grady, E., & Rodrigues, S. (2017b). TIMSS 2015: Reporting
Australia’s results. Melbourne: Australian Council for Educational Research. https://research.
acer.edu.au/cgi/viewcontent.cgi?article=1002&context=timss_2015
Thomson, S., De Bortoli, L., Underwood, C., & Schmid, M. (2019) PISA 2018: Reporting
Australia’s Results. Volume I Student Performance. Melbourne: Australian
Council for Educational Research. https://research.acer.edu.au/cgi/viewcontent.
cgi?article=1035&context=ozpisa
Today Online. (2020). Secondary school streaming to be abolished [in Singapore]. Accessed
on 15 June 2020 from https://www.todayonline.com/singapore/secondary-school-streamingbe-abolished-2024-replaced-subject-based-banding.
Track One Studio. (2020). Learning analytics suite. Accessed on 29 June 2020 on
https://www.trackonestudio.com.
UK Department of Education. (2017). Primary assessment in England: equalities impact
assessment. London: Author. Accessed on 22 June 2020 on https://assets.publishing.
service.gov.uk/government/uploads/system/uploads/attachment_data/file/644717/Primary_
assessment_in_England_-_EIA.pdf.
UNICEF & SEAMEO. (2019a). SEA-PLM 2019 Assessment Framework (1st ed.). Bangkok,
Thailand: United Nations Children’s Fund (UNICEF) & Southeast Asian Ministers of Education
Organization (SEAMEO) – SEA-PLM Secretariat.
UNICEF & SEAMEO. (2019b). SEA-PLM 2019 Technical Standards. Bangkok, Thailand: United
Nations Children’s Fund (UNICEF) & Southeast Asian Ministers of Education Organization
(SEAMEO) – SEA-PLM Secretariat.
United Nations Educational, Scientific and Cultural Organization (UNESCO). (2019). The
promise of large-scale learning assessments Acknowledging limits to unlock opportunities.
Paris, France. ISBN 978-92-3-100333-2
University of Otago. (2020). National Monitoring Study of Student Achievement. Accessed on
23 June 2020 on https://nmssa.otago.ac.nz/index.htm.
University of Western Australia. (2020). BASE Australia. Accessed on 29 June 2020 on
http://www.education.uwa.edu.au/base.
UNSW Global. (2020a). Educational assessments: how Reach and ICAS are different.
Accessed on 29 June 2020 on https://www.unswglobal.unsw.edu.au/educationalassessments/campaigns/reach-and-icas/.
UNSW Global. (2020b). Educational assessments: ICAS. Accessed on 29 June 2020 on
https://www.unswglobal.unsw.edu.au/educational-assessments/products/icas-assessments/
NAPLAN Review Final Report
178
Verger, A., Parcerisa, L., & Fontdevila, C. (2019). The growth and spread of large-scale
assessments and test-based accountabilities: a political sociology of global education
reforms, Educational Review, 71(1), 5-30, doi: 10.1080/00131911.2019.1522045
Victoria, Department of Education and Training. (2017). Differentiated School Performance
Method 2019. Melbourne: Author. https://www.education.vic.gov.au/Documents/school/
teachers/management/improvement/2019_dspm_measuresguide.pdf.
Victoria, Department of Education and Training. (2019). 2018-19 Report on Operations.
Melbourne: Author. https://www.education.vic.gov.au/Documents/about/department/201819-report-of-operations.pdf.
Victoria, Department of Education and Training. (2019). Fact sheet: Differentiated support
for school improvement. Melbourne: Author. https://www.education.vic.gov.au/Documents/
about/educationstate/differsupportedstatefactsheet.pdf.
Wikipedia. (2020). Ecological fallacy. Accessed on 27 June 2020 on https://en.wikipedia.org/
wiki/Ecological_fallacy.
Wyatt-Smith, C. & Adie, L. (2020 forthcoming). The development of students’ evaluative
expertise: Enabling conditions for integrating criteria into pedagogic practice. Journal of
Curriculum Studies. doi.org/10.1080/00220272.2019.1624831
NAPLAN Review Final Report
179