Field Notes Episode 3: What makes an effective assessment? (w/Dr Jarrod Hingston) Artwork

Improving Learning with ACER

Transform Learning. Enhance Practice.

The Australian Council for Educational Research (ACER) is an independent, not-for-profit research organisation that has been improving learning for more than 90 years.

Improving Learning with ACER shares the latest research, insights into best practice and discussions with leading educators.

All Episodes

Improving Learning with ACER

Field Notes Episode 3: What makes an effective assessment? (w/Dr Jarrod Hingston)

September 11, 2024 • Australian Council for Educational Research • Season 1 • Episode 3

In an increasingly-data filled world, how do you know what to focus on to inform teaching?

In this insightful episode, Education Consultant Marc Kralj delves deep into a practical and research-informed discussion on assessment with Dr Jarrod Hingston, Director of School and Early Childhood Education Services. Together they unpack types of evidence, what makes an assessment effective, the future of assessment, and why it is important to ensure educators are reliably measuring learning growth.

Field Notes is a podcast that shares honest conversations with leading educational practitioners about how they use evidence to overcome challenges and improve learning outcomes for their students.

Alex Gates: In the spirit of reconciliation ACER acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today. ACER acknowledges the Aboriginal and Torres Strait Islander people who continue to contribute to our work to improve learning, education and research.

Welcome to Field Notes, a podcast from the Australian Council for Educational Research. In today’s episode, our education consultant Marc Kralj speaks with Dr Jarrod Hingston about measurement, assessment and growth. Now, if you’re a teacher, those words alone might be enough to trigger a sense of panic; in the classroom, there is an ever-increasing pressure to gather more student data through vigorous assessment. But how can we be sure that those assessments are effective? With over two decades of experience in education policy, Dr Hingston is well-suited to help answer that question and clear up some of the misconceptions that surround measurement and growth in the classroom. If you’re an educator trying to make sense of assessment theory, or if you’re simply interested in what makes for an effective measurement of growth, this episode is for you. Let’s jump into that conversation with Marc and Jarrod now.

Marc Kralj: Joining me today is Dr Jarrod Hingston, Director of School and Early Childhood and School Education Services from ACER, the Australian Council for Educational Research. Today's topic is measurement. We talk about measurement in a number of ways. But today's discussion is going to be around education, school assessments, teachers, leaders, how we read information, how we read the data, how we read the evidence that we collect in a number of ways, and especially this time of year. We're moving into September, October, November, when a number of schools are looking at information, collecting data, thinking about report writing in a number of ways. Before we start, I'd like to ask Jarrod to just give a bit of information about his extensive background, his role at ACER, but also his previous roles that he's had over a number of years, and his Ph. D in assessment. Thank you, Jarrod.

Jarrod Hingston: Thank you very much for having me today and on the podcast. So my background, the first thing I always tell people is that I never was a teacher, so I have the greatest respect for the opinions and expertise of teachers. But I have spent over 20 years working in education policy in for different governments. I've had the privilege to work both at the Commonwealth, for the Victorian State Department, also for overseas, for the Government of Abu Dhabi for a period. I've been working as well in in programs and projects in the not-for-profit sector and for some predominant education companies in different parts of the world. So, I've been around the scene for quite a while. 17 years as well in working with large scale assessments, particularly with schools and teachers in how to use the data from large scale assessments is a particular passion of mine, particularly using the data for improving learning, so that has led me to ACER.

Marc: Thanks, Jarrod. And look, I think that's one of the reasons I wanted Jarrod to join us today. And that is simply to get a sense of some of the misconceptions and even myths that we have when we talk about measurement and assessment. But to begin with, Jarrod, when we think about education, when we think about what measurement is in assessment, can you give us an outline of exactly what it is.

Jarrod: I suppose over the years one of the questions that I've been asked a lot by teachers is, what's the difference between measurement and assessment, because I do tend to talk a lot about measuring learning. And I suppose, very simply, there's a number of ways we can assess a skill, knowledge, or concept. But measurement really thinks about where the skills, knowledge, or concepts sit within a learning continuum.

So– and I hope this isn't getting too technical – but when we look at educational measurement, we need to think about what we call the latent variable. So the latent variable is the construct. It is the big thing that we want to measure. So if we think about mathematics, for example, we cannot just look at one thing and say this is a student's achievement or ability in mathematics. It's not a single piece or a single element that we can actually chunk down into one particular instance. What it is, a latent variable is something we cannot observe on its own, but what we can do is we can look at a range of other variables that reflect the construct of that latent variable. So, for example, we can look at different skill and concept strands in mathematics and different concepts that make up those strands, and when we aggregate them all together, they become a construct that reflects mathematics.

What we're doing with measurement is trying to move away from the idea of giving an assessment where we are assessing, for example, different skills, different concepts and reporting back how a student performs on those different concepts. What we're trying to do is to be able to understand and observe where a student is currently – where they've reached in their learning against that continuum, against that latent variable. What is it that their performance on a set of test items or questions, what does it mean, as far as what we can infer about their achievement in mathematics as a construct? So, this is a big difference between measurement and assessment on its own, whereas assessment we can actually make comments against how a student has performed, or how they've mastered a skill or their understanding of the concept, or what their knowledge is in a particular area. But when it's measurement, we can also then relay back to where that student has reached against that, that continuum against that latent variable. So it's actually rather technical and we do need sophisticated, often psychometric processes sitting behind that. But it's a very important distinction to understand that measurement isn't just assessment, but it is a an understanding of what we're assessing as a continuum and being able to identify where students have reached along that continuum.

Marc: Learning is also important in this space. And I think that's one of the reasons we're discussing today. When teachers and leaders in schools look at their information, they need to actually dig a little deeper into understanding these concepts. Over the last few years we've talked about, and I've put it to you, Jarrod, earlier, about assessment evolving, things changing, things maybe being the same. But in your opinion how have you seen assessment evolve and look, we've talked about 5 years, we could say 10 years or 15 years, but I'm thinking, probably more recently over this period of time, over the last 5 years.

Jarrod: Yeah, it's an interesting question, Marc, because one of the privileges working for ACER is that we've been doing what we do for over 90 years. So not many organisations can say that, that they’ve had expertise in in assessment and educational measurement for such a period. And being with an organisation that has been around for so long, we have this incredible set of resources that sit in our in our library in our Camberwell office, where you can actually go and see how some of the assessment tools and learning measurement tools have evolved over the decades. The interesting piece is that in so many ways assessment has evolved and in other ways it hasn't.

So, to talk about how it has evolved, I think that the really clear and evident piece to everybody is that technology is having a big impact on assessment, positive and negative. The positive is that it's helping us to be more efficient. And so the biggest piece I think, that technology is supporting us with is being able to provide faster and, in many cases immediate feedback, on assessments. And we know that the faster you can get feedback – good feedback, actionable, meaningful feedback – into teachers’ hands, and most importantly back to students, is the faster that we can actually improve learning. We can take action on that feedback.

We're also seeing that technology is enabling new data points, so we're actually able to capture a lot more process information as well as product information. There are new methods of collecting data. So, the fact that that we have students interacting with technology does enable us to implement new methods. Even simple things like moving away from the old style, multiple choice into things like being able to see how students might use our interactive responses and be able to do simple things like use ‘hotspots’ and ‘drags and drops’ gives us new pieces of information. And there are also improvements in how we're actually targeting assessments, and particularly through adaptive assessment where we can actually focus on collecting rich diagnostic data that focuses on the individual and just not the group. I think that's the really clear places that we are benefiting from technology and assessment.

Also, I think, more and more, we're becoming very exposed to the other side of technology, where it's almost the bells and whistles are sometimes being embraced more so than substance. So there is that piece. And I think for a lot of teachers it is quite confusing as to whether some of the types of assessments they're being presented with are the bells and whistles, whilst they might be aiding engagement in some spaces and while they might be making life a little bit easier... is it actually giving us that measurement, that real robust view of this student's achievement? And what we can actually do to support the student? Or is it just giving us more information that is perhaps not going to assist in the improvement of learning. So there, there is that side of technology.

And more and more, we're talking about AI and seeing AI come into the world of assessment. And it's entering and it's going to be more and more part of what we do and how we provide faster, more effective, and more meaningful feedback. But there is, I think it has, as AI has quickly come into the picture and there is also an understanding that hallucination, or where an artificial intelligence is actually providing an incorrect response based on a corpus of incorrect interpretation and misleading information that it has been based on is actually having a negative impact on assessment and on the measurement of learning.

I think that the biggest piece is that we want technology to improve is how teachers are able to support students and to identify next steps in learning, not actually replace professional judgments or the ability to interpret assessment data and information. I think this is the space that we're really seeing assessments evolve in. But when we come back to it, the purpose of assessment hasn't changed. The basic elements haven't changed. And we're still using a lot of the basic ways of assessing that we were using 30 or 40 years ago, and that's using items, test questions, to be able to elicit those understandings of how a student has mastered a particular concept we're looking at. Even if what we're actually assessing, we might be looking more at general capabilities, different types of skills. We might be looking at things like cognitive proficiencies. We might be looking at even issues like communication and collaboration. But we're still using a lot of traditional methods that we know are reliable and provide good robust information to teachers, to capture those.

Marc: Good to hear how some things don't change, and I think that that's important as well. You know, I always thought about that term ‘don't throw the baby out of the bath water’ so to speak, but it's also about holding on to the things that still make sense when we are educators, or when we're actually looking at evidence and data in certain ways. And that's a great segue, Jarrod, into this next question in terms …interpreting, analysing, to be able to infer, to make informed decisions about next steps. So for schools, why is it important for them to use data and evidence for next steps in teaching and learning? And from your experiences, and the things that you hear from a number of school system clients, international and national. What are some of the challenges? And some of the successes that have that have come about in your time at ACER, but probably even probably beyond that as well.

Jarrod: Yeah, well, I think one of the basic elements is that as humans, when decisions are being made around us and what we're what we're doing and what our endeavours are essentially, we hope that those decisions are based on evidence and not just a hunch.

I think most people as well, we want to be challenged in life. We want to achieve and succeed as best as we can. Now, it might not be in everything, but generally where there are areas that we do want to excel in in, in the world or in our lives, we do want to be challenged. And we want the people who are helping us succeed in that area to actually be making to be providing us advice that is based on evidence and not just not just a hunch. So education isn't any different to that. It is such a such a big part of all our lives. But again, you know as humans, we do think generally, that evidence is such an important part of the decision-making process. In education, one of the hardest pieces, I suppose, as a both as a policymaker and as a somebody who works at an organisation such as ACER, one of the biggest things that we do tackle is making sure that teachers have access to tools that provide reliable and valid evidence.

Assessments so often – and I remember sort of going back to my early career in education policy - having teachers come and tell me that ‘this tool is not reliable, this tool does not tell us what we need to know’. So I mean, when we're thinking about reliability, you want evidence to be based on tools that you know are going to be consistent. When you're using them with students, you want to make sure that you're actually getting consistent information that you know you can trust.

But the other piece is, you want validity. You want information that is actually going to relay something meaningful back to what you're trying to teach students, to what they need to learn. And so often certainly, in my time, these practices, and particularly in the world of assessment, there's often assessment for assessment's sake. And I spent many years working in systems and schools in the Middle East, and there is a real culture of assessment and testing over there, and, you know, I understand what is driving that. And that's, I suppose, a thirst for evidence and evidence-based decision making, which is wonderful. But quite often we were seeing – and that's probably one of the areas of policy I did work in – was trying to reduce the amount of assessment and bring it back to reliability and validity in evidence gathering.

And one of the areas that we would often see is that teachers we know, being time poor, and that is around the world. That is the same issue. There's only so much you can achieve in a week with so many students, and so often we were finding that the use of more assessment was making teachers time poorer and wasn't actually giving them enough time to teach.The other piece was that often teachers were resource poor, and when we say resource poor, whilst there might have been lots of tests they could use, whilst there might have been a lot of focus on being able to produce quizzes and tests, there wasn't really a focus on developing great assessments that would be reliable and valid and be able to give teachers that trust that they can actually then base teaching off. So, there's certainly some challenges around being evidence and data driven. And again, when you don't have the validity in the assessments, it really makes interpretation a challenge. Because the natural thing is that we want to relay back the data to what's actually happening in the classroom to what you're observing day to day, and when they don't match, that's an issue. And so often we've seen in in the past, where the teacher is blamed around the interpretation. They know that the assessment is conflicting with what they're seeing in the classroom and being able to sort of come back and say, ‘Well, actually, if that's the case, it's not necessarily the teacher who's wrong because they're observing certain things in the classroom, they are collecting data in the day to day’. And when there's a mismatch, we need to understand why, not necessarily just say that ‘trust the test’.

Another issue around data and evidence is the data deluge. When you do have ‘over assessment’, there will be a deluge in data. There's also assisting teachers to understand triangulation and not to be trying to, I suppose, take one and one and get that to equal 3, because that's what you think the answer should be, or that's what you're being told the answer should be. It's understanding when not to actually use a piece, a data point as a piece of evidence towards a certain question you're asking. It's not easy.

But yeah, it's certainly when we talk about data and evidence. And I really focus on the term evidence, because that's what we really want. We want to be able to use good evidence of teaching and learning, when you can actually apply that to the individual and to groups of students. It is really powerful, and it does actually lead to improvement in learning. And just not improvement in learning, but higher engagement for students and groups of students. Because this is where we can actually use the measurement of learning to identify steps for students that will challenge them appropriately, will help them succeed and will build student engagement, will improve student agency, will drive students to be more ambitious and achieve more in learning, and find the passions in in learning that we all want them to find.

Marc: We certainly want improved learning outcomes, and I think anytime we talk about curriculum assessment, we talk about trust the teacher and trust their judgment, trust the ability to support understanding what triangulating is, what information are we collecting evidence, what observations that you mentioned already, and discussion we're having, feedback from kids discussions with parents. These are really important, but the one that I probably would like to emphasise from your response there, Jarrod, was trust teacher judgment.

Jarrod: Yeah. And I think this is one of the biggest spaces that it sort of in my time working education policy that so often the temptation for government policymakers is to look at big data. And you know, we want to see and get measures of success in each school and understand, you know, using metrics around which schools are performing better than others, and which teachers are performing better than others. But you know, whilst there's a place for that. It shouldn't be the focus. And in education policy, I think sometimes policymakers, and you know, I've been through the learning process where the temptation is looking at big data. And then when you actually get out in schools and talk to educators and the biggest policy area that that we need to focus on is, how do we make assessment and data and evidence relevant to the classroom and actually support teachers better to make better evidence-based decisions. And this doesn't come from looking at the mean scale score of a school compared to all the other schools in in your system. What it comes down to is actually making the evidence related so much closer to what's actually happening in the classroom. and to making sure that we don't lose track of what reliability and validity actually means in the classroom. And you know we can, we know that through the educational measurement that there are ways that we can quantify and qualify what the assessments mean, and we can make them relatable back to what's happening in the classroom. And we're never going to do it better than what the teacher does. But we will give them a body of evidence that actually means something against a learning continuum that will either reinforce what they know, or it might show something a little bit different to what their understanding is that will assist them in making evidence-based decisions around students’ next steps in learning. And that that's what we really want.

Marc: You know, it's it. It really emphasises too that point about supporting teachers, professional learning, their development, their understanding in this space. Because that's continued as well, and I still believe there's a bit of learning, and even some unlearning to be done along the way.

So, 9 to 12 months of teaching and learning has occurred. We've collected evidence. We've collected data. And one of the things that I'd like to put forward to you, Jarrod, is this term of growth. It’s used quite a bit in education, and in your experience, we know that there's research being done in this space about what student learning growth is, but also what are some of the myths that may be out there as well, that teachers and schools and all educational groups might be experiencing.

Jarrod: Yeah, I mean, growth is such an important – I don't want to call it a buzz word – but it is a really important word that is understood differently. So yeah, we do all want to see students progress. And so if we want to be able to understand how they're they've been progressing and how we can support them to progress more quickly.So, there's plenty of misconceptions out there about what growth is and how we should be treating it. I think the biggest myth, and we see it so often around the world, is where growth is being understood as a change in almost the ranking or performance of a student against the performances of other students. Now, this is actually not learning growth. This is, you could call it a growth in a ranking, or it's a change in ranking., it's an improvement in cohort performance. But does it actually mean anything back towards what we're actually teaching kids and how we want them to learn it? It doesn't.

What it essentially does is set up a system of winners and losers in a rankings state. Now, the problem, the biggest problem, with that is you could have all students progressing. And then you've got this ‘winners and losers’ system that we're talking about growth with. You could have but hypothetically – I'd hate to see this ever happen – but you could have a situation where no students are progressing and we're getting this idea of some students are actually doing better because they're performing better than other students.

None of this is really helping the question of ‘how do we improve learning.’ So learning growth – you know, I'm very passionate about this – because I think that there are so many, there are different ways we can actually evaluate learning and progress. But for me, you need tools that are entrenched in learning measurement to properly be able to measure learning growth. If we're looking at from, you know I'm not talking about you know the baseline assessment. And then the end of chapter test, where we're actually understanding how students have may have learned a particular set of concepts. What we're actually looking at is how are students traveling against that continuum of mathematics, or in that area of reading comprehension, or in science, or in that that bigger learning area. How are they actually travelling and progressing over time? So this is where we really do need learning measurement. That is sort of entrenched in a proper psychometric model, not just how they're performing against other students.

We shouldn't also be focused just on the numbers. We need to be able to relate that back to what learning actually looks like in the classroom. So what we do – and what one of the things I absolutely love about what ACER’s capability – is that we build these tools that that are entrenched in in learning measurement, that are supported by the psychometric methods that will actually enable us to calibrate different levels of assessment back to that latent variable, back to that construct of mathematics, or that construct of reading, or that construct of science, and other constructs where we can actually then say, we can use tools, assessment tools at different stages of learning, but then be able to get measures that relate back to a common scale and a scale that is described. So, we can actually understand that at this particular level these are the types of skills, concepts, and understandings that students are demonstrating. And then, 9 to 12 months later, have a subsequent measure against that scale where we can actually get a quantifiable measure, so we can see a learning gain against that scale, but also understand by looking at how that described progression against that scale…We can see what the actual skills and concepts are that students have been improving in and what are the next targets that we're setting for them? And this is actually the real way of being able to quantify and qualify learning growth against a measure. It is a complex area. It's not an area that is easy for teachers, that teachers can easily tackle without the support of really good tools, really good assessments. But it's important to understand that there's that that what some people call growth, and what some assessments will say that they measure growth in is not really learning growth.

Marc: Interesting when we talk about obviously the tools which are really critical. And I think sometimes when we, as we've gone through this discussion today, Jarrod, is that we almost need to be selective of the tools we pick as well. We need to think carefully about the kinds of things that we're putting in front of our teachers, how we're using them to make these informed decisions when we talk about that measurement between one point to the other after that, 9 to 12 months of teaching and learning, I often think about what's happening in between those 9 to 12 months, as well, you know, along with the tools and the assessments we collect and the evidence we collect, you know what teaching is actually happening. So earlier on in our conversation, we talked about looking at what evidence we can infer. And therefore, what are the things that we putting in place to impact those changes will impact improvement or impact, looking for strengths, weaknesses, and gaps.

Jarrod: That's right, absolutely. And I think this is the biggest thing, too, where it concerns me when we're certain when this idea around certain assessments providing growth measures when they when they aren't. Because it's also for the teacher – all the work that they've done throughout the year and all of the support for the student – it can so often be lost when the actual tool being used is not actually doing what it professes to do. And from the student's perspective as well when they're not getting a true measure of learning when they're not being able – to when a teacher isn’t able to sit down with them to show them. ‘Okay, this is where you were at 12 months ago. And this is where you're at now, and all of the work that you've been doing has led to this.’ And actually celebrate success when that can't be related back to the learning, when it's just a statistical measure we lose so much. And for the students as well, it's that sense of achievement, and that motivation to continue being ambitious and wanting to be challenged can so often be lost. And it's such an important area that I think so often gets missed.

Marc: Just to finish off today just a couple of things. And that is, I know that ACER produces many tools, many supporting materials, many supporting resources. But there is one thing that I do want you to touch on, and that is not the concept, but the construct of predicted growth. And I know it's something that's been exciting in schools who have seen it. But I'd like you just quickly finish off with giving us an outline of maybe where it came from, why, it's there, and how it can be possibly used.

Jarrod: Predicted growth is an interesting one and I suppose it comes back a little bit to an earlier question about how assessment has evolved. And we're talking about the use of technology. I think one of the uses of technology that we haven't necessarily, we didn't necessarily talk about was the actual collection of data. And the predicted growth measure is actually something that we've been able to produce. Because we've been able to collect data through the online nature of the assessment to collect students’ results over a period of years, and then, I suppose, conduct a study to understand what are the learning growth measures and trajectories of students who have taken our Progressive Achievement assessments over the years. It is kind of linked to a question, probably the biggest question that t I get asked in working with schools and systems and policy makers, around this concept of expected growth. When we talk about learning growth, so often teachers want to understand is how much growth should we actually be expecting to see a student make within 12 months?’

And I think the term ‘expected’ is a challenge, because it's really almost an opinion business. When we start talking about growth expectations in a in a human sense, we don't necessarily expect a student to physically grow a certain amount each year. But what we might do is to go back to health records and measurements taken of populations over years and understand how children at a similar age have grown at different rates, and what the likelihood is for a student at a particular age group to grow a certain amount. And that's based on evidence that's based on data that's been collected over years. It's not based on a on a pediatrician sitting there and saying, ‘Oh, I reckon your child might grow this much’, and then we're taking that as gospel.

So why, then, in education are we then saying, well, we should take an opinion and set that as the benchmark that we want to see a student demonstrate over 12 months. So predicted growth is really our effort to then go back to the body of evidence and understand at different stages and different parts of the Progressive Achievement scale, what trajectories students have taken previously, what sorts of learning growth gains have we witnessed students being able to take over different periods of time? And the biggest thing that we all learn is that there are sort of typical trajectories, but there's also atypical trajectories, and that gives us a range of measures that we can predict how a student may well demonstrate learning growth in in the coming 12 months. And that's, I think, an unique and interesting feature that we've been able to build into the Progressive Achievement reporting and something that we're looking to build on. But it's certainly attracted a lot of interest from a lot of educators. And we think that's a good thing, because what it's actually doing is improving an understanding of what learning growth actually looks like, and that it's not necessarily the same for every single student.

Marc: It. It brings me to the point before we finish off today, and that is that if people are unsure in schools as a teacher, as an educator, as a leader, even in a system, as a director or educational director who supports schools if they're unsure about something I often think about, ‘go back to the source.’

Jarrod: Yeah, absolutely. I suppose, on a note related to that is, sometimes we forget that assessment and educational measurement is a social science. It's not an exact science. At the end of the day, what we're trying to do, even in the classroom, we're trying to estimate whether a student has understood something, whether what their particular ability level is in a learning area. So there's absolutely nothing wrong with asking questions about anything in a report, anything within an assessment, because all that's going to do is to inform you as the user, the person who has to make those interpretations and make those estimations. It's just going to better inform you and your interpretations around what the evidence is saying. So absolutely, I mean, the more questions you can ask the better you can develop your understanding. And I do it myself, I often look at concepts that, you know, I think I know, but then I’ll still go to colleagues and other people I know, who work with that every single day, and just sort of reevaluate what my understanding is, because at the end of the day we are working in a social science, and sometimes it is good to just double check, to make sure that our own understandings are on point and sometimes things evolve. Sometimes we learn new pieces that will build on our ability to interpret the evidence in front of us and make better evidence-based decisions. It is an interesting area, and it's certainly an area that I love working and getting to work with – in general educators are passionate people, and they're passionate about wonderful things, about supporting children and young people in developing themselves, their future lives and building better societies. And it is a wonderful area that we work in. And I think the more that we can be supporting that the better.

Marc: Jarrod, thank you very much. Today has been with Dr Jarrod Hingston, and as you come towards the end of the year, when you're thinking about this particular topic, maybe share this podcast with someone, because it'll give you a better insight and understanding into what we mean by measurement.

Alex: Thank you again to Dr Jarrod Hingston. If you have any questions about ACER’s Progressive Achievement Test, or any other assessments offered by ACER, please contact the schools support team at school.support@acer.org. Thanks for listening; we’ll be back with another episode of Field Notes very soon.