Injecting Some Moneyball into Student Testing

I’ve always been one to find love in both the art and science of a given subject. As a lifelong baseball fan – and a pretty poor baseball player through high school – I quickly embraced Ted Williams’ The Science of Hitting, believing that the charts and graphs explaining strike zones and such would somehow transform me from a doubles hitter into a homerun machine. Sadly, it never did.

I’m also an unabashed fan of the New York Mets, and have been since the early 1980s. For more than three decades, I have endured the highs and lows (mostly lows) of rooting for the Metropolitans and in believing this might just be the year.

Sadly, the 2018 season wasn’t that year for the Mets. But it was such a year for Mets ace Jacob deGrom. Last week, the All-Star received the Cy Young, recognizing the best pitcher in the National League. It was a well-deserved honor, recognizing one of the best seasons a starting pitcher has ever had, including an earned run average of only 1.70, a WHIP of 0.912, and 269 strikeouts in 217 innings pitched. DeGrom secured the first place position on all but one of the ballots cast this year, offering a rare highlight in another tough Mets season.

Leading up to the award, there were some analysts who wondered if deGrom would win the Cy Young, despite those impressive numbers. The major ding against him was that he was pitching for the Mets, and as a result posted only a 10-9 record, getting almost no run support at all all season from his team. DeGrom’s top competition in the NL had 18 wins. The Cy Young winner in the American league posted 21 victories. So when a 10-9 record won the Cy Young, some critics pounced, accusing sabermetrics and “moneyball” taking over the awards. The thinking was that one of the chief attributes of a top starting pitcher is how many wins he has. If you aren’t winning, how can you possibly be the best?

All the discussions about how sabermetrics has ruined baseball – or at least baseball awards – soon had me thinking about education and education testing. For well over a decade, we have insisted that student achievement, and the quality of our schools, is based on a single metric. Student performance on the state test is king. It was the single determinant during the NCLB era, and it remains the same during the PARCC/Smarted Balanced reign.

Sure, some have led Quixotic fights against “high-stakes testing” in general, but we all know that testing isn’t going anywhere. While PARCC may ultimately be replaced by a new state test (as my state of New Jersey is looking to do) or whether the consortium may one day be replaced by the latest and greatest, testing is here to stay. The calls for accountability are so great and the dollars spent on K-12 education so high, that not placing some sort of testing metric on schools, and kids, is fairy tale. Testing is here to stay. The only question we should be asking is whether we are administering and analyzing the right tests.

I’ve long been a believer in education data and the importance of quantifiable research, particularly when it comes to demonstrating excellence or improvement. But I still remember the moment when I realized that data was fallible. While serving on a local school board in Virginia, overseeing one of the top school districts in the nation, we were told that our nationally ranked high school had failed to make AYP. At first I couldn’t understand how this was possible. Then I realized we were the victims of a small N size. The impact of a handful of students in special education and ELL dinged up in the AYP evaluation. The same handful of students in both groups. It didn’t make our high school lesser than it was. It didn’t reduce our desire to address the learning needs of those specific students. But the state test declared we weren’t making adequate progress. The data had failed us.

The same can be said about the use of value-added measures (VAM scores) in evaluating teachers and schools. VAM may indeed remain the best method for evaluating teachers based on student academic performance. But it is a badly flawed method, at best. A method that doesn’t take into account the limitations on the subjects that are assessed on state tests, small class sizes (particularly in rural communities or in subjects like STEM), and the transience of the teaching profession, even in a given school year. Despite these flaws, we still use VAM scores because we just don’t have any better alternatives.

Which gets me back to Jake deGrom and moneyball. Maybe it is time that we look at school and student success through a sabermetric lens. Sure, some years success can be measured based on performance on the PARCC, just like many years the best pitcher in baseball has the most victories that season. But maybe, just maybe, there are other outcomes metrics we can and should be using to determine achievement and progress.

This means more than just injecting the MAP test or other interim assessments into the process. It means finding other quantifiable metrics that can be used to determine student progress. It means identifying the shortcomings of a school – or even a student – and then measuring teaching and learning based on that. It means understanding that outcomes can be measured in multiple ways, through multiple tools, including but not limited to an online adaptive test. And it means applying all we know about cognitive learning to establish evaluative tools that live and breathe and adapt based on everything we know about teaching, learning, and the human brain.

DeGrom won the Cy Young because teams feared him every time his turn in the rotation came up. We knew he had a history-making season because of traditional metrics like strikeouts and innings pitched, but also because of moneyball metrics like “wins above replacement,” or WAR, and “walks and hits per innings pitched,” or WHIP. Had he not won that 10th game the last week of the season, thus giving him a winning record, deGrom would have had no less a stellar season. In fact, a losing record would have indicated his personal successes and impact despite what others around him were able to do.

Maybe it finally is a time a little moneyball thinking works its way into student assessment. Hopefully, this discussion will come before the Mets reach their next World Series.

 

 

From Proficiency to Mastery

Earlier this year, EdSec Betsy DeVos caught a great deal of flak for not acknowledging the difference between proficiency and progress when it comes to student learning. But with her remarks earlier this month, she may have changed the discussion by shifting the debate to one on mastery. 

Over on BAM! Radio Network, we examine this development on the latest edition of #TrumpED. Give it a listen!

“Compete Against Yourself”

Over the weekend, the edu-wife and I had the good fortune of seeing Kristen Chenoweth perform with the Philadelphia Symphony. If you don’t know who Chenoweth is, you might as well stop reading now … or start listening to the original Broadway cast recording of Wicked. Your choice.

At any rate, Chenoweth paused from her incredible performance to talk about her experiences, both as an artist and as a pageant performer. She spoke of how competing for both the Miss Oklahoma and Miss Pennsylvania crowns helped her develop her life motto.

When the four-foot-11-inch vocalist and actress realized was that she couldn’t compete — at least on the pageant circuit — against the six-foot statues she was standing next to. So she decided there was only one solution. She needed to focus on competing against herself.

Chenoweth offered that life lesson to a number of young women in the audience in Philadelphia that night, women who aspired to be like Chenoweth and wanted to pursue their passions in singing and performance. But it is a lesson that can and should apply to all students. And it is a lesson that isn’t all that foreign in our education debates.

For all the criticism of HOW it was measured, at the heart of adequate yearly progress (or AYP) was schools competing against themselves. Could they do better this year than they did in the previous? Could they build on previous years’ gains and continue to show improvement?

In the coming months, we will again hear a great deal about state tests and opting out and the proper role of state benchmarks in the learning process. Maybe we can take Kristen Chenoweth’s life motto and apply it to student assessment. Maybe, just maybe, we can use annual state assessments to help young learners see the progress over the course of the last year. Maybe we can use tests as the benchmarks they are supposed to be, helping students see all that time and hard work has paid off, and there is quantitative proof they know more this year than they did the previous.

Yes, the adults in the room often put too much weight into the “competitive” aspects of education. Let there be no mistake. Competition is OK. It’s not the end all/be all of life. But it is good to set a goal and achieve it. It is good to show growth and accomplishment. And is certainly is good to compete against yourself. It’s true for artists and performers, and it is certainly true for most children.

Growth is a good thing. Progress is a good thing. And competition, in the right frame, is a good thing. We should all be competing against ourselves,  whether as children or as adults.

Thank you, Kristen Chenoweth, for reminding me this. And it doesn’t matter if such competition makes one Popular or not.

Anti-CCSS “Tin Foil Hats”

There is little question that yesterday’s announcement from the National Education Association has issues with the Common Core State Standards and are calling for a “course correction“will be dissected and debated with enough electronic ink to drown a thousand digital ships.

How do the NEA and AFT pullbacks affect the notion that CCSS advocates are part of a big tent?  What does this mean for union-friendly states that are already having concerns about CCSS and their related assessments?  Are we again at that stage where we are asking if this is the beginning of the end for the Common Core?
The talk on delays or slowdowns of implementation on Common Core are not likely to go away.  But through all of the concern and consternation, no one seems to be offering a viable alternative.  Are we to return to the Old West days of the 1990s, when it was virtually every SEA or LEA for itself?  Are we suggesting that we shouldn’t have standards and accountability at all?
Yes, the CCSS standards movement should be focused on constant improvement.  We should be looking at ways to improve implementation, improve learning materials, improve related PD, and, yes, improve the testing that goes with it.  But at some point, we just need to accept that CCSS is a positive step forward for our public schools and focus on how to make sure all of our students are meeting expectations and learning to those standards.
But if we are going to continue to believe in the urban legends and grand conspiracy theories and of things that bump in the Common Core night, then maybe we need to consider what a committee chairman in the Missouri State House finally did.  According to the Associated Press (and courtesy of Politico’s Morning Education), in response to all of the “sky is falling” chatter about CCSS, Mike Lair, a Republican and retired teacher offered an $8 appropriation for “tin foil hats.”
Or more specifically, according to the AP, “two rolls of high density aluminum to create headgear designed to deflect drone and/or black helicopter mind reading and control technology.”
I’m all in.  I’ll even splurge on the first two rolls for all of the CCSS deniers and haters here in Eduflack’s home state.

Testing Problem … or Cheating Problem?

For the past decade, opponents of the accountability movement had crowed about the problems with testing and establishing student achievement-based metrics to determine the success, or lack there of, of our public schools.

When we learn of testing scandals such as those down in Atlanta, the finger is immediately pointed at the test itself.  Forget those educators who may have organized the erasure parties.  When we learn of cheating scandals such as those in NY, when high-performing students were paid to take the SAT for classmates, we again pointed at the test.  Oh, those poor students who re being overly stressed by being asked to take an SAT or ACT test to get into college.
The anti-testing forces have made their points clear.  Testing is bad.  Cheating proves it (as, it seems, does poor performance).  We can’t use tests to determine the effectiveness of a school, a teacher, or even a student.  We need to view each child holistically.  We need to let our students think and explore and do what they want to do and chase after rainbows and unicorns.
So how, exactly, does the latest from the Chicago Tribune fit into that anti-testing narrative?  For those who have missed it, John Keilman has a great piece on the impact of technology on cheating in the classroom.  
His lead?

Heloise Pechan’s heart rose when she read the essay one of her students, a seemingly uninterested high school sophomore, had turned in for a class assignment on “To Kill a Mockingbird.” The paper was clear, logical and well written — a sign, she thought, that she had gotten through to the boy.

Her elation passed quickly. What came next was suspicion.

Pechan, then substitute teaching at a McHenry County high school, went to Google, typed the paper’s first sentence (“Kind and understanding, strict but fair, Atticus Finch embodies everything that a father should be”) and there it was: The entire essay had been lifted from an online paper mill.

This piece actually provides a thoughtful reflection of the pros and cons of classroom technology, from the cheating that can come of it to the protections and checks it provides to ensure such cheating doesn’t happen.  
But it raises a very interesting question.  Do we have a testing problem, or do we really have a cheating problem?  After all, an essay on “To Kill a Mockingbird” is the perfect holistic evaluation, letting a student explore a topic in the way he or she wants, using critical thinking, reasoning, argument, and all of the other skills the anti-accountability movement has been preaching.  Yet we hear story after story about how paper mills, Wikipedia, and a host of other online sources have corrupted the five-paragraph essay.
At the same time, when we look at those states that have moved to online adaptive technology for their student assessments, we don’t hear a peep about alleged cheating or data fudging.  
Whether we like it or not, educational accountability is not heading for the exit.  Instead of attacking testing, we should be working to ensure that the assessments that are administered are of the highest quality, effectively measure the knowledge and skills of the students, and are used to tailor and improve instruction in the classroom.

Migrating from AYP

Virtually every state in the union is working to get out from under No Child Left Behind and its measure of Adequate Yearly Progress (AYP).  Thanks to the U.S. Department’s efforts to offer “NCLB waivers” most states have submitted applications to do just that, veer away from the AYP standard established a decade ago and chart a new path that still demonstrates forward progress.

Over at Education Week, Andrew Ujifusa has a piece outlining the plans many states are crafting for their post-NCLB existences.  From letter grades to stars, many states are looking for new ways to demonstrate progress to both policymakers and parents, in a way that put there districts and schools in the best light possible.
Take, for instance, the plan offered up by Ohio.  According to Education Week, Ohio’s plan is as follows:

  • A-F letter-grading system, based on 4 points. A school with 3.67 points or more earns an A, and a school getting 0.67 points or below earns an F.
  • A school cannot earn an A on the “achievement and graduation gap” portion of its score if one of four groups (all students, white non-Hispanic students, disadvantaged students, and students with disabilities) earns a C, D, or F.
  • Based on 2011 data, under the new A-F system, 24.8 percent of 3,103 traditional public schools (charters not included) would have earned A’s, 33.2 percent would have earned B’s, and 23.9 percent would have earned C’s.

Will it work?  Most states will likely win their NCLB waiver requests, thus giving these states and others the ability to enact their versions of AYP 2.0.  But how many years will it take before we know if this latest version of accountability works or not?
  

“Teachers Matter”

Last evening, President Barack Obama delivered his State of the Union Address to Congress and the nation.  The speech focused on the four pillars the President and his team see as necessary for turning around the United States and strengthening our community and our economy.  No surprise for those following the pre-game shows, education stood as one of those four pillars.

Five paragraphs committed to education.  One pointing out our states and districts are cutting education budgets when we should be strengthening them.  One on the importance of teachers.  One on high school dropouts.  Two on higher education and how we fund a college education.  (We have a sixth if you include the President’s call to do something to help hard-working students who are not yet citizens.)
So let’s go ahead and dissect what the President offered up last evening.
“Teachers matter.”
Absolutely.  No question about it.  We cannot and should not reform our K-12 educational systems without educators.  Teachers (and I would add, principals) are the single-greatest factor in education improvement.  They need to be at the table as we work toward the improved educational offerings the President and so many other dream of.
“So instead of bashing them, or defending the status quo, let’s offer schools a deal.  Give them the resources to keep good teachers on the job, and reward the best ones.”
Sign me up.  As the son of two educators, the last thing I want to do is bash a teacher (I’ll get in trouble with my mom if I do).  As I’ve said many times on this blog, teaching — particularly in this day and age — is one of the most difficult professions out there.  Most people aren’t cut out to do it, or at least do it well.  We need to make sure our precious tax dollars are being directed at recruiting, retaining, and supporting great teachers.  We should reward classroom excellence with merit pay and other acknowledgements.  But the President is also right in noting we cannot defend the status quo.  We can no longer debate whether reform is necessary.  Reform is necessary.  The discussion must now shift to how we change how we teach, not whether we change.
“In return, grant schools flexibility: To teach with creativity and passion; to stop teaching to the test; and to replace teachers who just aren’t helping kids learn.”
Yes, yes, yes.  Great educators know how to help virtually all kids learn.  They know to tailor their instruction based on data and other research points.  We should be encouraging that and empowering teachers to do so each and every day.  But we can’t lose sight of that last clause (and many may have missed it last night over the cheap applause line of not teaching to the test).  We must “replace teachers who just aren’t helping kids learn.”  In our quest for a great educator in every classroom, we must also realize not everyone is cut out to teach.  We need serious educator evaluation systems that ensure everyone is evaluated, everyone is evaluated every year, and those evaluations are based primarily on student learning.  And, like it or not, student performance tests still remain the greatest measure we have for student learning.  So if we can’t get struggling educators the professional development and support necessary to excel in the classroom, we need to be prepared to transition them out of the school.       
And lastly, President Obama’s “bold” call to action to ensure every student is college and career ready.
“I call on every State to require that all students stay in high school until they graduate or turn eighteen.” 
And here we have the President’s big educational swing and a miss.  This is a process goal, not an outcomes goal.  Based on AYP figures and recent on school improvement and turnaround, we know that far too many kids — particularly those from historically disadvantaged populations — are attending failing schools.  This is particularly true of secondary school students.  
Why force a student to stay in a school that has long been branded a “drop-out factory?”  Why keep a kid in school until he is 18 when he only reading at the grade level of an eight-year-old?  Why stick around for a high school diploma when it also requires massive remediation to attend a postsecondary institution?
No, the call should not be to require students to stick around a bad situation, giving us nothing more than a process win.  Instead, we should be focused on improving the outcomes of high school.  How do we demonstrate the relevance of a high school curriculum?  How do we engage kids?  How do we provide choices for a meaningful high school education?  How do we show the college and career paths that come from earning that diploma?  How do we make kids see they want to stick around, and don’t have to be mandated to do so?
At this point in time, we all realize that a high school diploma is the bare minimum to participate in our economy and our society.  For most, some form of postsecondary education is also necessary.  Until we improve the quality and direction of our high schools — and help kids see that dropping out is never a viable option — that mandatory diploma will be nothing more than a certificate of attendance.  We need to make a diploma something all kids covet … not a mandatory experience like going to the dentist.