I have been an AI skeptic for as long as AI has been a buzzword in research and analytics. Why? And what if I told you I’ve changed my mind, and only recently?
I used to say, "AI is nothing more than a non-parametric estimator of probabilities."
I would confidently make this fancy-sounding, buzz-word-laden proclamation on stage, in front of hundreds of technologists even. After finishing a PhD in economics focused on the statistical fundamentals of individual choice and causation, I was skeptical of anything that claimed to be obviously superior in predicting future outcomes than prior science told us was feasible. I already knew that all statistical models can predict things (some better than others, and some with fewer assumptions than others). AI/ML models doing it with lots of computational power was nothing new.
I used to think, "Prediction is a parlor trick."
If prediction is a parlor trick, then AI is nothing more than sleight of hand. When it comes to education, I want us to ask better questions and consider what questions we aren’t asking; AI can’t do that. I want to know how to help teachers and students plan and react and learn without having to do busy work; AI can’t do that. I want parents and families to have good causal information about how to best help their kids grow and become successful adults; AI can’t do that.
Right?
I am starting to believe that AI can do that.
Maybe.
If I now believe that AI can do that, what changed?
Enter generative AI algorithms
What changed was that after years of working in an area of AI research that many thought was a dead end, OpenAI released a series of products that blew me away. Importantly, they productized the concept of Generative AI Algorithms, products like DALL-E and ChatGPT, which can create completely new content based on a simple set of guidelines. These algorithms can not only match patterns, but also generate new ones, sometimes in unexpected ways. ChatGPT has opened my eyes to the possibilities of new interfaces between computers and human experts (like teachers)—where the algorithm is not just replicating statistics but can instead act like a superpowered assistant.
But how superpowered? For now, I will focus on a class of models called Large Language Models (LLM). ChatGPT has shown us just how natural it can feel to converse with such a model. But what could that look like in a classroom context?
As an example, I asked Chat-GPT the following question:
To which Chat-GPT 4 gladly answered:
What can we notice about this?
- First, (assuming Jed were a real person) by telling ChatGPT that Jed had failed his math test, I would have given PII data to Open AI, a private company.
- Second, any teacher would likely tell you that ChatGPT’s advice is very vague, and it even gives suggestions that may be unproven or misaligned to district policy. (I think Khan Academy is fantastic, but this may not be an approved program or the best course of action for the student.)
- Third, the response looks very professional and well-reasoned, even with a very small amount of context given in the prompt.
Sure, I could have added more context to the prompt, and I could have followed up with various questions to expand on ChatGPT’s response, but at the end of the day, education professionals’ time is precious and limited.
For now, let’s think about two key points that arise from this example that we need to address before AI could be used in the education space:
- Student data are private and have complex permissions associated with it. Only some people are allowed to see data like student assessments, program participation (like subsidized lunch or special education program enrollment), or any other PII—and these are not always the same people. It is typically illegal to pool these data across local and state lines without very specific purposes that have been explicitly authorized by parents and guardians.
- Teaching a child is a complex endeavor that takes place during 20+ years and involves hundreds of adults contributing along the way. Education leaders design school systems to be an interlocking system of curriculum, supports, interventions, and community. Without an understanding of that complex context, these kinds of AI systems are doomed to result in bad outcomes. For instance, in this case, perhaps Jed has already been enrolled in a math support pullout program, and the right answer is for the teacher to coordinate with their math support colleague.
It is clear to me that there is a glimmer of something here that may transform education. My next question is, how would that happen? Is it just as simple as connecting these LLMs to the education system? If we want AI In education, we must replicate these AI models, right?
Dipping into the weeds of LLMs
Think of training an LLM like the way we teach students to read. We repeat various evidence-based strategies to support fundamental reading skills—like having them sound out words and responding when they are right (or correcting when they are wrong). At a certain juncture, a child knows how to read words, because their neurons now have adjusted to be able to process the inputs of letters, phonemes, words, sounds, and meaning.
LLMs can loosely be thought of like we view our own brains. You feed data into them, and the parameters of the model shift slowly based on high-volume repetition. There are various methods for training LLMs (some more productive than others), but the process is modeled after real human learning.
Think of the structure of the algorithm like how a brain develops physically. A larger, more complex model compared to a smaller, simpler model is not that different from an adult brain compared to a child's brain. The adult brain and the child’s brain could study and learn for the same amount of time on the same topics, and it is likely that the adult brain would operate at a more complex level. If anyone has a fifth grader at home, you also know that since they study things we have forgotten, their process for approaching concepts is different than that of adults, and thus, they are often more adept at changes.
This is how it comes to be that a less complex model trained on specific data can be better at some tasks than a more complex one. It’s not that one brain is better than another—it’s that the processing of information and connections made has been trained for a different amount of time, and different pathways have been established (or pruned).
Here is where things get different from humans.
In February of 2023, Meta, the parent company of Facebook, released their version of GPT called LLaMA to the open-source world. This was like releasing a fully formed adult brain with no prior knowledge to the public for anyone to copy. It is useful for any researcher who wanted to start to learn to train new data on the most advanced algorithm out there for free; however, just like an empty adult brain is useless, this model was also useless, except for research.
A week later, the model parameters from Meta were leaked to the public. This would be like suddenly inserting all the knowledge that Meta had into that blank adult brain. By early May, anyone with interest and a laptop could run one of the most powerful AI algorithms—pre-trained.
What makes this so unusual and powerful is that what leaked wasn’t the database that Meta researchers initially trained the model on—it was instead access to the wider world of "knowledge" itself. Anyone could now add more knowledge to the brain. Meta spent millions of dollars building this tool that now anyone can use and develop on top of, at no cost.
This is all to say that, at least to me, AI models are going to be commodities that don’t cost very much to develop or implement in the very near future.
If generative AI models are going to be low-cost and easy to run, can they build the future of education?
Not so fast.
It turns out that while the general models themselves will have almost no special value, the data they are trained on in specific contexts matters a lot. Just like a fresh college graduate is filled with general knowledge, but some general experience cannot have an immediate impact in their first job without training and mentoring in the context of the organization where they are working, generalized AI models need that training and mentoring in context. Each organization will need to train these models on their private data, so that the models can learn about the things that matter to that organization. More specifically, functional experts within each organization will need to train these models on how to behave when prompted by various queries.
This is already becoming obvious to the commercial world. The major technology platform companies like Microsoft, Salesforce, and Google are already trying to integrate LLMs into their own products for use with private business data. Smaller startups are allowing for specific use case analysis of uploaded content like contracts and legal documents. These companies are leveraging the fact that they own or have access to a very large swath of technically private data (or are asking for folks to release those data to them for analysis) to train LLMs on. I believe that public education data will be much harder to work with for two main reasons: (1) privacy concerns and (2) data infrastructure barriers.
In education, the data that one would need to train an AI model on contain sensitive private data (PII) that absolutely cannot be released to an external company without extensive protection. Because of that, these models—if we want to use them—must be trained inside of education organizations themselves. We can start with the models and training that already exist in the general open-source field, but we need to host and apply those models to private, secure data systems that protect the privacy of the students and families in the system. If we do not, then the future of AI will be controlled by for-profit commercial ventures that will monetize public education—likely in ways we can’t forecast today.
If public control of these models built on private data is your goal, then public agencies first need to connect all their data into a private, secure, interoperable data stream. Importantly, those data must be housed so that the LLM models can interact in a common way in every district and state—otherwise, every district will need its own expert machine learning scientist. If we do have this common data interface, districts could potentially share the AI algorithm training burden by training the model one district at a time—without ever sharing data across districts.
So what happens next?
I think we have a great opportunity to lower the burden on teachers and amplify their expertise using these new tools. I am quite concerned that we are going to allow the commercial space to drive the R&D agenda for these tools in ways that are not necessarily aligned with the public good. In my view, we have a clear path forward for a public-interest-driven AI strategy:
- Continue the interoperability revolution to create common data surfaces across the ecosystem.
- Continue the data modernization strategies that give school systems access to the most powerful data technologies in the world at very low cost.
- Drive the adoption of common, non-commercial data science infrastructures for all education agencies. We believe that EDU is one such infrastructure that the field could rally around.
- Invest in a community-driven, non-commercial, AI data operating system for the education system that can sit on top of any cloud platform and can interface with any LLM or AI model.
- Invest in understanding the legal, ethical, and societal ramifications of the above.
- Enable thousands of educators to leverage these new tools to enable them to focus on their craft instead of busywork.
This isn’t a small investment, but the value can be multiple orders of magnitude greater than the cost. If anyone is interested in pushing this forward, please reach out.