Since ChatGPT was released back in November of 2022, the excitement surrounding AI has been undeniable. For almost two years, every meeting I participated in, conference I attended, or thought piece I read had to have an obligatory bow to AI.
It has been exhausting.
But I get it. There is enormous potential to revolutionize K-12 public education by enhancing learning and operational efficiency. The prize is too big not to strive for it.
I have, however, started to notice that the shine has come off the new toy. The promises that have been made are not coming true nearly as fast as we were promised. The newness and magic feeling is being replaced by a flood of more of the same dull text responses that are becoming increasingly easy to recognize as an LLM output.
Why is this happening? I think it is because we have answered the simple questions like, “What can AI do?” (My answer: Not very much yet, and what it can do is not done very well), and, “Should we be worried about AI?” (My answer: Yes, but only because it is wrong often and is quite biased—but no, we shouldn’t be worried about it taking teachers’ jobs).
But we haven’t yet started start answering the most important question in my mind, which is, “How do we actually enable AI to transform public education?”
Transformational change in education won’t come about by only using generic generative AI tools like ChatGPT, but rather by providing educators and administrators with tools that leverage the data specific to their own context. Just like an office worker’s daily work may be enhanced by AI “layered over” their email, calendar, team chat, policy documents, and more—so too might a teacher’s daily work be transformed by AI “layered over” their students’ attendance data, gradebook, assessment scores, and more. Until then, I fear generative AI will just be a fancy copyeditor that is frequently wrong.
Here is the rub: If we want this better world of contextualized, personalized, generative AI tools available in K-12 education, then the algorithms that drive them need access to as much real student data as possible. The more access the algorithm has, the more helpful it could be (in theory).
That sounds scary: Do we have to share our students’ data with all the algorithms? Let’s stay with this for a moment though—if only because the ultimate payoff of better education for all is just too great. How would we even do such a thing in a way that doesn’t immediately unleash a Pandora’s Box of student privacy risk?
We need an answer to this that is tailored specifically to K-12 education because unlike in other industries, data in public education are private information in a public institution. Schools need to appropriately act as guardians of this information. Entrusting this information to external entities that provide AI tools is akin to using a voice assistant without knowing if it’s listening to you or uploading photos to social media without knowing their ultimate use. Indeed, we are currently on the path to repeat the history of early social media adoption, where users dumped their personal and private information into new, exciting (and privately owned) technologies, without a deep understanding of what the uses of those private data are. We now have a better grasp of the risks, concerns, and rights related to personal data privacy on social media platforms, and yet, public school systems currently have few options beyond sharing their students’ private data with privately owned companies.
Right now, there are two paths school systems are taking to share these data—and both require tradeoffs related to control and privacy of the data.
The first path, known as a “walled garden” approach, is for school systems to buy into a large software ecosystem that runs all their systems, such as their Student Information System (SIS) and their Learning Management System (LMS). In this world, the software company is already hosting all the school’s data, and it typically has the rights to do R&D for their products using that data; in turn, this private-sector company can then sell back AI services based on that school’s data. We may find out that this is the only viable path to large scale AI in public education. But it clearly comes at the cost of student data privacy and public control of algorithms that are used to make decisions about students.
Most school systems in the country are already on this path. Large data companies (such as national SIS and LMS conglomerates) are increasingly encouraging school systems to take this path, since from their point of view, it leads to the quickest means for AI to have large positive student impact (not to mention company profit).
The second path is for school systems to give only a small subset of their student data to different EdTech companies who provide different algorithms and applications. The upside of this option is that no single company controls all the data related to a student, which increases privacy for the full set of student data. It also avoids relinquishing control of the entire dataset to a private entity (since they are divvying up elements among many providers). The downside is that AI tools will always be less powerful when they can only access a subset of the data. Educators then must continue to navigate multiple different tools, user interfaces, usernames, and passwords—further exacerbating the toolbox proliferation that already plagues educators. In addition, this path may increase student privacy risk, as it distributes the data across the EdTech field and prevents the school system from retaining full control of their data. This is the path that smaller, narrower EdTech companies are encouraging school systems to pursue.
Both paths have advantages and come with some tradeoffs. They are not inherently bad options. The problem is they both rely on shipping data out to data brokers. If you wanted to keep your data inside your control as a district, you could build these algorithms and tools in house, but asking any district of any size to build these functionalities as if they were a software development organization is a bridge too far. A district could in theory buy a product and install it in their data ecosystem—but to my knowledge, no such products exist beyond simple proofs of concept.
So, what is a third path? Investing in and building publicly owned, locally controlled and operated, interoperable infrastructure would enable a future where we could bring compatible AI algorithms and other tools to the school systems themselves, rather than shipping the data out to companies who can provide those algorithms and tools.
This is not a theoretical idea. Cloud data and analytics software is trending back towards a new version of on premises (“on prem”) software, where vendors install software and algorithms in a company or organization’s cloud and manage updates from afar. The main obstacle to this strategy is that we need a common data structure for these applications to access, otherwise it is far too expensive to build them.
We are making progress to build this common data structure in the field through efforts led by standards bodies like the CEDS, Ed-Fi Alliance, and 1EdTech. However, we aren’t yet seeing compatible open-source AI algorithms being designed to work with this interoperable data infrastructure inside school systems.
But we now have the technology and momentum to make this vision a reality, with success stories of large-scale interoperable data infrastructure with compatible applications being built for K-12 public education atop that infrastructure in states like Arizona, Colorado, Georgia, Indiana, Michigan, New Mexico, North Carolina, South Carolina, Texas, Wisconsin, and more. These examples are establishing the groundwork for this third path to come to fruition. I am confident we can build a world where data are not shipped out all over the place.
Our call to action is twofold: Join us in building unified educational data structures that are publicly owned and help build AI tools that operate easily on top of those structures that remain in the public’s control. Collaboration between educators, technologists, and policymakers can build interoperable systems to combat data fragmentation and reduce tool proliferation, while protecting student privacy. This will not only safeguard sensitive student information, but also empower schools to harness the full potential of AI to enhance learning outcomes and operational efficiency without compromising ethical standards. Only then can we ensure that the power of personalized, generative AI in education remains firmly within the grasp of public institutions, guided by principles of privacy, unity, and public control.