Mona’s Story
Mona is a four-year-old enrolled in a Head Start program. According to her teachers, she is on track to meet developmentally appropriate goals in three of the five domains of the Early Learning Outcomes Framework. But once Mona moves on to kindergarten in her local K-12 school district, there is no hand off of this information to her kindergarten teacher. Her parents are not able to attend the kindergarten orientation at Mona’s school (where they planned to bring notes from her Head Start teacher to share with her new teacher), because they both are working during that time. Mona is able to thrive in kindergarten despite this lack of information sharing, but her parents had to advocate strongly for her throughout the year to get her additional support where they knew she needed it.
As Mona progresses through elementary, middle, and high school, she takes grade-level coursework and earns a mix of As and Bs. When her mother takes a new job in a city one hour away, Mona’s family moves, and Mona transfers schools in the middle of her sophomore year. The classes at her new school feel quite a bit harder than they were at her old school, and her grades start to fall to a C average. Had there been better information shared from her old school to her new school, Mona may have been placed in more appropriate courses where she could have gained the knowledge and skills needed to excel in more advanced coursework.
When Mona graduates from high school, she plans to enroll in a community college in her area and begins working towards an associate degree in paralegal studies. She needs to provide an official high school transcript to the college, so she calls her high school’s front office to request a verified copy. She has to take the bus from her new apartment near campus to her old high school, a 45-minute ride each way, to pick up the transcript and deliver it to the college’s enrollment office. A staff member from the enrollment office calls her the next week to tell her to request course completion records from her old high school, from her freshman and first half of her sophomore year, because her transcript only includes generic names for those courses, and the college needs more information to verify her coursework and place her into the right courses at the college. It is only because of Mona's persistence and commitment that she is able to enroll in her degree before the deadline.
When she finishes her degree, she applies for a job as a paralegal in her county’s District Attorney’s office. She brings a copy of her official transcript from the college to the interview, but they still ask her quite a few questions about what she studied in her classes, what sorts of projects she completed as part of her coursework, and what kinds of field experiences she had in school. Her transcript didn’t seem to have enough information to convey to the interviewer what she knew and could do. Mona is offered the job, but during her first week, her new boss mentions that he hoped her proactive, entrepreneurial attitude would overcome any gaps in her college preparation.
Why context matters
At each point in Mona’s educational journey, vast amounts of information were generated and stored, from her developmental milestones during Head Start, to her attendance and grades earned throughout K-12, to her community college coursework. But at each transition, it was ultimately up to Mona (or her family) to transfer the right information from one context to the next. It was due to the advocacy and persistence of Mona and her family at every single stage of her education that Mona succeeded. Such advocacy is in fact an implicit expectation built into the educational system, which is one of many features contributing to inequitable outcomes for students.
We know it’s important to build better systems and processes for sharing data across these different educational agencies, as well as how important it is to preserve those data across time. But there’s something fundamental that’s missing from those data themselves—what we call context.
When it comes to data, context has dimensions of both space and time. Operational context is what we call information surrounding a piece of data from a single point in time for a given student. For example, when Mona was enrolled in Head Start, operational context might include information about what skills Mona was taught in that program, what skills she mastered while enrolled, what additional supports were available in that program, which of those supports were provided to Mona, and so on.
Longitudinal context is how the meaning of data evolves over time in a given context. For example, how do we know that the Head Start program a student was enrolled in 20 years ago is the same as the Head Start program a student is enrolled in today? If we are using data from the Head Start of 20 years ago to evaluate its impact on post-secondary outcomes, how do we know if those conclusions will hold true for students enrolled in Head Start today?
In education, context across both space and time are missing. In a local education agency like a district, we connect data from different places (across schools) and different times (across school years), and we produce student-level granular metrics from these different data elements—things like graduation rates and attendance rates. But we don’t have descriptions about the spaces where and times when those data were generated, collected, or gathered.
We believe in the importance of evaluating data in context by getting more and richer information, which allows you to understand, interpret, evaluate, and make meaning of the data in more equitable ways. Context gives more value to data: Student data without context in which those data were created loses meaning. Context also brings more equity to data: The less context there is around data, the more susceptible those data are to problematic assumptions leading to biased and inequitable decisions.
What might this look like in practice? A simple starting point would be high-level text descriptors surrounding each piece of data. For example, for a data point that indicates the number of credits required for a student to graduate, descriptors could further articulate whether there was an exit exam required, if there was a required sequence of English courses, if there was a computer literacy or PE requirement, and so on. All of these descriptors provide much more nuance and context to the simple number of credits required, in ways that render that data point more informative and useful not just in its original context, but in other contexts it may get transferred to.
The role of governance
If all it took to ease the transmission of data across contexts was to connect the systems together, this would be quite easy. The tricky part is that we do need all of these contexts connected, but without some sort of “big brother” agency that is collecting, monitoring, or exploiting those connected data, since that often scares away (or at a minimum, disincentivizes) organizations and agencies from sharing their data. In education, many stakeholders feel a deep discomfort with an outside group or agency having access to their operational data. For example, a local education agency may have concerns about the state education agency drawing inaccurate conclusions about the district's expenditures without having access to an appropriate amount of context motivating that decision. Or a family may have deep distrust of a researcher having access to their child's data due to a long history of mistreatment or abuse of marginalized communities in the name of research.
Trust, then, is a core issue when it comes to connecting data across contexts. The solution to weak or eroded trust is strong governance—meaning transparent rules and regulations that dictate appropriate use of data.
In our view, the most productive governance path enables local control via decentralized governance. To achieve decentralized governance, data repositories need to be centralized within the local domain, and not at any higher or aggregate level. Then, interoperability is the key to unlocking those centralized data repositories, as permitted by the decentralized governance structure, to allow common elements to be exchanged across local domains without exposing the totality of the data.
We are now at a place where the technology is far enough along to make decentralized governance, locally centralized data repositories, and interoperability a reality. And while we don’t know exactly what this solution is just yet, we have ideas.
Interoperability as transport
In many cases, when people think about interoperability—meaning technology that allows for data to be formatted and exchanged between different systems–often it’s thought about as a mechanism for storage or for reporting. In many cases, interoperable systems are set up as a repository of data that are translatable and usable for other systems. Or, interoperability is inherently defined as a way to display and report data that comes from multiple different systems. This is depicted in the illustration below, where Ed-Fi serves as the interoperable layer:
We posit that fundamentally, interoperability serves as transport. It’s not really about getting data into the right place for use in other places. It's not really about getting data into the right place to be able to report it all in one place. Instead, it’s about getting data into the right place, at the right time, with the right amount of surrounding information to make it meaningful and useful—in other words, context. We'll show you what we mean in later illustrations.
Data neighborhoods
Interoperability is, at its core, about enabling the broader use of data. However, as soon as you broaden the use of data, you have to immediately think about how you responsibly restrict it in transparent ways. In other words, we want to break down the barriers between data flows, while also dealing with the consequences of doing so. Just like installing stop lights at intersections, driving on the right-hand side of the road (in the U.S.), and yielding the right-of-way to oncoming traffic before turning left, we must define the "rules of the road" (that is, the social norms) for the broader data community.
The way we do that is through the concept of what we call data neighborhoods. In the realm of education, a data neighborhood could be K-12 education, early childhood education, institutes of higher education (including community college, universities, and career and technical education), adult education, the workforce, the research community, and the policy/legislative sphere. Each of these neighborhoods generates, collects, and uses its own set of data, and in turn, relies on data from the other neighborhoods. In practice, these neighborhoods are typically connected by a piecemeal system of unstandardized roads, bridges, and connectors:
As a student moves along their educational path, they move through some combination of these neighborhoods, traveling from neighborhood to neighborhood. Typically, the student or family serves as the only mechanism of transport along the way—like when a student uploads their transcript while applying to college or a parent registers a student for a new school following a cross-district move. In those (perhaps rare) cases where data do get transferred from neighborhood to neighborhood, they almost always lack context—either operational context, longitudinal context, or both.
For instance, if a college receives a student’s high school transcript (note, this transport happens via the student themselves), the college knows the student took a course called “Math 12” and received an A. What the college does not know (at least not via the transcript data) is whether that was the most rigorous math course offered at the particular school, meaning the student maxed out the capacity of the math program, or whether the student opted out of (or was not eligible for) either an AB or BC calculus course. In this case, the transcript data are lacking operational context, which leads to biased assumptions and inequitable judgments that falsely pass as "neutral" data.
In another example, if the criteria that a state uses to determine if students are eligible to graduate from high school changes in a given year (say, 2018), then the meaning of what it means to have graduated in 2017 differs from the meaning of graduating in 2019. That longitudinal context may be known to experts within the neighborhood (such as a high school principal, or a district administrator), but there would be no systematic or streamlined way of transferring that context out of the neighborhood—say, to a legislator who may be looking at trends in graduation rates over time to inform a funding model or a policy change.
When interoperability is transport, context is built into the data themselves. It’s not a separate component or add on. This is even more crucial to consider for longitudinal context, as definitions continue to evolve over time. It requires the preservation of context at a point in time, and the connection of context across time points.
The key to making this work
Before we can transport data between neighborhoods, we need to in fact “develop” each neighborhood. In other words, before we move data out of a district for use in other neighborhoods, we need to make it available to the district itself to support its own uses. For example, imagine the case of a student living with a grandparent who is chronically ill. The student is often absent from school because they are the sole caretaker for their guardian. What the school might see is a student racking up absence after absence, and perhaps they stop showing up at school altogether. And perhaps the Health and Human Services department has information about this student’s living situation. If the district had systematic access to that kind of information, in secure ways that protect the student’s information, this could fundamentally alter the tools at the school’s disposal for intervening and supporting this student.
This type of transport becomes much more complex once you need cross-institutional, bidirectional information transfer within the same time period and within the same context, and even more complex again when you need transfer across time periods and across institutions. In all of these cases, you must have the right context to make any sound decision with data from other institutions, other neighborhoods, other time periods. The greater the distance between data neighborhoods, the greater this challenge is. For instance, a state agency will know more about the longitudinal context in their districts than a federal agency will know.
This is where decentralized governance to support local control becomes crucial. Decentralized governance across neighborhoods starts with a strong governance structure within each neighborhood, to establish the norms for what data go out and what data come in to that neighborhood, as the image on the left shows.
Equity is a key motivator for incorporating and preserving context around data. In the absence of context, our brains make unconscious assumptions about the data. Those assumptions are always rife for implicit bias, even more so given there are structural features of the U.S. educational system that perpetuate inequities. Context is one tool for disrupting this feedback loop.
To achieve the ultimate goal of interoperability as transport for the good of all neighborhoods, as the image below illustrates, the first step must be maximizing local use, flexibility, and relevance for each individual neighborhood. Only then will each neighborhood have the right incentives to invest time and resources into building the interoperable infrastructure needed to realize the vision of transport across neighborhoods. How we actually implement this vision of interconnected data neighborhoods in practice will be the topic of a future blog.