How College Board's Environmental Context Dashboard highlights algorithm transparency vs. explainability issue

The Environmental Context Dashboard is a tool to provide socio-economic metrics on student populations for university admissions officers. Here's a look at the data as well as potential bias and transparency issues.

Recognizing the need to check for bias in algorithms Kathy Klotz-Guest, speaker, author, and business comedian, tells Tonya Hall about the importance of ensuring and checking for bias in algorithms.

College Board, the organization behind the SAT and related tests and products, is piloting an Environmental Context Dashboard that aims to surface students who have overcome socioeconomic challenges raises interesting questions about algorithm transparency versus explainability.

According to College Board, this dashboard is designed to shine a light on "students who have demonstrated resourcefulness to overcome challenges and achieve more with less." Students, however, won't see ECD contents, but College Board noted that it is "looking into how we might make it available to them."

The Wall Street Journal has dubbed College Board's index as an "adversity score" and not surprising press coverage was mixed. In a statement, the College Board chafes at the term adversity score and prefers Environmental Context Dashboard. For our purposes, we'll go with the Environmental Context Dashboard, or ECD. College Board officials declined an interview with ZDNet while the effort is being piloted with 50 schools, but has an FAQ posted.

College Board plans to make the data broadly available to academia for free next year. Given the impact on algorithms on our lives as well as the potential for bias, it's worth a deeper dive on the College Board pilot. College Board plans to expand the ECD pilot to more than 150 institutions in the fall to shape the tool. College Board outlined some of the data driven models in a paper

Here's a graphical representation of what College Board is trying to capture:

data-framework-for-ecd.png

College Board

This ECD score is based on 31 equally weighted data points about neighborhoods, crime rates, and high schools designed to gauge the environment. Admissions officers will get the score via a dashboard and can use it to add context to an SAT score.

College Board data is largely based on the American Community Survey from the U.S. Census Bureau. Neighborhood data is based on census tracts and high school data is based on census tracts represented by the school.

Here are the factors:

Neighborhood measures

  • Median family income
  • Percentage of all households in poverty (poverty rate)
  • Percentage of families with children in poverty
  • Percentage of households with food stamps
  • Percentage of families that are single-parent families with children and in poverty
  • Percentage of families that are single-parent families with children
  • Percentage of housing units that are rental
  • Percentage of housing units that are vacant
  • Rent as a percentage of income
  • Percentage of adults with less than a 4-year college degree
  • Percentage of adults with less than a high school diploma
  • Percentage of adults with agriculture jobs
  • Percentage of adults with nonprofessional jobs
  • Percentage unemployed
  • College-going behavior
  • Probability of being a victim of a crime

High school measures

  • Median family income
  • Percentage of all households in poverty (poverty rate)
  • Percentage of families with children in poverty
  • Percentage of households with food stamps
  • Percentage of families that are single-parent families with children and in poverty
  • Percentage of families that are single-parent families with children
  • Percentage of housing units that are rental
  • Percentage of housing units that are vacant
  • Rent as a percentage of income
  • Percentage of adults with less than a 4-year college degree
  • Percentage of adults with less than a high school diploma
  • Percentage of adults with agriculture jobs
  • Percentage of adults with nonprofessional jobs
  • Percentage unemployed
  • College-going behavior

When the data is compiled, College Board delivers a score between 1 and 100. One would be the least disadvantaged and 100 would be the most. And admission officer would get a dashboard like this:

ecd-webpage-update-wotop-0.jpg

According to a research paper on the topic, admissions officers from eight selective universities primed dashboard users to "treat students from high adversity backgrounds more favorably relative to the official read, and that this phenomenon occurred in both the control and treatment groups." In other words, those diamonds in the academia rough got a shot.

What's the problem?

In talking to other parents, College Board ECD is already causing concerns. After all, parents strive to move to good school districts and neighborhoods. And now there could be a penalty for what used to be a worthwhile goal.

Yet, the ECD makes sense on many levels. Context based on real data can't necessarily be a bad thing. And who doesn't want to lift up a student who has done well despite numerous hurdles?

Special Feature

Special Report: How to Win with Prescriptive Analytics (free PDF)

This ebook, based on the latest ZDNet / TechRepublic special feature, explores how you set up an analytics infrastructure that sees around corners and gives you options to avoid a head-on crash.

Read More

Personally, the ECD may have helped me back in my SAT testing days. It's doubtful that my kids will get a boost, but wouldn't be scored privileged and probably land in the middle somewhere.

College Board did respond with a few statements and clarifications. It noted the following:

  • The Environmental Context Dashboard doesn't alter a student's SAT score.
  • It does show how a student's SAT score compares to those of other students in their school.
  • It doesn't take into account any personal characteristics of a student beyond the test score.
  • It does provide admissions officers with better context about an applicant's neighborhood and high school.

In a statement, David Coleman, CEO of College Board, said:

Through its history, the College Board has been focused on finding unseen talent. The Environmental Context Dashboard shines a light on students who have demonstrated remarkable resourcefulness to overcome challenges and achieve more with less. It enables colleges to witness the strength of students in a huge swath of America who would otherwise be overlooked.

There is talent and potential waiting to be discovered in every community – the children of poor rural families, kids navigating the challenges of life in the inner city, and military dependents who face the daily difficulties of low income and frequent deployments as part of their family's service to our country. No single test score should ever be examined without paying attention to this critical context.

Transparency vs. explainability

Kathy Baxter, architect of ethical AI practices at Salesforce, finds the discussion of the College Board score interesting and highlights what's becoming a big issue as companies and organizations use algorithms. The issue: Transparency versus explainability.

Transparency is relatively straightforward and College Board appears to have met its requirements. "Transparency and explainability are interconnected, but different. Transparency communicates what factors are considered, but now how they are being used," explained Baxter. "Explainability is more difficult to do. An example would be something like 'a male between this age in this zip code is going to be charged x amount in insurance."

Explainability is much more important in regulated industries because it's useful to redress. A company can explain how it arrived at a decision and the customer could say the information is wrong and not fair.


Must read


Now you can just imagine College Board's pickle here. The ECD score is transparent about what factors are included. But there's no explanation about how the score will be used. Every university may be different so explainability could fall to individual institutions. In addition, students won't know their ECD even though admissions officers will.

And should College Board be truly transparent about the ECD and disclose scores to students there are other questions to ponder. Would a student with a score of 90 be heartened or discouraged? Would a student with a score of 10 feel like a failure if he had a SAT of 1,000 yet deemed privileged? What's the balance between psychology, algorithms and transparency?

Simply put, College Board has a Google algorithm issue. Google tells you what factors it considers in search rankings, but never discloses how the sausage is made. Why? Companies will game Google. College Board has the same risk.

Kartik Hosanagar, a professor at Wharton, explains the conundrum. We recently caught up with Hosanagar to talk about AI bias.

It appears that admissions officers are getting more information about applicants than in the past. But it's problematic if the student doesn't see this information and only an admissions officer does. It'll make the admissions process seem even more impenetrable to students and parents. It is only fair that such adversity scores should be accessible to students and some basic information on the factors that go into the calculation be made available to students.

That said, the more information that is made available to students and parents, the greater the risk of it being gamed in some way. I understand those risks. But if you are going ahead with a system that is going to guide important decisions such as college admissions, you have to make the math models more transparent.

Sameer Maskey, CEO of Fusemachines, a company that aims to eliminate bias in models, argued that College Board needs to be more transparent. "College Board has not disclosed the algorithm behind the system that spits out a score of 1 to 100 and it is a black box as of now. Additionally, scores are not provided to the students," said Maskey.

Also: Special report: Managing AI and ML in the enterprise (free PDF) (TechRepublic)

The larger issue for the ECD is that there can be bias in it and like all algorithms the score can't possibly catch all the nuances involved. "Adversity score tries to measure the right things and there are some measures that do make sense, but it doesn't cover many cases that are not taken account of," said Maskey.

Some of those situations are obvious:

  • A student could be in a wealthy zip code with two drug addicted parents, but look privileged.
  • Learning disabilities.
  • A stable single parent household versus an unstable two parent one could have two vastly different ECD scores.
  • A higher income family that moves to a poor neighborhood junior year of high school.

Where the bias discussion comes in will be for the ECD extreme scores and those that fall outside the middle. What's the weight of an ECD score of 60 vs. 40? Each attribute in the ECD score is weighted equally, but does that make sense? And finally what's the bias implications for an ECD score when combined with other metrics such as SAT, GPA, and other factors. What are the best practices for using the ECD score?

Add it up and College Board's ECD raises some interesting data science and transparency questions. With any luck, both will improve given your kid's admission to a university may be helped or hurt by it.

ZDNET'S MONDAY MORNING OPENER

The Monday Morning Opener is our opening salvo for the week in tech. Since we run a global site, this editorial publishes on Monday at 8:00am AEST in Sydney, Australia, which is 6:00pm Eastern Time on Sunday in the US. It is written by a member of ZDNet's global editorial board, which is comprised of our lead editors across Asia, Australia, Europe, and North America.

PREVIOUSLY ON MONDAY MORNING OPENER:

MORE ANALYTICS: