Social Good Meetup
A few weeks ago, I took advantage of one of the perks of living a train ride away from Chicago. I went to a Data Science Meetup. Many other cities have meetups for data science, but there's probably only a handful with many more members. While data science can be used to increase profits or to help me dominate March Madness competitions, most of my favorite applications of data science would be classified as for social good. This particular meetup featured four such applications of data science. Unfortunately, I was not personally involved in any of these projects so I can't try to bore you with fun details.
Since it was almost a month ago now, I will just mention my favorite of the four. For those interested though, here is a list of the different presentations with links to short synopses of the projects:
- Predicting Students That Will Struggle Academically by Third Grade
- Preventing Juvenile Interactions with the Criminal Justice System
- Identifying Frequent Users of Multiple Public Systems for More Effective Assistance
- Improving Government Response to Citizen Requests Online
If you know what a huge education addict I am, it might not surprise you that the first one caught my attention best. With all sorts of educational resources becoming so readily available online through sites like Khan Academy, Coursera, and dataisbaye ;), we have plenty of data with the potential to improve the learning process. Even though the massive scale of a site like Khan Academy really excites me, there is still plenty of data to learn from in a more traditional educational structure. This project and presentation utilized several years of data on students from kindergarten to third grade at Tulsa Public Schools. This age range was targeted due to research suggesting that third grade is a rather pivotal point in a person's education. There was also an emphasis on reading skills in particular.
The aim of the data analysis was not simply just identifying students who look to be struggling, but it involved essentially recommending a course of action to correct the problem. I found the presenters' distinction between actionable and inactionable features really interesting. In general, the term feature is used to indicate some piece of data that you use to make some prediction (which is often referred to as the target). So my credit score might be one of many features that would be used to determine a risk factor on lending me money, for example. The distinction between actionable and inactionable features was that something could be done about the actionable features. An inactionable feature might be a student's age because you can't do anything to change a student's age. But an actionable feature might be assign him X number of pages of reading every week.
Normally, I would think of these actionable features as possible target values. For instance, given student XYZ's age, grades, and attendance rate, I suggest giving him 5 pages of reading every night, or a one-on-one session focused on reading every week, or no more recess :o. And maybe this is how they actually went about the machine learning involved in making these suggestions, but it reminded me of Keith Bauson's and my failed machine learning project.
The idea of our project was to reverse a classifier of images of digits so that instead of being given an image of a digit and producing a classification, it would take a possible classification and produce an image that it believed to fit that classification. Long story short, we successfully reversed someting, but the something we reversed wasn't complex enough for the job so it didn't turn out too great. Oh well, Dr. White didn't follow through on his threat to fail us. Anyway, the something that we did manage to reverse might actually work fairly well for something like this. It might be tricky, but I at least like to pretend it would be possible. The classifier would essentially just say whether or not a student is on track. Then if not on track, tell the classifier to operate in reverse mode on a student with all of the same inactionable features held constant, but allow it to modify any actionable features. It will then imagine that same student succeeding and what it took to get him there. Maybe he never got to go to recess again, but now he knows how to read. Sounds like a bad deal to me, but I'll let his teacher make the call.
So that was basically my cool takeaway from my first Data Science Meetup in Chicago. I was hoping to have a more interesting post ready by now, but my progress on it has been too slow. I'm basically working on the first step to making Keith's and my old project actually work. Which by the way, see the video above to see someone who has accomplished what our main goal was. (We did have an added twist to ours that would have made our's not completely unoriginal). Hopefully, someday I will show off what we intended to do.