Dan O’Neil and Lola Chen on Big Data at PechaKucha Night with the Chicago Architecture Foundation

Tonight I co-presented at the Chicago: City of Big Data Pecha Kucha with my colleague and friend Lola Chen.

Here’s the presentation, along with complete text below.

1.
I’m Dan O’Neil, and I run the Smart Chicago Collaborative, a civic organization devoted to improving lives in Chicago through technology. I’m here with Lola Chen, a community advocate here in Chicago. We are going to talk about the role of humans in big data in an urban environment.2.
I think it has a great role to play in helping understand how to run a city. The understanding of facts is critical to a just society. And what makes sense for other segments of our culture and economy can make sense for government.

3.
And much of my career has been devoted to data. I’ve made data-driven web products for the last decade. Smart Chicago Collaborative is a national leader in the creation of civic apps. We were the impetus behind bringing Open 311 to Chicago. I guess the point is, I know of what I speak.

4.
In my work at Smart Chicago, I’ve come to deeply appreciate the value of humans. They make all data. Data is a subset of humanity, not the other way around. I’ve seen first-hand what happens when the fetish of data can make everything go wrong.

5.
So I am dubious of any discipline that seeks to help people that doesn’t seem to really include people in meaningful ways. Remember how stoked Burgess Meredith was in the Twilight Zone when all the people were gone and he was left with his books?

6.
Pretty much every time I see something in the world of “big data” or “predictive analytics”, there is never any mention of humans. As if the machines are autochthonous, indigenous, comes from nowhere and knows everything. Empty of humans.

7.
But of course humans have made everything. And they are the most versatile and capable objects on earth. Burgess Meredith got pretty bummed when he immediately broke his glasses and couldn’t read any of his glorious books. His myriad word repositories were of no use. If only there was one other human left to read to him.

8.
I’ve come to know Lola through the OpenGovChicago meetup and she’s helped me greatly in my work at Smart Chicago. She is an amazing Chicago resident. She values data and technology, and is one of the best humans I know.

9.
Lola Chen is the master of the email. As I was preparing for this event, and I was pondering the value of humans in big data, she wrote me one of her missives. In it, she wrote, “Any alert person can ride the streets of Chicago and see the pattern of pothole problems. The ride might take 4 hours or so. The making notes might take 1 hour.” This is what I mean. This is the value of humans. So I yield the remainder of my Pecha Kucha to the great Lola Chen.

10.
Hi there, my name is Lola Chen, a self-confessed extreme data hog. I moved to Lincoln Park in 1969, right around when the federal government declared the area the first urban renewal blight zone. I first bought properties in Bucktown in 1984. I moved to East Garfield Park in 1998. All throughout, I collect data.

11.
Everywhere I go I ask the question “when might data be flawed”. I interviewed all sorts of residents and visitors from all over the world.  All have seen Chicago Potholes. Data can be incomplete, biased, omitted, inaccurate, misclassified, or falsified. I have seen all of these.

12.
Here’s a practical example of the lack of data sharing. I had parked far from the curb due to a deep pothole. They gave me a ticket for being more than 13 inches from the curb. One piece of data that should relate to another. I GOT OUT OF THE TICKET.

13.
Data can be falsified faster than you think. I monitored grass cutting in vacant lots owned by the City. The workers knew they had GPS installed on the tractors, and they went up and down the lot, showing through through data that the lot was cut. But they lifted the blade so that no grass was cut. The lot was marked as done. It was not.

14.
Here we have a KINZIE INDUSTRIAL CORRIDOR series of long potholes going down the street that are DEEP. Chicago is currently promoting a return to manufacturing with hopes of creating new jobs. THAT IS WHY THESE POTHOLES ARE RELEVANT. There are patterns, visible, if you look.

15.
This lovely Lincoln Park ALLEY Pothole has a mural as a backdrop. Stanley’s is a neighborhood fruit market institution opened by Greek immigrants in the 1960s who have succeeded in expanding almost every decade. DON’T REALLY THINK AN ALGORITHM COULD PREDICT POTHOLE/MURAL

16.
This Humbolt Park CATCH BASIN Pothole seems to be accessorized with roadwork paraphernalia.  Today the paraphernalia runs for almost 1 block it has been there so long. The paraphernalia becomes permanent. COULD AN ALGORITHM SPEED UP ROAD REPAIR?

17.
This sewer pothole was misclassified as fixed. It was not, however, fixed. This is shoddy workmanship that leads to multiple visits to the same issue, leading to more work for contractors and more dollars out of our pockets. The data saw “fixed”, but it was nothing of the sort.

18.
Here you can see the impetus of my note to Dan. Any alert person can see the issue with the seam in the asphalt. I’ve seen it all over, and reported it to a number of commissioners. The Inspector General is now looking into this.

19.
The City collects and stores enormous amounts of data, but the data is flawed, and there’s not enough. We need drones, satellites, patrol cars, garbage trucks, all collecting data and making 311 requests. There are no mechanisms to address these flaws. We need more people— smart City workers who know the data— cleaning this up. Let’s do it.

20.
So thank you to all the people who helped me put this together. If anybody in the world knows how to fox potholes, please send us some ideas!

Complete Data Set of Medicare Payments to Doctors and Suppliers in Chicago

cms-logoOn April 9, 2014, the U.S. Centers for Medicare and Medicaid Services released data on the amount and type of billing that individual doctors and institutions submitted to the Medicare program in 2012.  Medicare pays for health care services to most persons aged 65 years or more and to persons who have a disability.

We have extracted the 8,104 records for “physicians and other suppliers” found in the database and with an address in Chicago.   (A separate database contains inpatient and outpatient charges of institutions such as hospitals.)  This file may be viewed and downloaded by clicking here. Some highlights of the data are as follows.

Continue reading

Open and Online: Accessing and Using Health Data at the Public Health Informatics Conference

Today I’m participating in the session called, “Open and Online: Accessing and Using Health Data” at the Public Health Informatics Conference in Atlanta. Here’s the description:

This session will present “8 Principles of Open Health Data” to guide management of, access to, and governance of de-identified non-aggregate health data. Presenters will discuss the use of an online interactive Disability and Health Data System that uses Behavioral Risk Factor Surveillance System disability data and will present a framework for capturing newborn admission data from hospitals.

If you care about these issues, please consider joining the Health Data Liberation meetup group, which is meeting tonight at the Opportunity Hub (“Atlanta Intro to the 8 Principles of Open Health Data“, right next door to the PHIC Conference.

Join us in this fight.

Livestream of “Putting Health Data to Work in Our States and Communities”

This Friday, the Health Data Consortium will be hosting a two-day event that will talk about how to Put Health Data to Work in our Communities. As we move through the day, we’ll post the videos below. Our broadcast will begin at 8:30am CST. If you don’t see the newest stream, please refresh your browser.

Afternoon Panel 3 (at 3:05)

Previous videos below:
Continue reading

Two Great Illinois-Focused Health Data Events

U.S. CTO Todd Park at Healthbox Investor Day here in Chicago

U.S. CTO Todd Park at Healthbox Investor Day here in Chicago

There are two great statewide health data events coming up on November 8 and 9 in Chicago.

Putting Health Data to Work in Our States and Communities

First up, on Friday, November 8, is A Health Data Consortium Event: Putting Health Data to Work in Our States and Communities.

The event is organized in partnership with the Health Data Consortium, the State of Illinois, the California HealthCare Foundation, and Robert Wood Johnson Foundation. It’s a gathering of thought leaders from the private, nonprofit, and government sectors confronting the most pressing health data issues in the U.S. health care system at the state and local level.

The day-long event will be the first of a series of Health Data Consortium forums focusing on state and local health data successes, challenges, and opportunities.  Invited speakers for the event include:

More speakers and panelists to be announced. Register here for the event on November 8.

Illinois Health Datapalooza

The next day,  Smart Chicago will be helping host the  Illinois Health Datapalooza on November 9th at 1871. The datapalooza brings policy makers, health care practitioners, web developers, designers, and data scientists to find ways to make health data a deeper part of the technology scene Illinois.

The event is organized by the Health Data Consortium, the US Department of Health and Human Services, the Illinois Department of Public Health (IDPH), the Illinois Department of Commerce and Economic Opportunity (DCEO) and the Smart Chicago Collaborative.

Christopher Whitaker, consultant and writer for Smart Chicago, has done lots to prepare for this event and will help guide the activities.

The morning sessions will be skill-sharing roundtables with representatives from SocrataESRI, and Healthdata.gov on hand to talk about available tools and resources for working with open health data. Midday activities will include brainstorming sessions on current challenges that healthcare policy makers and practitioners have in the field and an exploration into how civic innovation could help address them. The afternoon will feature unconference sessions (where attendees will can propose sessions on anything, from the new healthcare.gov to using Twitter to idenfity flu trends.)

Register for the event November 9th event here.

A Good Idea, on the Side of a Bus: Get A Flu Shot

Lots of work to be done

We’ve talked a lot about the value of civic partnerships have in creating healthier cities and how Chicago has been producing an impressive number of health related civic apps. However,  given the scope of the health care issue at both the local and national levels there is much more work to be done.

For the past few years, the Health Data Consortium  (A coalition of governments, academics, and health care providers formed to liberate health data) have hosted Datapaloozas to find innovative ways to use health data. To date, these events have always been held in Washington DC. This event will be the first of a series of regional gatherings that will bring the focus of health data to the state and local level.

There is an immense opportunity to harness health data into civic startups, to find ways to improve service delivery, and to use predictive analytics to help prevent disease. What’s needed is collaboration between civic technologists and health care practitioners.

We’ll hope you join us.

City of Chicago Launches the First Comprehensive, Public Data Dictionary

Today the City of Chicago launched the City of Chicago Data Dictionary, a single, comprehensive database catalog for the City of Chicago and City of Chicago sister agencies. The data dictionary contains detailed information on every data set held by City agencies and departments, how and if it may be accessed, and in which formats it may be accessed.

The City of Chicago Data Dictionary marks an important advance in open government data because it provides vast insight into how local government works. In concert with the City’s data portal, which is one of the largest raw data stores for a municipality anywhere, residents can now download available data, as well as examine the structure of all the data the City uses to make things work around here.

Tom Schenk Jr, Director of Analytics and Performance for the City of Chicago, announced the launch at the Code for America Summit in San Francisco. The City also published the underlying code for their data dictionary (titled “metalicious”). This code allows governments, businesses, and nonprofits– any organization that maintains multiple databases–  a great resource for publishing their own data dictionaries.

Continue reading

U.S. Ignite Application Summit and the Future of Gigabit Chicago

Last year I attended the US Ignite launch event at the White House (see full video here), where a number of Obama administration officials made a series of announcements about programs around broadband policy. It was a wide-ranging and mind-boggling series of speakers, and I wrote up some thoughts about what it all meant for Chicago.

Executive Office Building, Washington DC

This is an age of conception— we are limited only by our imaginations

Since then, I’ve continued to take interest in US Ignite and their efforts to foster the creation of next-generation Internet applications that provide transformative public benefit. The investments made here in Chicago, including the Gigabit Squared project that includes $2 million of investment from the State of Illinois as well as the Broadband Challenge from the City of Chicago— show that Chicago is very much a part of the Gigabit future.

What has struck me most, as I follow this work, is how far we have to go in terms of conceiving what this next-generation network looks like for regular people.

That’s why we’re a sponsor of the US Ignite Application Summit being held in Chicago June 24th – 26th.

What could you build if you weren’t restricted by the limits of network speed and latency? What if your network could support gigabit download and upload speeds? What if the power of cloud data centers wasn’t located on the east coast, but placed in your own backyard? What would you build?  What businesses could you launch if there were no limits?

That’s what we want to find out at this three-day event, running from June 24th to June 26th at the Allegro Hotel and UIC. We’ll be posting regularly from the Summit, so follow along on our Twitter and Facebook accounts.

 

Road to Government 2.0: Projects and Publications Around Analytics & Technology

Road to Government 2.0: Technological Problems and Solutions for Transparency, Efficiency and Participation

Last year I (along with Brett Goldstein, Chief Data Officer and Chief Information Officer for the City of Chicago) participated in the FOCAS 2012: Towards Open and Innovative Governance conference run by the Communications and Society Program run by the Aspen Institute. They recently published this report based on that conference: Road to Government 2.0: Technological Problems and Solutions for Transparency, Efficiency and Participation. Here’s a PDF of the publication and a relevant snip from the portion I worked on:

Solution 4: Alert System

Even with a heavy media push, many government services may slip by citizens, especially the underprivileged. The problem is, “I do not know what public information and services are available to me when I need and want them,” suggested Caitria O’Neill.

Speaking for another FOCAS working group, O’Neill proposed an opt-in government alert system that would signal citizens as they encounter opportunities for government services, such as moving to a different address. Such a system could also prevent duplicating government services and save on wasted advertising spending. The geolocation-sensitive system would be built out in three phases.

Phase 1: Build a framework for the system. The (very) specific checklist for the framework includes: “A large distributed NoSQL architecture that is cloud-based, that is able to answer spatially relevant queries via a RESTful API,” and that is powered by a combination of tools, such as Hadoop, MongoDB and PostGIS. Governmental and nongovernmental data sources can populate the system, and the team recommends public-private partnerships to maximize the available sources.

Phase 2: Develop and gather information about users, what they might use the information for, and what they need. “Units of government gather information about the consumption of services all the time in the normal course of business. They track things like who is obtaining business licenses and for what purpose, who has a driver’s license, who receives a particular benefit, and so on. There’s nothing new about this and no new systems need be made,” suggested Smart Chicago Collaborative Executive Director Daniel O’Neil.

Phase 3: Marketing. The group preferred that the government provide the service but that it be an open platform for others to offer apps to citizens and consumers. They argued that a private-sector solution “will likely not be provided for free or with the same level of integrity.” Thus they suggested the platform begin with government and foundation funding, as a fee-for-service solution is unlikely to be developed unless the government acts first.

Precautions must be made as to who governs the data and how active the government will be in targeting and advertising to citizens.

This overall concept fits into much of the work Brett has been leading here in Chicago, including the Chicago SmartData Platform, which won $1 million from the Bloomberg Philanthropies Mayors Challenge.

Elliot Ramos of WBEZ did a pretty good job of pulling the City’s various projects together in this post: City tech wonks add toys to Emanuel’s utility belt (disclosure: WBEZ is a grantee of the Chicago Community Trust under the Civic Innovation in Chicago project). Here’s a hefty snip:

While there, Goldstein touted several projects his department has initiated. Many were in testing stages, amounting to Chicago’s own version of Google Labs.

Within the the walls of the Daley Center, Goldstein’s department creates tools, utilizing the mountains of data to inform city managers about the inner workings of the city — sometimes in real time.

The project names are whimsical, but their use could very well alter the way city departments respond with services, perhaps pre-emptively.

Among the tools: Project Unicorn, which was recently renamed Chirp, on a submission to the Knight News Challenge. The city seeks to use that program “to act on city service issues identified via social media — eliminating the need to visit City Hall, call 311, or download special applications,” according to the project submission.

The tool, currently being tested by Goldstein’s department, would allow the city to monitor location-based Tweets and then respond to requests such as street-light outages or graffiti removal.

The city’s also testing Project Falcon, renamed on another submission to the Knight News Challenge as Scout.

About the grant submission, Goldstein said Scout would “aggregate data sources based on location … Applications built using this interface will enable residents to interact with data in a way that’s structured around their day-to-day lives.”

This is above and beyond the SmartData Platform, a separate program developed with funds from the Bloomberg Mayors’ Challenge, according to a spokesperson from Goldstein’s department. The platform’s purpose is allow City Hall to analyze millions of lines of data in real-time and, according to the city, make “smarter, earlier decisions to address a wide range of urban challenges.” The city won $1-million prize from Bloomberg Philanthropies to spend on the project.

Another effort aims to better visualize data using unconventional techniques. This one, dubbedProject Batman, will utilize an immersive, multi-display system called “The Cave.”

The Cave, housed at the University of Illinois at Chicago, has already been used by researchers to visualize environments or biological models.

The display is reminiscent of the computer used by Tom Cruise’s character Chief John Anderton in the 2002 movie Minority Report. That movie is often cited for its near prescience in predictingthe touch-and-swipe interfaces common to iPhones and iPads.

Smart Chicago is deeply interested in helping our founding partner, the City of Chicago move forward on these topics as we execute on our own projects like FoodborneChicago.