Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
10 views

Solving Public Problems - Topic 4 - Module 4 - Using Open Data to Define Your Problem

Uploaded by

Imani Grimes
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Solving Public Problems - Topic 4 - Module 4 - Using Open Data to Define Your Problem

Uploaded by

Imani Grimes
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 5

Hi and welcome back.

In earlier modules we progressed from learning to define our


problem to learning how to do so with the use of data and evidence. In fact, we
have two modules: one on data analytical thinking, explaining why to use data to
define your problem, and another where we focus on strategies for how to do so,
learning to use data in practice to accelerate your own projects. Okay, now, we're
going to deepen our discussion of data to introduce you to one of the most powerful
and important governance innovations of the last decade. That's the policy of open
government data. As we're going to see, open data policies are a key enabler for
one of the most important sources of information and evidence that you will be able
to avail yourself of when defining your problems. Okay, let's dive in and get
started. By the end of this module, you should be able to do a few things. First,
you're going to be able to define open data and open data policy. Second, you're
going to understand how open data policies can be used to make data available to
the public, including to you. And third, we want to really understand by the end of
this how open data might be available to you to use when defining your problem.
Let's start with the story as we usually do. Edmund Haley, the astronomer who lent
his name to Haley's comet, published an article on annuities in 1693. His
population table was based on data that was collected for the years 1687 to 1691
from the city of Breslau -- today it's called Wrocław (Vrotzlov). It was data
collected by the protestant pastor of the town, Caspar Neumann. This work is now
seen as a major event in the history of demography and the first major work of
actuarial science. What was so wonderful about this project is that it illustrates
the value of sharing and publishing data openly. By giving Haley the raw
information, Caspar Neumann enabled more value and insight to be created than had
he simply kept the data for himself. Such collaboration is what really makes open
data truly transformative. The organization or individual that collects and
maintains the data is not always the one with the exclusive ability to use the data
well. By opening up and sharing data, institutions can enable the collaboration of
people with diverse skills and talents and insights to work together to generate
more value from data. By making data open. you're going to enable others to bring
fresh perspectives, fresh insights, and additional resources to your data, and
that's when it can really become valuable to you and to others for public problem
solving. Okay, what is open data specifically? Government and other organizations
have always collected data. We've gathered information from companies when
government regulates, for example, government tracks statistics about the economy
and society in its role as policy-making body, and it collects data from citizens
in its role as a provider of public goods and services. Universities, companies,
and others also regularly collect data. But what distinguishes open data from other
types of data is that it is publicly available, it can be freely accessed and used
and -- and this is important -- it's capable of being processed by machine. Okay,
that is to say to be considered open data has to be both technically and legally
accessible. To make it technically accessible, data must be available in a form
that a computer can use and access so that the data can be analyzed. To be legally
accessible, data must be licensed in such a way that anyone can use it and reuse
the information without fee and without restriction or condition. When data is both
legally and technically open, then anyone -- whether they are the collector of the
data or not -- when they have the right tools, can then create sophisticated and
useful tools, conduct analysis across data sets to enable empirical problem
solving, and use data to advance both social good and potentially economic growth,
as we'll see. Okay, let's take an example from Mexico, a project called Mejora tu
escuela. Created by the Mexico Institute of Competitiveness or IMCO, Mejora tu
escuela is an online platform that makes government data about Mexico's schools
publicly available. The website provides parents with comparative data so that they
can compare their own school's performance with that of other schools. This
empowers parents and students to demand better quality education for their
children. Mejora publishes expenditure data as well, which gives activists and
administrators, policy makers and journalists the means to dig deeper to spot fraud
and corruption and to advocate for change, ultimately. And this is exactly what
happened in 2014 when a report by IMCO revealed that over 1400 teachers on public
school payrolls were supposedly, according to the data, more than 100 years old
with most having exactly the same birthday and most suspiciously of all earning
more than the president of Mexico. No, this was not a case of the school board
having discovered Ponce de Leon's mythical fountain of youth. Rather the story of
Mejora tu escuela illustrates how when a government -- or any institution, for that
matter -- makes information free of charge and readily downloadable in digital
form, such open data can then solve problems. In this case federal authorities had
then required states to provide information about the conditions of schools,
payrolls, other expenditures. But it was when civil society activists at IMCO
outside of government were able to create this platform to make that information
accessible to citizens and to journalists and to themselves, then the information
ultimately got scrutinized. That was when they ultimately exposed this rampant
malfeasance that was previously hidden. Although good government initially
prevaricated and hesitated, claiming clerical error, the ensuing media frenzy over
the website and what it revealed helped to prompt reform and then a shift of
responsibility over education from states to the federal government. Ultimately,
the activists and the federal bureaucracy worked in parallel. They collaborated,
addressing this local level corruption and acting to improve Mexico's schools. Open
data matters for a variety of reasons, the same reasons, though, that using data
matters in the first place. We can use open data to spot mistakes and outliers and
rare events. We can use it to help us target scarce resources more effectively. We
can use it to tell stories in the way that we've previously discussed. Let's
consider a few more examples of open data being put to work to further understand
also the benefit of data analysis for defining our problems. First, open data
sometimes achieves greater government accountability. In the United States at the
federal level open data facilitated the creation of a website called
usaspending.gov, a set of online tools for exploring the federal budget. Opening
local government data about public works in Zanesville, Ohio in that case revealed
a 50-year pattern of discriminatory water service provision. While access to clean
water from the city of Zanesville waterline spread throughout the rest of Muskegon
county, residents of the predominantly African-american area of Zanesville in Ohio,
they were only able to use contaminated rain water or they had to drive to the
nearest water tower or store and truck water back to their homes in bottles.
Opening the data laid bare the truth of what was going on and led to a successful
civil rights lawsuit against Zanesville in 2008 when it was revealed the disparity
between African-american and white residencies with regard to water provision.
Second, open data can improve the delivery of services at the state and local
level, increasing access to open data has allowed entrepreneurs and developers to
build tools such as smart transit apps, citizen-facing information services, and
business or government-facing data visualization and analysis platforms. For
example, both transit authorities and commercial providers think the MTA in your
city and Google maps, of course, use open transportation data to tell commuters
when to expect their bus or their train coming along their route. Retroficiency
analysis energy consumption data to allow utilities, energy service providers, or
building owners to identify buildings with high energy savings potential. Third,
open data also enables the creation of tools to improve consumer choice and citizen
decision making in the marketplace. Let's take another example here. Data that's
collected by government from universities has been transformed by the department of
education into a calculator known as the college scorecard. This is designed to
help parents and students make more informed financial decisions about their choice
of college education. Sometimes the benefits of open data are going to ripple out
beyond government accountability and government services. For instance, open data
can also be used to catalyze greater business competition and entrepreneurship as
well as job creation. Think of the wealth and the jobs that are created by
government's release of both weather data and geo location data for the economy.
Those have enabled the creation of weather apps as well as GPS devices. The open
data institute in the UK notes that the global market for open data could be as
high as 5 trillion dollars. Thousands of companies worldwide now already use open
government data as a core business asset. One example of this is the company
Brightscope, which worked with previously locked up or closed Department of Labor
form 5500 retirement plan data, to offer better decision making tools to investors.
When the data became available as open data, Brightscope was able to rapidly build
tools to help people make decisions about which retirement plan had the lowest
fees. A decade ago, open data was but an idea, a call
to action by pro-democracy activists wanting government to be more transparent.
Today, however, it encompasses a broader movement that is focused on solving public
problems. Open data policies have helped to drive that change. On his first day in
office in 2019 -- sorry 2009 -- fulfilling an earlier campaign promise, President
Obama signed the memorandum on transparency and open government. That memorandum
declared, and here I quote, "information maintained by the federal government is a
national asset." It called for the use of, quote, "new technologies to put
information about agency operations and decisions online and to make that data
readily available to the public." In addition, the government's open data policy
made clear that because the collection of data by government is already paid for by
the taxpayer, it therefore makes sense to give that data back to the public to use
for free. When the federal government's open data repository, called data.gov --
you can check out that website -- launched in May 2009, it started initially by
just making 47 data sets searchable. But turning the principles of the memorandum
into practice by creating a tangible and central place for agencies to list
government data and, more importantly, a place for the government to find that
data, data.gov was really instrumental in unleashing the movement of open data.
Later that year the Office of Management and Budget directed federal agencies to
release not only data about the workings of government but also what was termed
high value information. The choice to broaden the 40 year old definition of
government transparency from only data about government to data that agencies
collect, expanding that definition responded to what both the technologies of big
data and the technologies of collaboration can actually make possible. The
directive emphasize the broad public benefits and the need to disclose new kinds of
government information as open data in machine readable format, such as the
locations of reported crimes or weather information or information that we've
discussed before, like GPS data, that could foster new businesses. In 2013, the
federal government recommitted to and expanded its work on open data policy by
issuing another executive order on making open and machine readable the new default
for government information. It was designed to advance and accelerate open data
implementation by federal agencies, getting them to open up and put online ever
more information. Entrepreneurship and innovation rather than government
accountability alone are emphasized in that order. It makes clear that making
information resources easy to find, accessible, and usable can fuel
entrepreneurship, innovation, and scientific discovery that improves Americans'
lives and contributes significantly to job creation. Further laws have followed,
broadening the scope of data covered under open data statutes and policies. The
Digital Accountability and Transparency Act, also known as the DATA Act of 2014,
calls for publishing all federal government spending data, now in particular, as
open data in standardized formats. There is also another statute known as the Open
Public Electronic and Necessary Government Data Act, also known as the Open
Government Idea Act for short, which was signed into law in early 2019. The open
statute calls for inventorying and publishing all government information, not just
spending data, as open data. Today, there are quarter million federal data sets
online on data.gov. That's a long way from 47. And just about every state and
hundreds of cities now release some data as open data and have some form of open
data portal or website like data.gov but for a state or for a city. Despite this,
though, the need for continued open data policy making is as strong and urgent as
ever. An open data barometer survey of 1725 data sets covering 115 countries found
that nearly 90 percent of priority data sets -- those that people most want --
still remain closed and unavailable. Only seven percent of the data governments
collect, they say, is fully open, only one of every two data sets is machine
readable, and only one in four data sets has an open license that makes it free to
use and to reuse. The bipartisan interest in evidence-based approaches to governing
has fueled demand, however, for more access to administrative information of all
kinds, including the data that agencies collect about companies, about workplaces,
and the environment. Using open data is a great way to get data that you can use to
define and understand your problem. However, before you push ahead to identify how
to use open data to better define your problem, remember to make sure that you have
started in the first place by defining your problem, as we discussed in earlier
modules. Without knowing the problem and especially root causes, it's going to be
hard to know what type of data you actually need. So be sure you go over the
exercises for defining your problem before selecting your data sets. Okay, but now
let's take a minute then to finish up by considering whether the data that you
actually need is or might be made available as open data. First, let's consider the
availability of the data that you want as open data. Does somebody already collect
it and publish it? Is there a government or another institution, a university, or a
company making that data available online? While you can start with your own
community's open data catalog, that of your city or your state, often these are not
comprehensive sources. There are also other relevant agencies that you might want
to engage to identify available data sets. There are numerous aggregators of open
data that you can consider. For example, you might want to try the census bureau in
your country. You might want to look at -- if you're in the U.S. -- the Urban
Institute, which is a fabulous primary source of data about communities in the
United States. Depending on your field of interest, different federal agencies such
as the EPA for environmental data, self-evidently, or the US department of Labor
for labor data, or the FBI for crime data, might offer access to machine readable,
high quality, and comprehensive data that you need to tackle challenges. Open
Corporates is the largest open database of companies in the world. Once you find
the data that you want, you have to look at whether that data is fully open and or
accessible to you in a machine readable form, enabling you to use it readily for
analysis. If not, can you identify external or internal partners with the relevant
expertise who can actually help you prepare the data for use. One strategy is to
organize what's sometimes called a hackathon or a datathon, also might be known as
a data dive. Data dives are these high energy, marathon style events where teams of
volunteer data scientists, developers, statisticians, and designers help mission-
driven organizations and individuals, whether they're government agencies or NGOs
or activists, to organize, manipulate, clean, or visualize their data. If the data
is not collected, what is it actually going to take to collect that data? We've
previously discussed methods such as interviews and surveys that you might want to
do to collect original data yourself, and in our next module on open innovation,
we're going to look in detail at how to use crowdsourcing to collect data using
distributed participation. Next, you want to consider your level of readiness to
make use of the data. Do you actually have access to the necessary expertise, not
simply to collect the data but now to analyze it and make sense of it? Again,
reaching out to partners, especially in universities, can be one way to obtain the
necessary expertise. Another way may be the use of a competition. New York City,
for example, gets people to use its data and analyze it by hosting competitions or
challenges to attract data savvy individuals to analyze and use their data to
create new tools. The city's Big Apps competition invites private companies and
individuals to solve public problems using open data. Their challenge, the Big Apps
challenge, is overseen by the city's Economic Development Corporation. It engages
agency leadership throughout the planning process to open up more data that the
public can then use to create new tools. For example, a past Big Apps winning team
used targeted, geo-located data to create an app called Mind My Business. It was a
tool designed to assess brick and mortar food service establishments in New York by
sending alerts that help owners predict changes in customer traffic, operate more
efficiently, and avoid fines. Private sector platforms like Kaggle offer a
community of data scientists online ready to solve problems, usually in exchange
for a prize or some kind of monetary incentive. Okay, that concludes our much too
short session on the power of open data. Used well, open data can generate new
insights and enable us to define problems using empirical evidence. But collecting
that data and deriving insight from it and ultimately designing solutions to public
problems is, as we've discussed, going to require collaboration. That's why in our
next module we turn to exploring ways of using new technology to organize such
collaboration efficiently and effectively to allow us to do more using both
quantitative and qualitative methods. Until then, see you next time.

you

You might also like