NewsWire: 9/17/21

  • In a first, the annual American Community Survey may soon use synthetic data. The Census Bureau says this move will protect privacy, but critics argue such changes will undermine data accuracy. (The Economist)
    • NH: As demographers, it’s our job to keep track of what's going on at the Census Bureau. And recently, the bureau has changed its data reporting methods due to privacy concerns.
    • So what’s going on?
    • By law, the Census has always been required to keep data anonymous. But with big-data tools, it has become increasingly easier to identify individuals. The Economist offers a dumbed-down example. If there is a town with 100 people and the census reports that 1% of the area is Hispanic, it wouldn't be difficult to identify the sole Latino.  
    • In the 2020 decennial census, the bureau used “differential privacy” to protect identities. Officials “injected noise” into the data concerning congressional districts, towns, and census blocks. While the census claims it only changed the numbers minimally, some researchers believe it will make the demographic data misleading. Alabama even sued, claiming the state couldn’t properly redistrict. (Federal judges rejected Alabama’s claim.)
    • Now, further privacy provisions will be added to the American Community Survey (ACS). This monthly survey is taken by 3.5M people every year. And to protect their identity, the census will produce synthetic results. Statisticians will feed the "true" data to models, which will simulate the same relationships between variables in an artificial population.
    • IMO, the critics are making a mountain out of a molehill. To begin with, most of them don't seem to realize that Census numbers (except for the decennial census) are never actual counts in which every individual is enumerated. They are statistical inferences based on sampling. So, when in comes to ACS data for example, there is always some "noise" in the numbers.
    • More importantly, Census works within a strict legislative mandate--which is not to present data from which specific information about individuals can be inferred. If critics don't like this mandate, they should lobby Congress to change it. (Good luck!) Meanwhile, let's allow this agency to follow the law.
    • Yes, it is absolutely true that--thanks to growing transparency in the digital age--lots more detailed information about individuals can be inferred than ever before. As an illustration, look at this 2018 NYT feature story. The reporters were given access to a database that collects cellphone locations tracked by apps. The NYT was quickly able to identify the owners of most randomly chosen cellphones. And they succeeded in tracking the daily whereabouts of New York Mayor de Blasio's staff. But that's what you get for buying a cellphone and allowing it to track you. Most people don't especially care if they're tracked or deem it improbable that anyone would care. If you do, you can opt out.
    • But that's the difference. You can't opt out of a Census survey. Though few Americans are ever fined for noncompliance, you are in theory required by law to divulge information about yourself. It may be impractical to go completely "off the grid" and make yourself entirely transparent to commercial digital services. And the public may soon require much tighter regulation of how their personal information is handled by them. But I do think the distinction between legally required and legally not required does make sense.
    • Most of the states complaining about the new Census policy are (like Alabama) in the Redzone. I can't imagine they would comfortable with Census making the opposite choice and announce, OK, we really don't care whether rural households in depopulated census blocks can be easily unmasked.
To view and search all NewsWires, reports, videos, and podcasts, visit Demography World.
For help making full use of our archives, see this short tutorial.