ICPSR Data Brunch Podcast Episode 3: Census and Migration


Original Air Date: March 12, 2021




DORY KNIGHT-INGRAM: Welcome to Data Brunch with ICPSR! If you love data, this is gonna be food for thought. I’m Dory.




DORY KNIGHT-INGRAM: We're recording these episodes live from our remote offices so please excuse any cameos from canine colleagues, kids in class, and other unexpected moments.


ANNALEE SHELTON: Dory, can I just say…


[Phone alarm goes off]


DORY KNIGHT-INGRAM: Oh no! [laughing]


ANNALEE SHELTON: There it is! [laughing]


DORY KNIGHT-INGRAM: Of course! I didn’t even know this was in here, oh my god, sorry.


ANNALEE SHELTON: [laughing] There is our unexpected moment!


DORY KNIGHT-INGRAM: [laughing] That’s my daughter’s school bell that brings us through the day. So sorry about that.


ANNALEE SHELTON: [laughing] Oh my god I love it so much. Well I was just going to say, Dory, I just cannot believe in that last episode with Amber Amen-Ra about the Dunham dance data, holy moly that part where the drums come in and it just, it gave me chills. 


DORY KNIGHT-INGRAM: The same thing for me. I just remember listening to it and just wanted to give a really huge shout out to our producer Scott Campbell for his great editing on that and all of our other episodes too.


ANNALEE SHELTON: Woot! Scott Campbell! Good producer over here! All right, good stuff, that was excellent… no, you are not cutting that Scott Campbell, you are not cutting that. 


So today we will talk to ICPSR’s Trent Alexander about some little-known and surprising stories from the Census as well as data in the news and more.


[Musical interlude]


ANNALEE SHELTON: So data in current events. You may have heard a recent NPR report that some Black Americans hesitate to get the COVID-19 vaccine in part because of their mistrust of the medical system due to the abuses of the Tuskegee study. And this is the infamous study officially titled “Tuskegee Study of Untreated Syphilis in the Negro Male.” And the U.S. Public Health Service revealed publicly that study participants were misled without informed consent and that some men were allowed to suffer with syphilis for years despite that penicillin was available.


So in an article in the Quarterly Journal of Economics called “Tuskegee and the Health of Black Men”, authors Marcella Alsan and Marianne Wanamaker showed that the revelation about the Tuskegee study and the medical mistrust that it engendered in Black Americans contributed to racial disparities in health and healthcare utilization. In their research they used lots of data from ICPSR including: the General Social Survey, 1972-2014, to look at “measures of trust in doctors,” mortality data available by race, age group, gender, and cause from the Mortality Detail Files from 1968-1991; and annual data on the total number of non-federal, active medical doctors by county” from the Bureau of Health Professions Area Resource File, 1940-1990 and more. 


So if you're interested in reading the article or doing similar research you can find out more about this in the ICPSR Bibliography of Data-related Literature with we will link to in the show notes.


And speaking of current events, happy Women's History Month Dory! 


DORY KNIGHT-INGRAM: Happy Women's History Month to you also Anna, and to all of our listeners. There’s one ICPSR dataset we’d like to highlight, called the Women's Movements & Women's Policy Offices in Western Postindustrial Democracies, from 1970-2001. And this was produced by the Research Network on Gender Politics & the State (also known as RNGS) as a part of a cross-national longitudinal study.


We also want to give a special shoutout to ICPSR’s director, Margaret Levenstein, who is heading into her fourth year ICPSR’s first female director. And now, back to Anna for new and updated data.


ANNALEE SHELTON: Thanks Dory! First up, nursing homes have been in the national conversation recently. If you are interested in looking more into this topic, a newly available study at ICPSR is the Nursing Home Consumer Preferences in the United States, 2017 and 2019. This is a survey of a national sample of individuals with recent nursing home experience and it included in part an assessment of quality of the nursing home.


And another newly released study is called, Improving the Accuracy and Fairness of Pretrial Release Decisions: A Multi-Site Study of Risk Assessments Implemented in Four Counties, in Indiana. And it looked at, of course, improving the accuracy and fairness of pretrial release decisions. And one of the objectives of the study was to see if the use of pretrial risk assessments would improve the fairness of pretrial release decisions for racial minorities, relative to practice as usual.


And finally we have an update to the Population Assessment of Tobacco and Health study, also known as the PATH Study. And for background, the PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act. And this study sampled over 150,000 mailing addresses across the United States to create a national sample of tobacco users and non-users. And there are updates to the restricted use and special collection restricted use files here.


And all of these are available in our show notes 


All right next up it is Dory and Trent. Take it away Dory. 


[Musical interlude]


DORY KNIGHT-INGRAM: Hi everyone and welcome back. Really excited to talk to you today about a really cool project at ICPSR. Did you know that ICPSR has a connection to the Census Bureau? Today we’ll be talking about ICPSR’s role in a massive data infrastructure project to link 1940 to 2020 decennial censuses. These data allow researchers to follow families across generations and that's not all. Linkages to tax, health and information promote study of the effects of policies, environmental issues and other contextual factors on later life and intergenerational outcomes. 


Imagine the stories based data could tell.


And that brings us to our guest for today that we're thrilled to have with us. Trent Alexander who is ICPSR’s Associate Director and Research Professor. Hi, thanks for joining us Trent!


TRENT ALEXANDER: Thanks for having me.


DORY KNIGHT-INGRAM: So we know that the Census has many parts, can you tell us more about the part of the Census that you are talking with us about today?


TRENT ALEXANDER: Sure. The project I'm doing is linking the censuses from 1960 through 1990. This builds on work that I had done at the Census Bureau before coming here, and that others have done at the Census Bureau and outside the Census Bureau. Before I came here we had linked 1940 data forward to the 2000 census. And then people at University of Minnesota and actually at a couple of other places have linked all the prior censuses, so from 1850 through 1940 those are already linked and available.


1940 is linked forward to 2000. But it's everything in between there that no one has done, because of some of the unique challenges to getting that work done, and that's what my collaborator Katie Genadek and I are doing in this project.


I also want to acknowledge the really excellent support and collaboration that I've gotten here at ICPSR for this work. From Maggie Levenstein, the director of ICPSR, from the proposal development team especially Lisa Kelly has been fantastic. And David Bleckley is working very hard on this project. He's really one of the best data analysts that I've worked with, he's just really creative and hardworking on some of the thornier data manipulation problems that we've encountered early in this work. Just really grateful for the support I've had here.


DORY KNIGHT-INGRAM: So what are some of the great stories that you... Data Brunch is all about, you know, what makes these data projects great stories so can you tell us what makes your work on this project a great story? 


TRENT ALEXANDER: It all started for me in a cave, this project in 2006. We… I was at University of Minnesota at the time and we were collaborating with the Census Bureau to recover the 1960 census which at that time, there was… the microdata was flawed. And microdata is like the person-level records that analysts need to use and the tapes that it was stored on were degraded. So we.. it could not be recovered and it was not permitted to used because of its flaws. 


So what we did at that time, we got a grant from the National Institutes of Health to rescan the original manuscripts, which were stored on microfilm, and create a dataset from them. And they were stored in the cave in a little room called The Ice Cube because it had to be kept temperature controlled. Because microfilm, even though they're a great storage medium they don't last forever and that the cool temperature helps make them last longer. 


So this was a cave managed by the national archives in Lenexa, Kansas. It is an unfinished cave in many parts so it's it's pretty spectacular to think that that's where we store some of our most valuable records as a county. There’s actually another one in Boyers, Pennsylvania, so they make good use of caves, that’s a good way to store things if you can keep it dry.


That was when, so Katie my collaborator and I both started that project in 2006 when I was a postdoc and she was a graduate student, and completed it. And we restored the 1960s census and those data, we made new public use data that was now available to everyone, and new internal files that can be used within the Federal Statistical Research Data Centers. That project was completed in 2011, I believe.


DORY KNIGHT-INGRAM: Is it possible that we can share some of those pictures with our with our listeners?


TRENT ALEXANDER: Of course, no, taken with my own camera. You can have the pictures. 


DORY KNIGHT-INGRAM: Thank you, it's just so fascinating. So why did the data end up in caves?


TRENT ALEXANDER: Caves really are a good place to store things. It's acres and acres and acres of storage space. And it’s not just microfilm, there's plenty of paper records and other files, books… These were limestone caves, meaning that they’re not, in this case they’re not natural caves, they were actually created by people who were mining limestone which means the walls are square and the height is standard. And it's really cheap storage. It’s temperature controlled, you know naturally like any cave is, they put in all kinds of ventilation shafts and it's actually a really economical way to store records as long as you can keep the humidity where you want it.


So it's not particularly, you know, I think people (and indefinitely I) imagined the bunkers and, and things like that. It's not so much the security aspect of it being underground that makes it appealing to the National Archives, it’s that it’s super cheap. And for papers and microfilm, its good enough. It’s not an office for the most part, there are some people who work there, but largely it's just for storing things. So it's a good place to do that. 


DORY KNIGHT-INGRAM: Thank you, now I really can't wait to see these pictures so I can get the sci-fi images that I have out of my head. What are some surprises that you have found... well, you know what let me back up, because we're going to talk a lot about “linking.” So for people who might not understand what that is, what does “linked” mean?


TRENT ALEXANDER: What the linkage is doing, is, let's say I filled out the 2020 census, which I did, it was great, and I also filled out the 2010 census, it would be linking the Trent Alexander record from 2020 to the Trent Alexander record in 2010, so you can see how my life has changed over those 10 years… where I lived, and where I lived has changed, and what I do, and that has changed too, and doing this for the whole population. So linking all the people who we can, over time from one census to another, over every census from 2020 back to 1850. So it's not just, “how did Trent Alexander’s life change over every 10 year period.” It's how my kids’ lives changed, how my parents, their parents. So we can do not just person-level change but generational change can be studied with these data too.


DORY KNIGHT-INGRAM: Thank you. I’ll follow up with a question about some of your work that I’ve seen that focuses on the “great migration.” So tell us some of the ways that linking can be used on that topic.


TRENT ALEXANDER: Sure, yeah. I started that work when, well actually my dissertation was on the Great Migration so it’s been a topic I’ve been studying for a long time. But when I was at the Census Bureau, and we had linked the 1940 data forward to the 2000 data, uhm, we made that data available to 10 research teams, it was still very much in a beta format. And my team studied the Great Migration so we used… and it was… you know, it’s extremely useful data that a lot of people wanted, but it’s not obvious that it would be that useful because it covers a 60-year span with nothing in between. Remember that’s what we’re doing now. So we observed people in 1940 and again in 2000. 


The Great Migration is really, is a great topic to study with those data because we could really look at intergenerational change among the migrants and their children. Which, I mean I can tell you, I’ve been studying the Great Migration since the 1990s, that’s the holy grail, and it’s really, really hard to do. 


The best way historians have found to do it before this was to talk to people, and you can only talk to so many people. You know, as we’re having a conversation right now, this is taking time, you’re going to have to listen to it and think about it later… and this is, you can do that on such a larger scale with data. Certainly not the richness that you can get by talking to people, but the scale is, is really valuable because it gives us a way to place all of the studies based on interviews and newspapers and all the classical sources that exist.


So what we did is, as you may know the Great Migration was the movement north and west of African Americans beginning during World War 1 and really continuing through 1970. So it's, it's mid 20th century migration out of the south.


[Calm music begins and plays under voices.]


We focused on the early period. We looked at migrants who had left the south and were living in the north or the west in 1940. So these were adults, they had moved, you know, as early as World War 1 but even through the twenties and thirties, and in the north and west when, most of them have children by this point so we’re observing working adults with children. 


We then followed those children forward to the 2000 census. So we could see the long-term outcomes of the children of those who made the Great Migration. And just as importantly we had a comparison group of children of African Americans who did not make the Great Migration, that is those who stayed in the south. 


So we had these two cohorts of who are now retiring, retirement age adults in 2000. Some whose parents had made the move to the north and west, and some whose parents had not made those moves. And we’re able to compare their experiences, and really see that the movement to the north did have long-term impacts, intergenerational impacts, on those children where they had higher rates of education and homeownership, lower rates of poverty. 


And this was even... and again because this is mass statistical data we’re able to do sort of statistical controls that people would typically do in studies like this, where we can say not only were the children those migrants who moved north and west doing better by these conventional economic measures, but we could even control for the advantages that their parents had brought.


Migrants often are more advantaged than those who stay in the point of origin, just in terms of  their own occupational, educational backgrounds. And that was true of those who made the Great Migration as well.


Even controlling for those advantages that the parents had, the parents we observed in 1940, the children... their children were doing better in economic terms than the children of those who had remained in the south.


DORY KNIGHT-INGRAM: That is fascinating. And you're really talking about my family, you know?




DORY KNIGHT-INGRAM: Yeah literally. I mean, my father followed his elder siblings from Mississippi up to Detroit to work for Chrysler. And so half of my family, I would say, well there's a portion of my family that's in Michigan, and then there's a large portion of my family that still in Mississippi. And, and you can see the disparities clear as day from the ones who stayed in the ones who left. 


Interestingly enough, when my father retired in his senior years he moved back home. So he, he inherited his family's, his parents land, which is actually... comes from the emancipation, you know. And so he's living down there on that land. And then that will pass to my generation. Yep.


TRENT ALEXANDER: We need to talk Dory. That’s, that’s fascinating. 




TRENT ALEXANDER: There’s uhm, I mean so that's, so even in the return… your family is, is part of a large-scale return migration.




TRENT ALEXANDER: That, yeah you know, I mean the Great Migration itself I think was petering out by the 70s. And that’s right when the return started to take over and it’s still going.


DORY KNIGHT-INGRAM: Uh huh, uh huh.


TRENT ALEXANDER: Actually I just read, Charles Blow, New York Times columnist has a new book about ongoing Black return migration.




TRENT ALEXANDER: Yeah, so, fascinating. I want to hear more Dory. 


DORY KNIGHT-INGRAM: So where would these data be if they were not available right now at ICPSR?


TRENT ALEXANDER: If we were not recovering these data and linking them, they simply would not be available to researchers. They would exist as images on microfilm in either a cave or in an office in Indiana. So they would not be digital resources for people to do research with.


DORY KNIGHT-INGRAM: How can our listeners find out more about this, or contact you?


TRENT ALEXANDER: Well they can contact me here at ICPSR, and I'd love to talk about this to anybody, person or group. Like honestly it’s one of my favorite topics. 


If they want to just get data and don't want to talk to me that's completely fine. A lot of it is already available in the Federal Statistical Research Data Centers, so that's up, it's an organization managed by the Census Bureau with 30 offices around the country including one here in Ann Arbor. So reach out to them and they have employees who can guide you through the application process, tell you what they have and guide you through the process of getting access.


DORY KNIGHT-INGRAM: Thank you so much, this was fascinating and I'm sure listeners are going to appreciate all the history and work that’s gone into making these census data available. 




[Exit music playing]


ANNALEE SHELTON: Thank you so much Trent, that was fantastic. The caves… I kind of am sitting here in disbelief, and also I can't wait to see all of these pictures.


So next in upcoming events.. By the way, if you are listening to this episode at a later date, you can always visit icpsr.umich.edu to see our current job listings and upcoming events. But as of airtime today we are hiring! Our Summer Program has two teaching assistant positions open, for the introductory statistics courses and for statistics and quantitative methods courses. Applications for both are due on March 18thm 2021.


And the Summer Program has two short workshops with some application deadlines to keep in mind. So first is the Panel Study of Income Dynamics, also known as PSID. They have a one-week workshop at this year's Summer Program. Applications for that are due April 16th, 2021. 


And a reminder that applications are due on March 22nd, 2021 for the free Institute for Research on Innovation and Science, frequently known as IRIS, workshop which is called “Joining the Data Revolution: Big Data in Education and Social Science Research,” And instructions for applying for these workshops are in the show notes.


And just another note ICPSR’s Summer Program scholarship applications are due on March 29th.


And finally on April 1st we will be hosting a webinar on openICPSR, which is ICPSR’s self-publishing repository. And this webinar is free and open to the public, please do share this widely and you can find the registration information at icpsr.umich.edu.


DORY KNIGHT-INGRAM: Thank you Anna. And that brings us to the end of today's episode. Thanks for being with us.


ANNALEE SHELTON: For links to data and everything else that we've talked about today, visit our show notes at icpsr.umich.edu. 


DORY KNIGHT-INGRAM: Coming up we'll talk to some of the folks behind transgender related data at ICPSR for the upcoming International Transgender Day of Visibility. If you aren’t already, subscribe now on Apple podcasts or wherever you get your podcasts. 


ANNALEE SHELTON: And thank you as always to the ICPSR membership! This podcast would not be possible without the ICPSR members. 


DORY KNIGHT-INGRAM: You can get in touch with us by visiting our website, icpsr.umich.edu, or emailing us at icpsr-podcast@umich.edu. 




DORY KNIGHT-INGRAM: And I’m Dory, and thanks for joining us and ICPSR’s Data Brunch.