A few years back certain medical data made a big wave in right wing circles never quite spilling over into mainstream media. The data in question consisted of percentages of newborns tested for sickle cell anemia in mainland France. In France, only newborns that have at least one parent originating from a region in which sickle cell anemia is common are tested for the disease. As sickle cell anemia is mostly prevalent in Africa, these percentages where taken as stand-in for the percentage of French newborns of African heritage.
The screening data suggested, that in 2000, 19 percent of babies born in mainland France (excluding oversea departments) were of African origin, a number that rose steadily to 38.9 percent in 2015. This is certainly surprising. If these numbers are correct, France’s ethnic makeup seems poised to jump from entirely European to basically Brazil within two generations.
To me, these data are worth investigating for several reasons.
During the last decades the media fed us a steady diet of articles about the French family friendly policies that were the reason for the birth rate collapse failing to materialize in France. It would certainly be interesting if that was just nonsense and the real reason was a more fecund (or just bigger) class of immigrants.
Ethnic replacement is a centerpiece of rightwing agitation. Of course, the media tells us that it is just a conspiracy theory. Just as with the French birth rate, I am very much interested in the extend of lies told to me by mainstream media outlets. Call it a desire for informational emancipation.
Ethnicity correlates with lots of variables of interest. Quantifying such a rapid change would allow predictions in crime rates, economic growth, human capital, unemployment, etc. Rapid change of any sort is often accompanied with many dangers. If you don’t know about the change, you can’t look out for the dangers.
There are several arguments against equating sickle cell screening with African origin. Among the countries that provided significant numbers of immigrants to France, sickle cell anemia is prevalent in Italy, Greece and Turkey aside from the Maghreb, Subsaharan Africa and the Caribbean. However, the number of recent European immigrants from sickle cell regions is too small to account for more than a few percentage points.
It has also been argued that some hospitals do not distinguish by origin, but instead test all newborns. That is entirely possible, however, it leads to a dilemma. If the absolute number of newborns at risk for sickle cell anemia is overestimated, the growth rate has to be underestimated!
Or to put it differently: If the 19% in 2000 were actually just 10% because 9% were due to unnecessary testing, than to get to 39% in 2015 the percentage of actual kids at risk had to triple from 10% to 30% instead of double from 19% to 39%.
Alternatively, the number of hospitals just testing everybody has steadily risen. In which case the entire data is worthless. Or the original study could just be a hoax by a devious far-right physician. Who knows?
So the first point on our agenda is trying to independently verify the plausibility of the data.
To this end, I downloaded the data for given names in France provided by the French bureau for statistics, INSEE . I also create a list of 2211 popular Muslim names, specifically Arab and Turkish names. Not all of the sickle cell tested babies will be of Arab or Turkish origin. And not all kids of Arab and Turkish origin will be given Arab and Turkish names. And additionally, my list probably doesn’t cover more than small chunk of all actual Arab and Turkish names. But it still allows us to track the increase of a certain subset of all kids that would be subject to sickle cell testing.
A first quick and dirty run of the numbers: In 2000, out of 800039 kids my list covers 46718 or 5.8%. In 2015, my list covers 80387 out of 777746 names, or 10.3%. This amount to an estimated 1.77-fold increase of Arab/Turkish newborns over the time span in which the sickle cell percentage roughly doubled, which is reasonably close.
However, out of my 2211 names only 103 and 127 actually occur in the INSEE list of given names for the years 2000 and 2015. Only 77 names occur in both lists. Some of these names are clearly not just popular among Muslims, especially girl’s names are often ambiguous. So let’s try to tighten up the method.
Now, we only look at names present in both years. We exclude all ambiguous names. Each remaining name provides a separate estimate how much the percentage of Muslim newborns has changed between 2000 and 2015. This time the overall percentage accounted for by these 56 names almost exactly doubles from 1.95% to 3.89%. The median increase, which should be more robust against outliers (like short term trends in popularity), is also exactly 2.0.
To my mind this provides strong confirmation that the sickle cell data is correctly interpreted as showing that the percentage of a predominantly African derived immigrant population among the newborns in France has doubled between 2000 and 2015. Confirmation of the growth rate makes it rather unlikely that the absolute percentage numbers are off by any significant degree.
I did these analyses quite some time ago. At one point I became aware that my given name analysis had been scooped by a French far-right website. (Which was one motivation to finally get the blog going.) In their analysis they try to capture all Muslim names and give a definite estimate of the absolute numbers. They handle ambiguous names by just counting them as half a Muslim. According to their analysis the number of Muslim newborns more than doubled between 2000 and 2015.
This got me thinking about how to do this analysis right. Counting ambiguous names as half is a really ugly hack, likely to overcount names as long as Muslims are a minority. Instead one might use the regional and temporal variation to infer for each name separately how it contributes to the number of Muslims.
Once you have done that you can subtract a precise estimate of number of Muslims from the sickle cell data to get an estimate of the increase of Subsaharan Africans for each region. Which allows you to do the same inference for SS-African names, which are probably much more ambiguous than the Islamic ones.
If that works, you have ended up with a method to create precise estimates for both groups directly from given names, even in the likely case that the sickle cell data stops being published. Unfortunately this takes quite a lot of time. And of course there is no guarantee that it would work. Maybe a project for the future.