Alteryx, data visualisation, Maps, Tableau

Alaska Fried Chicken: the UK’s curious approach to naming chicken shops.

I went a little bit viral a couple of weeks ago when I tweeted about chicken shops in the UK which are named after American states which aren’t Kentucky. If I’d thought about it, I’d have written this blog up first, created a Tableau Public viz, and had all kinds of other shit ready to plug once I started getting some serious #numbers… but I didn’t. So, to make up for that, this blog will go through that thread in more detail and answer a few questions I received along the way.

It all started when I walked past Tennessee Fried Chicken in Camberwell, pretty close to where I live. It’s clearly a knock-off KFC, and I wanted to know how many other chicken shops had the same name format: [American state] Fried Chicken.

The first thing to do is to get a list of all the restaurants in the UK. I spent a while wondering how to get this data, but then I remembered that my colleague Luke Stoughton once built a Tableau Public dashboard about food hygiene ratings in the UK. All UK chicken shops – hopefully! – are inspected by the Food Standards Agency. So, Luke kindly showed me his Alteryx workflow for scraping the data from the FSA API, and I adjusted it to look for chicken shops.

My first line of inquiry is pretty stringent: how many chicken shops in the UK are called “X Fried Chicken” where X is an American state which isn’t Kentucky?

Turns out it’s 34. “Tennessee Fried Chicken” – including variants such as Tenessee and Tennesse – is the most popular with 13 chicken shops. The next highest is Kansas with six, which I’m assuming is so the owners can refer to their shops as KFC, although maybe the owner/s just really like tornadoes, wheat, and/or the Wizard of Oz. Then there’s four Californias, a couple of Floridas, and one each of Arizona, Georgia, Michigan, Mississippi, Montana, Ohio, Texas, and Virginia.

1 state fried chicken map

[tangent: I’m aware that a lot of these states aren’t exactly famed for their fried chicken, but as a Brit, all I have to go on for most of them are my stereotypes from American media. But hey, maybe it’s still accurate, and Ohio Fried Chicken tastes of opiates and post-industrial decline, Arizona Fried Chicken comes pre-pulped for the senior clientele who can’t chew so well these days, and Florida Fried Chicken is actually just alligator. Michigan Fried Chicken is, I dunno, fried in car oil rather than vegetable oil, and Alaska Fried Chicken is their sneaky way of dealing with the bald eagle problem up there? I’m running out of crude state stereotypes now, I’m afraid. Out of all these states, I’ve only actually been to California.]

There’s also a “DC Fried Chicken”, which is close but not quite close enough for me, and a “South Harrow Tennessee Fried Chicken”, which I’m not counting because either.

Here is where these American State Fried Chicken shops are in the UK:

2 map uk

Interestingly, this isn’t a case of a map simply showing population distributions. The shops cluster around the London and Manchester regions, but with almost none in any other major urban centre.

Let’s have a look at the clusters separately. Here’s the chicken shops around the Manchester area:

2.1 map greater nw

None of them are in the proper centre of Manchester itself, but they’re in the towns around. One town in particular stands out: Oldham. Let’s have a look at the centre of Oldham:

2.2 oldham only

Oldham, you’re fantastic. There are six separate “X Fried Chicken” shops in Oldham, and four of them – Georgia, Michigan, Montana, and Virginia – are the only ones by that name in the whole country.

For comparison, here’s the London area:

2.3 greater london area only

This is where all the Tennessees are, as well as the one Texas and Mississippi.

It looks like there’s a lot more variety in the north of England compared to the south, and sure enough, a split emerges:

3 latitude scatterplot

[chicken icon from https://www.flaticon.com/packs/animals-33%5D

Chicken shops in the south of England (and that one Tennessee place in Wales) tend to name their shops after states in the geographical south of the USA, while chicken shops in the north of England name their shops after any states they like.

This is where my initial Twitter thread ended, and I woke up the next day to a lot of comments like “Y IS THEIR NO MARYLAND THEIR IS MARYLAND CHICKEN IN LEICESTER”. Well, yeah, but it’s not Maryland Fried Chicken, is it?

So I re-ran the data to look at chicken shops with an American state in the name. This is the point at which it’s hard to tell if there’s any data drop out; the FSA data categorises places to inspect as restaurants, takeaways, etc., but not as specifically as chicken shops. All I’ve got to go on is the name, so I’ve taken all shops with an American state and the word “chicken” in the name. This would exclude (sadly fictional) places like “South Dakota Spicy Wings” and “The Organic Vermont Quail Emporium”, but it’d also include a lot of false positives; for example, you’d think that taking all takeaway places with “wings” in the name would be safe, but when I manually checked a few on Google Street View (because I’m dedicated to my research), about half of them are Chinese and refer to the owner’s surname, not the delicacy available.

This brings in a few more states – Marlyand, New Jersey, and Nevada:

4 state chicken map

Let’s have another look at the UK’s south vs north split. We’ve got a bit of midlands representation now, with the Maryland Chickens in Leicester and Nottingham, the Nevada Chickens in Nottingham and Derby, and a California Chicken & Pizza near Dudley. The latitude naming split between the south/midlands and the north isn’t quite as obvious anymore:

5 latitude with no fried restriction

…but, there is still a noticeable difference. This graph shows each chicken shop with an American state and the word “chicken” in the name, ordered by latitude going south to north:

6 north vs midlands and south

In the south and the midlands, there’s the occasional chicken shop that’s going individual – there’s the Texas Fried Chicken in Edmonton, the two Mississippi places in London which don’t seem to be related (Mississippi Chicken & Pizza in Dagenham, Mississippi Fried Chicken in Islington), the Kansas Chicken & Ribs place in Hornsey is almost definitely a different chain from the six Kansas Fried Chicken shops in and around Manchester, and the California Fried Chicken in Luton is probably independent of the California Fried Chickens on the south coast – but most of them are Tennessee or Maryland chains in the same area. In all, the south and midlands have 17 chicken shops named after 8 American states (excluding Kentucky), or a State-to-Chicken-Shop ratio of 0.47.

In the north, however, there’s a proliferation of independent chicken shops – 15 shops named after 9 different states (excluding Kentucky), or a State-to-Chicken-Shop ratio of 0.6. There’s the chain of six Kansas Fried Chicken places and two Florida Fried Chicken places in Manchester and Oldham, but the rest are completely separate. Good job, The North.

The broader question is: why does the UK do this? There’s obviously the copycat nature of it; chicken shops want to seem plausible, and sounding like a KFC (and looking like one too, since they’re almost always designed in red/white/blue colours) links it in people’s minds. I think there’s more to it, though. Having a really American-sounding word in the name is probably a bit like how Japanese companies scatter English words everywhere to sound international and dynamic (even if they make no sense), or how Americans often perceive British names and accents as fancier and more authoritative (even if to British ears it’s somebody from Birmingham called Jenkins). We’re doing the same, but… for fried chicken.

Finally, since this data is all from the Food Standards Agency’s hygiene ratings, it’d be a shame not to look at the actual hygiene ratings:

7 hygiene

It looks like independently-named chicken shops named after American states in the north are more hygienic. The chains in the south and midlands – Tennessee, Maryland, California, and especially New Jersey – don’t have great hygiene ratings, and the independent shops do pretty badly too. In contrast, the chicken shops in the north score highly for cleanliness. In fact, a quick linear regression of hygiene onto latitude gives me an R2 of 0.74 and a p-value of < 0.0001. Speculations as to why this is on a postcard, please.

Preëmpting your questions/comments:

“I live in […] and my local shop […] isn’t mentioned!”
Maybe you’re talking about a Dallas Chicken place. That’s not a state. Nor is Dixy Chicken, it just sounds a bit American. If it’s definitely a state, then does it have chicken in the name? If not, I won’t have picked it up. I also haven’t picked up shops which have, say, “Vermont Fried Chicken” written on the shop sign if it’s registered in the database as “VFC”. Same with if the state is misspelled, either by the shop or by the data collectors. If it’s all still fine, perhaps the shop is so new that it hasn’t had an inspection… or perhaps the shop is operating illegally and isn’t registered for a hygiene inspection.

“Did you know about Mr. Chicken, the guy who designs the signs?”
I didn’t, but I do now! He’s brilliant.

“How did you do all this?”
I use Alteryx for data scraping/preparation and Tableau for data visualisation.

“I have an idea for something / I want to talk to you about something, can I get in touch?”
Please do! My Twitter handle is @GwilymLockwood, or you can email me on gwilym.lockwood@theinformationlab.co.uk

“Your analysis is amazing, probably the best thing I’ve ever seen with my eyes. Where can I explore more of your stuff?”
Thanks, that’s so kind! There’s a lot of my infographic work on my Tableau Public site here.

Standard
R

Quantifying three years of a long distance relationship

I read two really useful guides to processing text data recently; an analysis of Trump’s tweets to work out whether it’s him or an intern sending them, and a sentiment analysis of Pride and Prejudice. Three years of a long distance relationship means that I have a nice big corpus of Whatsapp messages between my girlfriend and me, so I did the romantic thing and quantified some of our interactions in R. Also, this required quite a bit of text munging in Excel first, which turned out to be far quicker and easier than using regex in this case.

First of all, let’s look at when we text each other throughout the day. We’re in different time zones, but only by an hour, and since texts are inherently dependent – one text is overwhelmingly likely to lead to another pretty soon after – I haven’t adjusted the times.

text no by hour of day.png

Our texting activity represents our general activity pretty well; nothing much going on until about 7am, then a slow start to the day, a bit of a post-lunch dip, and then an evening peak when we’re more likely to be out and about doing things.

We can also have a look at how many messages we send each other, and how that’s changed over time:

text no by date.png

We’ve sent each other a fairly similar number of texts per day throughout the long distance period, but it looks pretty bad on me that I have consistently sent fewer texts than her…

…or does it? When I plot the length of each text sent, I consistently write longer messages:

text length by date.png

So, there’s two distinct texting styles here; I write longer messages less frequently, she writes shorter messages more frequently. The other thing I like about the text length graph is that you can see the times when we’ve been together and not texted each other that much; three weeks in November 2014 when I was running experiments in London, three weeks around Christmas 2015, and a load of long weekends throughout. It’s not that we don’t text each other at all then, it’s more that those texts tend to be stuff like “have we got milk?”, or simply “pub?”.

Plotting log likelihood ratios of how much each of us uses each word in comparison to the other also captures our texting styles:

top 20 words each (no names).png

For example, we both use the word /ha/ to express laughter, but I spell it “ha” and she spells it “hah”. Likewise, “til” and “till” as abbreviations for “until”, and I seem to use “somebody” while she uses “someone”.

If we filter out equivalent words and proper names (like the pubs, supermarkets, and stations we go to most often), another difference in dialogue style appears:

top 10 words each (no proper names).png

I am apparently a lot more conversational; I write out interjections (hmm, oooh, hey, ohhh) and reactions (fuck’s comes from for fuck’s sake, hoera comes from the Dutch phrase hiep hiep hoera, and boourns comes from, erm, The Simpsons). Apart from hhmmm, she doesn’t write interjections or contextual replies at all. Apart from the interjections and replies, my main thing is adjectives; she tends towards nouns and verbs.

The next step is sentiment analysis. If I plot log likelihood bars for each sentiment, I seem to be an atrociously negative person:

sentiment error bars.png

…but this, I think, is more a problem with the way sentiment analysis works in the syuzhet and tidytext packages using NRC sentiment data. Each word in the NRC corpus has a given value, 0 or 1, for a range of sentiments, and this sentiment analysis style simply adds it up for each word in a given set.

Because of that, it doesn’t really capture the actual sentiment behind the way we’re using these words. Let’s look at the main words driving the differences in each sentiment:

sentiment log likelihood words.pngFor me, a lot of my disgust and anger is coming from the word damn. If I was texting damn! every time I stubbed my toe or something, perhaps that would be accurate; but in this case, a lot of the time I write damn is in sympathy, as in exchanges like:

“My computer crashed this afternoon and I lost all the work I’d done today”
“Damn, that’s horrible”

Meanwhile, the word coop is actually me talking about the coöp / co-op, where I get my groceries. I’m not talking about being trapped, either physically or mentally.

The same goes for my girlfriend being more positive. With words like engagement and ceremony, she’s not joyous or anticipatory about her own upcoming nuptials or anything; rather, several of her colleagues have got engaged and married recently, and most of her uses of the words engagement and ceremony are her complaining about how that’s the only topic of conversation at the office. As for assessment, council, and teacher, she works in education. These are generally neutral descriptions of what’s happened that day.

So, I was hoping to be able to plot some sentiment analyses to show our relationship over time, but either it doesn’t work for text messages, or we’re really fucking obtuse. I think it might be the former.

Instead, I’ll settle for showing how much we both swear over time:

expletives per month.png

Each dot represents the number of occurrences per month of a particular expletive. I’m clearly the more profane here, although I do waver a bit while she’s fairly consistent.

More importantly is how we talk about beer a similar amount:

beer per month.png

Since couples who drink together stay together (or in the words of this study, “concordant drinking couples reported decreased negative marital quality over time”), I think this bodes pretty well for us.

Standard