***
UPDATE: this blog was written for macro inputs, but I’ve now written about a more general, and better, way of sorting them out. If you’re trying to sort out your underscores in your field names, give this blog a go!
***
I was building an Alteryx app for a client this week, and spent an hour or two tripping up over a really straightforward issue. My workflow worked just fine for a small subset of the data that I was testing it on… but when I fed in the rest of the data, I got this error message:
This isn’t helpful. My data is perfectly clean, thank you very much. I’m not having that. The workflow was working fine for a subset of the data, so there’s no reason it should have tripped up just because more data was added. Or so I thought… but it turns out that Alteryx’s Crosstab tool has a problem with special characters.
Let’s start from the beginning. I’m building an app with a drop down menu which lets you filter the data to a single value. That looks a little bit like this:
You can manually type in the possibilities in the drop down tool, but if there’s a lot of them (which there generally are), it’s a bit of an arse ache, and it’s not dynamic either in case the data changes in future… so the best option is to populate the drop down menu with the field names of a connected tool:
Irritatingly, there isn’t an option in the drop down tool configuration to take distinct values from the rows of a particular field of a connected tool. This means that you have to take the field where the interesting stuff is and crosstab it, so that all the values become a column heading.
This is pretty straightforward. First, I used a summarize tool and grouped the data by the field which has the values which you want to be in the drop down tool. Then, because you can’t crosstab a single field, I simply grouped by the same field again. That gave me this output:
…and I just crosstabbed it so that I’d get A * B, A + B, and A – B as the field names, and also A * B, A + B, and A – B as the first row of data.
But no:
The warning message is more informative than the error message here. What’s going on with the multiple fields named “A___B”?
It turns out that the crosstab tool automatically changes special characters, like *, +, and -, to underscores in field names. In my subset of data at work, I wasn’t working with any values with special characters in them; but when I brought the rest of the data in, there were values that were textually different, like A * B and A + B, which became the same thing when replacing the special characters with underscores. I’m not sure why it does this; my guess is that it’s something to do with making field names compatible with programmes like SQL and R, which are more restrictive in the characters they allow in field names.
I wasted quite a bit of time trying to work out what was going on here, but luckily, there’s a simple solution. Instead of grouping by the field in the summarize tool twice, just group by that field once. Then, add a Record ID tool in, so that you get something like this:
Now, you can crosstab successfully. Put the Record ID field as the new column headers, and the thing you’re actually interested in as the values:
The next step is to use a dynamic rename tool to take the column names from the first row of data. Unlike the crosstab tool, the dynamic rename tool doesn’t change special characters when assigning new column names:
…and there you go. Now you have an app where the populated drop down menu works with special characters!
Pingback: Between order and (statistical) model: how the crosstab tool in Alteryx orders things alphabetically but inconsistently | Vizzee Rascal
Pingback: Between order and (statistical) model: how the crosstab tool in Alteryx orders things alphabetically but inconsistently – Cloud Data Architect
Hello mate great blog postt
LikeLike