url encoding safe relational operators
By traviscj
- 2 minutes read - 407 wordsI was catching up on Jesse Wilson’s blog this morning, and in particular his post URL Encoding Is Material. In that article, he says
My advice
If you’re defining your own URLs, you’ll save a lot of trouble by avoiding characters like <, >, {, }, +, ^, &, |, and ;.
I came across this problem in one of my projects at work:
I wanted a page to display some recent records with 1) a name
field provided by the user and 2) a value
field that match $l \leq \text{value} \leq b$, where $l$ and $b$ are given by the user.
I also wanted that page to have a URL representing that condition, so that it could be easily shared with teammates and soforth.
When I started working on it, I didn’t have Jesse’s wisdom, and so I naively came up with a URL structure something like
https://service/matches.html#0.5<=theName<=0.6
I reveled in my brilliance and sent a link to one of my co-workers, who promptly told me the page just gave an error. So I clicked the link myself, and saw something like this:
![matches unsafe encoded]({{ site.baseurl }}/assets/matches_unsafe_encoded.png)
Something had turned the URL into https://service/matches.html#0.5%3C=theName%3C=0.6
!
At this point I came across some similar advice somewhere on the Internet and re-thought the use of <=
.
I considered separating with slashes or something, but the actual project actually had two of these conditions in the URL, so that seemed a bit unwieldy:
https://service/matches.html#0.5/nameA/0.6/0.7/nameB/0.8
This has the same feel as a program with a long list of positional arguments – it requires the user to remember the convention of “lower bound, then name
, then upper bound”.
It’s also not very flexible – it would be much harder to support a condition with <
instead of <=
or completely omit one side of the bound.
So I wanted to keep the conditions separated by /
, but have something else to denote the separation between the bound and the name
.
I initially considered using magic strings like leq
/lt
for that separation, but this also seemed a bit unsatisfying – what if the name
for some record contained with leq
or lt
?
After mulling it over a bit, my Fortran training kicked in.
Fortran defines relational operators like
.LT.
.LEQ.
and soforth. This approach resulted in URLs like:
https://service/matches.html#0.5.leq.nameA.leq.0.6/0.7.leq.nameB.leq.0.8
which, while a bit wordy, worked pretty well. And I was finally able to share links with impunity:
![matches safe]({{ site.baseurl }}/assets/matches_safe.png)