Choosing a Geocoder in 2021
Geocoding is a subject most geospatial users have to deal with on a regular basis. It is often one of the first steps in analyzing a dataset. But rarely do we really think about the reasons why we choose one geocoder over another other than I know what I know, and I like what I like. But reflecting on this thought I realized that I have made some very poor decisions on picking geocoders over the years, and I am making a resolution in 2021 to not repeat this mistake.
My History With Geocoders
While writing this article, I tried to write down every geocoder I’ve ever used over the years. My Gmail account dates from 2005 so there are about 15 years of documentation in there for me to look at but even before Gmail I have been using Geocoders. The less said about those built on Fortran I used in the 90s the better, but my first exposure to geocoding was with TIGER/Line in ArcView 3.x and ArcGIS 8.x. Boy, our standards for accuracy were much lower back then, but it did teach me much about how a geocoder works.
Once Google, Yahoo! and others released APIs in the mid-2000s, it made Geocoding much easier and gave us so many more choices including rooftop level, highly-accurate geocoding which really was a game-changer for many of us. Over the past decade, I have probably used Mapbox and Google geocoders more than any others but as I mentioned before, I have not always been using the right tool for the right reason.
What I look for in a Geocoder
Traditionally I look for a good API, one that has an SDK for my platform/language and a pricing model I understand. I probably use Google’s geocoder the most because I generally am building applications on Google’s mapping APIs. In that case, the Google geocoder makes a ton of sense. But I also have used Mapbox’s geocoder quite a bit when using their APIs. If you had to make me choose which one I liked the most, I don’t think I would answer. I always pick the geocoder which is for the platform I’m on. But this leaves out the big issue I’ve been thinking about.
What should matter and when
The two big questions I didn’t think about were data portability and accuracy. In my experience I focused on using the geocoder on the platform I was on, but as I’ve started integrating data in disparate platforms, displaying geocoded data on a specific mapping API becomes an impediment. Having a Geocoder that is platform-agnostic is critical to my workflow these days as my data is all over the place. The freedom to pick and choose now trumps my want of one platform.
Secondly, accuracy has become a big deal to me. At previous companies, I used rooftop geocoding for data that was neighborhood-level accuracy. Essentially I’m paying for accuracy I don’t need or even want to imply. Paying for an expensive geocoder when you only need this level of accuracy is a huge waste of money and in my case a waste of time.
And one more thing
A final point to consider: when we talk about geocoding more people typically mean “forward geocoding”, and don’t pay much attention to reverse geocoding. But reverse and forward geocoding are different, and it doesn’t make sense to assume because a provider is good (or bad) at one, they are at the same level for the other. If you are wondering what reverse geocoding is all about check out OpenCage’s guide to reverse geocoding. Just like with accuracy you may be way overspending if you are paying a premium for a service that is great at forward geocoding, and then use it only for reverse.
So which Geocoder am I using?
Well, I’m using them all. The epiphany that I’ve come to doesn’t mean that I throw Google/Apple or Mapbox/Here out the door. It just means that I now look at my use case before getting started and see what works best for my project. It might be Google on Google Maps, Apple on Apple Maps, or Mapbox on Mapbox, but it also might be OpenCage on OpenStreetMap or OpenCage on Here or any other endless combination my clients might come up with. Bringing in tools such as OpenCage which gives me lower cost, permissive licensing, no vendor lock-in. Multiple geocoders under a single API are giving me the freedom to focus on my data and not on the technological process of geocoding.
Embracing more geocoders in my workflow has made my products better, my clients happier, and nothing beats saving money and time.