Research for (Geospatial Entity Object Code) to represent a geographic point of interest (POI) and methods to evaluate or choose codes for an appropriate purpose

地理的座標を表現するコードシステムと、目的に応じたコードの評価お よび選択手法の研究

Naoki Ueda and Venkatesh Raghavan Graduate School for Creative Cities, Osaka City University, 3-3-138 Sugimoto, Sumiyoshi-ku, Osaka 558-8585, Japan

ABSTRACT: A time-expression format such as “10:15” is common and everyone uses it naturally in daily life. In contrast, a location-expression format, such as “latitude and longitude,” are not used in daily life. This is because it is not convenient for people to remember and use. Today, a GPS device is built-into most mobile phones and many location-based services (LBS) are gaining popularity. However, we still use a descriptive explanation to show location and spend much time and cost to communicate a location to others. Therefore, in attempts to handle location as easily as time, various (geospatial entity object code) have been invented around the world. All, however, are not yet in popular use. In this report, an overview of GEOCODE is introduced and some perspectives given to evaluate and choose an appropriate GEOCODE for a specific purpose. The author of this report is a GEOCODE researcher and an inventor of several GEOCODEs.

KEY WORDS: GIS, LBS, Coordinates, GEOCODE, geospatial entity object code

概要:時刻を表す『10:15』のようなフォーマットは、ごく自然に日常生活の中で使われています。 これに対して、場所を表すフォーマット、例えば『緯度・経度』は日常生活で自然に使用できるよう な便利なフォーマットではなく、あまり利用されていません。 今日、ほとんどの携帯電話には GPS 機能が内蔵され、位置ベースサービス(LBS)が普及してきました。 しかし、我々は場所を表すには住所や説明的な文章を使うことが多く、他社に場所を伝えるときに大 変な労力を使っています。 まだ一般的ではありませんが、時刻と同じように簡単に場所を扱えるように、これまでたくさんのジ オコード(地理空間物を特定するコード)が世界中で発明されてきました。 本レポートの筆者はいくつかのジオコードの発明者でもあり、ジオコードの研究者でもあります。本 レポートでは、ジオコードの概要を紹介し、目的に応じたジオコードを評価・選択する上でポイント となる幾つかの「視点」を紹介します。

キーワード:GIS、LBS、座標、ジオコード

1. INTRODUCTION Focusing on time, we usually use “time” in our daily life. For example, “Where?” Asking place is a very fundamental question in daily life. However, the answer to this “What time is the meeting?” question is not always as simple as the answer “10:15.” to the question, “When?” To clarify the main theme of this research, it is useful to compare At the beginning, however, it wasn’t that simple. the generic characteristics of time and location in According to Alvin Toffler, the concept of “time” our lives. was drastically changed in the era of industrialization. Before that era, people used the sun, the moon, stars, or other natural mobile devices enables to lookup location for phenomenon to know “When” (2006, Toffler). LBS usage. For example: We will soon be demanding a new concept of “location,” and we will be able to handle it as we “Let’s meet when the sun comes up to do for “time.” the top.” Currently, most people do not have a common “It will begin two days after the day and popular way to express geographic location. when Sirius rises just before sunrise.” Latitude and longitude is not as easy to say or remember as “10:15.” In addition, for people, time was a kind of Therefore, we also cannot teach children how to rounding, as a season, and sometimes it flowed read location in a manner as easy as how to slowly and sometimes flowed quickly. read a clock. Therefore, telling another person “exactly when,” Society will soon need an easy and common needed considerable effort or cost. method or format to express location, as easy time format. I define codes to point exact In the industrialization era, the commoditization geographic location as “GEOCODE” of watches and clocks enabled people to first (Geographic Entity Object Coding System) in begin to know an exact time. capital alphabet in this report. Secondly, industrialization needed a concept of According to George Miller, it is hard to obtain “industrialized-time” that was straight, linear, and more than seven chunks of information at once that moved at a constant speed to enabled in a human brain's short memory capacity (1956, factory efficiency, as we know today. A new Miller). The traditional coordinate system, latitude school system taught people how to read a clock, and longitude is usually written as like following, and how to work or study along with a planned schedule. +34.592121, +135.505140. In addition, it should be mentioned that the In the example above, each description of analog and digital format was also a key part of latitude and longitude has more than seven digits. the commoditization of “time.” It is because of Thus, the latitude and longitude system is not a this change that we can now write down candidate for an “easy” code for location, even “13:15” instead of “13 hours and 15 minutes.” though it is widely used in some industries. When talking about location, like time, we use Recently various GEOCODEs have been location information as often as time information invented and introduced to point to exact in our daily life. However, regarding location, we locations. All have taken a different approach, are in a situation similar to the people of the 19th but they all have a common goal – to make century. “where” as simple and easy as “when.” Most people do not have a geometric concept of In this report, an overview of GEOCODEs is location. People use a descriptive explanation for explained. In addition, perspectives or insight to location, such as a postal , landmark, or evaluate GEOCODEs will be introduced. The direction and spend much effort or cost to tell author of this report is an inventor of several another person “Exactly where.” GEOCODEs and also a GEOCODE researcher. This report is written to giving a general idea of For example: GEOCODEs and shows how you can choose an “Let us meet in front of the statue.” appropriate code that meets your purpose. “My home is, from the station, go west and turn to the right at the second Note: Proper names of GEOCODEs are shown corner, then….” in italic and bold (ex. “LocaPoint”). “My office is 3-3-138, Sugimotocho, Sumuyoshi-ku, Osaka, ZIP 558-8585.” As we needed a new concept for “time” in the industrialization era, today we are entering into a new era regarding “location.” In the history of mankind, this is the first time that most people can individually know an exact location.The commoditization of GPS-equipped with only 10 numbers. This is suitable for car 2. Basic theories behind GEOCODE navigation purposes. 2.1. The limitation of length compression Another approach is called “length compression.” Generally speaking, making something short is To express 49.507 bits of information, 50 digits considered “compressing.” Thus, “data are needed in a binary expression. Fifteen digits compression” technology was the first to be are needed as decimal. In theory, if you use a considered to make latitude/longitude shorter. notation system where the base radix is higher However, the data which is the target of than 10, it should be shorter than latitude and compression is only a set of latitude and longitude, because they are in decimal (Figure 1). longitude, and it is very difficult to apply the usual A decimal-based GEOCODE, including latitude data compression methods, such as Hoffman- and longitude coordinates, needs 15 numbers to compression, ZIP, etc. express about a 1 by 1 meter precision. Fifteen For example, assume that both the Earth’s numbers is almost the same length as a credit meridian and equator are 40,000,000 meters. To card number (16 numbers) and thus it is hard identify a location with 1 meter by 1 meter for a person to remember without some sort of precision, tool. A hexadecimal-based GEOCDE realizes 13 digits, but it usually becomes something like 20,000,000 * 40,000,000 "28a6f6b021cf3," and it is still difficult for a = 8 * 10^14 combinations. person to handle. If you use 10,000 different characters like a Chinese character set, only 4 Thus, letters can express a location. However, it is also difficult to recognize or remember 10,000 log 2 (8 * 10^14) different kanji characters. = 49.507 bit of information amount As the Figure 1 shows, increasing the radix base For this, there are two approaches to make it number does not have a linear effect for its shorter. length. If a radix base is more than 36 or 60, the effectiveness of raising the radix base number The first approach is by reducing the amount of won’t decrease code length effectively. information, by limiting covered area, or reducing precision. Some GEOCODEs, such as LocaPoint, LP- Address and Maidenhead Locator System, For example, reduces the amount of use a complex-radix notation. In short, they use information needed by limiting the coverage a different radix-base for different digits in their area to only the land area in Japan, and format. However, they are all still following the decreases the precision to 30 meters. By doing same rule. this, MapCode can locate any location in Japan

Figure 1: Radix base and minimum length for 8*10^14 value. Each type has a strength and weakness (Figure. 2.2. Trade-offs among parameters 3). This basic categorization is helpful in If a GEOCODE is shorter then it should be easier evaluating a new GEOCODE. and simpler to use and remember. But if your GEOCODE covers global location, the length 3.1 Mesh-code type of GEOCODE needs to be longer than a regional code. This type of GEOCODE divides a coverage area If you want your code to show a more precise into multiple areas with a mesh and then uses a location, the length must be longer. Therefore, mesh ID as a code. Many GEOCODEs take this there are the following trade-offs in length, ease, approach precision, and coverage.

3.2 Encrypted type of GEOCODE This type of GEOCODE uses a specific algorithm to calculate a GEOCODE from/to a latitude and longitude value. The code presentation has almost nothing to do with the physical area, so it is hard to match a code with a printed map. They are designed with the assumption of use by a computer.

3.3 Database-assisted type of GEOCODE Figure 2. Basic categories of GEOCO This type of GEOCODE uses a database to store a part of or complete location data, and then uses an identification key to access its data as a code. For example Navigation-Code uses a 3. Basic GEOCODE categories database to store the integer part of latitude and Most GEOCODEs can be divided into three longitude degree part, and uses a city name as a types, with a few as something in-between. key to recall it. Yahoo!WhereOnEarthID is 32 bit integer number that is a key to retrieve geographic location from Yahoo! Web Services.

Mesh code type Encrypted type Database assisted type

Shorter length Not Good Good Good

Readability Good Not Good Good

Maintenance-ability Good Good Not Good

Figure 3. Strengths and weakness in each GEOCODE type.

because the amount of information needed is 4. Ten Perspectives to evaluate GEOCODE smaller. If you are looking for a code that covers Various GEOCODEs are designed for different only one , there should be a simpler and proposes. Since there are trade-offs as smaller code available. explained previously, it is important to choose an 4.2 Intuitiveness and Human-Friendliness appropriate GEOCODE for your purposes. It is like selecting a map with an appropriate Some GEOCODEs are designed for only- projection for specific purposes. Here are ten computer usage, and others are designed for perspectives to evaluate GEOCODE. human use. For human use, code must have “readable coordinates” and should be reflected to a printed map intuitively. 4.1 Coverage Area NAC, LocaPoint, LP-Address, or N-Code have If a code is regional or limited, the code length code formats that have a latitude part and a can be shorter than a global or wider area code, longitude part, separately. It gives an intuitive latitude and longitude with some simple interface like a guide map (Figure 4). equations. This makes implementation easy and can be independent from network connection A status. B C D 4.4 Affinity to IT system E There are two points to keep in mind regarding 1 2 3 4 5 affinity to an information technology system. One is the radix base number of its notation. Figure 4. Guide-map style Most GEOCODE uses a higher number of base N-Code uses only numbers geared toward use radix notation. For example, SONY’s by children. StatusQuo has a unique intuitive GEOCODE uses a 37th radix notation. However, interface. It uses polar-style coordinates, like it is not friendly for a system and engineers. orientation methods. Unlike other codes it has a The easiest notation is a 2nd radix (known as guide-map style (X-Y style) for guidance. binary), 10th radix (known as decimal notation or StatusQuo provides angle in time format (3:00 natural number), 16th radix (known as HEX for east, 12:00 for north, etc.) and distance from decimal), 32nd radix, and 64nd radix. an origin point that is defined for each city. If the BINGEO (4th radix), GeoPo (64th radix), and city structure is not an X-Y shape and is a polar (32nd radix) are system friendly shape, like Rio de Janeiro or Paris, StatusQuo GEOCODEs. code follows the city road structure. StatusQuo is designed for the illiterate or children who can The other point of affinity is an existence of the only read numbers and a clock in developing aliquant part in a calculation. . For example, LocaPoint has a resolution of Non-intuitive GEOCODES are not designed for 0.00000787787542452995343 degree in human handling. They are more focused on longitude. This makes up the aliquant part in “compressing.” For example, GeoPo tries to most calculations, and may cause rounding error. make a code as short as possible for Twitter On the other hand, LP-Address has a resolution usage. of 0.00001 degree, so it is convenient for calculation. NAC Also used 30th radix, and this prevents an aliquot part in some calculations, 4.3 Database assisted or pure calculated because longitude degree is 360° base. code

Some codes need database access, or a local 4.5 Altitude consideration table, in order to encode/decode. For example, MapCode needs data of the first mesh Currently, only NAC has an extension to handle coordinate. altitude. To handle altitude, you need to be careful that altitude is from sea level, geoid level, Database-assisted types can make code shorter, or ground level. Also, it might be a negative and can be asymmetric. It is possible to assign a number. Sometimes, it is more convenient to short code for cities, and a longer code for an use a floor number rather than altitude in meter ocean or less-populated area. However, a or feet. database needs to work with it, and it may require network communication. In cases of Most GEOCODEs can handle only latitude and disaster and disabled communication line, it can longitude. be difficult to use.

Other codes can be encoded/decoded by pure 4.6 Area-code concepts calculation with only with latitude/longitude. For example, NAC, LocaPoint, LP-Address, Some codes have an “area code concept” like GeoPo, etc., can be converted directly from the area code of a telephone number. If code is first divided into an area code, then assigned a Most GEOCODEs are also trying to eliminate I-1 local code inside of it, people can become or O-0 type errors. GeoPo, Compact Text familiar with the area code, and usually do not encoding by Microsoft, and NAC are not using I have to handle the area code for regional or and O to prevent this kind of error. NAC also local usage. Even if the code is long, as long as prevents the use of vowels, since their “sounds” people already know an area code or use the can be confusing among various languages. code within the same area, they only have to handle local code. This makes the code shorter in practical use. 4.8 Precision and scalability

Most codes have a hierarchical area structure The precision of code is a trade-off with and are very flexible, but do not give an area coverage area and length of code. Some codes code concept clearly. have a flexible code schema like a decimal value, so you can choose precision for your N-Code, LocaPoint, and LP-Address have an purposes. For example, NAC can express about area-code concept. 1.6 by 1.6 meters with 10 digits, but can be 0.05 by 0.05 meter with 12 digits. N-Code and GeoPo have the same schema. 4.7 Fault-tolerant mechanism Some codes have a fixed precision and fixed In a critical situation, such as a search and format. LocaPoint expresses 0.4 by 0.8 meters rescue mission in natural disaster, incorrect with 12 letters, LP-Address expresses 1.1 by locations may result in tragedy. 1.1 meters with 12 letters, and MapCode As long as a coordinate is transferred from express 30 by 30 meters with 10 numbers. machine to machine, there is very little need for a fault-tolerant mechanism. However, if coordinate information goes through human 4.9 Licenses and Cost operation, such as telephone, walkie-talkie, Some GEOCODEs are patented. For example, radio, paper message, etc., then there is always BINGEO and SONY’GEOCODE are patented in the possibility of a “messaging game” type of the U.S., MapCode and LocaPoint are error. patented in Japan. Most other GEOCODEs are Some codes have a structure that prevents or in patent pending or licensed with copyright and detect faults. SONY’s GEOCODE has an trademarks. extension that needs one more letter as a check Some GEOCODEs such as LP-Address, digit. If the code is input incorrectly, it detects the GeoPo, declare an open license and are free. invalid code. This is the same mechanism as that used in a bar-code system in case a bar- code is dirty or a miss-reading occurs. 4.10 Datum

LocaPoint and LP-Address have a different It is very important to check which datum is the type of fault tolerant mechanism. They use a base of GEOCODE you are using. In order to cognitive psychology approach. They always express pinpoint location, technically just latitude have a specific pattern (alphabet, alphabet, and and longitude is not enough information. number) to create a kind of rhythm and this Geographic datum is a basic factor of earth helps the brain’s cognition process. The brain is emulation. The same value of latitude and not good at random information, but it is very longitude shows a slightly different location if good at pattern recognition. It also prevents I-1 you use incorrect datum. and O-0 mistakes because the position in the format indicates number or alphabet. That In the U.S., NAD27, NAD83, and WGS84 datum minimizes human error in reading, listening, co-exist. In Japan, there are two datums widely writing, or inputting. used. NAC, GeoPo, LocaPoint, LP-Address and most GEOCODEs specify the WGS84 datum. 5. A list of GEOCODEs %25202006.doc&ei=4YuUS52-Dc- GkAWs0_yJDQ&usg=AFQjCNHS9GN9P BINGEO AyggWC13aeKq_XV6GwO8g&sig2=Qdo US Patent 6,552,670 -yXfazAdlFchjtmVZ6g Compact text encoding of SONY’s GEOCODE latitude/longitude coordinates by US Patent 6,005,504 Microsoft World Meteorological Organization US Patent 7,302,343 squares (WMO squares) C-squares http://en.wikipedia.org/wiki/World_Meteor http://www.marine.csiro.au/csquares/spec ological_Organization_squares 1-1.htm Yahoo! Where On Earth ID (WOEID) Geohash http://developer.yahoo.com/geo/geoplanet http://en.wikipedia.org/wiki/Geohash /data/ GeoPo http://geopo.at/intl/en/ Georef References http://en.wikipedia.org/wiki/Georef Revolutionary Wealth http://www.map-reading.com/ch4-4.php (2006, Alvin Toffler & Heidi Toffler) Geotud http://www.geotude.com/about/specs "The Magical Number Seven, Plus or LocaPoint Minus Two: Some Limits on Our http://www.LocaPoint.com/en/index.html Capacity for Processing Information" Japan patent 3885157 (1956, George Armitage Miller) LP-Address http://lpaddress.com/ Maidenhead Locator System http://en.wikipedia.org/wiki/Maidenhead_L ocator_System MapCode http://guide2.e-MapCode.com/ Marsden square http://en.wikipedia.org/wiki/Marsden_Squ are Munich Orientation Convention (StatusQuo) http://www.volksnav.com/ Natural Area Coding System (NAC) http://www.nacgeo.com/nacsite/ Navigation Code by Aishin-AW Japan patent pending H11-184374. New-geocode by Asian Aero Survey Japan patent pending 2003-186391 N-Code http://www.ncproject.jp/main_e.htm P-code http://www.google.co.jp/url?sa=t&source= web&ct=res&cd=4&ved=0CBYQFjAD&ur l=http%3A%2F%2Foneresponse.info%2 Fresources%2Fimtoolbox%2Fpublicdocu ments%2FPCode%2520- %2520Quick%2520Guide%2520-