WO2012172160A1 - Method and apparatus for resolving geo-identity - Google Patents

Method and apparatus for resolving geo-identity Download PDF

Info

Publication number
WO2012172160A1
WO2012172160A1 PCT/FI2012/050470 FI2012050470W WO2012172160A1 WO 2012172160 A1 WO2012172160 A1 WO 2012172160A1 FI 2012050470 W FI2012050470 W FI 2012050470W WO 2012172160 A1 WO2012172160 A1 WO 2012172160A1
Authority
WO
WIPO (PCT)
Prior art keywords
geo
information
combination
terms
interest
Prior art date
Application number
PCT/FI2012/050470
Other languages
French (fr)
Inventor
Juong-Sik Lee
Deepti Chafekar
Umesh Chandra
Gyan RANJAN
Original Assignee
Nokia Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corporation filed Critical Nokia Corporation
Priority to CN201280029185.8A priority Critical patent/CN103609144A/en
Publication of WO2012172160A1 publication Critical patent/WO2012172160A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/021Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services

Definitions

  • LBS location based services
  • the geographical position is usually available from the GPS coordinates obtained from a GPS enabled handheld devices.
  • the GPS data is not available when the GPS connectivity is weak, user devices are not GPS enabled devices (prominent in emerging markets), user devices are not mobile, searching information about a location different from the GPS location, etc.
  • the existing geo-coding services may convert geographic coordinates and structured location data (e.g., street addresses) into names of places (e.g., points of interest (POIs)), street addresses, neighborhoods, cities/towns, counties/provinces, states, or countries, etc., to provide users with location information that they can understand.
  • location data that follows a definite structure is not always available or well recognized in all areas and/or countries, especially in developing countries. It is challenging to handle unstructured and/or imprecise location information (e.g., between the third and the forth bicycle rental shops on the beach), and colloquial and unofficial location references (e.g., "the Big Apple” as a nickname for New York City, "near XYZ hospital,” or "at ABC mall”). Accordingly, service providers and device manufacturers face significant technical challenges in processing un-structured and/or colloquial location information and identifying therein user-friendly location information. Such geo-identity resolution enhances the geo-coding database and user experience of location based services.
  • a method comprises causing, at least in part, a collection of point- of-interest information from one or more sources.
  • the method also comprises processing and/or facilitating a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof.
  • the method further comprises causing, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo-spatial documents are associated with one or more geographical areas.
  • the method further comprises processing and/or facilitating a processing of the one or more geo-spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
  • an apparatus comprises at least one processor, and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause, at least in part, the apparatus to cause, at least in part, a collection of point-of-interest information from one or more sources.
  • the apparatus is also caused to process and/or facilitate a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof.
  • the apparatus is further caused to cause, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo- spatial documents are associated with one or more geographical areas.
  • the apparatus is further caused to process and/or facilitate a processing of the one or more geo-spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
  • a computer-readable storage medium carries one or more sequences of one or more instructions which, when executed by one or more processors, cause, at least in part, an apparatus to cause, at least in part, a collection of point-of-interest information from one or more sources.
  • the apparatus is also caused to process and/or facilitate a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof.
  • the apparatus is further caused to cause, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo-spatial documents are associated with one or more geographical areas.
  • the apparatus is further caused to process and/or facilitate a processing of the one or more geo- spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
  • an apparatus comprises means for causing, at least in part, a collection of point-of-interest information from one or more sources.
  • the apparatus also comprises means for processing and/or facilitating a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof.
  • the apparatus further comprises means for causing, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo-spatial documents are associated with one or more geographical areas.
  • the apparatus further comprises means for processing and/or facilitating a processing of the one or more geo-spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
  • a method comprises causing, at least in part, a collection of point- of-interest information from one or more sources.
  • the method also comprises causing, at least in part, transmission of a query to a geo-coding service.
  • the method further comprises receiving a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof, wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas.
  • an apparatus comprises at least one processor, and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause, at least in part, the apparatus to cause, at least in part, transmission of a query to a geo-coding service.
  • the apparatus is also caused to receive a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof, wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas.
  • a computer-readable storage medium carries one or more sequences of one or more instructions which, when executed by one or more processors, cause, at least in part, an apparatus to cause, at least in part, transmission of a query to a geo-coding service.
  • the apparatus is also caused to receive a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof, wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas.
  • an apparatus comprises means for causing, at least in part, transmission of a query to a geo-coding service.
  • the apparatus also comprises means for processing and/or facilitating a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof.
  • the apparatus further comprises means for receiving a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof, wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas.
  • a method comprising facilitating a processing of and/or processing (1) data and/or (2) information and/or (3) at least one signal, the (1) data and/or (2) information and/or (3) at least one signal based, at least in part, on (including derived at least in part from) any one or any combination of methods (or processes) disclosed in this application as relevant to any embodiment of the invention.
  • a method comprising facilitating access to at least one interface configured to allow access to at least one service, the at least one service configured to perform any one or any combination of network or service provider methods (or processes) disclosed in this application.
  • a method comprising facilitating creating and/or facilitating modifying (1) at least one device user interface element and/or (2) at least one device user interface functionality, the (1) at least one device user interface element and/or (2) at least one device user interface functionality based, at least in part, on data and/or information resulting from one or any combination of methods or processes disclosed in this application as relevant to any embodiment of the invention, and/or at least one signal resulting from one or any combination of methods (or processes) disclosed in this application as relevant to any embodiment of the invention.
  • a method comprising creating and/or modifying (1) at least one device user interface element and/or (2) at least one device user interface functionality, the (1) at least one device user interface element and/or (2) at least one device user interface functionality based at least in part on data and/or information resulting from one or any combination of methods (or processes) disclosed in this application as relevant to any embodiment of the invention, and/or at least one signal resulting from one or any combination of methods (or processes) disclosed in this application as relevant to any embodiment of the invention.
  • the methods (or processes) can be accomplished on the service provider side or on the mobile device side or in any shared way between service provider and mobile device with actions being performed on both sides.
  • An apparatus comprising means for performing the method of any of originally filed claims 1-30 and 51 -53.
  • FIG. 1 is a diagram of a system capable of resolving geo-identification of un-structured and/or colloquial location information, according to one embodiment
  • FIG. 2 is a diagram of the components of a geo-coding manager, according to one embodiment
  • FIGs. 3A-3B are flowcharts of processes for resolving geo-identification of un-structured and/or colloquial location information, according various embodiments;
  • FIG. 4 is a location area diagram mapped with points of interest and utilized in the process of FIG. 3, according to one embodiment
  • FIGs. 5A-5B show examples of two geo-spatial documents utilized in the process of FIG. 3, according to various embodiments
  • FIGs. 6A-6D are diagrams of user interfaces utilized in the process of FIG. 3, according to various embodiments;
  • FIG. 7 is a diagram of hardware that can be used to implement an embodiment of the invention
  • FIG. 8 is a diagram of a chip set that can be used to implement an embodiment of the invention.
  • FIG. 9 is a diagram of a mobile terminal (e.g., handset) that can be used to implement an embodiment of the invention.
  • a mobile terminal e.g., handset
  • points of interest refers to a specific point location that an individual entity, business entity, or any legal entity may find useful or interesting. This term is used interchangeably with landmark.
  • a POI may be a historical monuments, cinema theatres, pub, bar, restaurant, hotel, club, venue, sightseeing spot, shopping mall/center, building, museum, industrial/science park, police/fire station, post office, bank, ATM machine, hospital, pharmacy, school, church, golf course, bridge, historic house, camping/caravan site, daycare center, community center, tunnel, airport, roadway, waterway, railway, rock formation, spring, oasis, mountain, etc.
  • the term "cloud” refers to an aggregated set of information and computational closures from different sources. This multi-sourcing is very flexible since it accounts and relies on the observation that the same piece of information or computation can come from different sources.
  • information and computations within the cloud are represented using Semantic Web standards such as Resource Description Framework (RDF), RDF Schema (RDFS), OWL (Web Ontology Language), FOAF (Friend of a Friend ontology), rule sets in RuleML (Rule Markup Language), etc.
  • RDF refers to a family of World Wide Web Consortium (W3C) specifications originally designed as a metadata data model.
  • LBS location-based service
  • LBS services can be used in a variety of contexts, such as navigation, entertainment, health, work, personal life, etc.
  • Location-based services include services to identify a location of a person or object, discover the nearest banking cash machine or the whereabouts of a friend or employee.
  • Location-based services include location-based commerce (e.g., trade and repair, wholesale, financial, legal, personal services, business services, communications and media,), location-based ecommerce (e.g., online transactions, coupons, marketing, advertising, etc.), accommodation, real estate, renting, construction, dining, transport and travel, travel guides, mapping and navigation, parcel/vehicle tracking, personalized weather services, location-based games, etc.
  • location-based commerce e.g., trade and repair, wholesale, financial, legal, personal services, business services, communications and media
  • location-based ecommerce e.g., online transactions, coupons, marketing, advertising, etc.
  • accommodation real estate, renting, construction, dining, transport and travel, travel guides, mapping and navigation, parcel/vehicle tracking, personalized weather services, location-based games, etc.
  • the term "user context” refers to discrete context characteristics/data of a user and/or the user terminal/equipment (UE), such as a date, time, location, current activity, weather, a history of activities, etc. Associated with the user.
  • UE user terminal/equipment
  • a contextual structure is inserted with instances, locations (e.g., points of interest), and events (e.g., activities) that contain possible relationships between points of interest and user activities discovered via, for instance, data-mining or other querying processes.
  • the contextual structure incorporates characteristics and features of an individual user's context data, such as the user's calendar, text messages, instant messages, etc.
  • user preference data is also merged into the user context structure.
  • the contextual data elements may include location (where the user/UE is available, wherein the context information source is applicable. Etc), active dates (the range of dates for which the user/UE and/or the context information source is available), sub-identifiers (each sub-identifier associated with a different location and/or applicable context information source), event type (event information associated with the user/UE), time (of the event if the user/UE involves), applicable context (in which the context information source is applicable), context source (what sensors, services, applications, etc. can provide the related contextual information), and optionally preference elements (associated with what preferences data elements ), etc.
  • the user preferences include user information and user preference data.
  • Typical user information elements include a user identifier (e.g., telephone number), user device model (e.g., to identify device capabilities), age, nationality, language preferences, interest areas, login credentials (to access the listed information resources of external links).
  • the preference data is automatically retrieved and/or generated by the system from the backend data and/or external information sources.
  • the preference data structure is recorded at the user device based upon user personal data, online interactions and related activities with respect to specific topics, points of interest, or locations, etc. It is contemplated that the user can define any number of preference elements and tokens as user preference data.
  • the system decides what parameters or attributes to choose to represent user context and/or preferences.
  • FIG. 1 is a diagram of a system capable of resolving geo -identification of un-structured and/or colloquial location information, according to one embodiment. It is becoming increasingly popular for service providers and device manufacturers to bundle or make available navigation and mapping services on an array of user devices (e.g., mobile handsets, computers, navigation devices, etc.). Such devices may utilize location-based technologies (e.g., Global Positioning System (GPS) receivers, cellular triangulation, assisted-GPS (A-GPS), etc.) to provide navigation and mapping information.
  • GPS Global Positioning System
  • A-GPS assisted-GPS
  • One growing trend is the use of geo-coding and/or geo- identify resolution services on a given location text and/or contact information to extract structured location information, then use the structured location information to provide location based services, and/or to the information to the user in the form of place names (e.g., points of interest (POIs)), street addresses, neighborhoods, city/town names, county names, state/province names, country names, etc.
  • place names e.g., points of interest (POIs)
  • POIs points of interest
  • street addresses e.g., city/town names, county names, state/province names, country names, etc.
  • Address interpolation requires the street address to match it to a street and specific segment, and then interpolates the position of the address within the range along the segment, to obtain the geographic coordinates of the street address.
  • a street segment running from 900 to 1000 is extracted. 955 would be somewhere in the middle of this block, and odd numbers are on one side and even on other side.
  • approximate geo-coordinates are mapped for the address. This method can work well only for addresses that follow a particular scheme (e.g., the numbering scheme). However, addresses in old cities, developing counties, etc. do not adhere to any particular scheme thereby it is difficult to deploy the address interpolation. Finding geo -coordinates based on address zip codes is not always helpful since the user may not know the relevant zip code, and a zip code area can be a too large for an intended purpose.
  • Another geo-coding approach is using text-based searches through a corpus of point-of-interest (POI) data and returns the geo -coordinates of the POI that has the maximum string matches.
  • POI point-of-interest
  • a document is considered for each POI entry. This document can be indexed based on certain keywords.
  • User entered address is then matched (based on string proximity) with each of these documents.
  • a relevance score is assigned to each document based on the number of string matches. Document with the highest relevance score is then returned along with its geo-coordinates.
  • this pure string matching approach may fail due to the vagaries in addresses, as demonstrated in a following example.
  • the user inputs an address of a cafe shop named Barista in New Delhi, India: McDonalds, near Alankar theatre, Feroze Koch Marg, Lajpat Nagar II, and there are four candidates.
  • addresses 1 , 2, and 3 are close to one another since 3 C's cinema road and Feroze Koch Marg refer to the same road.
  • Address 4 is far from the first three addresses, since it is in a different area (Lajpat Nagar III). Although Address 3 is a perfect match for the search query, the text-based search engine, however, find Address 4 as the best match since its has string matches with a highest relevance score compared to other addresses.
  • a system 100 of FIG. 1 introduces the capability to resolve geo- identification of un-structured and/or colloquial location information by crowdsourcing and document clustering. More specifically, the system 100 provides a two-fold approach for resolving geo-identities of unstructured addresses.
  • the system 100 uses a crowd sourcing approach for collecting local business, landmark and point of interest (POI) data.
  • POI point of interest
  • the system 100 bootstraps geo-coordinate information of addressees of local businesses and points of interest. These addressees are aggregated into clusters depending on their geo-coordinates. Each POI data has geo-tags associated therewith.
  • the system 100 geo-codes based on these addressees.
  • the system 100 groups the data to form geo-spatial clusters.
  • the system 100 assigns a document for every cluster that aggregates information (e.g., street names, landmark references, area names, etc.) about all the POIs belonging to that cluster.
  • a document is associated with each cluster.
  • a search engine runs over these documents to find a good match.
  • a document with the best match is returned to the user.
  • This approach captures various colloquial references and reduces the reliance on street network data. The geo-coding task is thus reduced to a search task over the corpus of these documents.
  • the system 100 captures the variations in the same address.
  • the system 100 comprises a user equipment (UE) 101 (or UEs 101 or UEs lOla-lOln) having connectivity to a map platform 103 via a communication network 105.
  • UE user equipment
  • the location information may be utilized by applications 107 of the UE 101 (e.g., location-based applications).
  • the applications 107 may also include or have access to a geo-coding manager 109 to resolve geo -identification of un-structured and/or colloquial location information.
  • the geo-coding manager 109 may be included with the UE 101 as shown, or the geo- coding manager 109 may be provided and handled by the map platform 103.
  • mapping information such as location information, may be included in a map database 11 1 associated with the map platform 103 for access by the applications 107. As discussed, mapping information may be retrieved from the map database 1 11 to be utilized by the applications 107 of the UE 101.
  • mapping information may be associated with content information including live media (e.g., streaming broadcasts), stored media (e.g., stored on a network or locally), metadata associated with media, text information, location information of other user devices, or a combination thereof.
  • the content may be provided by the service platform 1 13 which includes one or more services 1 15a-1 15m (e.g., music service, mapping service, video service, social networking service, content broadcasting service, etc.), one or more content providers 1 17a-l 17k (e.g., online content retailers, public databases, etc.), other content source available or accessible over the communication network 105.
  • services 1 15a-1 15m e.g., music service, mapping service, video service, social networking service, content broadcasting service, etc.
  • content providers 1 17a-l 17k e.g., online content retailers, public databases, etc.
  • the applications 107 may present location-related content information (e.g., content with regard to images, videos, articles, people, places, etc., associated with a location) on a display of the UE 101 in addition or as an alternate to geo-coded information and/or other mapping information.
  • location-related content information e.g., content with regard to images, videos, articles, people, places, etc., associated with a location
  • a user of UE 101 may own, use, or otherwise have access to various pieces of information distributed in information stores 1 19a-l 191 in the cloud.
  • the UE 101 may utilize location-based technologies (GPS receivers, cellular triangulation, A-GPS, etc.) to provide mapping information.
  • the UE 101 may include a GPS receiver to obtain geographic coordinates from satellites 121 to determine the current location associated with the UE 101.
  • a user lands at the local airport near his home from a long-distance business trip.
  • a particular application 107 will check via the geo-coding manager 109 whether the location information (e.g., geo-coded information) for the current location is available in the memory of the UE 101.
  • the geo-coding manager 109 may have predicted that the user would return, for instance, based on calendar entries on the user's UE 101. Thus, in this case, accurate location information of the user's current location will most likely be pre-fetched and stored in the memory of the UE 101.
  • the geo-coding manager 109 may attempt to obtain location at the best accuracy possible. For example, while inside the area, the UE 101 may not have GPS satellite reception, and therefore may determine location information based on, for instance, imprecise landmark references such as "near Koramangala bus stop", "opposite Forum mall,” etc.
  • the address query may lack proper structure and may not follow any particular scheme.
  • the address query may contain colloquial references and abbreviations different from their official names. For example, MG Road is a very well known abbreviation for Mahatma Gandhi road in Bangalore.
  • the address query may contain references to area names. For instance, most cities in India are divided (unofficially) into different areas. These areas may range a few KM, but they may not have any official geo-boundaries.
  • the area names are either colloquial or based on some history associated with the area. They are however used as good reference points for addresses. Address schemes and notions may differ from one area/city/county to another. For tier 2 and 3 cities it may be hard to find detailed road and street network data. Even with a detailed street network, it may be hard to geo-code addresses.
  • the geo-coding manager 109 processes one or more geo-spatial documents to resolve the query over a geographical area.
  • the geo-coding manager 109 aggregates one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof.
  • the one or more terms were determined from a collection of point-of-interest information from one or more sources, into the one or more geo-spatial documents based, at least in part, on the location information.
  • the communication network 105 of system 100 includes one or more networks such as a data network (not shown), a wireless network (not shown), a telephony network (not shown), or any combination thereof.
  • the data network may be any local area network (LAN), metropolitan area network (MAN), wide area network (WAN), a public data network (e.g., the Internet), short range wireless network, or any other suitable packet-switched network, such as a commercially owned, proprietary packet-switched network, e.g., a proprietary cable or fiber-optic network, and the like, or any combination thereof.
  • the wireless network may be, for example, a cellular network and may employ various technologies including enhanced data rates for global evolution (EDGE), general packet radio service (GPRS), global system for mobile communications (GSM), Internet protocol multimedia subsystem (IMS), universal mobile telecommunications system (UMTS), etc., as well as any other suitable wireless medium, e.g., worldwide interoperability for microwave access (WiMAX), Long Term Evolution (LTE) networks, code division multiple access (CDMA), wideband code division multiple access (WCDMA), wireless fidelity (WiFi), wireless LAN (WLAN), Bluetooth®, Internet Protocol (IP) data casting, satellite, mobile ad-hoc network (MANET), and the like, or any combination thereof.
  • EDGE enhanced data rates for global evolution
  • GPRS general packet radio service
  • GSM global system for mobile communications
  • IMS Internet protocol multimedia subsystem
  • UMTS universal mobile telecommunications system
  • WiMAX worldwide interoperability for microwave access
  • LTE Long Term Evolution
  • CDMA code division multiple
  • the UE 101 is any type of mobile terminal, fixed terminal, or portable terminal including a mobile handset, station, unit, device, multimedia computer, multimedia tablet, Internet node, communicator, desktop computer, laptop computer, notebook computer, netbook computer, tablet computer, personal communication system (PCS) device, personal navigation device, personal digital assistants (PDAs), audio/video player, digital camera/camcorder, positioning device, television receiver, radio broadcast receiver, electronic book device, game device, or any combination thereof, including the accessories and peripherals of these devices, or any combination thereof. It is also contemplated that the UE 101 can support any type of interface to the user (such as "wearable" circuitry, etc.).
  • a protocol includes a set of rules defining how the network nodes within the communication network 105 interact with each other based on information sent over the communication links.
  • the protocols are effective at different layers of operation within each node, from generating and receiving physical signals of various types, to selecting a link for transferring those signals, to the format of information indicated by those signals, to identifying which software application executing on a computer system sends or receives the information.
  • the conceptually different layers of protocols for exchanging information over a network are described in the Open Systems Interconnection (OSI) Reference Model.
  • OSI Open Systems Interconnection
  • Each packet typically comprises (1) header information associated with a particular protocol, and (2) payload information that follows the header information and contains information that may be processed independently of that particular protocol.
  • the packet includes (3) trailer information following the payload and indicating the end of the payload information.
  • the header includes information such as the source of the packet, its destination, the length of the payload, and other properties used by the protocol.
  • the data in the payload for the particular protocol includes a header and payload for a different protocol associated with a different, higher layer of the OSI Reference Model.
  • the header for a particular protocol typically indicates a type for the next protocol contained in its payload.
  • the higher layer protocol is said to be encapsulated in the lower layer protocol.
  • the headers included in a packet traversing multiple heterogeneous networks, such as the Internet typically include a physical (layer 1 ) header, a data-link (layer 2) header, an internetwork (layer 3) header and a transport (layer 4) header, and various application headers (layer 5, layer 6 and layer 7) as defined by the OSI Reference Model.
  • the functions of the geo-coding manager 109 are distributed among the UE 101 , the map platform 103, and/or the service platform 1 13 according to a client-server model.
  • a client process sends a message including a request to a server process, and the server process responds by providing a service (e.g., geo-coding messaging, advertisements, etc.).
  • the server process may also return a message with a response to the client process.
  • client process and server process execute on different computer devices, called hosts, and communicate via a network using one or more protocols for network communications.
  • server is conventionally used to refer to the process that provides the service, or the host computer on which the process operates.
  • client is conventionally used to refer to the process that makes the request, or the host computer on which the process operates.
  • client and server refer to the processes, rather than the host computers, unless otherwise clear from the context.
  • process performed by a server can be broken up to run as multiple processes on multiple hosts (sometimes called tiers) for reasons that include reliability, scalability, and redundancy, among others.
  • a user device client sends POI data and runs various location based applications and a server supports geo-coding queries and location based services.
  • the user device client prompts a form for the users to enter POI data.
  • This data is then sent to the server along with the client's location information, such as GPS, CBS, and Cell Id.
  • the underlying communication channel can be SMS, MMS, or GPRS depending on the availability.
  • the user device client can also run LBS applications, such as local search, navigation, locate-me etc.
  • the user device client has a locate-me application, and the user enters the approximate street address and sends a request to the server.
  • the server would then send the GPS coordinates to be overlaid on the client's map interface.
  • the server is connected to a database, and responsible for gathering and processing POI data.
  • the server also supports the geo-coding service and various location-based services, such as local search, navigation etc.
  • the server runs a search engine implementation. Appropriate indexes can be constructed in the search engine to speed up the search process.
  • user queries arrive (queries containing address texts)
  • the server processes these queries, searches through the various documents and returns the appropriate result to the UE 101. For example, if the server supports a "local search" service then after performing the geo-coding task the server can look for POIs around the geo-coded data.
  • the results are returned to the UE 101 via SMS or GPRS interface.
  • FIG. 2 is a diagram of the components of a geo-coding manager, according to one embodiment.
  • the geo-coding manager 109 includes one or more components for resolving geo-identification of un-structured and/or colloquial location information. It is contemplated that the functions of these components may be combined in one or more components or performed by other components of equivalent functionality.
  • the geo- coding manager 109 includes control logic 201, a memory 203, a collection module 205, a verification module 207, a correction module 209, a mapping module 21 1, and a clustering module 213.
  • the control logic 201 oversees tasks, including tasks performed by the collection module 205, the verification module 207, the correction module 209, the mapping module 211 , and the clustering module 213. For example, although the other modules may perform the actual task, the control logic 201 may determine when and how those tasks are performed or otherwise direct the other modules to perform the task.
  • the collection module 205 collects POI (including landmark, etc.) data for approximate geo- coding. It is easy to collect POI data from commercial sources, but it is challenging to aggregate information of local listings, POIs, since the information is known only to local people. In India, for example, there are several popular local pop and mom stores, such as tea shops, grocery stores, gas stations, etc. that may not be on any local listings or yellow pages entries yet can serve as useful points on interest.
  • a crowd sourcing technique can be used for collecting POI data including local business and landmark information, and grouping them according to area names.
  • the collection module 205 interacts with a user device client, such as small business owners, individuals, etc. to enter their local business information along with their address and contact information. Although local addresses may not have any inherent structure associated therewith, the collection module 205 enforces a predetermined structural scheme and standard to collect data.
  • the user device client has configured fields, such as building name, street name, landmark, area name, etc. These structural labels of text snippets in collected addresses prompt the user to include street name, landmark, and area name information. So the address entered has information associated with the structural scheme.
  • By automatically labeling different strings into fields (such as street names, area names etc.), parsing addresses becomes easier on the server side. String comparison between any two addresses becomes easier, by comparing based on individual fields. This feature is useful for identifying user entries typos and spelling mistakes.
  • the user device client also captures "location context," such as GPS coordinates, cell tower ID (Cell ID), and the cell broadcast (CBS) message received by the UE 101.
  • CBS messages are part of GSM specification (e.g., GSM 03.41 , 3GPP TS 23.041 , etc.) where the cellular provider broadcasts certain types of messages in a given area. This channel is mainly used for broadcasting emergency messages. Other types of broadcasted messages include traffic, advertisement, area code, area name, etc. CBS messages are periodically sent by a base station to the UE 101. By way of example, the broadcasting period ranges from 1.83 seconds to 60 seconds.
  • the collection module 205 collects the approximate location information (e.g., area name) of the UE 101. The captured information is then transmitted to a central server either via SMS or GPRS. Table 2 shows a sample data snippet collected by the UE 101.
  • the collection module 205 may provide incentives to users for sending their information. For example, business owners can be rewarded with advertisement benefits for sending POI information to the system 100. An end user can also collect points or earn monetary awards for submitting POI data.
  • the collection module 205 captures POI and landmark data that has no official listing. The collected data captures various colloquial references and jargon, and provides insights into how people perceive addresses. With the huge penetration of mobile devices, this collection module 205 can be scaled to collect data of different cities.
  • the verification module 207 verifies the authenticity and correctness of the data collected by the collection module 205.
  • the verification module 207 verifies the data needs at least two levels: the correctness of the data and the location associated with the data. In another embodiment, the verification module 207 corrects spelling mistakes, typos, and abbreviations associated therewith to improve data quality.
  • the verification module 207 verifies if the address entered by the user is indeed a valid address via one or more of following verification schemes, depending upon a cost-benefit analysis. For each POI data, the verification module 207 makes an actual call to the business owner to manually verify the submitted information. In another embodiment, the verification module 207 sends SMS messages to the business owners, asking them to verify the submitted information.
  • the verification module 207 deploys one or more incentive mechanisms to award higher points and credibility to the users who frequently submit correct data. In yet another embodiment, the verification module 207 assigns a high correctness probability for the data coming from such users.
  • the verification module 207 also verifies the UE 101 is actually present at the submitted address via an approximate location verification method.
  • the address entered via the UE 101 is "Apana Bazzar, Marathahalli Main Road, Marathahalli”. This address has the area name "Marathahalli”.
  • the verification module 207 compares this area name with the area name captured by the CBS. If the two match, the verification module 207 determines the address entered at the UE 101 is the actual physical location of the UE 101.
  • the verification module 207 can construct a connectivity graph of the area names captured by CBS messages. In one embodiment, for an area name "x", the verification module 207 locates other areas that are physically near "x" and constructs a neighborhood graph. The verification module 207 consults this graph to see if the UE 101 is in the neighborhood of the address entered.
  • the verification module 207 estimates that the UE 101 is within the neighborhood of the entered address, and verifies the location associated with the data.
  • the correction module 209 checks and corrects spell/typing errors. In one embodiment, the correction module 209 enables the user device client with an auto-complete feature to minimize typos and spelling mistakes.
  • the correction module 209 leverages the crowd-sourced data and string matching algorithms on the server side to identify typos and understand variations in spellings.
  • “Vijaya Nagar” in Bangalore is also spelled as “Vijayam Nagar” based on the local accents.
  • the correction module 209 associates the two spellings and reasons that "Vijay Nagar” and "Vijayam Nagar” refer to the same area name. Further we can also identify popularity of the names based on the frequency of their occurrence.
  • the correction module 209 matches entered strings and their geo-coordinates to detect typing errors, such as typing "Vijay Nagar” as “Vijy Nagar”.
  • the mapping module 21 1 geo-annotates data via various mapping functions. In one embodiment, the mapping module 211 maps address data to geo-coordinates. This mapping is achieved by capturing GPS data along with the addresses. In another embodiment, the mapping module 21 1 maps Cell ID to geo-coordinates. After capturing Cell Id of the UE 101 and its geo-coordinates, the mapping module 21 1 associates Cell Ids with these geo-coordinates which are a valuable data source for various LBS.
  • the mapping module 211 maps area names to geo-coordinates. Since both addresses have area names and the CBS messages have area names, the mapping module 211 extracts the geo-coordinates and area names and forms this mapping. This mapping is useful for local search services.
  • the clustering module 213 aggregates and clusters the geo-tags (GPS coordinates) associated with every POI data and its related address text. An example is provided with respect to a city in conjunction with FIG. 3. Similar concepts can be expanded for other location areas, such as state, country, etc.
  • FIG. 3A is a flowchart of a process for resolving geo-identification of un-structured and/or colloquial location information, according to one embodiment.
  • the geo- coding manager 109 performs the process 300 and is implemented in, for instance, a chip set including a processor and a memory as shown in FIG. 8.
  • the control logic 201 can provide means for accomplishing various parts of the process 300 as well as means for accomplishing other processes in conjunction with other components of the geo-coding manager 109.
  • the functions of the geo-coding manager 109 may be distributed among the UE 101 , the map platform 103, and/or the service platform 1 13 according to a client-server model.
  • the geo-coding manager 109 causes, at least in part, a collection of point-of-interest information from one or more sources.
  • the sources may include observations, people, speeches, documents, pictures, organizations, entities, libraries, databases, websites, custom POI sources (e.g., archaeological sites, traffic/safety cameras, etc.), commercial POI sources (e.g., travel, sports, etc.), etc.
  • the collection is based, at least in part, on one or more crowd-sourcing mechanisms.
  • the geo-coding manager 109 interacts with as online document and extracts POI information via artificial intelligence or Natural language processing (NLP) or by interpreting metadata of the document.
  • NLP Natural language processing
  • the geo-coding manager 109 interacts with UE 101 of a small business owner, individual, etc. to receive local point of interest information (e.g., cinema theatres, bar, restaurant, hotel, etc.) including address, contact information, etc.
  • local point of interest information e.g., cinema theatres, bar, restaurant, hotel, etc.
  • the geo-coding manager 109 receives the point-of-interest information as structured data including one or more fields for inputting, processing, storing, or a combination thereof.
  • the fields may include POI name, nick name, building name, building number, street name, street number, landmark, area name, city, state, zip code, phone number, etc.
  • such fields presented on a user interface force the user to enter structured information, rather than just "Tom's gardening tool store next to the Best Supermarket.”
  • the one or more fields relate, at least in part, to one or more names of the points of interest (e.g., Tom's Green Thumb), one or more addresses of the points of interest (e.g., a street name, etc.), one or more landmarks associated with the points of interest (e.g., Best Supermarket), or a combination thereof.
  • the geo-coding manager 109 processes and/or facilitates a processing of the point-of- interest information to determine one or more terms that describe location information of one or more points of interest (e.g., a T-shirt stand 50 feet away from a theme park entrance), the one or more sources (e.g., a commercial database, a restaurant owner Bob, etc.), or a combination thereof.
  • the terms may include text, words, name, user entries, etc.
  • an user entry may be "a bicycle rental shop at the west side of the public beach shower room", “next to the post office", "Ugly Pencil” (nick name for the Washington Monument), etc.
  • the geo-coding manager 109 causes, at least in part, a verification of the terms, the point-of-interest information, or a combination thereof.
  • verification methods such as using location context, user confirmation, synonyms, auto-completion, etc.
  • the geo-coding manager 109 causes, at least in part, a verification of the terms, the point-of-interest information, based, at least in part, on one or more confirmation replies associated with the one or more sources, the one or more devices, or a combination thereof.
  • the geo-coding manager 109 determines frequency information of the one or more terms in the point-of-interest information, among the one or more sources, or a combination thereof.
  • the geo-coding manager 109 processes and/or facilitates a processing of the frequency information to determine one or more synonyms among the one or more terms, to resolve inconsistencies among the one or more terms, to correct errors in the one or more terms, or a combination thereof.
  • the geo-coding manager 109 determines one or more location contexts of one or more terminals at least substantially concurrently with the collection of the point-of-interest information, causes, at least in part, an association of the one or more location contexts with the point-of-interest information.
  • the one or more location contexts are determined based, at least in part, on one or more communications cell tower identifiers (e.g., Cell ID), one or more communications cell broadcast (CBS) messages identifiers, one or more location sensor (e.g., GPS) coordinates, or a combination thereof.
  • one or more communications cell tower identifiers e.g., Cell ID
  • CBS communications cell broadcast
  • location sensor e.g., GPS
  • the geo-coding manager 109 then causes, at least in part, a verification of the terms, the point- of-interest information, based, at least in part, on comparison against the one or more location contexts.
  • the geo-coding manager 109 processes and/or facilitates a processing of the one or more terms to determine address information, one or more area names (e.g., old town Alexandria, etc.) of the one or more geographical areas (e.g., Alexandra in Virginia), or a combination thereof. Area names are comprehensible (e.g., "Pennsylvania Avenue,” “Georgetown,” “Vijay Nagar,” etc.) to the general population.
  • the geo-coding manager 109 processes and/or facilitates a processing the one or more location contexts to determine one or more geographical coordinates.
  • the geo-coding manager 109 causes, at least in part, a mapping among the address information, the one or more area names, the one or more geographical coordinates, the one or more location contexts, or a combination thereof.
  • the geo-coding manager 109 determines reliability information of the one or more sources based, at least in part, on the verification.
  • the geo-coding manager 109 determining respective weights of the one or more terms, the point-of-interest information, or a combination thereof based, at least in part, on the reliability information.
  • the one or more geo-spatial documents, the resolution of the one or more queries, or a combination thereof are based, at least in part, on the respective weights.
  • the subset of addresses in the corpus that best determine the geo- spatial expanse of Q includes those that have the same lexical structure as that of Q.
  • the process of geo-identity resolution is implicitly that of assessing the similarity between the query address document and the geo-spatially annotated address documents in the corpus.
  • the geo-coding manager 109 causes, at least in part, an aggregation of the one or more terms into one or more geo-spatial documents based, at least in part, on the location information.
  • the one or more geo-spatial documents are associated with one or more geographical areas.
  • the geographic areas can be defined multi-dimensionally, via the parameters of latitude, longitude, elevation, time, etc.
  • a document Xj is associated with one POI address: Feroze Shopping centre, Alankar Marg, Near McDonalds, Lajpat Nagar III, New Delhi, while a document Xi is associated with three POI addresses including Superfast Food, 3 C's Cinema Road, Lajpat Nagar II, New Delhi.
  • each set of POI information is associated with POI address information and a time stamp.
  • POI address information For example, North Korean President had a secret trip to China. In this case, his travel schedule reported via different news media was not in a structured manner in terms of both locations and times.
  • the geo-coding manager 109 collects POI information via various sources with the above -discussed approach to generate documents.
  • a document Tj is associated with one or more Chinese military facilities and time windows, while a document Ti is associated with Chinese President residence for dinner.
  • the geographical area can be set at any granularity, shape, and forms, depending on the availability of POI data, the location context, user preferences, etc.
  • the area is defined by existing geographic boundaries, e.g., a solar system, planet, continent, country, province, city, town, community, street, floor/unit/room within a building, etc.).
  • the one or more geographical areas, one or more portions of the one or more geographical areas, or a combination there are specified as one or more arbitrary geographic boundaries, such as cells.
  • the Sahara Desert (3,630,000 sq mi) covers most of Northern Africa, and is almost as large as Europe or the United States.
  • POIs e.g., oases
  • an area can be set as 100 mi *
  • Macau is the most crowed city in the world (over 19,400 population per sq km) and many tiny shops in alleys.
  • An area can be set as 5 m * 5 m grid cells.
  • the city contains approximately 12,000 POIs spreading non-uniformly across 140 area-names.
  • the areas can vary significantly in size - from a square km to 10 sq. Km.
  • street names i.e. Street name-area name combinations from the corpus
  • the sizes of the areas reduce to 1.8 and 3.6 sq. Km respectively that are relatively coarse grained for geo-identity resolution.
  • An example of a cell is described in conjunction with FIG. 4. Examples of documents are shown in FIG. 5.
  • the geo-coding manager 109 processes and/or facilitates a processing of one or more queries to determine or one or more query terms.
  • the query can be "I'm standing on a street with two movie theaters and one Italian restaurant in New Delhi. Where the closest bus station?"
  • the geo-coding manager 109 retrieves GPS data of the UE
  • the geo-coding manager 109 when the GPS signal is not available, extract terms from the query, such as "New Delhi,” “two movie theaters,” and “one Italian restaurant,” etc. If the user can read and type local language, the geo-coding manager 109 can extract the exact theater names and/or restaurant names to accelerate processing. In step 313, the geo-coding manager 109 causes, at least in part, a probabilistic matching of the one or more query terms against the one or more geo-spatial documents to resolve the query over the geographical areas.
  • the geo-coding manager 109 may apply algorithms to decide which street is the most likely one based on the entry, and optionally, other context and/or preference information of the user.
  • the context associated with a person may be a birthday, health, moods, clothes, preferences, etc. of the person.
  • the context associated with an event may be a time, location, equipment, materials, etc. of the event.
  • the context associated with a point of interest may be weather conditions, traffic, environment, atmosphere, etc. at the point of interest.
  • the geo-coding manager 109 analyzes the user's calendar or travel plan to rank the seven streets and/or select one among the seven streets for the user.
  • the geo-coding manager 109 may present a ranking list, and/or render the streets on a map differently for the user.
  • the presentations may comprise one or more messages, items (e.g., data files, applications, games, point of interest information), media objects (e.g., graphics, images, videos, sounds, songs), or a combination thereof. It is contemplated that the presentation may include any other form of information or communication to convey location information to a user. Examples of the user interfaces are shown in FIG. 6.
  • the geo-coding manager 109 may receive a request specifying a location- based service for UE 101 , and then causes, at least in part, rendering of the location-based service to UE 101 based, at least in part, the most probable street, a relevant cell, or a combination thereof.
  • the geo-coding manager 109 renders the route from the current location on the most probable street to the closest bus station.
  • the geo-coding manager 109 updates the one or more geo-spatial documents with additional associated location information per cell.
  • the geo-coding manager 109 updates the correlation between the user's current location and the bus station in one or more of the documents.
  • the geo-coding manager 109 causes, at least in part, transmission of the one or more geo-spatial documents, the updated one or more geo-spatial documents, or a combination thereof, to an information store, an information space, a cloud, or a combination thereof.
  • the documents become available for the user for the next visit.
  • the information is also available to the public by removing the user's information.
  • the information may be available to service provider, network operators, software developers, advertisers to use, if the user agrees.
  • FIG. 3B is a flowchart of a process for resolving geo-identification of un-structured and/or colloquial location information, according to one embodiment.
  • the geo- coding manager 109 residing in the UE 101 performs the process 320 and is implemented in, for instance, a chip set including a processor and a memory as shown in FIG. 8.
  • the control logic 201 can provide means for accomplishing various parts of the process 320 as well as means for accomplishing other processes in conjunction with other components of the geo- coding manager 109.
  • the geo-coding manager 109 causes, at least in part, transmission of a query (e.g., a T-shirt stand 50 feet away from a theme park entrance) to a geo-coding service.
  • the geo-coding manager 109 receives a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names (e.g., "Pennsylvania Avenue,” “Georgetown,” “Vijay Nagar,” “Williamsburg, Virginia” etc.) of the one or more geographical areas, or a combination thereof.
  • the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query.
  • the one or more terms describe location information of one or more points of interest (e.g., T-shirt stand, Super Water Park, cinema theatres, bar, restaurant, hotel, etc.) located within the one or more geographical areas (e.g., Williamsburg, Virginia).
  • the one or more geographical areas, one or more portions of the one or more geographical areas, or a combination there are specified as one or more cells.
  • the query contains one or more synonyms (e.g., Colonial Williamsburg) of the one or more terms.
  • the one or more terms are collected form one or more sources based, at least in part, on one or more crowd-sourcing mechanisms.
  • the one or more geographical areas are defined multi-dimensionally.
  • the geo-coding manager 109 causes, at least in part, representation at a user interface the one or more geographical areas, one or more points of interests associated with the one or more geographical areas, address information associated with the one or more geographical areas, or a combination thereof. Examples of documents are shown in FIGs. 6A-6D.
  • the geo-coding manager 109 causes, at least in part, transmission of point-of-interest information to the geo-coding service, the point-of-interest information being structured with one or more fields for inputting, processing, storing, or a combination thereof the one or more terms.
  • the one or more fields relate, at least in part, to the one or more names of the points of interest (e.g., the Mall in Washington DC), one or more addresses of the points of interest, one or more landmarks (e.g., the White House, the Capital Hill, etc.) associated with the points of interest, or a combination thereof.
  • the points of interest e.g., the Mall in Washington DC
  • the addresses of the points of interest e.g., the Mall in Washington DC
  • landmarks e.g., the White House, the Capital Hill, etc.
  • the geo-coding manager 109 receives credit (e.g., service credits, coupons, gifts, etc.) from the geo-coding service based, at least in part, on the transmission of point-of-interest information, reliability of the point-of-interest information, or a combination thereof.
  • the reliability of the point-of-interest information is verified, based, at least in part, on comparison against one or more location contexts (e.g., date, time, location, current activity, weather, a history of activities, etc.) associated with the point-of-interest information.
  • the one or more location contexts are determined based, at least in part, on one or more communications cell tower identifiers, one or more communications cell broadcast messages identifiers, one or more location sensor coordinates, or a combination thereof.
  • the geo-coding manager 109 receives a confirmation request from the geo-coding service for verifying the point-of-interest information, the one or more terms, or a combination thereof. Thereafter, the geo-coding manager 109 causes, at least in part, transmission of a confirmation reply to the geo-coding service.
  • the geo-coding manager 109 causes, at least in part, spell-checking, auto-completing, or a combination thereof, when receiving the point-of-interest information at a user interface.
  • FIG. 4 is a location area diagram mapped with points of interest and utilized in the process of FIG. 3, according to one embodiment.
  • Address texts when seen in isolation, can show significant lexical dissimilarity, even for POIs which are in geo-physical vicinity of each other. For example, two POIs located in close proximity may have different street names in them (one official and another colloquial).
  • the geo-coding manager 109 overlays a logical square grid over a city (e.g., New Delhi, India) into a plurality of grid cells. Each cell corresponds to one document that contain POI information collected via cloud sourcing. Every collected POI address has a GPS coordinate associated therewith.
  • each geo-spatial document X essentially is a text document. By construction, the geophysical expanse of X as bounded by the four edges of the unit cell representing it in the city grid.
  • the city in turn is a collection of discrete documents that contain all possible strings (names of buildings, streets and areas) that appear in the addresses belonging to that city.
  • the geo-coding manager 109 constructs a geo- spatial document Xi that aggregates information of all the POIs assigned to cell i. Every document has various attributes, such as street name, area name, landmark, city, etc.
  • the street and area name attributes are a union of the corresponding attributes of each POI therein, whereas the landmark attribute is a union of all POI names, building names, etc. that figure in the POIs. Since the geo-coding manager 109 enforces structured fields for users to enter POI data, parsing the addresses from different fields becomes easy.
  • each grid cell e.g., height and width parameters, such as 100 meter * 100 meter
  • the geo-coding manager 109 assigns each POI entry to one grid-cell in the city based on its latitude, longitude, etc.
  • XC ⁇ X : X is a geo-spatial document in city C ⁇ .
  • X is a geo-spatial document in city C ⁇ .
  • Given a query address Q find the geo-spatial document X*G X that best matches the query.
  • the geo-identity of the query Q is then resolved to within the grid cell corresponding to the geo-spatial document X*.
  • a search engine can be thought of as a function F : (X,Q) ⁇ 5H+ where X is a document and Q is a query.
  • the scoring algorithm of a search engine provides a ranking of the documents in the corpus for a given query.
  • a subset X" c X of geo-spatial documents, that attain competitive scores for a given query may be found.
  • FIGs. 5A-5B show examples of two geo-spatial documents utilized in the process of FIG. 3, according to various embodiments.
  • the geo-coding manager 109 further assigns a weight function to each landmark L, area name A, and street name S in Xi. This function can be a frequency count (such as number of POIs that reference L, A, or S).
  • the geo-coding manager 109 establishes the credibility of the landmarks, area names, and street names in each geo-spatial document and builds a ranking order based on these weights. For example, when a large number of POIs belonging to a particular grid cell refer to "Alankar theatre" as their landmark, the credibility of this landmark being present at that grid location is high and its weight is high as well.
  • the geo-coding manager 109 also uses this scheme to identify most commonly used colloquial names and abbreviations.
  • the problem of geo-identity resolution of address texts is converted into a search problem over these geo-spatial documents.
  • the geo-coding manager 109 determines the geo-spatial document Xi that best matches the query.
  • the geo-coding manager 109 uses a scoring algorithm to provide ranking of the documents in the corpus for the query.
  • the parameters of the search can be adjusted such that it ranks string-matches based on string proximity, weight associated with every landmark, or a combination thereof.
  • addresses 1 , 2, and 3 are close to each other and belong to the same grid cell (e.g., Lajpat Nagar II), they all combined in Xi while address 4 is in Lajpat Nagar III and belongs to Xj.
  • the geo-coding manager 109 When receiving a query of "Superfast Food, near Alankar theatre, Feroze Koch Marg, Lajpat Nagar ⁇ ", the geo-coding manager 109 matches strings in all documents.
  • the two documents Xi and Xj are used as examples.
  • the geo-coding manager 109 finds more matches in document Xi as compared to document Xj and hence returns the geo -coordinates associated with Xi as the response to the query.
  • the bold face words in FIG. 5 A highlight four matches "Feroze Gandhi Marg,” “Alankar theatre,” “Superfast Food” and “Lajpat Nagar ⁇ " for the input address "Superfast Food, near Alankar theatre, Feroze Khan Marg, Lajpat Nagar II.” While the bold face words in FIG. 5B highlight three matches "Alankar theatre,” “Superfast Food” and "Lajpat Nagar” for the input address.
  • the comparison is made via parsing address parameters, such as street name, area name, etc., for the user query, which consumes more resources while obtaining better accuracy.
  • the string matching is done without parsing the user query into structured address parameters, to spare the trouble of identifying semantics and field information from the user query.
  • the geo-coordinates of the grid cell corresponding to the document Xi are then returned as response to the query.
  • the geo-coding manager 109 use geo-coordinates of the center of the grid as the response.
  • the geo-coding manager 109 averages the geo-coordinates of the landmarks (e.g., Barista, Italian Pizza, Superfast Food) in the grid as the response.
  • FIGs. 6A-6D are diagrams of user interfaces utilized in the process of FIG. 3, according to various embodiments.
  • FIG. 6A shows user interfaces 601, 603, 605, 607, 609, 61 1 may be utilized to collect POI information to update geo-spatial documents.
  • the user may have recently opened a hotel in Cubon Park and may want to advertise this information.
  • Cubon Park may be an area name associated with a CBS message identifier.
  • Cubon Park may be a modified area name associated with the CBS message identifier (e.g., the CBS message identifier area name may include "Cubon Pk.," which translates into Cubon Park).
  • Creating a website or providing advertisement in the paper may be expensive options for the user.
  • the user may utilize the geo-coding manager 109 of the UE 101 to update POI information associated with the hotel. As such, the user need only bear the cost of transmission (e.g., a SMS or MMS). In another embodiment, the system 100 may change the user for a listing fee, a subscription fee, etc.
  • the user may select a region for a POI at user interface 601. Then, the user may be prompted, at user interface 603, to select a category (e.g., accommodation) for the POI. The user then selects the subcategory of hotel at user interface 605. At this point, the UE 101 knows that the POI is a hotel in Cubon Park. Next at user interface 607, the user enters POI information associated with the hotel (e.g., name, nick name, address, etc.). Other possible fields may include road, street, marg, chowk, gali, avenue, enclave, etc.
  • POI information associated with the hotel e.g., name, nick name, address, etc.
  • Other possible fields may include road, street, marg, chowk, gali, avenue, enclave, etc.
  • the user may enter additional information, such as comments, pictures, etc. Moreover, the user may be prompted to select other POIs or landmarks nearby the POI to provide more refined grouping information about which POIs are nearby which POIs and landmarks. Further, the user may provide vocal or speech input to the UE 101 that may be converted to text using a speech-to-text mechanism to submit the information. This information may be transmitted to the map platform 103, the service platform 1 13, an information store 119, etc. for updating one or more geo-spatial documents with the POI information. Then, the user may receive a message from the platforms 103, 1 13 and/or information stores 119 presented on the user interface 609 a successful registration notification. Alternatively, the user interface 611 may present an unsuccessful registration notification.
  • the new POI information may be received and utilized by the UE 101 of another user.
  • portions of the interface are highlighted and/or selected (e.g., Cubon Park)
  • audio associated with the portion e.g., a name
  • a user who cannot or prefers not to use text-based interaction e.g., an illiterate user or a user who cannot view text because of environmental conditions
  • other users e.g., a local person, a tourist, etc. interested in a POI may update the POI information.
  • the user may use the process of FIG. 3 to update one or more geo-spatial documents of the platforms and/or information stores. Further, users may additionally add ratings for the POIs, which may be utilized to search for POIs. Additionally, the user may be provided with incentives from the platforms and/or information stores to provide updates. These incentives may include monetary gain, credits for services from the platforms and/or information store, credits for sending messages, etc. or a combination thereof.
  • FIG. 6B shows user interfaces 621, 623, 625, 627, 629, 631 may be utilized to provide information to a user of the UE 101 about POIs.
  • the user may have a UE 101 that is not capable of receiving GPS signals or use GPRS connectivity, such that the user cannot use sensor data to locate the geographic position directly.
  • the user is new to a city (e.g., a tourist), and would like to know where the user is and saw a Superfast Food nearby. In another embodiment, the user would like to a nearby Superfast Food.
  • the user may enter text describing the restaurant or may select an option to retrieve information about the restaurant using menus.
  • the user enters city, area, and POI information by selecting the items in the interfaces 621 , 623, 625 respectively.
  • the user input in the interface 627 "Superfast food, near Alankar theatre, Feroze Koch Marg, Lajpat Nagar II.”
  • the user can initiate the geo- coding manager 109 is then initiated to execute the process in FIG. 3 to find where the user or a nearby Superfast Food is, base on the input information. Further, the geo-coding manager 109 retrieves one or more geo-spatial documents from its own database or from a local information store 119.
  • the geo-coding manager 109 determines a most like area or a list of areas nearby to allow the user to select these areas as the current location or the location of a nearby Superfast Food.
  • the geo-coding manager 109 displays the address information of the current location or the location of a nearby Superfast Food in the user interface 631.
  • the most likely area is an area associated with a cell of the geo- spatial document corresponding to the current location or the location of a nearby Superfast Food.
  • the geo-coding manager 109 displays a list of areas associated with the current location or the location of a nearby Superfast Food according to their ranks.
  • the geo-coding manager 109 displays points of interest in FIG. 6C, according to one embodiment. Points of interest can be displayed to the user in a location panel, map, etc. In FIG. 6C, points of interest a restaurant 641 (i.e., current location of UE 101), a bust stop 643, a train station 645, a bank 647, and a movie theater 649 were extracted for a user visiting New Delhi.
  • the geo-coding manager 109 displays relevant geographic areas in FIG. 6D, according to one embodiment.
  • FIG. 6D shows geographic areas 661 , 663, 665 including the icons of the points of interest. Each geographic area corresponds to a geo-spatial document.
  • a route 667 from the location to the bust stop is displayed to the user.
  • additional information such as specialties of the restaurant or user ratings and/or review may be presented on user interface if the user selects to view additional information.
  • the user interfaces may be presented to the user via a vocal interface (e.g., using a text-to- speech and speech-to-text means).
  • the user can select a zoom function. This function would then extract the landmarks or POIs in the current area or another selected area.
  • the user can select the landmark or POI that the user is close to. For instance, if the user knows that the user is close to a cinema theatre "A", the user could select that POI and the geo-coding manager 109 can perform a refined query to find all the restaurants near cinema theatre "A".
  • each POI may include in its POI information, specific POIs and/or landmarks that the POI is nearby.
  • a more refined search can be provided by grouping certain POIs together within the areas of the geo-spatial documents.
  • the geo-coding manager 109 may request updates from the platforms or the information stores or may broaden the search area.
  • the above-discussed embodiments leverage a crowd sourcing mechanism for collecting local business, point-of-interest (POI), and landmark data, and then geo-codes unstructured address data.
  • POI point-of-interest
  • the geo-coding involves clustering geo-spatially annotated address texts to form a corpus of geo-spatial documents.
  • the above-discussed embodiments perform geo-coding as a "search" over the knowledge corpus of the geo-spatial documents.
  • the above-discussed embodiments thus exploit landmark references, understands colloquial jargon and handles unstructured address formats.
  • the above-discussed embodiments work on unstructured and landmark based location information commonly found in emerging markets.
  • users of UEs 101 are provided with location based services based on POI information associated with areas of the geo-spatial documents.
  • the UE 101 need not use power consuming GPS location determination technology to receive the location based services, thus saving power and extending battery life in a mobile UE 101.
  • the geo-spatial documents may be local to the UE 101 , the UE 101 need not use GPRS services to receive the location based services. As such, the UE 101 need not have or utilize the capabilities of GPS or GPRS to provide the location information.
  • the geo-spatial documents need not utilize mapping information, thus the UE 101 can save memory resources while providing the location information without loading map images. Additionally, because the search experience adheres to current practices, there is little change in user behavior to utilize the geo-coding manager 109.
  • the geo-spatial documents and location based services may be provided by the platforms and/or information stores.
  • the processing and resource burden associated with providing such location information and/or location based services can be shifted from the UE 101 to the platforms and/or information stores, thereby reducing processing power and memory resources used at the UE 101 to support the location information and/or location based services.
  • the processes described herein for resolving geo-identification of un-structured and/or colloquial location information may be advantageously implemented via software, hardware, firmware or a combination of software and/or firmware and/or hardware.
  • FIG. 7 illustrates a computer system 700 upon which an embodiment of the invention may be implemented.
  • DSP Digital Signal Processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Arrays
  • FIG. 7 illustrates a computer system 700 upon which an embodiment of the invention may be implemented.
  • computer system 700 is depicted with respect to a particular device or equipment, it is contemplated that other devices or equipment (e.g., network elements, servers, etc.) within FIG. 7 can deploy the illustrated hardware and components of system 700.
  • Computer system 700 is programmed (e.g., via computer program code or instructions) to resolve geo -identification of un-structured and/or colloquial location information as described herein and includes a communication mechanism such as a bus 710 for passing information between other internal and external components of the computer system 700.
  • Information also called data
  • Information is represented as a physical expression of a measurable phenomenon, typically electric voltages, but including, in other embodiments, such phenomena as magnetic, electromagnetic, pressure, chemical, biological, molecular, atomic, sub-atomic and quantum interactions. For example, north and south magnetic fields, or a zero and non-zero electric voltage, represent two states (0, 1) of a binary digit (bit). Other phenomena can represent digits of a higher base.
  • a superposition of multiple simultaneous quantum states before measurement represents a quantum bit (qubit).
  • a sequence of one or more digits constitutes digital data that is used to represent a number or code for a character.
  • information called analog data is represented by a near continuum of measurable values within a particular range.
  • Computer system 700, or a portion thereof, constitutes a means for performing one or more steps of resolving geo-identification of un-structured and/or colloquial location information.
  • a bus 710 includes one or more parallel conductors of information so that information is transferred quickly among devices coupled to the bus 710.
  • One or more processors 702 for processing information are coupled with the bus 710.
  • a processor (or multiple processors) 702 performs a set of operations on information as specified by computer program code related to resolve geo-identification of un-structured and/or colloquial location information.
  • the computer program code is a set of instructions or statements providing instructions for the operation of the processor and/or the computer system to perform specified functions.
  • the code for example, may be written in a computer programming language that is compiled into a native instruction set of the processor. The code may also be written directly using the native instruction set (e.g., machine language).
  • the set of operations include bringing information in from the bus 710 and placing information on the bus 710.
  • the set of operations also typically include comparing two or more units of information, shifting positions of units of information, and combining two or more units of information, such as by addition or multiplication or logical operations like OR, exclusive OR (XOR), and AND.
  • Each operation of the set of operations that can be performed by the processor is represented to the processor by information called instructions, such as an operation code of one or more digits.
  • a sequence of operations to be executed by the processor 702, such as a sequence of operation codes, constitute processor instructions, also called computer system instructions or, simply, computer instructions.
  • Processors may be implemented as mechanical, electrical, magnetic, optical, chemical or quantum components, among others, alone or in combination.
  • Computer system 700 also includes a memory 704 coupled to bus 710.
  • the memory 704 such as a random access memory (RAM) or any other dynamic storage device, stores information including processor instructions for resolving geo-identification of un-structured and/or colloquial location information.
  • Dynamic memory allows information stored therein to be changed by the computer system 700.
  • RAM allows a unit of information stored at a location called a memory address to be stored and retrieved independently of information at neighboring addresses.
  • the memory 704 is also used by the processor 702 to store temporary values during execution of processor instructions.
  • the computer system 700 also includes a read only memory (ROM) 706 or any other static storage device coupled to the bus 710 for storing static information, including instructions, that is not changed by the computer system 700.
  • ROM read only memory
  • Non-volatile (persistent) storage device 708 such as a magnetic disk, optical disk or flash card, for storing information, including instructions, that persists even when the computer system 700 is turned off or otherwise loses power.
  • Information including instructions for resolving geo-identification of un-structured and/or colloquial location information, is provided to the bus 710 for use by the processor from an external input device 712, such as a keyboard containing alphanumeric keys operated by a human user, or a sensor.
  • an external input device 712 such as a keyboard containing alphanumeric keys operated by a human user, or a sensor.
  • a sensor detects conditions in its vicinity and transforms those detections into physical expression compatible with the measurable phenomenon used to represent information in computer system 700.
  • a display device 714 such as a cathode ray tube (CRT), a liquid crystal display (LCD), a light emitting diode (LED) display, an organic LED (OLED) display, a plasma screen, or a printer for presenting text or images
  • a pointing device 716 such as a mouse, a trackball, cursor direction keys, or a motion sensor, for controlling a position of a small cursor image presented on the display 714 and issuing commands associated with graphical elements presented on the display 714.
  • one or more of external input device 712, display device 714 and pointing device 716 is omitted.
  • special purpose hardware such as an application specific integrated circuit (ASIC) 720, is coupled to bus 710.
  • ASICs include graphics accelerator cards for generating images for display 714, cryptographic boards for encrypting and decrypting messages sent over a network, speech recognition, and interfaces to special external devices, such as robotic arms and medical scanning equipment that repeatedly perform some complex sequence of operations that are more efficiently implemented in hardware.
  • Computer system 700 also includes one or more instances of a communications interface 770 coupled to bus 710.
  • Communication interface 770 provides a one-way or two-way communication coupling to a variety of external devices that operate with their own processors, such as printers, scanners and external disks. In general the coupling is with a network link 778 that is connected to a local network 780 to which a variety of external devices with their own processors are connected.
  • communication interface 770 may be a parallel port or a serial port or a universal serial bus (USB) port on a personal computer.
  • USB universal serial bus
  • communications interface 770 is an integrated services digital network (ISDN) card or a digital subscriber line (DSL) card or a telephone modem that provides an information communication connection to a corresponding type of telephone line.
  • ISDN integrated services digital network
  • DSL digital subscriber line
  • a communication interface 770 is a cable modem that converts signals on bus 710 into signals for a communication connection over a coaxial cable or into optical signals for a communication connection over a fiber optic cable.
  • communications interface 770 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN, such as Ethernet. Wireless links may also be implemented.
  • LAN local area network
  • the communications interface 770 sends or receives or both sends and receives electrical, acoustic or electromagnetic signals, including infrared and optical signals, that carry information streams, such as digital data.
  • the communications interface 770 includes a radio band electromagnetic transmitter and receiver called a radio transceiver.
  • the communications interface 770 enables connection from the UE 101 to the communication network 105 for resolving geo- identification of un-structured and/or colloquial location information.
  • Non-transitory media such as non-volatile media, include, for example, optical or magnetic disks, such as storage device 708.
  • Volatile media include, for example, dynamic memory 704.
  • Transmission media include, for example, twisted pair cables, coaxial cables, copper wire, fiber optic cables, and carrier waves that travel through space without wires or cables, such as acoustic waves and electromagnetic waves, including radio, optical and infrared waves.
  • Signals include man-made transient variations in amplitude, frequency, phase, polarization or other physical properties transmitted through the transmission media.
  • Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, CDRW, DVD, any other optical medium, punch cards, paper tape, optical mark sheets, any other physical medium with patterns of holes or other optically recognizable indicia, a RAM, a PROM, an EPROM, a FLASH-EPROM, an EEPROM, a flash memory, any other memory chip or cartridge, a carrier wave, or any other medium from which a computer can read.
  • the term computer-readable storage medium is used herein to refer to any computer-readable medium except transmission media.
  • Logic encoded in one or more tangible media includes one or both of processor instructions on a computer-readable storage media and special purpose hardware, such as ASIC 720.
  • Network link 778 typically provides information communication using transmission media through one or more networks to other devices that use or process the information.
  • network link 778 may provide a connection through local network 780 to a host computer 782 or to equipment 784 operated by an Internet Service Provider (ISP).
  • ISP equipment 784 in turn provides data communication services through the public, world-wide packet-switching communication network of networks now commonly referred to as the Internet 790.
  • a computer called a server host 792 connected to the Internet hosts a process that provides a service in response to information received over the Internet.
  • server host 792 hosts a process that provides information representing video data for presentation at display 714. It is contemplated that the components of system 700 can be deployed in various configurations within other computer systems, e.g., host 782 and server 792.
  • At least some embodiments of the invention are related to the use of computer system 700 for implementing some or all of the techniques described herein. According to one embodiment of the invention, those techniques are performed by computer system 700 in response to processor 702 executing one or more sequences of one or more processor instructions contained in memory 704. Such instructions, also called computer instructions, software and program code, may be read into memory 704 from another computer-readable medium such as storage device 708 or network link 778. Execution of the sequences of instructions contained in memory 704 causes processor 702 to perform one or more of the method steps described herein. In alternative embodiments, hardware, such as ASIC 720, may be used in place of or in combination with software to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware and software, unless otherwise explicitly stated herein.
  • the signals transmitted over network link 778 and other networks through communications interface 770 carry information to and from computer system 700.
  • Computer system 700 can send and receive information, including program code, through the networks 780, 790 among others, through network link 778 and communications interface 770.
  • a server host 792 transmits program code for a particular application, requested by a message sent from computer 700, through Internet 790, ISP equipment 784, local network 780 and communications interface 770.
  • the received code may be executed by processor 702 as it is received, or may be stored in memory 704 or in storage device 708 or any other non-volatile storage for later execution, or both. In this manner, computer system 700 may obtain application program code in the form of signals on a carrier wave.
  • instructions and data may initially be carried on a magnetic disk of a remote computer such as host 782.
  • the remote computer loads the instructions and data into its dynamic memory and sends the instructions and data over a telephone line using a modem.
  • a modem local to the computer system 700 receives the instructions and data on a telephone line and uses an infra-red transmitter to convert the instructions and data to a signal on an infra-red carrier wave serving as the network link 778.
  • An infrared detector serving as communications interface 770 receives the instructions and data carried in the infrared signal and places information representing the instructions and data onto bus 710.
  • Bus 710 carries the information to memory 704 from which processor 702 retrieves and executes the instructions using some of the data sent with the instructions.
  • the instructions and data received in memory 704 may optionally be stored on storage device 708, either before or after execution by the processor 702.
  • FIG. 8 illustrates a chip set or chip 800 upon which an embodiment of the invention may be implemented.
  • Chip set 800 is programmed to resolve geo-identification of un-structured and/or colloquial location information as described herein and includes, for instance, the processor and memory components described with respect to FIG. 7 incorporated in one or more physical packages (e.g., chips).
  • a physical package includes an arrangement of one or more materials, components, and/or wires on a structural assembly (e.g., a baseboard) to provide one or more characteristics such as physical strength, conservation of size, and/or limitation of electrical interaction. It is contemplated that in certain embodiments the chip set
  • Chip set or chip 800 can be implemented in a single chip. It is further contemplated that in certain embodiments the chip set or chip 800 can be implemented as a single "system on a chip.” It is further contemplated that in certain embodiments a separate ASIC would not be used, for example, and that all relevant functions as disclosed herein would be performed by a processor or processors.
  • Chip set or chip 800, or a portion thereof constitutes a means for performing one or more steps of providing user interface navigation information associated with the availability of functions.
  • Chip set or chip 800, or a portion thereof constitutes a means for performing one or more steps of resolving geo-identification of un-structured and/or colloquial location information.
  • the chip set or chip 800 includes a communication mechanism such as a bus
  • a processor 803 has connectivity to the bus 801 to execute instructions and process information stored in, for example, a memory 805.
  • the processor 803 may include one or more processing cores with each core configured to perform independently.
  • a multi-core processor enables multiprocessing within a single physical package. Examples of a multi-core processor include two, four, eight, or greater numbers of processing cores.
  • the processor 803 may include one or more microprocessors configured in tandem via the bus 801 to enable independent execution of instructions, pipelining, and multithreading.
  • the processor 803 may also be accompanied with one or more specialized components to perform certain processing functions and tasks such as one or more digital signal processors (DSP) 807, or one or more application-specific integrated circuits (ASIC) 809.
  • DSP digital signal processors
  • ASIC application-specific integrated circuits
  • a DSP 807 typically is configured to process real-world signals (e.g., sound) in real time independently of the processor 803.
  • an ASIC 809 can be configured to performed specialized functions not easily performed by a more general purpose processor.
  • Other specialized components to aid in performing the inventive functions described herein may include one or more field programmable gate arrays (FPGA) (not shown), one or more controllers (not shown), or one or more other special-purpose computer chips.
  • FPGA field programmable gate arrays
  • the chip set or chip 800 includes merely one or more processors and some software and/or firmware supporting and/or relating to and/or for the one or more processors.
  • the processor 803 and accompanying components have connectivity to the memory 805 via the bus 801.
  • the memory 805 includes both dynamic memory (e.g., RAM, magnetic disk, writable optical disk, etc.) and static memory (e.g., ROM, CD-ROM, etc.) for storing executable instructions that when executed perform the inventive steps described herein to resolve geo- identification of un-structured and/or colloquial location information.
  • the memory 805 also stores the data associated with or generated by the execution of the inventive steps.
  • FIG. 9 is a diagram of exemplary components of a mobile terminal (e.g., handset) for communications, which is capable of operating in the system of FIG. 1 , according to one embodiment.
  • mobile terminal 901 or a portion thereof, constitutes a means for performing one or more steps of resolving geo -identification of un-structured and/or colloquial location information.
  • a radio receiver is often defined in terms of front- end and back-end characteristics. The front-end of the receiver encompasses all of the Radio Frequency (RF) circuitry whereas the back-end encompasses all of the base-band processing circuitry.
  • RF Radio Frequency
  • circuitry refers to both: (1) hardware-only implementations (such as implementations in only analog and/or digital circuitry), and (2) to combinations of circuitry and software (and/or firmware) (such as, if applicable to the particular context, to a combination of processor(s), including digital signal processor(s), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions).
  • This definition of "circuitry” applies to all uses of this term in this application, including in any claims.
  • the term “circuitry” would also cover an implementation of merely a processor (or multiple processors) and its (or their) accompanying software/or firmware.
  • the term “circuitry” would also cover if applicable to the particular context, for example, a baseband integrated circuit or applications processor integrated circuit in a mobile phone or a similar integrated circuit in a cellular network device or other network devices.
  • Pertinent internal components of the telephone include a Main Control Unit (MCU) 903, a Digital Signal Processor (DSP) 905, and a receiver/transmitter unit including a microphone gain control unit and a speaker gain control unit.
  • a main display unit 907 provides a display to the user in support of various applications and mobile terminal functions that perform or support the steps of resolving geo-identification of un-structured and/or colloquial location information.
  • the display 907 includes display circuitry configured to display at least a portion of a user interface of the mobile terminal (e.g., mobile telephone). Additionally, the display 907 and display circuitry are configured to facilitate user control of at least some functions of the mobile terminal.
  • An audio function circuitry 909 includes a microphone 91 1 and microphone amplifier that amplifies the speech signal output from the microphone 91 1. The amplified speech signal output from the microphone 911 is fed to a coder/decoder (CODEC) 913.
  • CDEC coder/decoder
  • a radio section 915 amplifies power and converts frequency in order to communicate with a base station, which is included in a mobile communication system, via antenna 917.
  • the power amplifier (PA) 919 and the transmitter/modulation circuitry are operationally responsive to the MCU 903, with an output from the PA 919 coupled to the duplexer 921 or circulator or antenna switch, as known in the art.
  • the PA 919 also couples to a battery interface and power control unit 920.
  • a user of mobile terminal 901 speaks into the microphone 91 1 and his or her voice along with any detected background noise is converted into an analog voltage.
  • the analog voltage is then converted into a digital signal through the Analog to Digital Converter (ADC) 923.
  • ADC Analog to Digital Converter
  • the control unit 903 routes the digital signal into the DSP 905 for processing therein, such as speech encoding, channel encoding, encrypting, and interleaving.
  • the processed voice signals are encoded, by units not separately shown, using a cellular transmission protocol such as enhanced data rates for global evolution (EDGE), general packet radio service (GPRS), global system for mobile communications (GSM), Internet protocol multimedia subsystem (IMS), universal mobile telecommunications system (UMTS), etc., as well as any other suitable wireless medium, e.g., microwave access (WiMAX), Long Term Evolution (LTE) networks, code division multiple access (CDMA), wideband code division multiple access (WCDMA), wireless fidelity (WiFi), satellite, and the like, or any combination thereof.
  • EDGE enhanced data rates for global evolution
  • GPRS general packet radio service
  • GSM global system for mobile communications
  • IMS Internet protocol multimedia subsystem
  • UMTS universal mobile telecommunications system
  • any other suitable wireless medium e.g., microwave access (WiMAX), Long Term Evolution (LTE) networks, code division multiple access (CDMA), wideband code division multiple access (WCDMA), wireless fidelity (WiFi), satellite,
  • the encoded signals are then routed to an equalizer 925 for compensation of any frequency-dependent impairments that occur during transmission though the air such as phase and amplitude distortion.
  • the modulator 927 combines the signal with a RF signal generated in the RF interface 929.
  • the modulator 927 generates a sine wave by way of frequency or phase modulation.
  • an up-converter 931 combines the sine wave output from the modulator 927 with another sine wave generated by a synthesizer 933 to achieve the desired frequency of transmission.
  • the signal is then sent through a PA 919 to increase the signal to an appropriate power level.
  • the PA 919 acts as a variable gain amplifier whose gain is controlled by the DSP 905 from information received from a network base station.
  • the signal is then filtered within the duplexer 921 and optionally sent to an antenna coupler 935 to match impedances to provide maximum power transfer. Finally, the signal is transmitted via antenna 917 to a local base station.
  • An automatic gain control (AGC) can be supplied to control the gain of the final stages of the receiver.
  • the signals may be forwarded from there to a remote telephone which may be another cellular telephone, any other mobile phone or a land-line connected to a Public Switched Telephone Network (PSTN), or other telephony networks.
  • PSTN Public Switched Telephone Network
  • Voice signals transmitted to the mobile terminal 901 are received via antenna 917 and immediately amplified by a low noise amplifier (LNA) 937.
  • LNA low noise amplifier
  • a down-converter 939 lowers the carrier frequency while the demodulator 941 strips away the RF leaving only a digital bit stream.
  • the signal then goes through the equalizer 925 and is processed by the DSP 905.
  • a Digital to Analog Converter (DAC) 943 converts the signal and the resulting output is transmitted to the user through the speaker 945, all under control of a Main Control Unit (MCU) 903 which can be implemented as a Central Processing Unit (CPU) (not shown).
  • MCU Main Control Unit
  • CPU Central Processing Unit
  • the MCU 903 receives various signals including input signals from the keyboard 947.
  • the keyboard 947 and/or the MCU 903 in combination with other user input components comprise a user interface circuitry for managing user input.
  • the MCU 903 runs a user interface software to facilitate user control of at least some functions of the mobile terminal 901 to resolve geo-identification of un-structured and/or colloquial location information.
  • the MCU 903 also delivers a display command and a switch command to the display 907 and to the speech output switching controller, respectively.
  • the MCU 903 exchanges information with the DSP 905 and can access an optionally incorporated SIM card 949 and a memory 951.
  • the MCU 903 executes various control functions required of the terminal.
  • the DSP 905 may, depending upon the implementation, perform any of a variety of conventional digital processing functions on the voice signals. Additionally, DSP 905 determines the background noise level of the local environment from the signals detected by microphone 91 1 and sets the gain of microphone 91 1 to a level selected to compensate for the natural tendency of the user of the mobile terminal 901.
  • the CODEC 913 includes the ADC 923 and DAC 943.
  • the memory 951 stores various data including call incoming tone data and is capable of storing other data including music data received via, e.g., the global Internet.
  • the software module could reside in RAM memory, flash memory, registers, or any other form of writable storage medium known in the art.
  • the memory device 951 may be, but not limited to, a single memory, CD, DVD, ROM, RAM, EEPROM, optical storage, magnetic disk storage, flash memory storage, or any other nonvolatile storage medium capable of storing digital data.
  • An optionally incorporated SIM card 949 carries, for instance, important information, such as the cellular phone number, the carrier supplying service, subscription details, and security information.
  • the SIM card 949 serves primarily to identify the mobile terminal 901 on a radio network.
  • the card 949 also contains a memory for storing a personal telephone number registry, text messages, and user specific mobile terminal settings.

Abstract

An approach is provided for predicting and resolving geo-identification of un-structured and/or colloquial location information. A geo-coding manager causes, at least in part, a collection of point-of-interest information from one or more sources. Next, the geo-coding manager processes and/or facilitates a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof. Then, the geo-coding manager causes, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo-spatial documents are associated with one or more geographical areas. The geo-coding manager processes and/or facilitates a processing of the one or more geo-spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.

Description

METHOD AND APPARATUS FOR RESOLVING GEO-IDENTITY
BACKGROUND
Service providers and device manufacturers (e.g., wireless, cellular, etc.) are continually challenged to deliver value and convenience to consumers by, for example, providing compelling network services. One area of development has been the use of location based services (LBS), such as local search, local advertising, point-to-point navigation, geo-spatial recommendation systems etc. The geographical position is usually available from the GPS coordinates obtained from a GPS enabled handheld devices. However, the GPS data is not available when the GPS connectivity is weak, user devices are not GPS enabled devices (prominent in emerging markets), user devices are not mobile, searching information about a location different from the GPS location, etc. The existing geo-coding services may convert geographic coordinates and structured location data (e.g., street addresses) into names of places (e.g., points of interest (POIs)), street addresses, neighborhoods, cities/towns, counties/provinces, states, or countries, etc., to provide users with location information that they can understand. However, location data that follows a definite structure is not always available or well recognized in all areas and/or countries, especially in developing countries. It is challenging to handle unstructured and/or imprecise location information (e.g., between the third and the forth bicycle rental shops on the beach), and colloquial and unofficial location references (e.g., "the Big Apple" as a nickname for New York City, "near XYZ hospital," or "at ABC mall"). Accordingly, service providers and device manufacturers face significant technical challenges in processing un-structured and/or colloquial location information and identifying therein user-friendly location information. Such geo-identity resolution enhances the geo-coding database and user experience of location based services.
SOME EXAMPLE EMBODIMENTS Therefore, there is a need for an approach for resolving geo-identification of un-structured and/or colloquial location information.
According to one embodiment, a method comprises causing, at least in part, a collection of point- of-interest information from one or more sources. The method also comprises processing and/or facilitating a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof. The method further comprises causing, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo-spatial documents are associated with one or more geographical areas. The method further comprises processing and/or facilitating a processing of the one or more geo-spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
According to another embodiment, an apparatus comprises at least one processor, and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause, at least in part, the apparatus to cause, at least in part, a collection of point-of-interest information from one or more sources. The apparatus is also caused to process and/or facilitate a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof. The apparatus is further caused to cause, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo- spatial documents are associated with one or more geographical areas. The apparatus is further caused to process and/or facilitate a processing of the one or more geo-spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
According to another embodiment, a computer-readable storage medium carries one or more sequences of one or more instructions which, when executed by one or more processors, cause, at least in part, an apparatus to cause, at least in part, a collection of point-of-interest information from one or more sources. The apparatus is also caused to process and/or facilitate a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof. The apparatus is further caused to cause, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo-spatial documents are associated with one or more geographical areas. The apparatus is further caused to process and/or facilitate a processing of the one or more geo- spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
According to another embodiment, an apparatus comprises means for causing, at least in part, a collection of point-of-interest information from one or more sources. The apparatus also comprises means for processing and/or facilitating a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof. The apparatus further comprises means for causing, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo-spatial documents are associated with one or more geographical areas. The apparatus further comprises means for processing and/or facilitating a processing of the one or more geo-spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
According to one embodiment, a method comprises causing, at least in part, a collection of point- of-interest information from one or more sources. The method also comprises causing, at least in part, transmission of a query to a geo-coding service. The method further comprises receiving a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof, wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas.
According to another embodiment, an apparatus comprises at least one processor, and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause, at least in part, the apparatus to cause, at least in part, transmission of a query to a geo-coding service. The apparatus is also caused to receive a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof, wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas. According to another embodiment, a computer-readable storage medium carries one or more sequences of one or more instructions which, when executed by one or more processors, cause, at least in part, an apparatus to cause, at least in part, transmission of a query to a geo-coding service. The apparatus is also caused to receive a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof, wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas.
According to another embodiment, an apparatus comprises means for causing, at least in part, transmission of a query to a geo-coding service. The apparatus also comprises means for processing and/or facilitating a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof. The apparatus further comprises means for receiving a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof, wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas.
In addition, for various example embodiments of the invention, the following is applicable: a method comprising facilitating a processing of and/or processing (1) data and/or (2) information and/or (3) at least one signal, the (1) data and/or (2) information and/or (3) at least one signal based, at least in part, on (including derived at least in part from) any one or any combination of methods (or processes) disclosed in this application as relevant to any embodiment of the invention.
For various example embodiments of the invention, the following is also applicable: a method comprising facilitating access to at least one interface configured to allow access to at least one service, the at least one service configured to perform any one or any combination of network or service provider methods (or processes) disclosed in this application.
For various example embodiments of the invention, the following is also applicable: a method comprising facilitating creating and/or facilitating modifying (1) at least one device user interface element and/or (2) at least one device user interface functionality, the (1) at least one device user interface element and/or (2) at least one device user interface functionality based, at least in part, on data and/or information resulting from one or any combination of methods or processes disclosed in this application as relevant to any embodiment of the invention, and/or at least one signal resulting from one or any combination of methods (or processes) disclosed in this application as relevant to any embodiment of the invention. For various example embodiments of the invention, the following is also applicable: a method comprising creating and/or modifying (1) at least one device user interface element and/or (2) at least one device user interface functionality, the (1) at least one device user interface element and/or (2) at least one device user interface functionality based at least in part on data and/or information resulting from one or any combination of methods (or processes) disclosed in this application as relevant to any embodiment of the invention, and/or at least one signal resulting from one or any combination of methods (or processes) disclosed in this application as relevant to any embodiment of the invention. In various example embodiments, the methods (or processes) can be accomplished on the service provider side or on the mobile device side or in any shared way between service provider and mobile device with actions being performed on both sides.
For various example embodiments, the following is applicable: An apparatus comprising means for performing the method of any of originally filed claims 1-30 and 51 -53.
Still other aspects, features, and advantages of the invention are readily apparent from the following detailed description, simply by illustrating a number of particular embodiments and implementations, including the best mode contemplated for carrying out the invention. The invention is also capable of other and different embodiments, and its several details can be modified in various obvious respects, all without departing from the spirit and scope of the invention. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive. BRIEF DESCRIPTION OF THE DRAWINGS
The embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings:
FIG. 1 is a diagram of a system capable of resolving geo-identification of un-structured and/or colloquial location information, according to one embodiment;
FIG. 2 is a diagram of the components of a geo-coding manager, according to one embodiment; FIGs. 3A-3B are flowcharts of processes for resolving geo-identification of un-structured and/or colloquial location information, according various embodiments;
FIG. 4 is a location area diagram mapped with points of interest and utilized in the process of FIG. 3, according to one embodiment;
FIGs. 5A-5B show examples of two geo-spatial documents utilized in the process of FIG. 3, according to various embodiments; FIGs. 6A-6D are diagrams of user interfaces utilized in the process of FIG. 3, according to various embodiments;
FIG. 7 is a diagram of hardware that can be used to implement an embodiment of the invention; FIG. 8 is a diagram of a chip set that can be used to implement an embodiment of the invention; and
FIG. 9 is a diagram of a mobile terminal (e.g., handset) that can be used to implement an embodiment of the invention.
DESCRIPTION OF SOME EMBODIMENTS
Examples of a method, apparatus, and computer program for resolving geo-identification of unstructured and/or colloquial location information are disclosed. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the invention. It is apparent, however, to one skilled in the art that the embodiments of the invention may be practiced without these specific details or with an equivalent arrangement. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the embodiments of the invention. As used herein, the term "point of interest", or POI, refers to a specific point location that an individual entity, business entity, or any legal entity may find useful or interesting. This term is used interchangeably with landmark. A POI may be a historical monuments, cinema theatres, pub, bar, restaurant, hotel, club, venue, sightseeing spot, shopping mall/center, building, museum, industrial/science park, police/fire station, post office, bank, ATM machine, hospital, pharmacy, school, church, golf course, bridge, historic house, camping/caravan site, daycare center, community center, tunnel, airport, roadway, waterway, railway, rock formation, spring, oasis, mountain, etc.
As used herein, the term "cloud" refers to an aggregated set of information and computational closures from different sources. This multi-sourcing is very flexible since it accounts and relies on the observation that the same piece of information or computation can come from different sources. In one embodiment, information and computations within the cloud are represented using Semantic Web standards such as Resource Description Framework (RDF), RDF Schema (RDFS), OWL (Web Ontology Language), FOAF (Friend of a Friend ontology), rule sets in RuleML (Rule Markup Language), etc. Furthermore, as used herein, RDF refers to a family of World Wide Web Consortium (W3C) specifications originally designed as a metadata data model. It has come to be used as a general method for conceptual description or modeling of information and computations that is implemented in web resources; using a variety of syntax formats. Although various embodiments are described with respect to clouds, it is contemplated that the approach described herein may be used with other structures and conceptual description methods used to create distributed models of information and computations.
As used herein, the term "location-based service" (LBS) refers to an information service accessible through the network and utilizing the ability to make use of the geographical position of a terminal. LBS services can be used in a variety of contexts, such as navigation, entertainment, health, work, personal life, etc. Location-based services include services to identify a location of a person or object, discover the nearest banking cash machine or the whereabouts of a friend or employee. Location-based services include location-based commerce (e.g., trade and repair, wholesale, financial, legal, personal services, business services, communications and media,), location-based ecommerce (e.g., online transactions, coupons, marketing, advertising, etc.), accommodation, real estate, renting, construction, dining, transport and travel, travel guides, mapping and navigation, parcel/vehicle tracking, personalized weather services, location-based games, etc.
As used herein, the term "user context" refers to discrete context characteristics/data of a user and/or the user terminal/equipment (UE), such as a date, time, location, current activity, weather, a history of activities, etc. Associated with the user. In an effort to organize the user context data, a contextual structure is inserted with instances, locations (e.g., points of interest), and events (e.g., activities) that contain possible relationships between points of interest and user activities discovered via, for instance, data-mining or other querying processes. By way of example, the contextual structure incorporates characteristics and features of an individual user's context data, such as the user's calendar, text messages, instant messages, etc. In another embodiment, user preference data is also merged into the user context structure. In particular, the contextual data elements may include location (where the user/UE is available, wherein the context information source is applicable. Etc), active dates (the range of dates for which the user/UE and/or the context information source is available), sub-identifiers (each sub-identifier associated with a different location and/or applicable context information source), event type (event information associated with the user/UE), time (of the event if the user/UE involves), applicable context (in which the context information source is applicable), context source (what sensors, services, applications, etc. can provide the related contextual information), and optionally preference elements (associated with what preferences data elements ), etc. The user preferences include user information and user preference data. Typical user information elements include a user identifier (e.g., telephone number), user device model (e.g., to identify device capabilities), age, nationality, language preferences, interest areas, login credentials (to access the listed information resources of external links). In one embodiment, the preference data is automatically retrieved and/or generated by the system from the backend data and/or external information sources. In another embodiment, the preference data structure is recorded at the user device based upon user personal data, online interactions and related activities with respect to specific topics, points of interest, or locations, etc. It is contemplated that the user can define any number of preference elements and tokens as user preference data. In addition or alternatively, the system decides what parameters or attributes to choose to represent user context and/or preferences.
Although various embodiments are described with respect to crowd sourced data, it is contemplated that the approach described herein is applicable to other data sources, including, but not limited to, a local listing of POI addresses, landmark information from data vendors (e.g., Navteq), etc.
FIG. 1 is a diagram of a system capable of resolving geo -identification of un-structured and/or colloquial location information, according to one embodiment. It is becoming increasingly popular for service providers and device manufacturers to bundle or make available navigation and mapping services on an array of user devices (e.g., mobile handsets, computers, navigation devices, etc.). Such devices may utilize location-based technologies (e.g., Global Positioning System (GPS) receivers, cellular triangulation, assisted-GPS (A-GPS), etc.) to provide navigation and mapping information. One growing trend is the use of geo-coding and/or geo- identify resolution services on a given location text and/or contact information to extract structured location information, then use the structured location information to provide location based services, and/or to the information to the user in the form of place names (e.g., points of interest (POIs)), street addresses, neighborhoods, city/town names, county names, state/province names, country names, etc. However, the existing geo-coding practices have limitations for handling un-structured and/or colloquial location information.
Address interpolation requires the street address to match it to a street and specific segment, and then interpolates the position of the address within the range along the segment, to obtain the geographic coordinates of the street address. By way example, to interpolate the address 955, Page Mill Road, a street segment running from 900 to 1000 is extracted. 955 would be somewhere in the middle of this block, and odd numbers are on one side and even on other side. Based on the information, approximate geo-coordinates are mapped for the address. This method can work well only for addresses that follow a particular scheme (e.g., the numbering scheme). However, addresses in old cities, developing counties, etc. do not adhere to any particular scheme thereby it is difficult to deploy the address interpolation. Finding geo -coordinates based on address zip codes is not always helpful since the user may not know the relevant zip code, and a zip code area can be a too large for an intended purpose.
Another geo-coding approach is using text-based searches through a corpus of point-of-interest (POI) data and returns the geo -coordinates of the POI that has the maximum string matches. In this approach, a document is considered for each POI entry. This document can be indexed based on certain keywords. User entered address is then matched (based on string proximity) with each of these documents. A relevance score is assigned to each document based on the number of string matches. Document with the highest relevance score is then returned along with its geo-coordinates. However, this pure string matching approach may fail due to the vagaries in addresses, as demonstrated in a following example. The user inputs an address of a cafe shop named Barista in New Delhi, India: McDonalds, near Alankar theatre, Feroze Gandhi Marg, Lajpat Nagar II, and there are four candidates.
1. Barista, 3 C's Cinema Road, Near Alankar theatre, Lajpat Nagar II New Delhi, India.
2. Dominos Pizza, Feroze Gandhi Marg, near 3 C's cinema, Lajpat Nagar II, New Delhi, India.
3. McDonalds, 3 C's Cinema Road, Lajpat Nagar II, New Delhi, India
4. Feroze Shopping centre, Alankar Marg, Near McDonalds, Lajpat Nagar III, New Delhi, India.
Table 1
Here addresses 1 , 2, and 3 are close to one another since 3 C's cinema road and Feroze Gandhi Marg refer to the same road. Address 4 is far from the first three addresses, since it is in a different area (Lajpat Nagar III). Although Address 3 is a perfect match for the search query, the text-based search engine, however, find Address 4 as the best match since its has string matches with a highest relevance score compared to other addresses.
To address this problem, a system 100 of FIG. 1 introduces the capability to resolve geo- identification of un-structured and/or colloquial location information by crowdsourcing and document clustering. More specifically, the system 100 provides a two-fold approach for resolving geo-identities of unstructured addresses. The system 100 uses a crowd sourcing approach for collecting local business, landmark and point of interest (POI) data. The system 100 bootstraps geo-coordinate information of addressees of local businesses and points of interest. These addressees are aggregated into clusters depending on their geo-coordinates. Each POI data has geo-tags associated therewith. The system 100 geo-codes based on these addressees. The system 100 groups the data to form geo-spatial clusters. The system 100 then assigns a document for every cluster that aggregates information (e.g., street names, landmark references, area names, etc.) about all the POIs belonging to that cluster. A document is associated with each cluster. When a user enters an address text, a search engine runs over these documents to find a good match. A document with the best match is returned to the user. This approach captures various colloquial references and reduces the reliance on street network data. The geo-coding task is thus reduced to a search task over the corpus of these documents. The system 100 captures the variations in the same address. This geo-spatial clustering and aggregation allows the system 100 to capture vagaries attached with the unstructured addresses that do not follow any format such as "suite number, street name, area code" etc. Two input strings X and Y that have different address texts but actually refer to the same physical location (this is very common case in emerging markets) can be accurately geo-coded using our approach. The system 100 does not rely on detailed street and road network information, which may be hard or expensive to find. As shown in FIG. 1 , the system 100 comprises a user equipment (UE) 101 (or UEs 101 or UEs lOla-lOln) having connectivity to a map platform 103 via a communication network 105. The location information may be utilized by applications 107 of the UE 101 (e.g., location-based applications). The applications 107 may also include or have access to a geo-coding manager 109 to resolve geo -identification of un-structured and/or colloquial location information. It is noted that the geo-coding manager 109 may be included with the UE 101 as shown, or the geo- coding manager 109 may be provided and handled by the map platform 103. Moreover, mapping information, such as location information, may be included in a map database 11 1 associated with the map platform 103 for access by the applications 107. As discussed, mapping information may be retrieved from the map database 1 11 to be utilized by the applications 107 of the UE 101.
In certain embodiments, mapping information may be associated with content information including live media (e.g., streaming broadcasts), stored media (e.g., stored on a network or locally), metadata associated with media, text information, location information of other user devices, or a combination thereof. The content may be provided by the service platform 1 13 which includes one or more services 1 15a-1 15m (e.g., music service, mapping service, video service, social networking service, content broadcasting service, etc.), one or more content providers 1 17a-l 17k (e.g., online content retailers, public databases, etc.), other content source available or accessible over the communication network 105. For example, the applications 107 may present location-related content information (e.g., content with regard to images, videos, articles, people, places, etc., associated with a location) on a display of the UE 101 in addition or as an alternate to geo-coded information and/or other mapping information. As seen in FIG. 1 , a user of UE 101 may own, use, or otherwise have access to various pieces of information distributed in information stores 1 19a-l 191 in the cloud.
As mentioned, the UE 101 may utilize location-based technologies (GPS receivers, cellular triangulation, A-GPS, etc.) to provide mapping information. For instance, the UE 101 may include a GPS receiver to obtain geographic coordinates from satellites 121 to determine the current location associated with the UE 101. In one sample use case, a user lands at the local airport near his home from a long-distance business trip. Based on the geographic coordinates received from the satellites 121 , a particular application 107 will check via the geo-coding manager 109 whether the location information (e.g., geo-coded information) for the current location is available in the memory of the UE 101. Assuming that the user had previously departed from the local airport, the geo-coding manager 109 may have predicted that the user would return, for instance, based on calendar entries on the user's UE 101. Thus, in this case, accurate location information of the user's current location will most likely be pre-fetched and stored in the memory of the UE 101.
When the user lands at a foreign country or area where the user has not been before. In this case, on activation of the UE 101 (e.g., turning on, entering a new cellular network, etc.), the geo-coding manager 109 may attempt to obtain location at the best accuracy possible. For example, while inside the area, the UE 101 may not have GPS satellite reception, and therefore may determine location information based on, for instance, imprecise landmark references such as "near Koramangala bus stop", "opposite Forum mall," etc.
The address query may lack proper structure and may not follow any particular scheme. The address query may contain colloquial references and abbreviations different from their official names. For example, MG Road is a very well known abbreviation for Mahatma Gandhi road in Bangalore. The address query may contain references to area names. For instance, most cities in India are divided (unofficially) into different areas. These areas may range a few KM, but they may not have any official geo-boundaries. The area names are either colloquial or based on some history associated with the area. They are however used as good reference points for addresses. Address schemes and notions may differ from one area/city/county to another. For tier 2 and 3 cities it may be hard to find detailed road and street network data. Even with a detailed street network, it may be hard to geo-code addresses.
Based on this landmark references or other address query data, the geo-coding manager 109 processes one or more geo-spatial documents to resolve the query over a geographical area. The geo-coding manager 109 aggregates one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof. The one or more terms were determined from a collection of point-of-interest information from one or more sources, into the one or more geo-spatial documents based, at least in part, on the location information.
Referring back to the address of a cafe shop named Barista in New Delhi, India: Barista, 3 C's Cinema Road, Near Alankar theatre, Lajpat Nagar II New Delhi, India, there are peculiarities of this address: (1) There is no street number associated with this address. (2) 3C's Cinema Road, which is officially named as Feroze Gandhi Marg. 3C's Cinema is the colloquially understood street name named after a cinema theatre 3C's. (3) The address includes a landmark reference Alankar theatre. (4) The address includes the area name Lajpat Nagar II, which is also an unofficial reference. In short, a same address can have different variations based on people's perception. By way of example, the above address could also be written as: Barista, Feroze Gandhi Marg, Near 3C's Cinema, Lajpat Nagar II, New Delhi, India.
By way of example, the communication network 105 of system 100 includes one or more networks such as a data network (not shown), a wireless network (not shown), a telephony network (not shown), or any combination thereof. It is contemplated that the data network may be any local area network (LAN), metropolitan area network (MAN), wide area network (WAN), a public data network (e.g., the Internet), short range wireless network, or any other suitable packet-switched network, such as a commercially owned, proprietary packet-switched network, e.g., a proprietary cable or fiber-optic network, and the like, or any combination thereof. In addition, the wireless network may be, for example, a cellular network and may employ various technologies including enhanced data rates for global evolution (EDGE), general packet radio service (GPRS), global system for mobile communications (GSM), Internet protocol multimedia subsystem (IMS), universal mobile telecommunications system (UMTS), etc., as well as any other suitable wireless medium, e.g., worldwide interoperability for microwave access (WiMAX), Long Term Evolution (LTE) networks, code division multiple access (CDMA), wideband code division multiple access (WCDMA), wireless fidelity (WiFi), wireless LAN (WLAN), Bluetooth®, Internet Protocol (IP) data casting, satellite, mobile ad-hoc network (MANET), and the like, or any combination thereof.
The UE 101 is any type of mobile terminal, fixed terminal, or portable terminal including a mobile handset, station, unit, device, multimedia computer, multimedia tablet, Internet node, communicator, desktop computer, laptop computer, notebook computer, netbook computer, tablet computer, personal communication system (PCS) device, personal navigation device, personal digital assistants (PDAs), audio/video player, digital camera/camcorder, positioning device, television receiver, radio broadcast receiver, electronic book device, game device, or any combination thereof, including the accessories and peripherals of these devices, or any combination thereof. It is also contemplated that the UE 101 can support any type of interface to the user (such as "wearable" circuitry, etc.).
By way of example, the UE 101 , map platform 103, the service platform 1 13, and the content providers 1 17a-l 17k communicate with each other and other components of the communication network 105 using well known, new or still developing protocols. In this context, a protocol includes a set of rules defining how the network nodes within the communication network 105 interact with each other based on information sent over the communication links. The protocols are effective at different layers of operation within each node, from generating and receiving physical signals of various types, to selecting a link for transferring those signals, to the format of information indicated by those signals, to identifying which software application executing on a computer system sends or receives the information. The conceptually different layers of protocols for exchanging information over a network are described in the Open Systems Interconnection (OSI) Reference Model.
Communications between the network nodes are typically effected by exchanging discrete packets of data. Each packet typically comprises (1) header information associated with a particular protocol, and (2) payload information that follows the header information and contains information that may be processed independently of that particular protocol. In some protocols, the packet includes (3) trailer information following the payload and indicating the end of the payload information. The header includes information such as the source of the packet, its destination, the length of the payload, and other properties used by the protocol. Often, the data in the payload for the particular protocol includes a header and payload for a different protocol associated with a different, higher layer of the OSI Reference Model. The header for a particular protocol typically indicates a type for the next protocol contained in its payload. The higher layer protocol is said to be encapsulated in the lower layer protocol. The headers included in a packet traversing multiple heterogeneous networks, such as the Internet, typically include a physical (layer 1 ) header, a data-link (layer 2) header, an internetwork (layer 3) header and a transport (layer 4) header, and various application headers (layer 5, layer 6 and layer 7) as defined by the OSI Reference Model. In one embodiment, the functions of the geo-coding manager 109 are distributed among the UE 101 , the map platform 103, and/or the service platform 1 13 according to a client-server model. According to the client-server model, a client process sends a message including a request to a server process, and the server process responds by providing a service (e.g., geo-coding messaging, advertisements, etc.). The server process may also return a message with a response to the client process. Often the client process and server process execute on different computer devices, called hosts, and communicate via a network using one or more protocols for network communications. The term "server" is conventionally used to refer to the process that provides the service, or the host computer on which the process operates. Similarly, the term "client" is conventionally used to refer to the process that makes the request, or the host computer on which the process operates. As used herein, the terms "client" and "server" refer to the processes, rather than the host computers, unless otherwise clear from the context. In addition, the process performed by a server can be broken up to run as multiple processes on multiple hosts (sometimes called tiers) for reasons that include reliability, scalability, and redundancy, among others.
In one embodiment, a user device client sends POI data and runs various location based applications and a server supports geo-coding queries and location based services. The user device client prompts a form for the users to enter POI data. This data is then sent to the server along with the client's location information, such as GPS, CBS, and Cell Id. The underlying communication channel can be SMS, MMS, or GPRS depending on the availability. The user device client can also run LBS applications, such as local search, navigation, locate-me etc. By way of example, the user device client has a locate-me application, and the user enters the approximate street address and sends a request to the server. The server would then send the GPS coordinates to be overlaid on the client's map interface.
The server is connected to a database, and responsible for gathering and processing POI data. The server also supports the geo-coding service and various location-based services, such as local search, navigation etc. The server runs a search engine implementation. Appropriate indexes can be constructed in the search engine to speed up the search process. Whenever user queries arrive (queries containing address texts), the server processes these queries, searches through the various documents and returns the appropriate result to the UE 101. For example, if the server supports a "local search" service then after performing the geo-coding task the server can look for POIs around the geo-coded data. The results are returned to the UE 101 via SMS or GPRS interface.
FIG. 2 is a diagram of the components of a geo-coding manager, according to one embodiment. By way of example, the geo-coding manager 109 includes one or more components for resolving geo-identification of un-structured and/or colloquial location information. It is contemplated that the functions of these components may be combined in one or more components or performed by other components of equivalent functionality. In this embodiment, the geo- coding manager 109 includes control logic 201, a memory 203, a collection module 205, a verification module 207, a correction module 209, a mapping module 21 1, and a clustering module 213.
The control logic 201 oversees tasks, including tasks performed by the collection module 205, the verification module 207, the correction module 209, the mapping module 211 , and the clustering module 213. For example, although the other modules may perform the actual task, the control logic 201 may determine when and how those tasks are performed or otherwise direct the other modules to perform the task.
The collection module 205 collects POI (including landmark, etc.) data for approximate geo- coding. It is easy to collect POI data from commercial sources, but it is challenging to aggregate information of local listings, POIs, since the information is known only to local people. In India, for example, there are several popular local pop and mom stores, such as tea shops, grocery stores, gas stations, etc. that may not be on any local listings or yellow pages entries yet can serve as useful points on interest.
A crowd sourcing technique can be used for collecting POI data including local business and landmark information, and grouping them according to area names. The collection module 205 interacts with a user device client, such as small business owners, individuals, etc. to enter their local business information along with their address and contact information. Although local addresses may not have any inherent structure associated therewith, the collection module 205 enforces a predetermined structural scheme and standard to collect data. In one embodiment, the user device client has configured fields, such as building name, street name, landmark, area name, etc. These structural labels of text snippets in collected addresses prompt the user to include street name, landmark, and area name information. So the address entered has information associated with the structural scheme. By automatically labeling different strings into fields (such as street names, area names etc.), parsing addresses becomes easier on the server side. String comparison between any two addresses becomes easier, by comparing based on individual fields. This feature is useful for identifying user entries typos and spelling mistakes.
In another embodiment, the user device client also captures "location context," such as GPS coordinates, cell tower ID (Cell ID), and the cell broadcast (CBS) message received by the UE 101. CBS messages are part of GSM specification (e.g., GSM 03.41 , 3GPP TS 23.041 , etc.) where the cellular provider broadcasts certain types of messages in a given area. This channel is mainly used for broadcasting emergency messages. Other types of broadcasted messages include traffic, advertisement, area code, area name, etc. CBS messages are periodically sent by a base station to the UE 101. By way of example, the broadcasting period ranges from 1.83 seconds to 60 seconds. By capturing these messages, the collection module 205 collects the approximate location information (e.g., area name) of the UE 101. The captured information is then transmitted to a central server either via SMS or GPRS. Table 2 shows a sample data snippet collected by the UE 101. POI Name: Khana Khazana
Building Number/Name: 147/1 1, Museum Inn
Street Name: Museum Road
> Landmark(s): Above Ruby Tuesday
Area name: Church Street
City: Bengalaru
State: Karnataka
> Pin Code - 560 001
> Phone number - 9821032295
> GPS coordinates: (lat: 12.9777002335, long: 77.6293029785)
> Cell ID: 4801
CBS (area name): Church Street
> Date: 02/16/2011
> Time: 4:00pm
Table 2
Moreover, the collection module 205 may provide incentives to users for sending their information. For example, business owners can be rewarded with advertisement benefits for sending POI information to the system 100. An end user can also collect points or earn monetary awards for submitting POI data. The collection module 205 captures POI and landmark data that has no official listing. The collected data captures various colloquial references and jargon, and provides insights into how people perceive addresses. With the huge penetration of mobile devices, this collection module 205 can be scaled to collect data of different cities.
The verification module 207 verifies the authenticity and correctness of the data collected by the collection module 205. The verification module 207 verifies the data needs at least two levels: the correctness of the data and the location associated with the data. In another embodiment, the verification module 207 corrects spelling mistakes, typos, and abbreviations associated therewith to improve data quality.
The verification module 207 verifies if the address entered by the user is indeed a valid address via one or more of following verification schemes, depending upon a cost-benefit analysis. For each POI data, the verification module 207 makes an actual call to the business owner to manually verify the submitted information. In another embodiment, the verification module 207 sends SMS messages to the business owners, asking them to verify the submitted information.
In yet another embodiment, the verification module 207 deploys one or more incentive mechanisms to award higher points and credibility to the users who frequently submit correct data. In yet another embodiment, the verification module 207 assigns a high correctness probability for the data coming from such users.
The verification module 207 also verifies the UE 101 is actually present at the submitted address via an approximate location verification method. By way of example, the address entered via the UE 101 is "Apana Bazzar, Marathahalli Main Road, Marathahalli". This address has the area name "Marathahalli".
The verification module 207 compares this area name with the area name captured by the CBS. If the two match, the verification module 207 determines the address entered at the UE 101 is the actual physical location of the UE 101. The verification module 207 can construct a connectivity graph of the area names captured by CBS messages. In one embodiment, for an area name "x", the verification module 207 locates other areas that are physically near "x" and constructs a neighborhood graph. The verification module 207 consults this graph to see if the UE 101 is in the neighborhood of the address entered. By way of example, when Marathahalli is closer to "Indira Nagar" and the captured CBS contains the name "Indira Nagar," the verification module 207 estimates that the UE 101 is within the neighborhood of the entered address, and verifies the location associated with the data.
The correction module 209 checks and corrects spell/typing errors. In one embodiment, the correction module 209 enables the user device client with an auto-complete feature to minimize typos and spelling mistakes.
When the addresses contain proper names, the auto-complete feature is rarely utilized. The correction module 209 leverages the crowd-sourced data and string matching algorithms on the server side to identify typos and understand variations in spellings. By way of example, "Vijaya Nagar" in Bangalore is also spelled as "Vijayam Nagar" based on the local accents. If getting "x" entries with "Vijay Nagar" and "y" entries with "Vijayam Nagar" based on the GPS co-ordinates and string matching closeness, the correction module 209 associates the two spellings and reasons that "Vijay Nagar" and "Vijayam Nagar" refer to the same area name. Further we can also identify popularity of the names based on the frequency of their occurrence. When "x" > "y", the more popular name is "Vijay Nagar". The correction module 209 matches entered strings and their geo-coordinates to detect typing errors, such as typing "Vijay Nagar" as "Vijy Nagar". The mapping module 21 1 geo-annotates data via various mapping functions. In one embodiment, the mapping module 211 maps address data to geo-coordinates. This mapping is achieved by capturing GPS data along with the addresses. In another embodiment, the mapping module 21 1 maps Cell ID to geo-coordinates. After capturing Cell Id of the UE 101 and its geo-coordinates, the mapping module 21 1 associates Cell Ids with these geo-coordinates which are a valuable data source for various LBS. In yet another embodiment, the mapping module 211 maps area names to geo-coordinates. Since both addresses have area names and the CBS messages have area names, the mapping module 211 extracts the geo-coordinates and area names and forms this mapping. This mapping is useful for local search services. The clustering module 213 aggregates and clusters the geo-tags (GPS coordinates) associated with every POI data and its related address text. An example is provided with respect to a city in conjunction with FIG. 3. Similar concepts can be expanded for other location areas, such as state, country, etc. FIG. 3A is a flowchart of a process for resolving geo-identification of un-structured and/or colloquial location information, according to one embodiment. In one embodiment, the geo- coding manager 109 performs the process 300 and is implemented in, for instance, a chip set including a processor and a memory as shown in FIG. 8. As such, the control logic 201 can provide means for accomplishing various parts of the process 300 as well as means for accomplishing other processes in conjunction with other components of the geo-coding manager 109. The functions of the geo-coding manager 109 may be distributed among the UE 101 , the map platform 103, and/or the service platform 1 13 according to a client-server model.
In step 301 , the geo-coding manager 109 causes, at least in part, a collection of point-of-interest information from one or more sources. The sources may include observations, people, speeches, documents, pictures, organizations, entities, libraries, databases, websites, custom POI sources (e.g., archaeological sites, traffic/safety cameras, etc.), commercial POI sources (e.g., travel, sports, etc.), etc.
In one embodiment, the collection is based, at least in part, on one or more crowd-sourcing mechanisms. By way of example, the geo-coding manager 109 interacts with as online document and extracts POI information via artificial intelligence or Natural language processing (NLP) or by interpreting metadata of the document.
In another embodiment, the geo-coding manager 109 interacts with UE 101 of a small business owner, individual, etc. to receive local point of interest information (e.g., cinema theatres, bar, restaurant, hotel, etc.) including address, contact information, etc.
In step 303, the geo-coding manager 109 receives the point-of-interest information as structured data including one or more fields for inputting, processing, storing, or a combination thereof. The fields may include POI name, nick name, building name, building number, street name, street number, landmark, area name, city, state, zip code, phone number, etc.
In one embodiment, such fields presented on a user interface force the user to enter structured information, rather than just "Tom's gardening tool store next to the Best Supermarket." The one or more fields relate, at least in part, to one or more names of the points of interest (e.g., Tom's Green Thumb), one or more addresses of the points of interest (e.g., a street name, etc.), one or more landmarks associated with the points of interest (e.g., Best Supermarket), or a combination thereof.
In step 305, the geo-coding manager 109 processes and/or facilitates a processing of the point-of- interest information to determine one or more terms that describe location information of one or more points of interest (e.g., a T-shirt stand 50 feet away from a theme park entrance), the one or more sources (e.g., a commercial database, a restaurant owner Bob, etc.), or a combination thereof. The terms may include text, words, name, user entries, etc. By way of example, an user entry may be "a bicycle rental shop at the west side of the public beach shower room", "next to the post office", "Ugly Pencil" (nick name for the Washington Monument), etc. In step 307, the geo-coding manager 109 causes, at least in part, a verification of the terms, the point-of-interest information, or a combination thereof. There are several verification methods, such as using location context, user confirmation, synonyms, auto-completion, etc.
In one embodiment, the geo-coding manager 109 causes, at least in part, a verification of the terms, the point-of-interest information, based, at least in part, on one or more confirmation replies associated with the one or more sources, the one or more devices, or a combination thereof.
In one embodiment, the geo-coding manager 109 determines frequency information of the one or more terms in the point-of-interest information, among the one or more sources, or a combination thereof. The geo-coding manager 109 processes and/or facilitates a processing of the frequency information to determine one or more synonyms among the one or more terms, to resolve inconsistencies among the one or more terms, to correct errors in the one or more terms, or a combination thereof.
In yet another embodiment, the geo-coding manager 109 determines one or more location contexts of one or more terminals at least substantially concurrently with the collection of the point-of-interest information, causes, at least in part, an association of the one or more location contexts with the point-of-interest information. The one or more location contexts are determined based, at least in part, on one or more communications cell tower identifiers (e.g., Cell ID), one or more communications cell broadcast (CBS) messages identifiers, one or more location sensor (e.g., GPS) coordinates, or a combination thereof.
The geo-coding manager 109 then causes, at least in part, a verification of the terms, the point- of-interest information, based, at least in part, on comparison against the one or more location contexts.
In yet another embodiment, the geo-coding manager 109 processes and/or facilitates a processing of the one or more terms to determine address information, one or more area names (e.g., old town Alexandria, etc.) of the one or more geographical areas (e.g., Alexandra in Virginia), or a combination thereof. Area names are comprehensible (e.g., "Pennsylvania Avenue," "Georgetown," "Vijay Nagar," etc.) to the general population. The geo-coding manager 109 processes and/or facilitates a processing the one or more location contexts to determine one or more geographical coordinates. The geo-coding manager 109 causes, at least in part, a mapping among the address information, the one or more area names, the one or more geographical coordinates, the one or more location contexts, or a combination thereof.
In yet another embodiment, the geo-coding manager 109 determines reliability information of the one or more sources based, at least in part, on the verification. The geo-coding manager 109 determining respective weights of the one or more terms, the point-of-interest information, or a combination thereof based, at least in part, on the reliability information. The one or more geo-spatial documents, the resolution of the one or more queries, or a combination thereof are based, at least in part, on the respective weights.
The mathematically expression of the geo-Id resolution, Res(A,Q) includes: given a corpus of address texts of POIs A = {A : A is a POI address} , such that VA≡ A, the geo-spatial expanse of A is known (latA, lonA), and a query address Q, find the addresses that best determine the geo-spatial expanse of Q. The subset of addresses in the corpus that best determine the geo- spatial expanse of Q includes those that have the same lexical structure as that of Q. In other words, the process of geo-identity resolution is implicitly that of assessing the similarity between the query address document and the geo-spatially annotated address documents in the corpus. However, we have already seen that a query Q might actually be best represented as a linear combination of more than one geo -physically proximate addresses as compared to any individual address. It is therefore important to define a document in this setting which accounts for this property. In step 309, the geo-coding manager 109 causes, at least in part, an aggregation of the one or more terms into one or more geo-spatial documents based, at least in part, on the location information. The one or more geo-spatial documents are associated with one or more geographical areas. The geographic areas can be defined multi-dimensionally, via the parameters of latitude, longitude, elevation, time, etc.
When the geographic areas are defined two-dimensionally, by way of examples, a document Xj is associated with one POI address: Feroze Shopping centre, Alankar Marg, Near McDonalds, Lajpat Nagar III, New Delhi, while a document Xi is associated with three POI addresses including Superfast Food, 3 C's Cinema Road, Lajpat Nagar II, New Delhi.
When the geographic areas are defined four-dimensionally, in one embodiment, each set of POI information is associated with POI address information and a time stamp. By way of examples, North Korean President had a secret trip to China. In this case, his travel schedule reported via different news media was not in a structured manner in terms of both locations and times. The geo-coding manager 109 collects POI information via various sources with the above -discussed approach to generate documents. A document Tj is associated with one or more Chinese military facilities and time windows, while a document Ti is associated with Chinese President residence for dinner.
The geographical area can be set at any granularity, shape, and forms, depending on the availability of POI data, the location context, user preferences, etc. In one embodiment, the area is defined by existing geographic boundaries, e.g., a solar system, planet, continent, country, province, city, town, community, street, floor/unit/room within a building, etc.).
In another embodiment, the one or more geographical areas, one or more portions of the one or more geographical areas, or a combination there are specified as one or more arbitrary geographic boundaries, such as cells. By way of example, the Sahara Desert (3,630,000 sq mi) covers most of Northern Africa, and is almost as large as Europe or the United States. However, since it has very limited number of POIs (e.g., oases), an area can be set as 100 mi *
100 mi grid cells. On the other hand, Macau is the most crowed city in the world (over 19,400 population per sq km) and many tiny shops in alleys. An area can be set as 5 m * 5 m grid cells.
Taking the city New Delhi as another example, the city contains approximately 12,000 POIs spreading non-uniformly across 140 area-names. The areas can vary significantly in size - from a square km to 10 sq. Km. When combined with street names (i.e. Street name-area name combinations from the corpus), the sizes of the areas reduce to 1.8 and 3.6 sq. Km respectively that are relatively coarse grained for geo-identity resolution. An example of a cell is described in conjunction with FIG. 4. Examples of documents are shown in FIG. 5.
In step 311 , the geo-coding manager 109 processes and/or facilitates a processing of one or more queries to determine or one or more query terms. For example, the query can be "I'm standing on a street with two movie theaters and one Italian restaurant in New Delhi. Where the closest bus station?" In one embodiment, the geo-coding manager 109 retrieves GPS data of the UE
101 to decide where the UE 101 is. In another embodiment, when the GPS signal is not available, the geo-coding manager 109 extract terms from the query, such as "New Delhi," "two movie theaters," and "one Italian restaurant," etc. If the user can read and type local language, the geo-coding manager 109 can extract the exact theater names and/or restaurant names to accelerate processing. In step 313, the geo-coding manager 109 causes, at least in part, a probabilistic matching of the one or more query terms against the one or more geo-spatial documents to resolve the query over the geographical areas. By way of example, there are five streets in New Delhi has "two movie theaters" and "one Italian restaurant," and two other streets has "two movie theaters" and "two Italian restaurants," and another street has "two movie theaters," "one Italian restaurant," and "an ice cream shop."
The geo-coding manager 109 may apply algorithms to decide which street is the most likely one based on the entry, and optionally, other context and/or preference information of the user. The context associated with a person may be a birthday, health, moods, clothes, preferences, etc. of the person. The context associated with an event may be a time, location, equipment, materials, etc. of the event. The context associated with a point of interest may be weather conditions, traffic, environment, atmosphere, etc. at the point of interest. By way of example, the geo-coding manager 109 analyzes the user's calendar or travel plan to rank the seven streets and/or select one among the seven streets for the user. The geo-coding manager 109 may present a ranking list, and/or render the streets on a map differently for the user. By way of example, the presentations may comprise one or more messages, items (e.g., data files, applications, games, point of interest information), media objects (e.g., graphics, images, videos, sounds, songs), or a combination thereof. It is contemplated that the presentation may include any other form of information or communication to convey location information to a user. Examples of the user interfaces are shown in FIG. 6.
In one embodiment, the geo-coding manager 109 may receive a request specifying a location- based service for UE 101 , and then causes, at least in part, rendering of the location-based service to UE 101 based, at least in part, the most probable street, a relevant cell, or a combination thereof. By way of example, the geo-coding manager 109 renders the route from the current location on the most probable street to the closest bus station. In another embodiment, the geo-coding manager 109 updates the one or more geo-spatial documents with additional associated location information per cell. By way of example, the geo-coding manager 109 updates the correlation between the user's current location and the bus station in one or more of the documents. The geo-coding manager 109 causes, at least in part, transmission of the one or more geo-spatial documents, the updated one or more geo-spatial documents, or a combination thereof, to an information store, an information space, a cloud, or a combination thereof. The documents become available for the user for the next visit. The information is also available to the public by removing the user's information. The information may be available to service provider, network operators, software developers, advertisers to use, if the user agrees.
FIG. 3B is a flowchart of a process for resolving geo-identification of un-structured and/or colloquial location information, according to one embodiment. In one embodiment, the geo- coding manager 109 residing in the UE 101 performs the process 320 and is implemented in, for instance, a chip set including a processor and a memory as shown in FIG. 8. As such, the control logic 201 can provide means for accomplishing various parts of the process 320 as well as means for accomplishing other processes in conjunction with other components of the geo- coding manager 109.
In step 321 , the geo-coding manager 109 causes, at least in part, transmission of a query (e.g., a T-shirt stand 50 feet away from a theme park entrance) to a geo-coding service. In step 323, the geo-coding manager 109 receives a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names (e.g., "Pennsylvania Avenue," "Georgetown," "Vijay Nagar," "Williamsburg, Virginia" etc.) of the one or more geographical areas, or a combination thereof. The one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query. The one or more terms describe location information of one or more points of interest (e.g., T-shirt stand, Super Water Park, cinema theatres, bar, restaurant, hotel, etc.) located within the one or more geographical areas (e.g., Williamsburg, Virginia). The one or more geographical areas, one or more portions of the one or more geographical areas, or a combination there are specified as one or more cells. The query contains one or more synonyms (e.g., Colonial Williamsburg) of the one or more terms. The one or more terms are collected form one or more sources based, at least in part, on one or more crowd-sourcing mechanisms. The one or more geographical areas are defined multi-dimensionally. In step 325, the geo-coding manager 109 causes, at least in part, representation at a user interface the one or more geographical areas, one or more points of interests associated with the one or more geographical areas, address information associated with the one or more geographical areas, or a combination thereof. Examples of documents are shown in FIGs. 6A-6D. In another embodiment, the geo-coding manager 109 causes, at least in part, transmission of point-of-interest information to the geo-coding service, the point-of-interest information being structured with one or more fields for inputting, processing, storing, or a combination thereof the one or more terms. The one or more fields relate, at least in part, to the one or more names of the points of interest (e.g., the Mall in Washington DC), one or more addresses of the points of interest, one or more landmarks (e.g., the White House, the Capital Hill, etc.) associated with the points of interest, or a combination thereof.
Thereafter, the geo-coding manager 109 receives credit (e.g., service credits, coupons, gifts, etc.) from the geo-coding service based, at least in part, on the transmission of point-of-interest information, reliability of the point-of-interest information, or a combination thereof. The reliability of the point-of-interest information is verified, based, at least in part, on comparison against one or more location contexts (e.g., date, time, location, current activity, weather, a history of activities, etc.) associated with the point-of-interest information. The one or more location contexts are determined based, at least in part, on one or more communications cell tower identifiers, one or more communications cell broadcast messages identifiers, one or more location sensor coordinates, or a combination thereof.
In another embodiment, the geo-coding manager 109 receives a confirmation request from the geo-coding service for verifying the point-of-interest information, the one or more terms, or a combination thereof. Thereafter, the geo-coding manager 109 causes, at least in part, transmission of a confirmation reply to the geo-coding service.
In yet another embodiment, the geo-coding manager 109 causes, at least in part, spell-checking, auto-completing, or a combination thereof, when receiving the point-of-interest information at a user interface.
FIG. 4 is a location area diagram mapped with points of interest and utilized in the process of FIG. 3, according to one embodiment. Address texts, when seen in isolation, can show significant lexical dissimilarity, even for POIs which are in geo-physical vicinity of each other. For example, two POIs located in close proximity may have different street names in them (one official and another colloquial). In order to accommodate such variations, the geo-coding manager 109 overlays a logical square grid over a city (e.g., New Delhi, India) into a plurality of grid cells. Each cell corresponds to one document that contain POI information collected via cloud sourcing. Every collected POI address has a GPS coordinate associated therewith. Therefore, the geo- coding manager 109 maps each POI to a unique grid-cell in the city grid, based on their latitude- longitude coordinates, to generate geo-spatial documents. The content of each geo-spatial document is a collection of labeled strings contributed by each of the POIs that map thereto. Each geo-spatial document X, essentially is a text document. By construction, the geophysical expanse of X as bounded by the four edges of the unit cell representing it in the city grid. The city in turn is a collection of discrete documents that contain all possible strings (names of buildings, streets and areas) that appear in the addresses belonging to that city.
When a POI falls on a boundary of two or more cells, ties can be resolved arbitrarily, or the POI can be shared by those cells. For each grid cell i, the geo-coding manager 109 constructs a geo- spatial document Xi that aggregates information of all the POIs assigned to cell i. Every document has various attributes, such as street name, area name, landmark, city, etc. By way of example, the street and area name attributes are a union of the corresponding attributes of each POI therein, whereas the landmark attribute is a union of all POI names, building names, etc. that figure in the POIs. Since the geo-coding manager 109 enforces structured fields for users to enter POI data, parsing the addresses from different fields becomes easy.
The size of each grid cell (e.g., height and width parameters, such as 100 meter * 100 meter) can be varies as part of system tuning, depending upon the availability of POI data, the location context, user preferences, etc. The geo-coding manager 109 then assigns each POI entry to one grid-cell in the city based on its latitude, longitude, etc.
The problem of geo-identity resolution of address texts thus is converted into a search problem as follows: Let XC = {X : X is a geo-spatial document in city C } . Given a query address Q find the geo-spatial document X*G X that best matches the query. The geo-identity of the query Q is then resolved to within the grid cell corresponding to the geo-spatial document X*. To make precise the concept of a best matching document X* ε X for a given query Q, we make use of a search engine. Mathematically, a search engine can be thought of as a function F : (X,Q)→ 5H+ where X is a document and Q is a query. The scoring algorithm of a search engine provides a ranking of the documents in the corpus for a given query. In practice, a subset X" c X of geo-spatial documents, that attain competitive scores for a given query, may be found. In such cases, depending upon the distribution of the scores, we compute the smallest contiguous sub-region formed by adjacent grid cells.
FIGs. 5A-5B show examples of two geo-spatial documents utilized in the process of FIG. 3, according to various embodiments. The geo-coding manager 109 further assigns a weight function to each landmark L, area name A, and street name S in Xi. This function can be a frequency count (such as number of POIs that reference L, A, or S). By this grouping and scoring mechanism, the geo-coding manager 109 establishes the credibility of the landmarks, area names, and street names in each geo-spatial document and builds a ranking order based on these weights. For example, when a large number of POIs belonging to a particular grid cell refer to "Alankar theatre" as their landmark, the credibility of this landmark being present at that grid location is high and its weight is high as well. The geo-coding manager 109 also uses this scheme to identify most commonly used colloquial names and abbreviations.
The problem of geo-identity resolution of address texts is converted into a search problem over these geo-spatial documents. In other words, given a query address Q and a set of geo-spatial documents X, the geo-coding manager 109 determines the geo-spatial document Xi that best matches the query. The geo-coding manager 109 uses a scoring algorithm to provide ranking of the documents in the corpus for the query. The parameters of the search can be adjusted such that it ranks string-matches based on string proximity, weight associated with every landmark, or a combination thereof. Considering the two exemplary documents Xi and Xj shown in FIGs. 5A-5B, addresses 1 , 2, and 3 are close to each other and belong to the same grid cell (e.g., Lajpat Nagar II), they all combined in Xi while address 4 is in Lajpat Nagar III and belongs to Xj.
When receiving a query of "Superfast Food, near Alankar theatre, Feroze Gandhi Marg, Lajpat Nagar Π", the geo-coding manager 109 matches strings in all documents. Here the two documents Xi and Xj are used as examples. The geo-coding manager 109 finds more matches in document Xi as compared to document Xj and hence returns the geo -coordinates associated with Xi as the response to the query. The bold face words in FIG. 5 A highlight four matches "Feroze Gandhi Marg," "Alankar theatre," "Superfast Food" and "Lajpat Nagar Π" for the input address "Superfast Food, near Alankar theatre, Feroze Gandhi Marg, Lajpat Nagar II." While the bold face words in FIG. 5B highlight three matches "Alankar theatre," "Superfast Food" and "Lajpat Nagar" for the input address.
"Lajpat Nagar Π" can be partially matched with "Lajpat Nagar" in FIG. 5B. The geo-spatial clustering scheme therefore allows the geo-coding manager 109 to capture the variance in addresses.
In one embodiment, the comparison is made via parsing address parameters, such as street name, area name, etc., for the user query, which consumes more resources while obtaining better accuracy. In another embodiment, the string matching is done without parsing the user query into structured address parameters, to spare the trouble of identifying semantics and field information from the user query. The geo-coordinates of the grid cell corresponding to the document Xi are then returned as response to the query. In one embodiment, the geo-coding manager 109 use geo-coordinates of the center of the grid as the response. In another embodiments, the geo-coding manager 109 averages the geo-coordinates of the landmarks (e.g., Barista, Italian Pizza, Superfast Food) in the grid as the response.
FIGs. 6A-6D are diagrams of user interfaces utilized in the process of FIG. 3, according to various embodiments. FIG. 6A shows user interfaces 601, 603, 605, 607, 609, 61 1 may be utilized to collect POI information to update geo-spatial documents. In one example, the user may have recently opened a hotel in Cubon Park and may want to advertise this information. Cubon Park may be an area name associated with a CBS message identifier. Alternatively or additionally, Cubon Park may be a modified area name associated with the CBS message identifier (e.g., the CBS message identifier area name may include "Cubon Pk.," which translates into Cubon Park). Creating a website or providing advertisement in the paper may be expensive options for the user. Further, many people in emerging economies are less Internet savvy and do not use the Internet for searching for things, meaning that there would be less advertisement exposure. The user may utilize the geo-coding manager 109 of the UE 101 to update POI information associated with the hotel. As such, the user need only bear the cost of transmission (e.g., a SMS or MMS). In another embodiment, the system 100 may change the user for a listing fee, a subscription fee, etc.
According to FIG. 6A, the user may select a region for a POI at user interface 601. Then, the user may be prompted, at user interface 603, to select a category (e.g., accommodation) for the POI. The user then selects the subcategory of hotel at user interface 605. At this point, the UE 101 knows that the POI is a hotel in Cubon Park. Next at user interface 607, the user enters POI information associated with the hotel (e.g., name, nick name, address, etc.). Other possible fields may include road, street, marg, chowk, gali, avenue, enclave, etc.
Further, the user may enter additional information, such as comments, pictures, etc. Moreover, the user may be prompted to select other POIs or landmarks nearby the POI to provide more refined grouping information about which POIs are nearby which POIs and landmarks. Further, the user may provide vocal or speech input to the UE 101 that may be converted to text using a speech-to-text mechanism to submit the information. This information may be transmitted to the map platform 103, the service platform 1 13, an information store 119, etc. for updating one or more geo-spatial documents with the POI information. Then, the user may receive a message from the platforms 103, 1 13 and/or information stores 119 presented on the user interface 609 a successful registration notification. Alternatively, the user interface 611 may present an unsuccessful registration notification. Upon successful registration, when another UE 101 queries the platforms and/or information stores for information about the area of Cubon Park, the new POI information may be received and utilized by the UE 101 of another user. In certain embodiments, when portions of the interface are highlighted and/or selected (e.g., Cubon Park), audio associated with the portion (e.g., a name) may be presented to the user. In this manner, a user who cannot or prefers not to use text-based interaction (e.g., an illiterate user or a user who cannot view text because of environmental conditions) may be able to navigate the user interface to add POI information. Moreover, other users (e.g., a local person, a tourist, etc.) interested in a POI may update the POI information. For example, if the user utilizes a location based service on the user's UE 101 and notices that no POI is mentioned, the user may use the process of FIG. 3 to update one or more geo-spatial documents of the platforms and/or information stores. Further, users may additionally add ratings for the POIs, which may be utilized to search for POIs. Additionally, the user may be provided with incentives from the platforms and/or information stores to provide updates. These incentives may include monetary gain, credits for services from the platforms and/or information store, credits for sending messages, etc. or a combination thereof.
FIG. 6B shows user interfaces 621, 623, 625, 627, 629, 631 may be utilized to provide information to a user of the UE 101 about POIs. In this case, the user may have a UE 101 that is not capable of receiving GPS signals or use GPRS connectivity, such that the user cannot use sensor data to locate the geographic position directly. In one embodiment, the user is new to a city (e.g., a tourist), and would like to know where the user is and saw a Superfast Food nearby. In another embodiment, the user would like to a nearby Superfast Food.
In certain embodiments, the user may enter text describing the restaurant or may select an option to retrieve information about the restaurant using menus. In one embodiment, the user enters city, area, and POI information by selecting the items in the interfaces 621 , 623, 625 respectively. In another embodiment, the user input in the interface 627: "Superfast food, near Alankar theatre, Feroze Gandhi Marg, Lajpat Nagar II." The user can initiate the geo- coding manager 109 is then initiated to execute the process in FIG. 3 to find where the user or a nearby Superfast Food is, base on the input information. Further, the geo-coding manager 109 retrieves one or more geo-spatial documents from its own database or from a local information store 119. The geo-coding manager 109 then determines a most like area or a list of areas nearby to allow the user to select these areas as the current location or the location of a nearby Superfast Food. When the user selects "the most likely area" in the user interface 629, the geo-coding manager 109 displays the address information of the current location or the location of a nearby Superfast Food in the user interface 631. The most likely area is an area associated with a cell of the geo- spatial document corresponding to the current location or the location of a nearby Superfast Food.
When the user selects "an area list" in the user interface 629, the geo-coding manager 109 displays a list of areas associated with the current location or the location of a nearby Superfast Food according to their ranks.
When the user selects "map with POI" in the user interface 629, the geo-coding manager 109 displays points of interest in FIG. 6C, according to one embodiment. Points of interest can be displayed to the user in a location panel, map, etc. In FIG. 6C, points of interest a restaurant 641 (i.e., current location of UE 101), a bust stop 643, a train station 645, a bank 647, and a movie theater 649 were extracted for a user visiting New Delhi.
When the user selects "map with areas" in the user interface 629, the geo-coding manager 109 displays relevant geographic areas in FIG. 6D, according to one embodiment. FIG. 6D shows geographic areas 661 , 663, 665 including the icons of the points of interest. Each geographic area corresponds to a geo-spatial document. In addition, a route 667 from the location to the bust stop is displayed to the user.
Further, additional information such as specialties of the restaurant or user ratings and/or review may be presented on user interface if the user selects to view additional information. As noted above, the user interfaces may be presented to the user via a vocal interface (e.g., using a text-to- speech and speech-to-text means).
If the user wants a narrower search, the user can select a zoom function. This function would then extract the landmarks or POIs in the current area or another selected area. The user can select the landmark or POI that the user is close to. For instance, if the user knows that the user is close to a cinema theatre "A", the user could select that POI and the geo-coding manager 109 can perform a refined query to find all the restaurants near cinema theatre "A". In this scenario, each POI may include in its POI information, specific POIs and/or landmarks that the POI is nearby. Thus, a more refined search can be provided by grouping certain POIs together within the areas of the geo-spatial documents. Further, if the search provides inadequate results, the geo-coding manager 109 may request updates from the platforms or the information stores or may broaden the search area. The above-discussed embodiments leverage a crowd sourcing mechanism for collecting local business, point-of-interest (POI), and landmark data, and then geo-codes unstructured address data. The geo-coding involves clustering geo-spatially annotated address texts to form a corpus of geo-spatial documents. The above-discussed embodiments perform geo-coding as a "search" over the knowledge corpus of the geo-spatial documents. The above-discussed embodiments thus exploit landmark references, understands colloquial jargon and handles unstructured address formats. The above-discussed embodiments work on unstructured and landmark based location information commonly found in emerging markets. With the above approaches, users of UEs 101 are provided with location based services based on POI information associated with areas of the geo-spatial documents. In this manner, the UE 101 need not use power consuming GPS location determination technology to receive the location based services, thus saving power and extending battery life in a mobile UE 101. Moreover, because the geo-spatial documents may be local to the UE 101 , the UE 101 need not use GPRS services to receive the location based services. As such, the UE 101 need not have or utilize the capabilities of GPS or GPRS to provide the location information. Further, the geo-spatial documents need not utilize mapping information, thus the UE 101 can save memory resources while providing the location information without loading map images. Additionally, because the search experience adheres to current practices, there is little change in user behavior to utilize the geo-coding manager 109.
In embodiments where GPRS services, SMS or MMS or other wireless network connectivity is available, the geo-spatial documents and location based services may be provided by the platforms and/or information stores. In this way, the processing and resource burden associated with providing such location information and/or location based services can be shifted from the UE 101 to the platforms and/or information stores, thereby reducing processing power and memory resources used at the UE 101 to support the location information and/or location based services. The processes described herein for resolving geo-identification of un-structured and/or colloquial location information may be advantageously implemented via software, hardware, firmware or a combination of software and/or firmware and/or hardware. For example, the processes described herein, may be advantageously implemented via processor(s), Digital Signal Processing (DSP) chip, an Application Specific Integrated Circuit (ASIC), Field Programmable Gate Arrays (FPGAs), etc. Such exemplary hardware for performing the described functions is detailed below. FIG. 7 illustrates a computer system 700 upon which an embodiment of the invention may be implemented. Although computer system 700 is depicted with respect to a particular device or equipment, it is contemplated that other devices or equipment (e.g., network elements, servers, etc.) within FIG. 7 can deploy the illustrated hardware and components of system 700. Computer system 700 is programmed (e.g., via computer program code or instructions) to resolve geo -identification of un-structured and/or colloquial location information as described herein and includes a communication mechanism such as a bus 710 for passing information between other internal and external components of the computer system 700. Information (also called data) is represented as a physical expression of a measurable phenomenon, typically electric voltages, but including, in other embodiments, such phenomena as magnetic, electromagnetic, pressure, chemical, biological, molecular, atomic, sub-atomic and quantum interactions. For example, north and south magnetic fields, or a zero and non-zero electric voltage, represent two states (0, 1) of a binary digit (bit). Other phenomena can represent digits of a higher base. A superposition of multiple simultaneous quantum states before measurement represents a quantum bit (qubit). A sequence of one or more digits constitutes digital data that is used to represent a number or code for a character. In some embodiments, information called analog data is represented by a near continuum of measurable values within a particular range. Computer system 700, or a portion thereof, constitutes a means for performing one or more steps of resolving geo-identification of un-structured and/or colloquial location information.
A bus 710 includes one or more parallel conductors of information so that information is transferred quickly among devices coupled to the bus 710. One or more processors 702 for processing information are coupled with the bus 710. A processor (or multiple processors) 702 performs a set of operations on information as specified by computer program code related to resolve geo-identification of un-structured and/or colloquial location information. The computer program code is a set of instructions or statements providing instructions for the operation of the processor and/or the computer system to perform specified functions. The code, for example, may be written in a computer programming language that is compiled into a native instruction set of the processor. The code may also be written directly using the native instruction set (e.g., machine language). The set of operations include bringing information in from the bus 710 and placing information on the bus 710. The set of operations also typically include comparing two or more units of information, shifting positions of units of information, and combining two or more units of information, such as by addition or multiplication or logical operations like OR, exclusive OR (XOR), and AND. Each operation of the set of operations that can be performed by the processor is represented to the processor by information called instructions, such as an operation code of one or more digits. A sequence of operations to be executed by the processor 702, such as a sequence of operation codes, constitute processor instructions, also called computer system instructions or, simply, computer instructions. Processors may be implemented as mechanical, electrical, magnetic, optical, chemical or quantum components, among others, alone or in combination.
Computer system 700 also includes a memory 704 coupled to bus 710. The memory 704, such as a random access memory (RAM) or any other dynamic storage device, stores information including processor instructions for resolving geo-identification of un-structured and/or colloquial location information. Dynamic memory allows information stored therein to be changed by the computer system 700. RAM allows a unit of information stored at a location called a memory address to be stored and retrieved independently of information at neighboring addresses. The memory 704 is also used by the processor 702 to store temporary values during execution of processor instructions. The computer system 700 also includes a read only memory (ROM) 706 or any other static storage device coupled to the bus 710 for storing static information, including instructions, that is not changed by the computer system 700. Some memory is composed of volatile storage that loses the information stored thereon when power is lost. Also coupled to bus 710 is a non-volatile (persistent) storage device 708, such as a magnetic disk, optical disk or flash card, for storing information, including instructions, that persists even when the computer system 700 is turned off or otherwise loses power.
Information, including instructions for resolving geo-identification of un-structured and/or colloquial location information, is provided to the bus 710 for use by the processor from an external input device 712, such as a keyboard containing alphanumeric keys operated by a human user, or a sensor. A sensor detects conditions in its vicinity and transforms those detections into physical expression compatible with the measurable phenomenon used to represent information in computer system 700. Other external devices coupled to bus 710, used primarily for interacting with humans, include a display device 714, such as a cathode ray tube (CRT), a liquid crystal display (LCD), a light emitting diode (LED) display, an organic LED (OLED) display, a plasma screen, or a printer for presenting text or images, and a pointing device 716, such as a mouse, a trackball, cursor direction keys, or a motion sensor, for controlling a position of a small cursor image presented on the display 714 and issuing commands associated with graphical elements presented on the display 714. In some embodiments, for example, in embodiments in which the computer system 700 performs all functions automatically without human input, one or more of external input device 712, display device 714 and pointing device 716 is omitted. In the illustrated embodiment, special purpose hardware, such as an application specific integrated circuit (ASIC) 720, is coupled to bus 710. The special purpose hardware is configured to perform operations not performed by processor 702 quickly enough for special purposes. Examples of ASICs include graphics accelerator cards for generating images for display 714, cryptographic boards for encrypting and decrypting messages sent over a network, speech recognition, and interfaces to special external devices, such as robotic arms and medical scanning equipment that repeatedly perform some complex sequence of operations that are more efficiently implemented in hardware. Computer system 700 also includes one or more instances of a communications interface 770 coupled to bus 710. Communication interface 770 provides a one-way or two-way communication coupling to a variety of external devices that operate with their own processors, such as printers, scanners and external disks. In general the coupling is with a network link 778 that is connected to a local network 780 to which a variety of external devices with their own processors are connected. For example, communication interface 770 may be a parallel port or a serial port or a universal serial bus (USB) port on a personal computer. In some embodiments, communications interface 770 is an integrated services digital network (ISDN) card or a digital subscriber line (DSL) card or a telephone modem that provides an information communication connection to a corresponding type of telephone line. In some embodiments, a communication interface 770 is a cable modem that converts signals on bus 710 into signals for a communication connection over a coaxial cable or into optical signals for a communication connection over a fiber optic cable. As another example, communications interface 770 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN, such as Ethernet. Wireless links may also be implemented. For wireless links, the communications interface 770 sends or receives or both sends and receives electrical, acoustic or electromagnetic signals, including infrared and optical signals, that carry information streams, such as digital data. For example, in wireless handheld devices, such as mobile telephones like cell phones, the communications interface 770 includes a radio band electromagnetic transmitter and receiver called a radio transceiver. In certain embodiments, the communications interface 770 enables connection from the UE 101 to the communication network 105 for resolving geo- identification of un-structured and/or colloquial location information.
The term "computer-readable medium" as used herein refers to any medium that participates in providing information to processor 702, including instructions for execution. Such a medium may take many forms, including, but not limited to computer-readable storage medium (e.g., non-volatile media, volatile media), and transmission media. Non-transitory media, such as non-volatile media, include, for example, optical or magnetic disks, such as storage device 708. Volatile media include, for example, dynamic memory 704. Transmission media include, for example, twisted pair cables, coaxial cables, copper wire, fiber optic cables, and carrier waves that travel through space without wires or cables, such as acoustic waves and electromagnetic waves, including radio, optical and infrared waves. Signals include man-made transient variations in amplitude, frequency, phase, polarization or other physical properties transmitted through the transmission media. Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, CDRW, DVD, any other optical medium, punch cards, paper tape, optical mark sheets, any other physical medium with patterns of holes or other optically recognizable indicia, a RAM, a PROM, an EPROM, a FLASH-EPROM, an EEPROM, a flash memory, any other memory chip or cartridge, a carrier wave, or any other medium from which a computer can read. The term computer-readable storage medium is used herein to refer to any computer-readable medium except transmission media. Logic encoded in one or more tangible media includes one or both of processor instructions on a computer-readable storage media and special purpose hardware, such as ASIC 720.
Network link 778 typically provides information communication using transmission media through one or more networks to other devices that use or process the information. For example, network link 778 may provide a connection through local network 780 to a host computer 782 or to equipment 784 operated by an Internet Service Provider (ISP). ISP equipment 784 in turn provides data communication services through the public, world-wide packet-switching communication network of networks now commonly referred to as the Internet 790.
A computer called a server host 792 connected to the Internet hosts a process that provides a service in response to information received over the Internet. For example, server host 792 hosts a process that provides information representing video data for presentation at display 714. It is contemplated that the components of system 700 can be deployed in various configurations within other computer systems, e.g., host 782 and server 792.
At least some embodiments of the invention are related to the use of computer system 700 for implementing some or all of the techniques described herein. According to one embodiment of the invention, those techniques are performed by computer system 700 in response to processor 702 executing one or more sequences of one or more processor instructions contained in memory 704. Such instructions, also called computer instructions, software and program code, may be read into memory 704 from another computer-readable medium such as storage device 708 or network link 778. Execution of the sequences of instructions contained in memory 704 causes processor 702 to perform one or more of the method steps described herein. In alternative embodiments, hardware, such as ASIC 720, may be used in place of or in combination with software to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware and software, unless otherwise explicitly stated herein.
The signals transmitted over network link 778 and other networks through communications interface 770, carry information to and from computer system 700. Computer system 700 can send and receive information, including program code, through the networks 780, 790 among others, through network link 778 and communications interface 770. In an example using the Internet 790, a server host 792 transmits program code for a particular application, requested by a message sent from computer 700, through Internet 790, ISP equipment 784, local network 780 and communications interface 770. The received code may be executed by processor 702 as it is received, or may be stored in memory 704 or in storage device 708 or any other non-volatile storage for later execution, or both. In this manner, computer system 700 may obtain application program code in the form of signals on a carrier wave.
Various forms of computer readable media may be involved in carrying one or more sequence of instructions or data or both to processor 702 for execution. For example, instructions and data may initially be carried on a magnetic disk of a remote computer such as host 782. The remote computer loads the instructions and data into its dynamic memory and sends the instructions and data over a telephone line using a modem. A modem local to the computer system 700 receives the instructions and data on a telephone line and uses an infra-red transmitter to convert the instructions and data to a signal on an infra-red carrier wave serving as the network link 778. An infrared detector serving as communications interface 770 receives the instructions and data carried in the infrared signal and places information representing the instructions and data onto bus 710. Bus 710 carries the information to memory 704 from which processor 702 retrieves and executes the instructions using some of the data sent with the instructions. The instructions and data received in memory 704 may optionally be stored on storage device 708, either before or after execution by the processor 702.
FIG. 8 illustrates a chip set or chip 800 upon which an embodiment of the invention may be implemented. Chip set 800 is programmed to resolve geo-identification of un-structured and/or colloquial location information as described herein and includes, for instance, the processor and memory components described with respect to FIG. 7 incorporated in one or more physical packages (e.g., chips). By way of example, a physical package includes an arrangement of one or more materials, components, and/or wires on a structural assembly (e.g., a baseboard) to provide one or more characteristics such as physical strength, conservation of size, and/or limitation of electrical interaction. It is contemplated that in certain embodiments the chip set
800 can be implemented in a single chip. It is further contemplated that in certain embodiments the chip set or chip 800 can be implemented as a single "system on a chip." It is further contemplated that in certain embodiments a separate ASIC would not be used, for example, and that all relevant functions as disclosed herein would be performed by a processor or processors. Chip set or chip 800, or a portion thereof, constitutes a means for performing one or more steps of providing user interface navigation information associated with the availability of functions. Chip set or chip 800, or a portion thereof, constitutes a means for performing one or more steps of resolving geo-identification of un-structured and/or colloquial location information.
In one embodiment, the chip set or chip 800 includes a communication mechanism such as a bus
801 for passing information among the components of the chip set 800. A processor 803 has connectivity to the bus 801 to execute instructions and process information stored in, for example, a memory 805. The processor 803 may include one or more processing cores with each core configured to perform independently. A multi-core processor enables multiprocessing within a single physical package. Examples of a multi-core processor include two, four, eight, or greater numbers of processing cores. Alternatively or in addition, the processor 803 may include one or more microprocessors configured in tandem via the bus 801 to enable independent execution of instructions, pipelining, and multithreading. The processor 803 may also be accompanied with one or more specialized components to perform certain processing functions and tasks such as one or more digital signal processors (DSP) 807, or one or more application-specific integrated circuits (ASIC) 809. A DSP 807 typically is configured to process real-world signals (e.g., sound) in real time independently of the processor 803. Similarly, an ASIC 809 can be configured to performed specialized functions not easily performed by a more general purpose processor. Other specialized components to aid in performing the inventive functions described herein may include one or more field programmable gate arrays (FPGA) (not shown), one or more controllers (not shown), or one or more other special-purpose computer chips.
In one embodiment, the chip set or chip 800 includes merely one or more processors and some software and/or firmware supporting and/or relating to and/or for the one or more processors.
The processor 803 and accompanying components have connectivity to the memory 805 via the bus 801. The memory 805 includes both dynamic memory (e.g., RAM, magnetic disk, writable optical disk, etc.) and static memory (e.g., ROM, CD-ROM, etc.) for storing executable instructions that when executed perform the inventive steps described herein to resolve geo- identification of un-structured and/or colloquial location information. The memory 805 also stores the data associated with or generated by the execution of the inventive steps.
FIG. 9 is a diagram of exemplary components of a mobile terminal (e.g., handset) for communications, which is capable of operating in the system of FIG. 1 , according to one embodiment. In some embodiments, mobile terminal 901 , or a portion thereof, constitutes a means for performing one or more steps of resolving geo -identification of un-structured and/or colloquial location information. Generally, a radio receiver is often defined in terms of front- end and back-end characteristics. The front-end of the receiver encompasses all of the Radio Frequency (RF) circuitry whereas the back-end encompasses all of the base-band processing circuitry. As used in this application, the term "circuitry" refers to both: (1) hardware-only implementations (such as implementations in only analog and/or digital circuitry), and (2) to combinations of circuitry and software (and/or firmware) (such as, if applicable to the particular context, to a combination of processor(s), including digital signal processor(s), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions). This definition of "circuitry" applies to all uses of this term in this application, including in any claims. As a further example, as used in this application and if applicable to the particular context, the term "circuitry" would also cover an implementation of merely a processor (or multiple processors) and its (or their) accompanying software/or firmware. The term "circuitry" would also cover if applicable to the particular context, for example, a baseband integrated circuit or applications processor integrated circuit in a mobile phone or a similar integrated circuit in a cellular network device or other network devices.
Pertinent internal components of the telephone include a Main Control Unit (MCU) 903, a Digital Signal Processor (DSP) 905, and a receiver/transmitter unit including a microphone gain control unit and a speaker gain control unit. A main display unit 907 provides a display to the user in support of various applications and mobile terminal functions that perform or support the steps of resolving geo-identification of un-structured and/or colloquial location information. The display 907 includes display circuitry configured to display at least a portion of a user interface of the mobile terminal (e.g., mobile telephone). Additionally, the display 907 and display circuitry are configured to facilitate user control of at least some functions of the mobile terminal. An audio function circuitry 909 includes a microphone 91 1 and microphone amplifier that amplifies the speech signal output from the microphone 91 1. The amplified speech signal output from the microphone 911 is fed to a coder/decoder (CODEC) 913.
A radio section 915 amplifies power and converts frequency in order to communicate with a base station, which is included in a mobile communication system, via antenna 917. The power amplifier (PA) 919 and the transmitter/modulation circuitry are operationally responsive to the MCU 903, with an output from the PA 919 coupled to the duplexer 921 or circulator or antenna switch, as known in the art. The PA 919 also couples to a battery interface and power control unit 920.
In use, a user of mobile terminal 901 speaks into the microphone 91 1 and his or her voice along with any detected background noise is converted into an analog voltage. The analog voltage is then converted into a digital signal through the Analog to Digital Converter (ADC) 923. The control unit 903 routes the digital signal into the DSP 905 for processing therein, such as speech encoding, channel encoding, encrypting, and interleaving. In one embodiment, the processed voice signals are encoded, by units not separately shown, using a cellular transmission protocol such as enhanced data rates for global evolution (EDGE), general packet radio service (GPRS), global system for mobile communications (GSM), Internet protocol multimedia subsystem (IMS), universal mobile telecommunications system (UMTS), etc., as well as any other suitable wireless medium, e.g., microwave access (WiMAX), Long Term Evolution (LTE) networks, code division multiple access (CDMA), wideband code division multiple access (WCDMA), wireless fidelity (WiFi), satellite, and the like, or any combination thereof.
The encoded signals are then routed to an equalizer 925 for compensation of any frequency- dependent impairments that occur during transmission though the air such as phase and amplitude distortion. After equalizing the bit stream, the modulator 927 combines the signal with a RF signal generated in the RF interface 929. The modulator 927 generates a sine wave by way of frequency or phase modulation. In order to prepare the signal for transmission, an up-converter 931 combines the sine wave output from the modulator 927 with another sine wave generated by a synthesizer 933 to achieve the desired frequency of transmission. The signal is then sent through a PA 919 to increase the signal to an appropriate power level. In practical systems, the PA 919 acts as a variable gain amplifier whose gain is controlled by the DSP 905 from information received from a network base station. The signal is then filtered within the duplexer 921 and optionally sent to an antenna coupler 935 to match impedances to provide maximum power transfer. Finally, the signal is transmitted via antenna 917 to a local base station. An automatic gain control (AGC) can be supplied to control the gain of the final stages of the receiver. The signals may be forwarded from there to a remote telephone which may be another cellular telephone, any other mobile phone or a land-line connected to a Public Switched Telephone Network (PSTN), or other telephony networks.
Voice signals transmitted to the mobile terminal 901 are received via antenna 917 and immediately amplified by a low noise amplifier (LNA) 937. A down-converter 939 lowers the carrier frequency while the demodulator 941 strips away the RF leaving only a digital bit stream. The signal then goes through the equalizer 925 and is processed by the DSP 905. A Digital to Analog Converter (DAC) 943 converts the signal and the resulting output is transmitted to the user through the speaker 945, all under control of a Main Control Unit (MCU) 903 which can be implemented as a Central Processing Unit (CPU) (not shown).
The MCU 903 receives various signals including input signals from the keyboard 947. The keyboard 947 and/or the MCU 903 in combination with other user input components (e.g., the microphone 911) comprise a user interface circuitry for managing user input. The MCU 903 runs a user interface software to facilitate user control of at least some functions of the mobile terminal 901 to resolve geo-identification of un-structured and/or colloquial location information. The MCU 903 also delivers a display command and a switch command to the display 907 and to the speech output switching controller, respectively. Further, the MCU 903 exchanges information with the DSP 905 and can access an optionally incorporated SIM card 949 and a memory 951. In addition, the MCU 903 executes various control functions required of the terminal. The DSP 905 may, depending upon the implementation, perform any of a variety of conventional digital processing functions on the voice signals. Additionally, DSP 905 determines the background noise level of the local environment from the signals detected by microphone 91 1 and sets the gain of microphone 91 1 to a level selected to compensate for the natural tendency of the user of the mobile terminal 901.
The CODEC 913 includes the ADC 923 and DAC 943. The memory 951 stores various data including call incoming tone data and is capable of storing other data including music data received via, e.g., the global Internet. The software module could reside in RAM memory, flash memory, registers, or any other form of writable storage medium known in the art. The memory device 951 may be, but not limited to, a single memory, CD, DVD, ROM, RAM, EEPROM, optical storage, magnetic disk storage, flash memory storage, or any other nonvolatile storage medium capable of storing digital data. An optionally incorporated SIM card 949 carries, for instance, important information, such as the cellular phone number, the carrier supplying service, subscription details, and security information. The SIM card 949 serves primarily to identify the mobile terminal 901 on a radio network. The card 949 also contains a memory for storing a personal telephone number registry, text messages, and user specific mobile terminal settings.
While the invention has been described in connection with a number of embodiments and implementations, the invention is not so limited but covers various obvious modifications and equivalent arrangements, which fall within the purview of the appended claims. Although features of the invention are expressed in certain combinations among the claims, it is contemplated that these features can be arranged in any combination and/order.

Claims

We Claim:
1. A method comprising:
causing, at least in part, a collection of point-of-interest information from one or more
sources;
processing and/or facilitating a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof;
causing, at least in part, an aggregation of one or more terms into one or more geo-spatial documents based, at least in part, on the location information, wherein the one or more geo-spatial documents are associated with one or more geographical areas; and processing and/or facilitating a processing of the one or more geo-spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
2. A method of claim 1 , further comprising:
processing and/or facilitating a processing of the one or more queries to determine one or more query terms; and
causing, at least in part, a probabilistic matching of the one or more query terms against the one or more geo-spatial documents,
wherein the resolution of the one or more queries is based, at least in part, on the
probabilistic matching.
3. A method according to any of claims 1 and 2, further comprising:
determining one or more location contexts of one or more terminals at least substantially concurrently with the collection of the point-of-interest information; and
causing, at least in part, an association of the one or more location contexts with the point-of- interest information.
4. A method of claim 3, further comprising:
causing, at least in part, a verification of the terms, the point-of-interest information, based, at least in part, on comparison against the one or more location contexts.
5. A method of claim 4, further comprising:
determining reliability information of the one or more sources based, at least in part, on the verification.
6. A method of claim 5, further comprising:
determining respective weights of the one or more terms, the point-of-interest information, or a combination thereof based, at least in part, on the reliability information,
wherein the one or more geo-spatial documents, the resolution of the one or more queries, or a combination thereof are based, at least in part, on the respective weights.
7. A method according to any of claims 3-6, wherein the one or more location contexts are determined based, at least in part, on one or more communications cell tower identifiers, one or more communications cell broadcast messages identifiers, one or more location sensor coordinates, or a combination thereof.
8. A method of claim 7, further comprising:
processing and/or facilitating a processing of the one or more terms to determine address information, one or more area names of the one or more geographical areas, or a combination thereof;
processing and/or facilitating a processing the one or more location contexts to determine one or more geographical coordinates; and
causing, at least in part, a mapping among the address information, the one or more area names, the one or more geographical coordinates, the one or more location contexts, or a combination thereof.
9. A method according to any of claims 1-8, further comprising:
causing, at least in part, a verification of the terms, the point-of-interest information, based, at least in part, on one or more confirmation replies associated with the one or more sources, the one or more devices, or a combination thereof.
10. A method according to any of claims 1-9, further comprising:
receiving the point-of-interest information as structured data including one or more fields for inputting, processing, storing, or a combination thereof the one or more terms.
11. A method of claim 10, wherein the one or more fields relate, at least in part, to one or more names of the points of interest, one or more addresses of the points of interest, one or more landmarks associated with the points of interest, or a combination thereof.
12. A method according to any of claims 1-11 , further comprising:
determining frequency information of the one or more terms in the point-of-interest
information, among the one or more sources, or a combination thereof; processing and/or facilitating a processing of the frequency information to determine one or more synonyms among the one or more terms, to resolve inconsistencies among the one or more terms, to correct errors in the one or more terms, or a combination thereof.
13. A method according to any of claims 1-12, wherein the one or more geographical areas, one or more portions of the one or more geographical areas, or a combination there are specified as one or more cells.
14. A method according to any of claims 1-13, wherein the collection is based, at least in part, on one or more crowd-sourcing mechanisms.
15. A method according to any of claims 1-14, wherein one or more geographical areas are defined multi-dimensionally.
16. An apparatus comprising:
at least one processor; and
at least one memory including computer program code,
the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following,
cause, at least in part, a collection of point-of-interest information from one or more sources;
process and/or facilitate a processing of the point-of-interest information to determine one or more terms that describe location information of one or more points of interest, the one or more sources, or a combination thereof;
cause, at least in part, an aggregation of one or more terms into one or more geo- spatial documents based, at least in part, on the location information, wherein the one or more geo-spatial documents are associated with one or more geographical areas; and
process and/or facilitate a processing of the one or more geo-spatial documents to cause, at least in part, a resolution of one or more queries over the one or more geographical areas.
17. An apparatus of claim 16, wherein the user interface is presented at a user device, and the apparatus is further caused to:
process and/or facilitate a processing of the one or more queries to determine one or more query terms; and cause, at least in part, a probabilistic matching of the one or more query terms against the one or more geo-spatial documents,
wherein the resolution of the one or more queries is based, at least in part, on the
probabilistic matching.
18. An apparatus according to any of claims 16 and 17, wherein the user interface is presented at a user device, and the apparatus is further caused to:
determine one or more location contexts of one or more terminals at least substantially
concurrently with the collection of the point-of-interest information; and
cause, at least in part, an association of the one or more location contexts with the point-of- interest information.
19. An apparatus of claim 18, wherein the user interface is presented at a user device, and apparatus is further caused to:
cause, at least in part, a verification of the terms, the point-of-interest information, based, at least in part, on comparison against the one or more location contexts.
20. An apparatus of claim 19, wherein the user interface is presented at a user device, and apparatus is further caused to:
determine reliability information of the one or more sources based, at least in part, on the verification.
21. An apparatus of claim 20, wherein the user interface is presented at a user device, and the apparatus is further caused to:
determine respective weights of the one or more terms, the point-of-interest information, or a combination thereof based, at least in part, on the reliability information,
wherein the one or more geo-spatial documents, the resolution of the one or more queries, or a combination thereof are based, at least in part, on the respective weights.
22. An apparatus according to any of claims 18-21 , wherein the one or more location contexts are determined based, at least in part, on one or more communications cell tower identifiers, one or more communications cell broadcast messages identifiers, one or more location sensor coordinates, or a combination thereof.
23. An apparatus of claim 22, wherein the user interface is presented at a user device, and the apparatus is further caused to: process and/or facilitate a processing of the one or more terms to determine address information, one or more area names of the one or more geographical areas, or a combination thereof;
process and/or facilitate a processing the one or more location contexts to determine one or more geographical coordinates; and
cause, at least in part, a mapping among the address information, the one or more area names, the one or more geographical coordinates, the one or more location contexts, or a combination thereof.
24. An apparatus according to any of claims 16-23, wherein the user interface is presented at a user device, and the apparatus is further caused to:
cause, at least in part, a verification of the terms, the point-of-interest information, based, at least in part, on one or more confirmation replies associated with the one or more sources, the one or more devices, or a combination thereof.
25. An apparatus according to any of claims 16-24, wherein the user interface is presented at a user device, and the apparatus is further caused to:
receive the point-of-interest information as structured data including one or more fields for inputting, processing, storing, or a combination thereof the one or more terms.
26. An apparatus of claim 25, wherein the one or more fields relate, at least in part, to one or more names of the points of interest, one or more addresses of the points of interest, one or more landmarks associated with the points of interest, or a combination thereof.
27. An apparatus according to any of claims 16-26, wherein the user interface is presented at a user device, and the apparatus is further caused to:
determine frequency information of the one or more terms in the point-of-interest
information, among the one or more sources, or a combination thereof;
process and/or facilitate a processing of the frequency information to determine one or more synonyms among the one or more terms, to resolve inconsistencies among the one or more terms, to correct errors in the one or more terms, or a combination thereof.
28. An apparatus according to any of claims 16-27, wherein the one or more geographical areas, one or more portions of the one or more geographical areas, or a combination there are specified as one or more cells.
29. An apparatus according to any of claims 16-28, wherein the collection is based, at least in part, on one or more crowd-sourcing mechanisms.
30. An apparatus according to any of claims 16-29, wherein one or more geographical areas are defined multi-dimensionally.
31. An apparatus according to any of claims 16-30, wherein the apparatus is a mobile phone further comprising:
user interface circuitry and user interface software configured to facilitate user control of at least some functions of the mobile phone through use of a display and configured to respond to user input; and
a display and display circuitry configured to display at least a portion of a user interface of the mobile phone, the display and display circuitry configured to facilitate user control of at least some functions of the mobile phone.
32. A computer-readable storage medium carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause an apparatus to perform at least a method of any of claims 1-15.
33. An apparatus comprising means for performing a method of any of claims 1-15.
34. An apparatus of claim 33, wherein the apparatus is a mobile phone further comprising: user interface circuitry and user interface software configured to facilitate user control of at least some functions of the mobile phone through use of a display and configured to respond to user input; and
a display and display circuitry configured to display at least a portion of a user interface of the mobile phone, the display and display circuitry configured to facilitate user control of at least some functions of the mobile phone.
35. A computer program product including one or more sequences of one or more instructions which, when executed by one or more processors, cause an apparatus to at least perform the steps of a method of any of claims 1-15.
36. A method comprising facilitating access to at least one interface configured to allow access to at least one service, the at least one service configured to perform a method of any of claims 1 -15.
37. A method comprising facilitating a processing of and/or processing (1) data and/or (2) information and/or (3) at least one signal, the (1) data and/or (2) information and/or (3) at least one signal based, at least in part, on the method of any of claims 1 -15.
38. A method comprising facilitating creating and/or facilitating modifying (1) at least one device user interface element and/or (2) at least one device user interface functionality, the (1) at least one device user interface element and/or (2) at least one device user interface functionality based, at least in part, on the method of any of claims 1 -15.
39. A method comprising:
causing, at least in part, transmission of a query to a geo-coding service; and
receiving a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof,
wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas.
40. A method according to claim 39, wherein the one or more geographical areas, one or more portions of the one or more geographical areas, or a combination there are specified as one or more cells.
41. A method according to any of claims 39 and 40, wherein the query contains one or more synonyms of the one or more terms.
42. A method according to any of claims 39-41 , wherein the one or more terms are collected form one or more sources based, at least in part, on one or more crowd-sourcing mechanisms.
43. A method according to any of claims 39-42, wherein the one or more geographical areas are defined multi-dimensionally.
44. A method according to any of claims 39-43, further comprising:
causing, at least in part, representation at a user interface the one or more geographical areas, one or more points of interests associated with the one or more geographical areas, address information associated with the one or more geographical areas, or a combination thereof.
45. A method according to any of claims 39-44, further comprising:
causing, at least in part, transmission of point-of-interest information to the geo-coding
service, the point-of-interest information being structured with one or more fields for inputting, processing, storing, or a combination thereof the one or more terms.
46. A method of claim 45, wherein the one or more fields relate, at least in part, to the one or more names of the points of interest, one or more addresses of the points of interest, one or more landmarks associated with the points of interest, or a combination thereof.
47. A method according to any of claims 45 and 46, further comprising:
receiving credit from the geo-coding service based, at least in part, on the transmission of point-of-interest information, reliability of the point-of-interest information, or a combination thereof.
48. A method of claim 47, wherein the reliability of the point-of-interest information is verified, based, at least in part, on comparison against one or more location contexts associated with the point-of-interest information.
49. A method of claim 48, wherein the one or more location contexts are determined based, at least in part, on one or more communications cell tower identifiers, one or more
communications cell broadcast messages identifiers, one or more location sensor coordinates, or a combination thereof.
50. A method according to any of claims 45-49, further comprising:
receiving a confirmation request from the geo-coding service for verifying the point-of- interest information, the one or more terms, or a combination thereof; and
causing, at least in part, transmission of a confirmation reply to the geo-coding service.
51. A method according to according to any of claims 45-50, further comprising:
causing, at least in part, spell-checking, auto-completing, or a combination thereof, when receiving the point-of-interest information at a user interface.
52. An apparatus comprising: at least one processor; and
at least one memory including computer program code,
the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following,
cause, at least in part, transmission of a query to a geo-coding service; and
receive a response to the query from the geo-coding service, the response containing one or more geographical areas, one or more area names of the one or more geographical areas, or a combination thereof,
wherein the one or more geographical areas are associated with one or more geo-spatial documents containing one or more terms probabilistically matched with one or more query terms in the query, and the one or more terms describe location information of one or more points of interest located within the one or more geographical areas.
53. An apparatus of claim 52, wherein the one or more geographical areas, one or more portions of the one or more geographical areas, or a combination there are specified as one or more cells.
54. An apparatus according to any of claims 52 and 53, wherein the query contains one or more synonyms of the one or more terms.
55. An apparatus according to any of claims 52-54, wherein the one or more terms are collected form one or more sources based, at least in part, on one or more crowd-sourcing mechanisms.
56. An apparatus according to any of claims 52-55, wherein the one or more geographical areas are defined multi-dimensionally.
57. An apparatus according to any of claims 52-56, wherein the user interface is presented at a user device, and the apparatus is further caused to:
cause, at least in part, representation at a user interface the one or more geographical areas, one or more points of interests associated with the one or more geographical areas, address information associated with the one or more geographical areas, or a combination thereof.
58. An apparatus according to any of claims 52-57, wherein the user interface is presented at a user device, and the apparatus is further caused to: cause, at least in part, transmission of point-of-interest information to the geo-coding service, the point-of-interest information being structured with one or more fields for inputting, processing, storing, or a combination thereof the one or more terms.
59. An apparatus of claim 58, wherein the one or more fields relate, at least in part, to the one or more names of the points of interest, one or more addresses of the points of interest, one or more landmarks associated with the points of interest, or a combination thereof.
60. An apparatus according to any of claims 58 and 59, wherein the user interface is presented at a user device, and the apparatus is further caused to:
receive credit from the geo-coding service based, at least in part, on the transmission of point-of-interest information, reliability of the point-of-interest information, or a combination thereof.
61. An apparatus of claim 60, wherein the reliability of the point-of-interest information is verified, based, at least in part, on comparison against one or more location contexts associated with the point-of-interest information.
62. An apparatus of claim 61 , wherein the one or more location contexts are determined based, at least in part, on one or more communications cell tower identifiers, one or more communications cell broadcast messages identifiers, one or more location sensor coordinates, or a combination thereof.
63. An apparatus according to any of claims 58-62, wherein the user interface is presented at a user device, and the apparatus is further caused to:
receive a confirmation request from the geo-coding service for verifying the point-of-interest information, the one or more terms, or a combination thereof; and
cause, at least in part, transmission of a confirmation reply to the geo-coding service.
64. An apparatus according to any of claims 58-63, wherein the user interface is presented at a user device, and the apparatus is further caused to:
cause, at least in part, spell-checking, auto-completing, or a combination thereof, when
receiving the point-of-interest information at a user interface.
65. A computer-readable storage medium carrying one or more sequences of one or more instructions which, when executed by one or more processors, cause an apparatus to perform at least a method of any of claims 39-51.
66. An apparatus comprising means for performing a method of any of claims 39-51.
67. An apparatus of claim 66, wherein the apparatus is a mobile phone further comprising: user interface circuitry and user interface software configured to facilitate user control of at least some functions of the mobile phone through use of a display and configured to respond to user input; and
a display and display circuitry configured to display at least a portion of a user interface of the mobile phone, the display and display circuitry configured to facilitate user control of at least some functions of the mobile phone.
68. A computer program product including one or more sequences of one or more instructions which, when executed by one or more processors, cause an apparatus to at least perform the steps of a method of any of claims 39-51.
69. A method comprising facilitating access to at least one interface configured to allow access to at least one service, the at least one service configured to perform a method of any of claims 39-51.
70. A method comprising facilitating a processing of and/or processing (1) data and/or (2) information and/or (3) at least one signal, the (1) data and/or (2) information and/or (3) at least one signal based, at least in part, on the method of any of claims 39-51.
71. A method comprising facilitating creating and/or facilitating modifying (1) at least one device user interface element and/or (2) at least one device user interface functionality, the (1) at least one device user interface element and/or (2) at least one device user interface functionality based, at least in part, on the method of any of claims 39-51.
PCT/FI2012/050470 2011-06-16 2012-05-16 Method and apparatus for resolving geo-identity WO2012172160A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201280029185.8A CN103609144A (en) 2011-06-16 2012-05-16 Method and apparatus for resolving geo-identity

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN2051CH2011 2011-06-16
IN2051/CHE/2011 2011-06-16

Publications (1)

Publication Number Publication Date
WO2012172160A1 true WO2012172160A1 (en) 2012-12-20

Family

ID=47356588

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2012/050470 WO2012172160A1 (en) 2011-06-16 2012-05-16 Method and apparatus for resolving geo-identity

Country Status (2)

Country Link
CN (1) CN103609144A (en)
WO (1) WO2012172160A1 (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103500217A (en) * 2013-10-09 2014-01-08 北京火信网络科技有限公司 Method and system for providing service of identification of region of interest
WO2014159007A1 (en) * 2013-03-14 2014-10-02 Microsoft Corporation Dynamically expiring crowd-sourced content
WO2015007945A1 (en) * 2013-07-18 2015-01-22 Nokia Corporation Method and apparatus for updating points of interest information via crowdsourcing
US20150372966A1 (en) * 2014-06-18 2015-12-24 Yahoo Inc. System and method for address based locations
CN105550169A (en) * 2015-12-11 2016-05-04 北京奇虎科技有限公司 Method and device for identifying point of interest names based on character length
TWI637646B (en) * 2017-06-05 2018-10-01 中華電信股份有限公司 Method and system based on mobile communication positioning technology combined with user profile and interest landmark-assisted location strategy
US10346389B2 (en) 2013-09-24 2019-07-09 At&T Intellectual Property I, L.P. Facilitating determination of reliability of crowd sourced information
CN110968654A (en) * 2018-09-29 2020-04-07 阿里巴巴集团控股有限公司 Method, equipment and system for determining address category of text data
CN111176456A (en) * 2014-06-17 2020-05-19 谷歌有限责任公司 Input method editor for inputting geographical location names
CN111506676A (en) * 2019-01-30 2020-08-07 菜鸟智能物流控股有限公司 Geographic data correction method, device, equipment and storage medium
CN111694919A (en) * 2020-06-12 2020-09-22 北京百度网讯科技有限公司 Method and device for generating information, electronic equipment and computer readable storage medium
US10972873B2 (en) 2018-12-17 2021-04-06 Here Global B.V. Enhancing the accuracy for device localization
CN112783992A (en) * 2019-11-08 2021-05-11 腾讯科技(深圳)有限公司 Map functional area determining method and device based on interest points
CN112966192A (en) * 2021-02-09 2021-06-15 北京百度网讯科技有限公司 Region address naming method and device, electronic equipment and readable storage medium
EP3971731A4 (en) * 2019-05-15 2022-06-01 Tencent Technology (Shenzhen) Company Limited Fence address-based coordinate data processing method and apparatus, and computer device
US20220300552A1 (en) * 2019-06-04 2022-09-22 Schlumberger Technology Corporation Applying geotags to images for identifying exploration opportunities
US11526670B2 (en) 2018-09-28 2022-12-13 The Mitre Corporation Machine learning of colloquial place names
EP4174712A4 (en) * 2021-05-24 2024-03-20 Beijing Baidu Netcom Sci & Tech Co Ltd Coding method and apparatus for geographic location area, and method and apparatus for establishing coding model

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104133918B (en) * 2014-08-15 2019-07-02 百度在线网络技术(北京)有限公司 A kind of acquisition methods and device, method for pushing and device of interest point information
CN105813013A (en) * 2014-12-29 2016-07-27 深圳市腾讯计算机系统有限公司 Information prompting method, device and system
GB2541710B (en) * 2015-08-27 2017-12-13 Hitachi Ltd Locating train events on a railway network
US10083187B2 (en) 2015-08-28 2018-09-25 International Business Machines Corporation Generating geographic borders
CN108885118B (en) * 2016-01-06 2022-11-18 罗伯特·博世有限公司 System and method for providing a multimodal visual time-aware graphical display
DE102016209568B3 (en) * 2016-06-01 2017-09-21 Volkswagen Aktiengesellschaft Methods, apparatus and computer programs for capturing measurement results from mobile devices
CN108021638B (en) * 2017-11-28 2022-01-14 上海电科智能系统股份有限公司 Offline geocoding unstructured address resolution system
CN108875013B (en) * 2018-06-19 2022-05-27 百度在线网络技术(北京)有限公司 Method and device for processing map data
CN109408739A (en) * 2018-09-22 2019-03-01 北京微播视界科技有限公司 The rendering method of point of interest and device, terminal, storage medium in media information
CN111382218B (en) * 2018-12-29 2023-09-26 北京嘀嘀无限科技发展有限公司 System and method for searching point of interest (POI)
US11402220B2 (en) * 2019-03-13 2022-08-02 Here Global B.V. Maplets for maintaining and updating a self-healing high definition map
CN110990651B (en) * 2019-12-05 2021-06-04 同盾控股有限公司 Address data processing method and device, electronic equipment and computer readable medium
CN111597279B (en) * 2020-03-31 2023-07-25 平安科技(深圳)有限公司 Information prediction method based on deep learning and related equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050278378A1 (en) * 2004-05-19 2005-12-15 Metacarta, Inc. Systems and methods of geographical text indexing
US20060197763A1 (en) * 2002-02-11 2006-09-07 Landnet Corporation Document geospatial shape tagging, searching, archiving, and retrieval software
US20090112812A1 (en) * 2007-10-29 2009-04-30 Ellis John R Spatially enabled content management, discovery and distribution system for unstructured information management
US20090156229A1 (en) * 2007-12-13 2009-06-18 Garmin Ltd. Automatically identifying location information in text data
US20100179754A1 (en) * 2009-01-15 2010-07-15 Robert Bosch Gmbh Location based system utilizing geographical information from documents in natural language

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060197763A1 (en) * 2002-02-11 2006-09-07 Landnet Corporation Document geospatial shape tagging, searching, archiving, and retrieval software
US20050278378A1 (en) * 2004-05-19 2005-12-15 Metacarta, Inc. Systems and methods of geographical text indexing
US20090112812A1 (en) * 2007-10-29 2009-04-30 Ellis John R Spatially enabled content management, discovery and distribution system for unstructured information management
US20090156229A1 (en) * 2007-12-13 2009-06-18 Garmin Ltd. Automatically identifying location information in text data
US20100179754A1 (en) * 2009-01-15 2010-07-15 Robert Bosch Gmbh Location based system utilizing geographical information from documents in natural language

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014159007A1 (en) * 2013-03-14 2014-10-02 Microsoft Corporation Dynamically expiring crowd-sourced content
US8983976B2 (en) 2013-03-14 2015-03-17 Microsoft Technology Licensing, Llc Dynamically expiring crowd-sourced content
US9436695B2 (en) 2013-03-14 2016-09-06 Microsoft Technology Licensing, Llc Dynamically expiring crowd-sourced content
WO2015007945A1 (en) * 2013-07-18 2015-01-22 Nokia Corporation Method and apparatus for updating points of interest information via crowdsourcing
US11468036B2 (en) 2013-09-24 2022-10-11 At&T Intellectual Property I, L.P. Facilitating determination of reliability of crowd sourced information
US10346389B2 (en) 2013-09-24 2019-07-09 At&T Intellectual Property I, L.P. Facilitating determination of reliability of crowd sourced information
CN103500217A (en) * 2013-10-09 2014-01-08 北京火信网络科技有限公司 Method and system for providing service of identification of region of interest
CN111176456B (en) * 2014-06-17 2023-06-06 谷歌有限责任公司 Input method editor for inputting geographic location names
CN111176456A (en) * 2014-06-17 2020-05-19 谷歌有限责任公司 Input method editor for inputting geographical location names
US20150372966A1 (en) * 2014-06-18 2015-12-24 Yahoo Inc. System and method for address based locations
US9661066B2 (en) * 2014-06-18 2017-05-23 Yahoo! Inc. System and method for address based locations
CN105550169A (en) * 2015-12-11 2016-05-04 北京奇虎科技有限公司 Method and device for identifying point of interest names based on character length
TWI637646B (en) * 2017-06-05 2018-10-01 中華電信股份有限公司 Method and system based on mobile communication positioning technology combined with user profile and interest landmark-assisted location strategy
US11526670B2 (en) 2018-09-28 2022-12-13 The Mitre Corporation Machine learning of colloquial place names
CN110968654B (en) * 2018-09-29 2023-10-20 阿里巴巴集团控股有限公司 Address category determining method, equipment and system for text data
CN110968654A (en) * 2018-09-29 2020-04-07 阿里巴巴集团控股有限公司 Method, equipment and system for determining address category of text data
US10972873B2 (en) 2018-12-17 2021-04-06 Here Global B.V. Enhancing the accuracy for device localization
CN111506676B (en) * 2019-01-30 2023-03-24 菜鸟智能物流控股有限公司 Geographic data correction method, device, equipment and storage medium
CN111506676A (en) * 2019-01-30 2020-08-07 菜鸟智能物流控股有限公司 Geographic data correction method, device, equipment and storage medium
EP3971731A4 (en) * 2019-05-15 2022-06-01 Tencent Technology (Shenzhen) Company Limited Fence address-based coordinate data processing method and apparatus, and computer device
US20220300552A1 (en) * 2019-06-04 2022-09-22 Schlumberger Technology Corporation Applying geotags to images for identifying exploration opportunities
US11797605B2 (en) * 2019-06-04 2023-10-24 Schlumberger Technology Corporation Applying geotags to images for identifying exploration opportunities
CN112783992A (en) * 2019-11-08 2021-05-11 腾讯科技(深圳)有限公司 Map functional area determining method and device based on interest points
CN112783992B (en) * 2019-11-08 2023-10-20 腾讯科技(深圳)有限公司 Map functional area determining method and device based on interest points
CN111694919A (en) * 2020-06-12 2020-09-22 北京百度网讯科技有限公司 Method and device for generating information, electronic equipment and computer readable storage medium
CN112966192A (en) * 2021-02-09 2021-06-15 北京百度网讯科技有限公司 Region address naming method and device, electronic equipment and readable storage medium
CN112966192B (en) * 2021-02-09 2023-10-27 北京百度网讯科技有限公司 Regional address naming method, apparatus, electronic device and readable storage medium
EP4174712A4 (en) * 2021-05-24 2024-03-20 Beijing Baidu Netcom Sci & Tech Co Ltd Coding method and apparatus for geographic location area, and method and apparatus for establishing coding model

Also Published As

Publication number Publication date
CN103609144A (en) 2014-02-26

Similar Documents

Publication Publication Date Title
WO2012172160A1 (en) Method and apparatus for resolving geo-identity
US11386167B2 (en) Location-based searching using a search area that corresponds to a geographical location of a computing device
US8341185B2 (en) Method and apparatus for context-indexed network resources
US8204886B2 (en) Method and apparatus for preparation of indexing structures for determining similar points-of-interests
US10956938B2 (en) Method and apparatus for associating commenting information with one or more objects
US8457653B2 (en) Method and apparatus for pre-fetching location-based data while maintaining user privacy
JP5602864B2 (en) Location-based service middleware
US8725706B2 (en) Method and apparatus for multi-item searching
US10234305B2 (en) Method and apparatus for providing a targeted map display from a plurality of data sources
US20170067748A1 (en) Location-Based Search Refinements
US10001384B2 (en) Method and apparatus for the retrieval of similar places
US20140074871A1 (en) Device, Method and Computer-Readable Medium For Recognizing Places
EP2706496A1 (en) Device, method and computer-readable medium for recognizing places in a text
US20090186631A1 (en) Location Based Information Related to Preferences
WO2013144435A1 (en) Method and apparatus for geo-coding unstructured address information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12800043

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12800043

Country of ref document: EP

Kind code of ref document: A1