US20080189249A1 - Searching Structured Geographical Data - Google Patents

Searching Structured Geographical Data Download PDF

Info

Publication number
US20080189249A1
US20080189249A1 US11/671,306 US67130607A US2008189249A1 US 20080189249 A1 US20080189249 A1 US 20080189249A1 US 67130607 A US67130607 A US 67130607A US 2008189249 A1 US2008189249 A1 US 2008189249A1
Authority
US
United States
Prior art keywords
data
structured
structured document
data sets
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US11/671,306
Other versions
US7836085B2 (en
Inventor
Artem Petakov
David Minogue
Alexey Spiridonov
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Priority to US11/671,306 priority Critical patent/US7836085B2/en
Assigned to GOOGLE INC. reassignment GOOGLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: SPIRIDONOV, ALEXEY, MINOGUE, DAVID, PETAKOV, ARTEM
Priority to BRPI0807172-1A2A priority patent/BRPI0807172A2/en
Priority to JP2009548491A priority patent/JP5336391B2/en
Priority to CN200880010447XA priority patent/CN101647020B/en
Priority to EP08728954.2A priority patent/EP2118779A4/en
Priority to CA2677307A priority patent/CA2677307C/en
Priority to KR1020097017280A priority patent/KR101450358B1/en
Priority to AU2008213993A priority patent/AU2008213993A1/en
Priority to PCT/US2008/052945 priority patent/WO2008097921A1/en
Publication of US20080189249A1 publication Critical patent/US20080189249A1/en
Priority to US12/945,600 priority patent/US8200704B2/en
Publication of US7836085B2 publication Critical patent/US7836085B2/en
Application granted granted Critical
Assigned to GOOGLE LLC reassignment GOOGLE LLC CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: GOOGLE INC.
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Definitions

  • This disclosure relates to providing results to user searches.
  • Query processing systems are commonly used to locate information from large data collections. Exemplary systems include those that identify relevant web pages responsive to one or more user search terms entered by a user seeking to identify relevant web content.
  • search results can be identified by matching the terms in the search query to a corpus of pre-stored web pages.
  • Data collections can also include structured documents that can include a potentially large amount of data, of which a small subset is pertinent to particular search.
  • An exemplary structured document is a Keyhole Markup Language (KML) document, which is an XML-based file format used to display geographic data in a browser, such as ‘Google Earth’.
  • KML Keyhole Markup Language
  • a KML document utilizes a tag-based structure with nested elements and attributes, and can be used to associate descriptive text, models, and images with locations on the earth's surface.
  • search systems are adept at identifying documents which, as a whole, match the individual terms of a query, they are incapable of identifying the elements of structured documents which, in context, match the parameters of a query.
  • search systems may not return only most relevant data stored within a KML document. Therefore, users are unable to search structured documents based on their content, such as nested elements and attributes. For instance, a user is unable to search for elements of KML files by specifying a geographic area of interest, by filtering KML files based on keywords, or by specifying a combination of such search queries.
  • the following disclosure describes systems,, methods, and computer program products that permit the identification of search query results.
  • the method includes identifying data from one or more data sources, where the data is associated with at least one structured document, and extracting one or more data sets contained within the at least one structured document, The method further includes adding one or more record items to a searchable database, where the one or more record items correspond to the one or more extracted data sets.
  • the method can include retrieving the data from the one or more data sources. Further, at least one of the one or more data sources can include one or more uniform resource locators (URLs). According to an aspect the data is at least one structured document. The data can also include metadata, such as a page rank. According to yet another aspect, the method can include identifying metadata associated with, the data subsequent to identifying the data from the one or more data sources.
  • URLs uniform resource locators
  • the method can include generating an output file, where the output file includes data associated with two or more structured documents. Extracting one or more data sets can also include extracting one or more data sets from the output file. Additionally, according to an aspect, the at least one structured document can include two or more structured documents, and the method can further include merging the two or more structured documents.
  • the at least one structured document comprises a Keyhole Markup Language (KML) document
  • the one or more data sets can include at least one placemark.
  • the method can also include receiving at least one search query, and identifying at least one of the one or more record items responsive to receiving the at least one search query.
  • extracting one or more data sets contained within the at least one structured document can include associating the one or more data sets with contextual information associated with the at least one structured document
  • FIG. 1 shows a search system, according to an illustrative implementation.
  • FIG. 2 shows a components of a server within the search system of FIG. 1 , according to an illustrative implementation.
  • FIG. 3 shows a search system, according to an illustrative implementation
  • FIG. 4 shows an illustrative KML file.
  • FIG. 5 shows exemplary processes for collecting and merging documents and metadata from one or more data sources, according to an implementation.
  • FIG, 6 shows an illustrative sample output file that includes a single file indexed by URL.
  • FIG. 7 shows and exemplary process flow for extraction of structured files, according to an implementation.
  • FIG. 8 shows an illustrative sample output file including multiple record items corresponding to a single URL.
  • FIG. 9 shows a process of structured document collection and extraction, according to an implementation.
  • the present disclosure describes a search, system that permits the collection of
  • structured documents and the extraction of data sets within such structured documents such that the individual data sets may be searched and retrieved in response to a user search query.
  • a KML file having several placemarks may be extracted such that the individual placemarks are searchable records that may be returned as search results to a user query.
  • the extraction of data sets from within a structured document is performed such that contextual information associated with the structured document is maintained subsequent to extraction.
  • These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means that implement the functions) specified in the flowchart block or blocks.
  • the computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions that execute on the computer or other programmable apparatus provide steps for implementing the function(s) specified in the flowchart block, or blocks.
  • blocks of the block diagrams and flowchart illustrations support combinations of means for performing the specified functions, combinations of steps for performing the specified functions and program instruction means for performing the specified functions. It will also be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware-based computer systems that perform the specified functions or steps, or combinations of special purpose hardware and computer instructions.
  • the system 100 includes one or more clients 115 in communication with a server 120 via one or more networks 140 .
  • clients 115 and a single server 120 are illustrated in FIG. 1 , there can be more servers and more or fewer clients.
  • some of the functions performed by the server 120 can be performed by one or more other servers such that the server 120 can represent several devices, such as a network of computer processors and/or servers.
  • a client can perform a function of the server 120 and the server 120 can perform a function of a client.
  • the clients 115 can include a device, such as a personal computer, a wireless telephone, a personal digital assistant (PDA), a lap top computer, or another type of computation or communication device, a thread or process running on one of these devices, and/or an object executable by one of these devices.
  • a device such as a personal computer, a wireless telephone, a personal digital assistant (PDA), a lap top computer, or another type of computation or communication device, a thread or process running on one of these devices, and/or an object executable by one of these devices.
  • the system 100 also includes one or more data sources 105 in communication with the server 120 via one or more networks 140 ,
  • the server 120 can collect and/or receive data from one or more data sources 105 and manipulate the data to generate a response to a search query received from one or more clients 115 .
  • the network(s) 140 can include one or more local, area networks (LANs), wide area networks (WANs), telephone networks, such as the Public Switched Telephone Network (PSTN), intranets, the Internet, and/or or other type of network.
  • the clients 115 , data sources 105 , and server 120 can connect to the network(s) 140 via wired, wireless, or optical or other connections.
  • one or more of the devices illustrated in FIG. 1 are directly connected to another one of the devices.
  • the clients 115 and/or data sources 105 are directly connected to the server 120 .
  • FIG. 2 shows the server 120 of FIG. 1 , according to an illustrative implementation.
  • the server 120 can include a bus 210 , a processor 220 , a main memory 230 , a read only memory (ROM) 240 , a storage device 250 , one or more input devices 260 , one or more output devices 270 , and a communication interface 280 .
  • the bus 210 can include one or more paths that permit communication among the components of server 120 .
  • the processor 220 includes any type of conventional processor, microprocessor or processing logic that interprets and executes instructions.
  • the main memory 230 can include a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by processor 220 .
  • the ROM 240 can include a conventional ROM device or another type of static storage device that stores static information and instructions for use by the processor 220 , including, for instance, an operating system.
  • the storage device 250 can include a magnetic and/or optical recording medium and its corresponding drive,
  • the server 120 can also include an input device 260 having one or more conventional mechanisms that permit a user to input information to the server 120 , such as a keyboard, a mouse, a pen, voice recognition and/or biometric mechanisms, or the like.
  • the output device 270 includes one or more conventional mechanisms that output information to the user, such as a display, a printer, a speaker, or the like.
  • the communication interface 280 can include any transceiver-like mechanism that enables the server 120 to communicate with other devices and/or systems.
  • the communication interface 280 can include mechanisms for communicating with another device or system via one or more networks, such as the network(s) 140 .
  • the server 120 can store collect documents and data associated with documents from one or more data sources 105 , manipulate such documents and data, and can identify results responsive to client search queries. In one implementation, the server 120 performs these operations in response to the processor 220 executing software instructions contained in a computer-readable medium, such as memory 230 .
  • the software instructions can be read into the memory 230 from another computer readable medium, such as a data storage device 250 , or from another device via the communication interface 280 .
  • the software instructions contained in the memory 230 cause processor 220 to perform processes described in this patent disclosure.
  • hardwired circuitry can be used in place of or in combination with software instructions to implement processes consistent with the disclosure. Thus, implementations are not limited to any specific combination of hardware circuitry and software.
  • a search system 300 is shown according to an illustrative implementation.
  • the search system 300 is similar to the system 100 shown in FIG. 1 , and includes one or more data sources 305 and one or more clients 315 in communication with a server 320 via one or more networks 340 .
  • the data sources 305 and/or clients 315 may be geographically dispersed and/or local to the server 320 , and can communicate with the server 320 directly in alternative to communicating via the one or more networks 340 .
  • the one or more data sources 305 and/or one or more clients 315 can be local to the server 320 .
  • the one or more data sources 305 can include one or more data repositories, web pages, and the like, that include documents and metadata associated with the documents.
  • the documents can include structured documents, such as KML files.
  • KML file is an XML-based data or file format used to display geographic data in a browser, such as ‘Google Earth’, ‘Google Maps’, and ‘Google Maps for mobile’, and utilizes a tag-based structure with nested elements and attributes.
  • KML files can be used to associate descriptive text, models, and images with locations on the earth's surface.
  • each georeferenced entity is called a placemark, which can be georeferenced to points, areas, or paths.
  • An illustrative KML file 400 that identifies a simple placemark is shown in FIG, 4 .
  • the KML file 400 generally includes an XML header, which Is the first line in the KML file 400 , a KML namespace definition, which is the second line in the KML file 400 , and at least one placemark object that includes several elements including a name, a description, and a point.
  • a KML file can contain multiple placemarks, and the placemarks within a KML file may be organized in a hierarchy of enclosing folders.
  • a KML file can also include other descriptive features, and can include descriptive HTML to add links, font sizes, styles, colors, identify text alignment, and the like.
  • structured documents and metadata from the one or more data sources 305 are received by the server 320 .
  • the one or more data sources 305 can include public and/or private repositories of data.
  • documents and metadata are collected by the data collection module 325 as a result of data being pulled from the data sources 305 by the data collection module 325 .
  • Data sources may alternatively or additionally push, or transmit, documents and metadata to the data collection module 325 , either automatically and/or upon a request by the data collection module 325 for data.
  • KML files and their associated metadata may be received by the data collection module 325 .
  • the metadata associated with a KML file can include, among other data, a page rank that identifies, relative to an arbitrary numbering scheme, the relative rank of the file identifying the number of links (or popularity) of the file, the number of downloads of the file, and/or other metadata. Metadata associated, with a document can come from a different source than the document itself, such as from a separate document or database.
  • the present disclosure is operative with any structured data format that may be used to link data, such as location names, descriptive text, images, geographic references, and the like.
  • the one or more data sources 305 can also transmit GeoRSS files to the server 320 , where GeoRSS files contain HTML and typically reference a geographic location. Still other files, such as KMZ files, may be utilized.
  • the data collection module 325 merges the structured documents and metadata and provides an indexed output file to an Indexing module 330 .
  • the indexing module 330 is operable to parse the indexed output file received from the data collection, module 325 to identify one or more data sets, calculate a query independent rank for extracted data sets, and forward record items corresponding to the extracted data sets for storage in a results repository 335 used to respond to user queries.
  • KML documents and metadata generated by the data collection module 325 can be fed as an input into the indexing module 338 , which can parse the KML files, extract placemarks, calculate a query independent rank for each placemark, and provide each placemark individually for insertion into a search repository 335 used to respond to user search queries.
  • each of the data collection module 325 , indexing module 330 , and results repository 335 are illustrated as internal to the server 320 , and may be implemented by software instructions stored within a memory 230 , or other components of the illustrative server 120 shown in FIG. 2 , one or more the data collection module 325 , indexing module 330 , and/or results repository 335 may reside external to the server 320 .
  • one or more of the components 325 , 330 , 335 may reside in one or more separate servers,
  • the components 325 , 330 , 335 can also be combined in whole or part in one or more components. Therefore, the block diagram implementation of the illustrative system 300 shown in FIG. 3 is intended to represent various functions of the system 300 without limitation to specific software and/or hardware that can implement the functions described herein.
  • FIG. 5 shows an exemplary functional block diagram flow chart 500 illustrating the collection of documents and metadata from one or more data sources by the data collection module 325 .
  • the data collection module 325 is operable to identity and retrieve structured documents and any metadata associated with such documents, for instance, KML documents and associated metadata.
  • tire data collection module 325 can utilize a web crawl program to identify structured documents existing on the world wide web (‘web’).
  • a web crawl program browses the web, creating a copy of visited pages, and creates an index or table of URLs it encounters.
  • the web crawl program is operable to generate URLs 505 associated with, or identifying, the individual pages identified by the web crawl program.
  • the URLs can identify structured documents and/or metadata associated with structured documents.
  • structured files may be examined and/or converted into a separate file format to enable the contents of the structured files to be examined and/or searched.
  • a structured KML file can have an HTML file associated with it (and identified by a URL) that describes the contents of the KML file, where the HTML file may be relevant for ranking and/or indexing the KML file.
  • URLs included within the metadata including URLS for each structured document, are forwarded to a page rank database 535 , which may include additional metadata associated with each structured document,
  • the data collection, module 325 can examine the URLs 505 and identify all structured documents of interest, such as all KML documents. Once the entries are identified the data collection module 325 executes a document fetch 515 to retrieve the structured documents 520 associated with the identified URLs, Additional structured documents and meta data 530 can be collected from other public and/or private data repositories 525 . Each data source may include different metadata 530 associated with a particular structured file, such as the number of times the file was downloaded from a particular site, user feedback, or the like. URLs for each structured document collected from other public and/or private data repositories 525 are forwarded to a page rank database 535 , which may include additional metadata associated with each structured document.
  • the data collection module 325 can attempt to lookup a page rank of each discovered structured file.
  • the data collection module queries a page rank database 535 for identified documents by attempting to fetch a page rank of the URLs corresponding to each structured document discovered in the web crawl or public and/or private data repositories.
  • the page rank may be used by the search system to prioritize results to user queries.
  • the page rank of URLs is not fetched for identified documents. Still, other metadata may be looked up for a corresponding URL, for instance, from one or more other databases,
  • Metadata keyed by URL 540 and collected from the page rank database 535 is merged 545 with the structured documents and metadata identified from the web crawl and/or public or private data repositories.
  • This data can be passed through one or more de-duplication stages to eliminate duplicate documents.
  • two identical files, each downloaded from a respective different URL may be identified.
  • One of the two identical files may be deleted by a duplication elimination (or de-duplication) stage.
  • a URL associated with a particular page rank may be merged with the same URL associated with a document. Merging of metadata and documents may occur by URL and/or merging on the raw document contents.
  • the data collection module 325 can convert all discovered structured documents and metadata into a common format to generate an output file 550 .
  • the output of the data collection module 325 is a single indexed output file 550 in which each record contains a structured document and all of the associated metadata to be used to score the record to identify whether it is an appropriate response to a user search query.
  • the output file can include a table of records indexed by URL, where all of the information associated with each URL is in a record associated with the URL.
  • FIG. 6 shows a sample output file 600 generated from collection of documents and metadata from one or more data sources by the data collection module 325 .
  • the output file includes a record associated with a source URL identifying a KML file associated with “Google Offices”.
  • the contents include several placemarks corresponding to different Google offices around the world, including names and coordinates for “headquarters in Mountain View”, “New York City”, and “Tokyo Office”.
  • Metadata associated with the source URL identifies die page rank of the URL, and the number of downloads of the file, for instance, provided by the web site from which it was available.
  • the metadata can also include an anchor, such as a URL, that is associated with each placemark.
  • the single output file 550 can include two or more records.
  • the output file 550 generated by the data collection module 325 is transmitted to the indexing module 330 .
  • the indexing module 330 is operable to extract data sets from the records within the output file while preserving contextual, information.
  • the indexing module 330 is operable to extract placemarks from, a single KML file that may include a large number of placemarks, where the extraction preserves contextual text, such as parent folders, referring pages, and the like.
  • a KML file that includes several placemarks, each associated with a particular hotel within the city of London (which may be a parent folder within which the hotels are identified), may be extracted such that each placemark becomes an individual searchable item associated with the city of London.
  • FIG. 7 shows an exemplary process flow 700 implemented by the indexing module 300 to perform indexing.
  • the indexing module 330 can transform each record indexed by URL into multiple records items per URL.
  • indexing of structured documents can he implemented by taking the output file 550 , parsing the structured records to extract individual data sets (e.g., placemarks) (block 710 ), calculating a query independent rank for each data set, and transmitting record items corresponding to the data sets to a searchable repository 335 .
  • placemarks e.g., placemarks
  • the indexing module 330 transforms a record indexed by URL into multiple records items per URL, where each record item is indexed by a document ID, which is an number chosen by the indexing module 330 to uniquely identify a record item, such as a placemark.
  • the document ID can be generated as a hash value from selected fields of a record. For instance, the document ID may be based in part on geo coordinates identifying the location of a placemark.
  • the indexing module 330 associates the contextual information from the original record with each data set. For instance, each placemark extracted from a KML file will preserve its context information, including the URL and other metadata of the corresponding structured file and the name of each enclosing folder, in addition to the placemark's descriptive text and other data, such as georeference data. Additionally, for each placemark the indexing module 330 can calculate a query independent score based on the available metadata. This may utilize one or more rankings from other databases (not illustrated).
  • the indexing module 330 is further operable to eliminate duplicate record items (block 715 ) based on like metadata. For instance, duplicate placemarks may be eliminated based on comparison of the fingerprint of the location and placemark name. The placemark with the highest score based on the available metadata may be retained.
  • the indexing module 330 is optionally operable to cluster data sets within structured documents into a compound search result where the data sets are related. For instance, subsequent to extracting data sets from one or more records indexed by URLs, one or more data sets may be combined, or clustered, into a single compound search result where they refer to the same physical entity. This maybe useful to improve the diversity of results. For instance, a user search query with the terms ‘statue’ for New York City may would return placemarks having the highest score, which may all be placemarks identifying the Statue of Liberty. Clustering all results for the Statue of Liberty will permit a search result that provides one compound result for the Statue of Liberty such that other statue results can also be provided to a user.
  • a serving module can perform dynamic clustering that is based at least in part on the user's search query. For instance, continuing with the Statue of Liberty example, a search for ‘Statue of Liberty’ and ‘Tours’ may result in clustering based on a user search term in addition to static terms included within records indexed by URLs.
  • the record items identified by the indexing module 330 are listed individually and provided to the searchable repository for use in responding to user queries (blocks 720 , 730 ).
  • An illustrative example of an indexing module 330 output 800 that corresponds to the illustrative output file 600 of FIG. 6 is shown in FIG. 8 .
  • the placemarks identified in FIG, 6 as existing within a single record entry have been extracted into separate record items 805 , 810 , 815 , This permits a user to search for content associated with individual record items extracted from a structured document.
  • FIG. 9 shows a process of structured document collection and extraction, according to an implementation.
  • Data is collected from one or more data sources (block 905 ), where the data can include one or more structured documents and metadata associated therewith.
  • the data collection can he effected by the data collection module 325 , which can collect data from the one or more data sources 305 .
  • structured documents and associated metadata is identified (block 910 ).
  • structured documents and metadata associated therewith may be merged, for instance, by URL, Duplicate entries may also be merged based on other keys, such as based on the document contents (block 920 ).
  • An output file is then generated (block 930 ).
  • the identification of structured data, the merging of structured documents and metadata, and/or the generation of an output file can also be performed by the data collection module 325 .
  • Record items are then created from each record indexed within the output file by extracting data sets from each output file record. For instance, where the output file record includes a KML file indexed by URL, record items can be created that correspond to extracted placemarks within the KML file (block 940 ). According to an implementation, the extraction of data sets and generation of record items can be performed by the indexing module 330 . After extraction is complete, the record items (or table of record items) are added to a searchable database (block 950 ).

Abstract

Data is identified from one or more data sources, where the data is associated with at least one structured document. Data sets contained within the at least one structured document are extracted, and one or more record items are added to a searchable database, where the one or more record items correspond to the extracted data sets.

Description

    FIELD
  • This disclosure relates to providing results to user searches.
  • BACKGROUND
  • Query processing systems are commonly used to locate information from large data collections. Exemplary systems include those that identify relevant web pages responsive to one or more user search terms entered by a user seeking to identify relevant web content. In a web page search system, search results can be identified by matching the terms in the search query to a corpus of pre-stored web pages.
  • Data collections can also include structured documents that can include a potentially large amount of data, of which a small subset is pertinent to particular search. An exemplary structured document is a Keyhole Markup Language (KML) document, which is an XML-based file format used to display geographic data in a browser, such as ‘Google Earth’. A KML document utilizes a tag-based structure with nested elements and attributes, and can be used to associate descriptive text, models, and images with locations on the earth's surface.
  • Although web page search systems are adept at identifying documents which, as a whole, match the individual terms of a query, they are incapable of identifying the elements of structured documents which, in context, match the parameters of a query. As an illustrative example, search systems may not return only most relevant data stored within a KML document. Therefore, users are unable to search structured documents based on their content, such as nested elements and attributes. For instance, a user is unable to search for elements of KML files by specifying a geographic area of interest, by filtering KML files based on keywords, or by specifying a combination of such search queries.
  • SUMMARY
  • The following disclosure describes systems,, methods, and computer program products that permit the identification of search query results.
  • According to an aspect, there is disclosed a method. The method includes identifying data from one or more data sources, where the data is associated with at least one structured document, and extracting one or more data sets contained within the at least one structured document, The method further includes adding one or more record items to a searchable database, where the one or more record items correspond to the one or more extracted data sets.
  • According to an aspect, the method can include retrieving the data from the one or more data sources. Further, at least one of the one or more data sources can include one or more uniform resource locators (URLs). According to an aspect the data is at least one structured document. The data can also include metadata, such as a page rank. According to yet another aspect, the method can include identifying metadata associated with, the data subsequent to identifying the data from the one or more data sources.
  • According to still another aspect, the method can include generating an output file, where the output file includes data associated with two or more structured documents. Extracting one or more data sets can also include extracting one or more data sets from the output file. Additionally, according to an aspect, the at least one structured document can include two or more structured documents, and the method can further include merging the two or more structured documents.
  • According to another aspect, the at least one structured document comprises a Keyhole Markup Language (KML) document Additionally, the one or more data sets can include at least one placemark. The method can also include receiving at least one search query, and identifying at least one of the one or more record items responsive to receiving the at least one search query. Further, extracting one or more data sets contained within the at least one structured document can include associating the one or more data sets with contextual information associated with the at least one structured document
  • These general and specific aspects may be implemented using a system, a method, or a computer program, or any combination of systems, methods, and computer programs.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 shows a search system, according to an illustrative implementation.
  • FIG. 2 shows a components of a server within the search system of FIG. 1, according to an illustrative implementation.
  • FIG. 3 shows a search system, according to an illustrative implementation,
  • FIG. 4 shows an illustrative KML file.
  • FIG. 5 shows exemplary processes for collecting and merging documents and metadata from one or more data sources, according to an implementation.
  • FIG, 6 shows an illustrative sample output file that includes a single file indexed by URL.
  • FIG. 7 shows and exemplary process flow for extraction of structured files, according to an implementation.
  • FIG. 8 shows an illustrative sample output file including multiple record items corresponding to a single URL.
  • FIG. 9 shows a process of structured document collection and extraction, according to an implementation.
  • DETAILED DESCRIPTION
  • The present disclosure now will, be described more folly hereinafter with reference to the accompanying drawings, in which some, but not all implementations are shown. Indeed, these implementations can be embodied in many different forms and should not be construed as limited to the implementations set forth herein; rather, these implementations are provided so that this disclosure will satisfy applicable legal requirements. Like numbers refer to like elements throughout.
  • The present disclosure describes a search, system that permits the collection of
  • structured documents and the extraction of data sets within such structured documents such that the individual data sets may be searched and retrieved in response to a user search query. For instance, a KML file having several placemarks may be extracted such that the individual placemarks are searchable records that may be returned as search results to a user query. The extraction of data sets from within a structured document is performed such that contextual information associated with the structured document is maintained subsequent to extraction.
  • This disclosure is described with reference to block diagrams and flowchart illustrations of methods, apparatuses (i.e., systems) and computer program products. It will be understood that blocks of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, respectively, may be implemented by computer program instructions. These computer program instructions may be loaded onto a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such thai the instructions which execute on the computer or other programmable data processing apparatus create means for implementing the functions specified in the flowchart block or blocks.
  • These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means that implement the functions) specified in the flowchart block or blocks. The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions that execute on the computer or other programmable apparatus provide steps for implementing the function(s) specified in the flowchart block, or blocks.
  • Accordingly, blocks of the block diagrams and flowchart illustrations support combinations of means for performing the specified functions, combinations of steps for performing the specified functions and program instruction means for performing the specified functions. It will also be understood that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware-based computer systems that perform the specified functions or steps, or combinations of special purpose hardware and computer instructions.
  • Referring now to FIG. 1, a search system 100 is shown according to an illustrative implementation. The system 100 includes one or more clients 115 in communication with a server 120 via one or more networks 140. Although multiple clients 115 and a single server 120 are illustrated in FIG. 1, there can be more servers and more or fewer clients. For instance, some of the functions performed by the server 120 can be performed by one or more other servers such that the server 120 can represent several devices, such as a network of computer processors and/or servers. Additionally, in some implementations a client can perform a function of the server 120 and the server 120 can perform a function of a client. The clients 115 can include a device, such as a personal computer, a wireless telephone, a personal digital assistant (PDA), a lap top computer, or another type of computation or communication device, a thread or process running on one of these devices, and/or an object executable by one of these devices.
  • The system 100 also includes one or more data sources 105 in communication with the server 120 via one or more networks 140, In the system 100 the server 120 can collect and/or receive data from one or more data sources 105 and manipulate the data to generate a response to a search query received from one or more clients 115.
  • The network(s) 140 can include one or more local, area networks (LANs), wide area networks (WANs), telephone networks, such as the Public Switched Telephone Network (PSTN), intranets, the Internet, and/or or other type of network. The clients 115, data sources 105, and server 120 can connect to the network(s) 140 via wired, wireless, or optical or other connections. In alternative implementations, one or more of the devices illustrated in FIG. 1 are directly connected to another one of the devices. For example, in one implementation, the clients 115 and/or data sources 105 are directly connected to the server 120.
  • FIG. 2 shows the server 120 of FIG. 1, according to an illustrative implementation. The server 120 can include a bus 210, a processor 220, a main memory 230, a read only memory (ROM) 240, a storage device 250, one or more input devices 260, one or more output devices 270, and a communication interface 280. The bus 210 can include one or more paths that permit communication among the components of server 120.
  • The processor 220 includes any type of conventional processor, microprocessor or processing logic that interprets and executes instructions. The main memory 230 can include a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by processor 220. The ROM 240 can include a conventional ROM device or another type of static storage device that stores static information and instructions for use by the processor 220, including, for instance, an operating system. Additionally, the storage device 250 can include a magnetic and/or optical recording medium and its corresponding drive,
  • The server 120 can also include an input device 260 having one or more conventional mechanisms that permit a user to input information to the server 120, such as a keyboard, a mouse, a pen, voice recognition and/or biometric mechanisms, or the like. The output device 270 includes one or more conventional mechanisms that output information to the user, such as a display, a printer, a speaker, or the like. The communication interface 280 can include any transceiver-like mechanism that enables the server 120 to communicate with other devices and/or systems. For example, the communication interface 280 can include mechanisms for communicating with another device or system via one or more networks, such as the network(s) 140.
  • In operation the server 120 can store collect documents and data associated with documents from one or more data sources 105, manipulate such documents and data, and can identify results responsive to client search queries. In one implementation, the server 120 performs these operations in response to the processor 220 executing software instructions contained in a computer-readable medium, such as memory 230. The software instructions can be read into the memory 230 from another computer readable medium, such as a data storage device 250, or from another device via the communication interface 280. The software instructions contained in the memory 230 cause processor 220 to perform processes described in this patent disclosure. Alternatively, hardwired circuitry can be used in place of or in combination with software instructions to implement processes consistent with the disclosure. Thus, implementations are not limited to any specific combination of hardware circuitry and software.
  • Referring now to FIG. 3, a search system 300 is shown according to an illustrative implementation. The search system 300 is similar to the system 100 shown in FIG. 1, and includes one or more data sources 305 and one or more clients 315 in communication with a server 320 via one or more networks 340. Like the system 100 of FIG. 1, the data sources 305 and/or clients 315 may be geographically dispersed and/or local to the server 320, and can communicate with the server 320 directly in alternative to communicating via the one or more networks 340. Further, the one or more data sources 305 and/or one or more clients 315 can be local to the server 320.
  • The one or more data sources 305 can include one or more data repositories, web pages, and the like, that include documents and metadata associated with the documents. According to an implementation, the documents can include structured documents, such as KML files. A KML file is an XML-based data or file format used to display geographic data in a browser, such as ‘Google Earth’, ‘Google Maps’, and ‘Google Maps for mobile’, and utilizes a tag-based structure with nested elements and attributes.
  • KML files can be used to associate descriptive text, models, and images with locations on the earth's surface. Within a KML file, each georeferenced entity is called a placemark, which can be georeferenced to points, areas, or paths. An illustrative KML file 400 that identifies a simple placemark is shown in FIG, 4, The KML file 400 generally includes an XML header, which Is the first line in the KML file 400, a KML namespace definition, which is the second line in the KML file 400, and at least one placemark object that includes several elements including a name, a description, and a point. The name can be used as the label for the placemark, the description can appear in a graphical ‘balloon’ attached to the placemark in a browser, and the point can specify the position of the placemark, such as in terms of longitude, latitude, and/or altitude, a street address, or the like. A KML file can contain multiple placemarks, and the placemarks within a KML file may be organized in a hierarchy of enclosing folders. A KML file can also include other descriptive features, and can include descriptive HTML to add links, font sizes, styles, colors, identify text alignment, and the like.
  • Referring again to FIG. 3, structured documents and metadata from the one or more data sources 305 are received by the server 320. The one or more data sources 305 can include public and/or private repositories of data. According to an implementation, documents and metadata are collected by the data collection module 325 as a result of data being pulled from the data sources 305 by the data collection module 325. Data sources may alternatively or additionally push, or transmit, documents and metadata to the data collection module 325, either automatically and/or upon a request by the data collection module 325 for data. For instance, KML files and their associated metadata may be received by the data collection module 325. The metadata associated with a KML file can include, among other data, a page rank that identifies, relative to an arbitrary numbering scheme, the relative rank of the file identifying the number of links (or popularity) of the file, the number of downloads of the file, and/or other metadata. Metadata associated, with a document can come from a different source than the document itself, such as from a separate document or database.
  • Although the present disclosure will be described in detail, hereinafter with reference to KML files and their associated metadata, the present disclosure is operative with any structured data format that may be used to link data, such as location names, descriptive text, images, geographic references, and the like. For instance, the one or more data sources 305 can also transmit GeoRSS files to the server 320, where GeoRSS files contain HTML and typically reference a geographic location. Still other files, such as KMZ files, may be utilized.
  • After the collection of structured documents and related metadata is complete, the data collection module 325 merges the structured documents and metadata and provides an indexed output file to an Indexing module 330. The indexing module 330 is operable to parse the indexed output file received from the data collection, module 325 to identify one or more data sets, calculate a query independent rank for extracted data sets, and forward record items corresponding to the extracted data sets for storage in a results repository 335 used to respond to user queries. According to an exemplary implementation, KML documents and metadata generated by the data collection module 325 can be fed as an input into the indexing module 338, which can parse the KML files, extract placemarks, calculate a query independent rank for each placemark, and provide each placemark individually for insertion into a search repository 335 used to respond to user search queries.
  • Although each of the data collection module 325, indexing module 330, and results repository 335 are illustrated as internal to the server 320, and may be implemented by software instructions stored within a memory 230, or other components of the illustrative server 120 shown in FIG. 2, one or more the data collection module 325, indexing module 330, and/or results repository 335 may reside external to the server 320. For instance, one or more of the components 325, 330, 335 may reside in one or more separate servers, The components 325, 330, 335 can also be combined in whole or part in one or more components. Therefore, the block diagram implementation of the illustrative system 300 shown in FIG. 3 is intended to represent various functions of the system 300 without limitation to specific software and/or hardware that can implement the functions described herein.
  • Next, FIG. 5 shows an exemplary functional block diagram flow chart 500 illustrating the collection of documents and metadata from one or more data sources by the data collection module 325. According to an implementation, the data collection module 325 is operable to identity and retrieve structured documents and any metadata associated with such documents, for instance, KML documents and associated metadata.
  • According to an implementation, tire data collection module 325 can utilize a web crawl program to identify structured documents existing on the world wide web (‘web’). A web crawl program browses the web, creating a copy of visited pages, and creates an index or table of URLs it encounters. As shown in FIG. 5, the web crawl program is operable to generate URLs 505 associated with, or identifying, the individual pages identified by the web crawl program. The URLs can identify structured documents and/or metadata associated with structured documents.
  • According to an implementation, during the web crawl structured files may be examined and/or converted into a separate file format to enable the contents of the structured files to be examined and/or searched. For example, a structured KML file can have an HTML file associated with it (and identified by a URL) that describes the contents of the KML file, where the HTML file may be relevant for ranking and/or indexing the KML file. As shown in FIG. 5, URLs included within the metadata, including URLS for each structured document, are forwarded to a page rank database 535, which may include additional metadata associated with each structured document,
  • The data collection, module 325 can examine the URLs 505 and identify all structured documents of interest, such as all KML documents. Once the entries are identified the data collection module 325 executes a document fetch 515 to retrieve the structured documents 520 associated with the identified URLs, Additional structured documents and meta data 530 can be collected from other public and/or private data repositories 525. Each data source may include different metadata 530 associated with a particular structured file, such as the number of times the file was downloaded from a particular site, user feedback, or the like. URLs for each structured document collected from other public and/or private data repositories 525 are forwarded to a page rank database 535, which may include additional metadata associated with each structured document.
  • As described above, upon identifying a structured document, the data collection module 325 can attempt to lookup a page rank of each discovered structured file. According to an implementation, the data collection module queries a page rank database 535 for identified documents by attempting to fetch a page rank of the URLs corresponding to each structured document discovered in the web crawl or public and/or private data repositories. The page rank may be used by the search system to prioritize results to user queries. According to another implementation, the page rank of URLs is not fetched for identified documents. Still, other metadata may be looked up for a corresponding URL, for instance, from one or more other databases,
  • Metadata keyed by URL 540 and collected from the page rank database 535 is merged 545 with the structured documents and metadata identified from the web crawl and/or public or private data repositories. This data can be passed through one or more de-duplication stages to eliminate duplicate documents. As an illustrative example, two identical files, each downloaded from a respective different URL, may be identified. One of the two identical files may be deleted by a duplication elimination (or de-duplication) stage. As another illustrative example, a URL associated with a particular page rank may be merged with the same URL associated with a document. Merging of metadata and documents may occur by URL and/or merging on the raw document contents.
  • According to an implementation, the data collection module 325 can convert all discovered structured documents and metadata into a common format to generate an output file 550. According to an implementation, the output of the data collection module 325 is a single indexed output file 550 in which each record contains a structured document and all of the associated metadata to be used to score the record to identify whether it is an appropriate response to a user search query. The output file can include a table of records indexed by URL, where all of the information associated with each URL is in a record associated with the URL.
  • FIG. 6 shows a sample output file 600 generated from collection of documents and metadata from one or more data sources by the data collection module 325. As illustrated, the output file includes a record associated with a source URL identifying a KML file associated with “Google Offices”. The contents include several placemarks corresponding to different Google offices around the world, including names and coordinates for “headquarters in Mountain View”, “New York City”, and “Tokyo Office”. Metadata associated with the source URL identifies die page rank of the URL, and the number of downloads of the file, for instance, provided by the web site from which it was available. The metadata can also include an anchor, such as a URL, that is associated with each placemark. Although only one source URL is shown in the sample output file 600, the single output file 550 can include two or more records.
  • The output file 550 generated by the data collection module 325 is transmitted to the indexing module 330. The indexing module 330 is operable to extract data sets from the records within the output file while preserving contextual, information. For instance, the indexing module 330 is operable to extract placemarks from, a single KML file that may include a large number of placemarks, where the extraction preserves contextual text, such as parent folders, referring pages, and the like. As an example, a KML file that includes several placemarks, each associated with a particular hotel within the city of London (which may be a parent folder within which the hotels are identified), may be extracted such that each placemark becomes an individual searchable item associated with the city of London.
  • FIG. 7 shows an exemplary process flow 700 implemented by the indexing module 300 to perform indexing. Whereas the output file 550 generated by the data collection module 325 includes records indexed by URL, the indexing module 330 can transform each record indexed by URL into multiple records items per URL. According to an implementation, indexing of structured documents can he implemented by taking the output file 550, parsing the structured records to extract individual data sets (e.g., placemarks) (block 710), calculating a query independent rank for each data set, and transmitting record items corresponding to the data sets to a searchable repository 335. Although the present disclosure is referenced herein with respect to placemarks within KML files, other data sets within structured documents may be extracted by the indexing module 330.
  • According to an implementation, the indexing module 330 transforms a record indexed by URL into multiple records items per URL, where each record item is indexed by a document ID, which is an number chosen by the indexing module 330 to uniquely identify a record item, such as a placemark. According to an implementation, the document ID can be generated as a hash value from selected fields of a record. For instance, the document ID may be based in part on geo coordinates identifying the location of a placemark.
  • During extraction, the indexing module 330 associates the contextual information from the original record with each data set. For instance, each placemark extracted from a KML file will preserve its context information, including the URL and other metadata of the corresponding structured file and the name of each enclosing folder, in addition to the placemark's descriptive text and other data, such as georeference data. Additionally, for each placemark the indexing module 330 can calculate a query independent score based on the available metadata. This may utilize one or more rankings from other databases (not illustrated).
  • The indexing module 330 is further operable to eliminate duplicate record items (block 715) based on like metadata. For instance, duplicate placemarks may be eliminated based on comparison of the fingerprint of the location and placemark name. The placemark with the highest score based on the available metadata may be retained.
  • The indexing module 330 is optionally operable to cluster data sets within structured documents into a compound search result where the data sets are related. For instance, subsequent to extracting data sets from one or more records indexed by URLs, one or more data sets may be combined, or clustered, into a single compound search result where they refer to the same physical entity. This maybe useful to improve the diversity of results. For instance, a user search query with the terms ‘statue’ for New York City may would return placemarks having the highest score, which may all be placemarks identifying the Statue of Liberty. Clustering all results for the Statue of Liberty will permit a search result that provides one compound result for the Statue of Liberty such that other statue results can also be provided to a user.
  • If all items associated with a certain entity are clustered, however, this can reduce the ability to identify a particular record entry. Therefore, a serving module can perform dynamic clustering that is based at least in part on the user's search query. For instance, continuing with the Statue of Liberty example, a search for ‘Statue of Liberty’ and ‘Tours’ may result in clustering based on a user search term in addition to static terms included within records indexed by URLs.
  • The record items identified by the indexing module 330 are listed individually and provided to the searchable repository for use in responding to user queries (blocks 720,730). An illustrative example of an indexing module 330 output 800 that corresponds to the illustrative output file 600 of FIG. 6 is shown in FIG. 8. The placemarks identified in FIG, 6 as existing within a single record entry have been extracted into separate record items 805, 810, 815, This permits a user to search for content associated with individual record items extracted from a structured document.
  • FIG. 9 shows a process of structured document collection and extraction, according to an implementation. Data is collected from one or more data sources (block 905), where the data can include one or more structured documents and metadata associated therewith. According to an implementation, the data collection can he effected by the data collection module 325, which can collect data from the one or more data sources 305. From the collected data, structured documents and associated metadata is identified (block 910). Next, structured documents and metadata associated therewith may be merged, for instance, by URL, Duplicate entries may also be merged based on other keys, such as based on the document contents (block 920). An output file is then generated (block 930). According to an implementation, the identification of structured data, the merging of structured documents and metadata, and/or the generation of an output file can also be performed by the data collection module 325.
  • Record items are then created from each record indexed within the output file by extracting data sets from each output file record. For instance, where the output file record includes a KML file indexed by URL, record items can be created that correspond to extracted placemarks within the KML file (block 940). According to an implementation, the extraction of data sets and generation of record items can be performed by the indexing module 330. After extraction is complete, the record items (or table of record items) are added to a searchable database (block 950).
  • Many modifications and other implementations will come to mind to one skilled in the art having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it Is to he understood that the disclosure is not limited to the specific implementations disclosed and that modifications and other implementations are intended to be included within the scope of the appended claims, Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims (28)

1. A method, comprising:
identifying data from one or more data sources, wherein the data is associated with at least one structured document;
extracting one or more data sets contained within the at least one structured document; and
adding one or more record items to a searchable database, wherein the one or more record items correspond to the one or more extracted data sets.
2. The method of claim 1, further comprising retrieving the data from the one or more data sources.
3. The method of claim 1, wherein at least one of the one or more data sources comprises one or more uniform resource locators.
4. The method of claim 1, wherein the data is at least one structured document.
5. The method of claim 1, wherein the data, further comprises metadata.
6. The method of claim .1, further comprising identifying metadata associated with the data subsequent to identifying the data from the one or more data sources.
7. The method of claim 6, wherein the metadata comprises at feast one page rank.
8. The method of claim 1, further comprising generating an output file, wherein the output file includes data associated with two or more structured documents.
9. The method of claim 8, wherein extracting one or more data sets comprises extracting one or more data sets from the output file.
10. The method of claim 1, wherein the at least one structured document comprises two or more structured documents, and further comprising merging the two or more structured documents.
11. The method of claim 1, wherein the at least one structured document comprises a Keyhole Markup Language (KML) document.
12. The method of claim 11, wherein the one or more data sets comprise at least one placemark.
13. The method of claim 1, further comprising:
receiving at least one search query; and
identifying at least one of the one or more record items responsive to receiving the at least one search query.
14. The method of claim 1, wherein extracting one or more data sets contained within the at least one structured document further comprises:
associating the one or more data sets with contextual information associated with the at least one structured document.
15. A system, comprising:
means for identifying data from one or more data sources, wherein the data is associated with at least one structured document;
means lot extracting one or more data sets contained within the at least one structured document; and
means for adding one or more record items to a searchable database, wherein the one or more record items correspond to the one or more extracted data sets.
16. The system of claim 15, further comprising means for retrieving the data from the one or more data sources.
17. The system of claim 15, wherein at least one of the one or more data sources comprises one or more uniform resource locators.
18. The system of claim 15, wherein the data is at least one structured document.
19. The system of claim 15, wherein the data further comprises metadata.
20. The system of claim 15, further comprising means for identifying metadata associated with the data.
21. The system of claim 20, wherein the metadata comprises at least one page rank.
22. The system of claim 15, further comprising means for generating an output file, wherein the output file includes data associated with two or more structured documents.
23. The system of claim 22, wherein the means for extracting one or more data sets comprises means for extracting one or more data sets from the output file.
24. The system of claim 15, wherein the at least one structured document comprises two or more structured documents, and further comprising means for merging the two or more structured documents.
25. The system of claim 15, wherein the at least one structured document comprises a Keyhole Markup Language (KML) document.
26. The system of claim 25, wherein the one or more data sets comprise at least, one placemark.
27. The system of claim 15, further comprising;
means for receiving at least one search query; and
means for identifying at least one of the one or more record items responsive to receiving the at least one search query.
28. The system of claim 15, wherein the means for extracting one or more data sets contained within the at least one structured document further comprises;
means for associating the one or more data sets with contextual information associated with the at least one structured document.
US11/671,306 2007-02-05 2007-02-05 Searching structured geographical data Active 2027-10-04 US7836085B2 (en)

Priority Applications (10)

Application Number Priority Date Filing Date Title
US11/671,306 US7836085B2 (en) 2007-02-05 2007-02-05 Searching structured geographical data
KR1020097017280A KR101450358B1 (en) 2007-02-05 2008-02-04 Searching structured geographical data
PCT/US2008/052945 WO2008097921A1 (en) 2007-02-05 2008-02-04 Searching structured geographical data
CN200880010447XA CN101647020B (en) 2007-02-05 2008-02-04 Searching structured geographical data
EP08728954.2A EP2118779A4 (en) 2007-02-05 2008-02-04 Searching structured geographical data
CA2677307A CA2677307C (en) 2007-02-05 2008-02-04 Searching structured geographical data
BRPI0807172-1A2A BRPI0807172A2 (en) 2007-02-05 2008-02-04 SEARCHING STRUCTURED GEOGRAPHICAL DATA
AU2008213993A AU2008213993A1 (en) 2007-02-05 2008-02-04 Searching structured geographical data
JP2009548491A JP5336391B2 (en) 2007-02-05 2008-02-04 Search for structured geographic data
US12/945,600 US8200704B2 (en) 2007-02-05 2010-11-12 Searching structured data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/671,306 US7836085B2 (en) 2007-02-05 2007-02-05 Searching structured geographical data

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/945,600 Continuation US8200704B2 (en) 2007-02-05 2010-11-12 Searching structured data

Publications (2)

Publication Number Publication Date
US20080189249A1 true US20080189249A1 (en) 2008-08-07
US7836085B2 US7836085B2 (en) 2010-11-16

Family

ID=39677008

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/671,306 Active 2027-10-04 US7836085B2 (en) 2007-02-05 2007-02-05 Searching structured geographical data
US12/945,600 Expired - Fee Related US8200704B2 (en) 2007-02-05 2010-11-12 Searching structured data

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/945,600 Expired - Fee Related US8200704B2 (en) 2007-02-05 2010-11-12 Searching structured data

Country Status (9)

Country Link
US (2) US7836085B2 (en)
EP (1) EP2118779A4 (en)
JP (1) JP5336391B2 (en)
KR (1) KR101450358B1 (en)
CN (1) CN101647020B (en)
AU (1) AU2008213993A1 (en)
BR (1) BRPI0807172A2 (en)
CA (1) CA2677307C (en)
WO (1) WO2008097921A1 (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080195943A1 (en) * 2007-02-12 2008-08-14 Spinlet Oy Distribution system for data items
US20080235163A1 (en) * 2007-03-22 2008-09-25 Srinivasan Balasubramanian System and method for online duplicate detection and elimination in a web crawler
US20090112812A1 (en) * 2007-10-29 2009-04-30 Ellis John R Spatially enabled content management, discovery and distribution system for unstructured information management
US20090198767A1 (en) * 2008-02-01 2009-08-06 Gabriel Jakobson Method and system for associating content with map zoom function
US20090237408A1 (en) * 2008-03-18 2009-09-24 Nielsen Steven E Virtual white lines for delimiting planned excavation sites
US20090238417A1 (en) * 2008-03-18 2009-09-24 Nielsen Steven E Virtual white lines for indicating planned excavation sites on electronic images
US20100095231A1 (en) * 2008-10-13 2010-04-15 Yahoo! Inc. Method and system for providing customized regional maps
US20100146436A1 (en) * 2008-02-01 2010-06-10 Gabriel Jakobson Displaying content associated with electronic mapping systems
US20100205554A1 (en) * 2009-02-11 2010-08-12 Certusview Technologies, Llc Virtual white lines (vwl) application for indicating an area of planned excavation
US20100205195A1 (en) * 2009-02-11 2010-08-12 Certusview Technologies, Llc Methods and apparatus for associating a virtual white line (vwl) image with corresponding ticket information for an excavation project
US20100324967A1 (en) * 2009-02-11 2010-12-23 Certusview Technologies, Llc Management system, and associated methods and apparatus, for dispatching tickets, receiving field information, and performing a quality assessment for underground facility locate and/or marking operations
WO2012004450A2 (en) * 2010-07-09 2012-01-12 Nokia Corporation Method and apparatus for aggregating and linking place data
US8274506B1 (en) * 2008-04-28 2012-09-25 Adobe Systems Incorporated System and methods for creating a three-dimensional view of a two-dimensional map
US8458232B1 (en) * 2009-03-31 2013-06-04 Symantec Corporation Systems and methods for identifying data files based on community data
US20130191385A1 (en) * 2007-03-14 2013-07-25 David J. Vespe Geopoint Janitor
US8584013B1 (en) * 2007-03-20 2013-11-12 Google Inc. Temporal layers for presenting personalization markers on imagery
US8781815B1 (en) * 2013-12-05 2014-07-15 Seal Software Ltd. Non-standard and standard clause detection
US9037599B1 (en) * 2007-05-29 2015-05-19 Google Inc. Registering photos in a geographic information system, and applications thereof
US20150278211A1 (en) * 2014-03-31 2015-10-01 Microsoft Corporation Using geographic familiarity to generate search results
US20150324407A1 (en) * 2012-03-29 2015-11-12 Isogeo Method for indexing geographical data
US20160026620A1 (en) * 2014-07-24 2016-01-28 Seal Software Ltd. Advanced clause groupings detection
US9805025B2 (en) 2015-07-13 2017-10-31 Seal Software Limited Standard exact clause detection
EP3367267A1 (en) * 2017-02-23 2018-08-29 Innoplexus AG System and method for creating entity records using existing data sources
US11056244B2 (en) * 2017-12-28 2021-07-06 Cilag Gmbh International Automated data scaling, alignment, and organizing based on predefined parameters within surgical networks

Families Citing this family (128)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101340036B1 (en) * 2007-07-10 2013-12-10 삼성전자주식회사 Method for generating Electronic Content Guide and apparatus therefor
GB2458309A (en) * 2008-03-13 2009-09-16 Business Partners Ltd Search engine
US8782564B2 (en) * 2008-03-21 2014-07-15 Trimble Navigation Limited Method for collaborative display of geographic data
US8898179B2 (en) * 2008-03-21 2014-11-25 Trimble Navigation Limited Method for extracting attribute data from a media file
US8965894B2 (en) * 2010-12-21 2015-02-24 Tata Consultancy Services Limited Automated web page classification
US20130067346A1 (en) * 2011-09-09 2013-03-14 Microsoft Corporation Content User Experience
US11871901B2 (en) 2012-05-20 2024-01-16 Cilag Gmbh International Method for situational awareness for surgical network or surgical network connected device capable of adjusting function based on a sensed situation or usage
US9146981B2 (en) * 2012-07-06 2015-09-29 International Business Machines Corporation Automated electronic discovery collections and preservations
US9053085B2 (en) * 2012-12-10 2015-06-09 International Business Machines Corporation Electronic document source ingestion for natural language processing systems
US8925099B1 (en) 2013-03-14 2014-12-30 Reputation.Com, Inc. Privacy scoring
US10324733B2 (en) 2014-07-30 2019-06-18 Microsoft Technology Licensing, Llc Shutdown notifications
US9787576B2 (en) 2014-07-31 2017-10-10 Microsoft Technology Licensing, Llc Propagating routing awareness for autonomous networks
US10592080B2 (en) 2014-07-31 2020-03-17 Microsoft Technology Licensing, Llc Assisted presentation of application windows
US10254942B2 (en) 2014-07-31 2019-04-09 Microsoft Technology Licensing, Llc Adaptive sizing and positioning of application windows
US9836464B2 (en) 2014-07-31 2017-12-05 Microsoft Technology Licensing, Llc Curating media from social connections
US10678412B2 (en) 2014-07-31 2020-06-09 Microsoft Technology Licensing, Llc Dynamic joint dividers for application windows
US11504192B2 (en) 2014-10-30 2022-11-22 Cilag Gmbh International Method of hub communication with surgical instrument systems
US11086216B2 (en) 2015-02-09 2021-08-10 Microsoft Technology Licensing, Llc Generating electronic components
US9827209B2 (en) 2015-02-09 2017-11-28 Microsoft Technology Licensing, Llc Display system
US10018844B2 (en) 2015-02-09 2018-07-10 Microsoft Technology Licensing, Llc Wearable image display system
US11291510B2 (en) 2017-10-30 2022-04-05 Cilag Gmbh International Method of hub communication with surgical instrument systems
US11317919B2 (en) 2017-10-30 2022-05-03 Cilag Gmbh International Clip applier comprising a clip crimping system
US11229436B2 (en) 2017-10-30 2022-01-25 Cilag Gmbh International Surgical system comprising a surgical tool and a surgical hub
US11026687B2 (en) 2017-10-30 2021-06-08 Cilag Gmbh International Clip applier comprising clip advancing systems
US11801098B2 (en) 2017-10-30 2023-10-31 Cilag Gmbh International Method of hub communication with surgical instrument systems
US11564756B2 (en) 2017-10-30 2023-01-31 Cilag Gmbh International Method of hub communication with surgical instrument systems
US11510741B2 (en) 2017-10-30 2022-11-29 Cilag Gmbh International Method for producing a surgical instrument comprising a smart electrical system
US11129636B2 (en) 2017-10-30 2021-09-28 Cilag Gmbh International Surgical instruments comprising an articulation drive that provides for high articulation angles
US11311342B2 (en) 2017-10-30 2022-04-26 Cilag Gmbh International Method for communicating with surgical instrument systems
US11911045B2 (en) 2017-10-30 2024-02-27 Cllag GmbH International Method for operating a powered articulating multi-clip applier
US11589888B2 (en) 2017-12-28 2023-02-28 Cilag Gmbh International Method for controlling smart energy devices
US10987178B2 (en) 2017-12-28 2021-04-27 Ethicon Llc Surgical hub control arrangements
US11147607B2 (en) 2017-12-28 2021-10-19 Cilag Gmbh International Bipolar combination device that automatically adjusts pressure based on energy modality
US11096693B2 (en) 2017-12-28 2021-08-24 Cilag Gmbh International Adjustment of staple height of at least one row of staples based on the sensed tissue thickness or force in closing
US10943454B2 (en) 2017-12-28 2021-03-09 Ethicon Llc Detection and escalation of security responses of surgical instruments to increasing severity threats
US11633237B2 (en) 2017-12-28 2023-04-25 Cilag Gmbh International Usage and technique analysis of surgeon / staff performance against a baseline to optimize device utilization and performance for both current and future procedures
US11257589B2 (en) 2017-12-28 2022-02-22 Cilag Gmbh International Real-time analysis of comprehensive cost of all instrumentation used in surgery utilizing data fluidity to track instruments through stocking and in-house processes
US11304720B2 (en) 2017-12-28 2022-04-19 Cilag Gmbh International Activation of energy devices
US20190201039A1 (en) 2017-12-28 2019-07-04 Ethicon Llc Situational awareness of electrosurgical systems
US11202570B2 (en) 2017-12-28 2021-12-21 Cilag Gmbh International Communication hub and storage device for storing parameters and status of a surgical device to be shared with cloud based analytics systems
US11308075B2 (en) 2017-12-28 2022-04-19 Cilag Gmbh International Surgical network, instrument, and cloud responses based on validation of received dataset and authentication of its source and integrity
US11304745B2 (en) 2017-12-28 2022-04-19 Cilag Gmbh International Surgical evacuation sensing and display
US11832899B2 (en) 2017-12-28 2023-12-05 Cilag Gmbh International Surgical systems with autonomously adjustable control programs
US11389164B2 (en) 2017-12-28 2022-07-19 Cilag Gmbh International Method of using reinforced flexible circuits with multiple sensors to optimize performance of radio frequency devices
US11109866B2 (en) 2017-12-28 2021-09-07 Cilag Gmbh International Method for circular stapler control algorithm adjustment based on situational awareness
US11311306B2 (en) 2017-12-28 2022-04-26 Cilag Gmbh International Surgical systems for detecting end effector tissue distribution irregularities
US10758310B2 (en) 2017-12-28 2020-09-01 Ethicon Llc Wireless pairing of a surgical device with another device within a sterile surgical field based on the usage and situational awareness of devices
US11304763B2 (en) 2017-12-28 2022-04-19 Cilag Gmbh International Image capturing of the areas outside the abdomen to improve placement and control of a surgical device in use
US11132462B2 (en) 2017-12-28 2021-09-28 Cilag Gmbh International Data stripping method to interrogate patient records and create anonymized record
US11896322B2 (en) 2017-12-28 2024-02-13 Cilag Gmbh International Sensing the patient position and contact utilizing the mono-polar return pad electrode to provide situational awareness to the hub
US11410259B2 (en) 2017-12-28 2022-08-09 Cilag Gmbh International Adaptive control program updates for surgical devices
US11576677B2 (en) 2017-12-28 2023-02-14 Cilag Gmbh International Method of hub communication, processing, display, and cloud analytics
US11051876B2 (en) 2017-12-28 2021-07-06 Cilag Gmbh International Surgical evacuation flow paths
US11423007B2 (en) 2017-12-28 2022-08-23 Cilag Gmbh International Adjustment of device control programs based on stratified contextual data in addition to the data
US20190201146A1 (en) 2017-12-28 2019-07-04 Ethicon Llc Safety systems for smart powered surgical stapling
US11857152B2 (en) 2017-12-28 2024-01-02 Cilag Gmbh International Surgical hub spatial awareness to determine devices in operating theater
US11744604B2 (en) 2017-12-28 2023-09-05 Cilag Gmbh International Surgical instrument with a hardware-only control circuit
US11464559B2 (en) 2017-12-28 2022-10-11 Cilag Gmbh International Estimating state of ultrasonic end effector and control system therefor
US11069012B2 (en) 2017-12-28 2021-07-20 Cilag Gmbh International Interactive surgical systems with condition handling of devices and data capabilities
US11179208B2 (en) 2017-12-28 2021-11-23 Cilag Gmbh International Cloud-based medical analytics for security and authentication trends and reactive measures
US11266468B2 (en) 2017-12-28 2022-03-08 Cilag Gmbh International Cooperative utilization of data derived from secondary sources by intelligent surgical hubs
US11291495B2 (en) 2017-12-28 2022-04-05 Cilag Gmbh International Interruption of energy due to inadvertent capacitive coupling
US11612444B2 (en) 2017-12-28 2023-03-28 Cilag Gmbh International Adjustment of a surgical device function based on situational awareness
US11818052B2 (en) 2017-12-28 2023-11-14 Cilag Gmbh International Surgical network determination of prioritization of communication, interaction, or processing based on system or device needs
US11937769B2 (en) 2017-12-28 2024-03-26 Cilag Gmbh International Method of hub communication, processing, storage and display
US11559308B2 (en) 2017-12-28 2023-01-24 Cilag Gmbh International Method for smart energy device infrastructure
US11273001B2 (en) 2017-12-28 2022-03-15 Cilag Gmbh International Surgical hub and modular device response adjustment based on situational awareness
US11253315B2 (en) 2017-12-28 2022-02-22 Cilag Gmbh International Increasing radio frequency to create pad-less monopolar loop
US11786251B2 (en) 2017-12-28 2023-10-17 Cilag Gmbh International Method for adaptive control schemes for surgical network control and interaction
US20190200981A1 (en) 2017-12-28 2019-07-04 Ethicon Llc Method of compressing tissue within a stapling device and simultaneously displaying the location of the tissue within the jaws
US11317937B2 (en) 2018-03-08 2022-05-03 Cilag Gmbh International Determining the state of an ultrasonic end effector
US11571234B2 (en) 2017-12-28 2023-02-07 Cilag Gmbh International Temperature control of ultrasonic end effector and control system therefor
US11419667B2 (en) 2017-12-28 2022-08-23 Cilag Gmbh International Ultrasonic energy device which varies pressure applied by clamp arm to provide threshold control pressure at a cut progression location
US11903601B2 (en) 2017-12-28 2024-02-20 Cilag Gmbh International Surgical instrument comprising a plurality of drive systems
US11540855B2 (en) 2017-12-28 2023-01-03 Cilag Gmbh International Controlling activation of an ultrasonic surgical instrument according to the presence of tissue
US11464535B2 (en) 2017-12-28 2022-10-11 Cilag Gmbh International Detection of end effector emersion in liquid
US11304699B2 (en) 2017-12-28 2022-04-19 Cilag Gmbh International Method for adaptive control schemes for surgical network control and interaction
US11166772B2 (en) 2017-12-28 2021-11-09 Cilag Gmbh International Surgical hub coordination of control and communication of operating room devices
US11666331B2 (en) 2017-12-28 2023-06-06 Cilag Gmbh International Systems for detecting proximity of surgical end effector to cancerous tissue
US11076921B2 (en) 2017-12-28 2021-08-03 Cilag Gmbh International Adaptive control program updates for surgical hubs
US11424027B2 (en) 2017-12-28 2022-08-23 Cilag Gmbh International Method for operating surgical instrument systems
US11234756B2 (en) 2017-12-28 2022-02-01 Cilag Gmbh International Powered surgical tool with predefined adjustable control algorithm for controlling end effector parameter
US10966791B2 (en) 2017-12-28 2021-04-06 Ethicon Llc Cloud-based medical analytics for medical facility segmented individualization of instrument function
US11364075B2 (en) 2017-12-28 2022-06-21 Cilag Gmbh International Radio frequency energy device for delivering combined electrical signals
US11045591B2 (en) 2017-12-28 2021-06-29 Cilag Gmbh International Dual in-series large and small droplet filters
US20190201139A1 (en) 2017-12-28 2019-07-04 Ethicon Llc Communication arrangements for robot-assisted surgical platforms
US11559307B2 (en) 2017-12-28 2023-01-24 Cilag Gmbh International Method of robotic hub communication, detection, and control
US11896443B2 (en) 2017-12-28 2024-02-13 Cilag Gmbh International Control of a surgical system through a surgical barrier
US11432885B2 (en) 2017-12-28 2022-09-06 Cilag Gmbh International Sensing arrangements for robot-assisted surgical platforms
US11864728B2 (en) 2017-12-28 2024-01-09 Cilag Gmbh International Characterization of tissue irregularities through the use of mono-chromatic light refractivity
US10892995B2 (en) 2017-12-28 2021-01-12 Ethicon Llc Surgical network determination of prioritization of communication, interaction, or processing based on system or device needs
US11832840B2 (en) 2017-12-28 2023-12-05 Cilag Gmbh International Surgical instrument having a flexible circuit
US11324557B2 (en) 2017-12-28 2022-05-10 Cilag Gmbh International Surgical instrument with a sensing array
US11419630B2 (en) 2017-12-28 2022-08-23 Cilag Gmbh International Surgical system distributed processing
US11659023B2 (en) 2017-12-28 2023-05-23 Cilag Gmbh International Method of hub communication
US11529187B2 (en) 2017-12-28 2022-12-20 Cilag Gmbh International Surgical evacuation sensor arrangements
US11786245B2 (en) 2017-12-28 2023-10-17 Cilag Gmbh International Surgical systems with prioritized data transmission capabilities
US11284936B2 (en) 2017-12-28 2022-03-29 Cilag Gmbh International Surgical instrument having a flexible electrode
US11026751B2 (en) 2017-12-28 2021-06-08 Cilag Gmbh International Display of alignment of staple cartridge to prior linear staple line
US11179175B2 (en) 2017-12-28 2021-11-23 Cilag Gmbh International Controlling an ultrasonic surgical instrument according to tissue location
US11446052B2 (en) 2017-12-28 2022-09-20 Cilag Gmbh International Variation of radio frequency and ultrasonic power level in cooperation with varying clamp arm pressure to achieve predefined heat flux or power applied to tissue
US11602393B2 (en) 2017-12-28 2023-03-14 Cilag Gmbh International Surgical evacuation sensing and generator control
US20190201118A1 (en) 2017-12-28 2019-07-04 Ethicon Llc Display arrangements for robot-assisted surgical platforms
US11376002B2 (en) 2017-12-28 2022-07-05 Cilag Gmbh International Surgical instrument cartridge sensor assemblies
US11678881B2 (en) 2017-12-28 2023-06-20 Cilag Gmbh International Spatial awareness of surgical hubs in operating rooms
US11278281B2 (en) 2017-12-28 2022-03-22 Cilag Gmbh International Interactive surgical system
US11100631B2 (en) 2017-12-28 2021-08-24 Cilag Gmbh International Use of laser light and red-green-blue coloration to determine properties of back scattered light
US11160605B2 (en) 2017-12-28 2021-11-02 Cilag Gmbh International Surgical evacuation sensing and motor control
US11259830B2 (en) 2018-03-08 2022-03-01 Cilag Gmbh International Methods for controlling temperature in ultrasonic device
US11389188B2 (en) 2018-03-08 2022-07-19 Cilag Gmbh International Start temperature of blade
US11589915B2 (en) 2018-03-08 2023-02-28 Cilag Gmbh International In-the-jaw classifier based on a model
US11213294B2 (en) 2018-03-28 2022-01-04 Cilag Gmbh International Surgical instrument comprising co-operating lockout features
US11096688B2 (en) 2018-03-28 2021-08-24 Cilag Gmbh International Rotary driven firing members with different anvil and channel engagement features
US11090047B2 (en) 2018-03-28 2021-08-17 Cilag Gmbh International Surgical instrument comprising an adaptive control system
US11259806B2 (en) 2018-03-28 2022-03-01 Cilag Gmbh International Surgical stapling devices with features for blocking advancement of a camming assembly of an incompatible cartridge installed therein
US11471156B2 (en) 2018-03-28 2022-10-18 Cilag Gmbh International Surgical stapling devices with improved rotary driven closure systems
US11207067B2 (en) 2018-03-28 2021-12-28 Cilag Gmbh International Surgical stapling device with separate rotary driven closure and firing systems and firing member that engages both jaws while firing
US11219453B2 (en) 2018-03-28 2022-01-11 Cilag Gmbh International Surgical stapling devices with cartridge compatible closure and firing lockout arrangements
US11278280B2 (en) 2018-03-28 2022-03-22 Cilag Gmbh International Surgical instrument comprising a jaw closure lockout
US10973520B2 (en) 2018-03-28 2021-04-13 Ethicon Llc Surgical staple cartridge with firing member driven camming assembly that has an onboard tissue cutting feature
US11464511B2 (en) 2019-02-19 2022-10-11 Cilag Gmbh International Surgical staple cartridges with movable authentication key arrangements
US11369377B2 (en) 2019-02-19 2022-06-28 Cilag Gmbh International Surgical stapling assembly with cartridge based retainer configured to unlock a firing lockout
US11291444B2 (en) 2019-02-19 2022-04-05 Cilag Gmbh International Surgical stapling assembly with cartridge based retainer configured to unlock a closure lockout
US11317915B2 (en) 2019-02-19 2022-05-03 Cilag Gmbh International Universal cartridge based key feature that unlocks multiple lockout arrangements in different surgical staplers
US11357503B2 (en) 2019-02-19 2022-06-14 Cilag Gmbh International Staple cartridge retainers with frangible retention features and methods of using same
USD950728S1 (en) 2019-06-25 2022-05-03 Cilag Gmbh International Surgical staple cartridge
USD952144S1 (en) 2019-06-25 2022-05-17 Cilag Gmbh International Surgical staple cartridge retainer with firing system authentication key
USD964564S1 (en) 2019-06-25 2022-09-20 Cilag Gmbh International Surgical staple cartridge retainer with a closure system authentication key

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6366934B1 (en) * 1998-10-08 2002-04-02 International Business Machines Corporation Method and apparatus for querying structured documents using a database extender
US6424980B1 (en) * 1998-06-10 2002-07-23 Nippon Telegraph And Telephone Corporation Integrated retrieval scheme for retrieving semi-structured documents
US20030212675A1 (en) * 2002-05-08 2003-11-13 International Business Machines Corporation Knowledge-based data mining system
US6950815B2 (en) * 2002-04-23 2005-09-27 International Business Machines Corporation Content management system and methodology featuring query conversion capability for efficient searching
US7111000B2 (en) * 2003-01-06 2006-09-19 Microsoft Corporation Retrieval of structured documents
US20070078850A1 (en) * 2005-10-03 2007-04-05 Microsoft Corporation Commerical web data extraction system
US20070203891A1 (en) * 2006-02-28 2007-08-30 Microsoft Corporation Providing and using search index enabling searching based on a targeted content of documents
US20070276845A1 (en) * 2006-05-12 2007-11-29 Tele Atlas North America, Inc. Locality indexes and method for indexing localities
US20080228675A1 (en) * 2006-10-13 2008-09-18 Move, Inc. Multi-tiered cascading crawling system

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2604100A (en) * 1999-01-08 2000-07-24 Micro-Integration Corporation Search engine database and interface
JP2000322420A (en) * 1999-05-07 2000-11-24 Hitachi Ltd Method for registering and retrieving spatial data
CN1451126A (en) 1999-09-15 2003-10-22 西门子共同研究公司 Method and system for selecting and automatically updating arbitrary elements from structured documents
US6480837B1 (en) * 1999-12-16 2002-11-12 International Business Machines Corporation Method, system, and program for ordering search results using a popularity weighting
WO2001065410A2 (en) 2000-02-28 2001-09-07 Geocontent, Inc. Search engine for spatial data indexing
US20050010494A1 (en) * 2000-03-21 2005-01-13 Pricegrabber.Com Method and apparatus for Internet e-commerce shopping guide
US7085736B2 (en) 2001-02-27 2006-08-01 Alexa Internet Rules-based identification of items represented on web pages
JP4199671B2 (en) * 2002-03-15 2008-12-17 富士通株式会社 Regional information retrieval method and regional information retrieval apparatus
JP2003296341A (en) * 2002-04-03 2003-10-17 Nissan Motor Co Ltd Database generation method, database generation program, data structure, database generation system, retrieval system and retrieval method
JP2004234288A (en) * 2003-01-30 2004-08-19 Nippon Telegr & Teleph Corp <Ntt> Web search method and device, web search program, and recording medium with the program recorded
KR100677116B1 (en) * 2004-04-02 2007-02-02 삼성전자주식회사 Cyclic referencing method/apparatus, parsing method/apparatus and recording medium storing a program to implement the method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6424980B1 (en) * 1998-06-10 2002-07-23 Nippon Telegraph And Telephone Corporation Integrated retrieval scheme for retrieving semi-structured documents
US6366934B1 (en) * 1998-10-08 2002-04-02 International Business Machines Corporation Method and apparatus for querying structured documents using a database extender
US6950815B2 (en) * 2002-04-23 2005-09-27 International Business Machines Corporation Content management system and methodology featuring query conversion capability for efficient searching
US20030212675A1 (en) * 2002-05-08 2003-11-13 International Business Machines Corporation Knowledge-based data mining system
US7111000B2 (en) * 2003-01-06 2006-09-19 Microsoft Corporation Retrieval of structured documents
US20070078850A1 (en) * 2005-10-03 2007-04-05 Microsoft Corporation Commerical web data extraction system
US20070203891A1 (en) * 2006-02-28 2007-08-30 Microsoft Corporation Providing and using search index enabling searching based on a targeted content of documents
US20070276845A1 (en) * 2006-05-12 2007-11-29 Tele Atlas North America, Inc. Locality indexes and method for indexing localities
US20080228675A1 (en) * 2006-10-13 2008-09-18 Move, Inc. Multi-tiered cascading crawling system

Cited By (63)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8468154B2 (en) * 2007-02-12 2013-06-18 Spinlet Oy Distribution system for data items
US20080195943A1 (en) * 2007-02-12 2008-08-14 Spinlet Oy Distribution system for data items
US20130191385A1 (en) * 2007-03-14 2013-07-25 David J. Vespe Geopoint Janitor
US10459955B1 (en) 2007-03-14 2019-10-29 Google Llc Determining geographic locations for place names
US9892132B2 (en) * 2007-03-14 2018-02-13 Google Llc Determining geographic locations for place names in a fact repository
US8584013B1 (en) * 2007-03-20 2013-11-12 Google Inc. Temporal layers for presenting personalization markers on imagery
US10585920B2 (en) 2007-03-20 2020-03-10 Google Llc Temporal layers for presenting personalization markers on imagery
US11636138B1 (en) 2007-03-20 2023-04-25 Google Llc Temporal layers for presenting personalization markers on imagery
US20080235163A1 (en) * 2007-03-22 2008-09-25 Srinivasan Balasubramanian System and method for online duplicate detection and elimination in a web crawler
US9280258B1 (en) 2007-05-29 2016-03-08 Google Inc. Displaying and navigating within photo placemarks in a geographic information system and applications thereof
US9037599B1 (en) * 2007-05-29 2015-05-19 Google Inc. Registering photos in a geographic information system, and applications thereof
US8195630B2 (en) * 2007-10-29 2012-06-05 Bae Systems Information Solutions Inc. Spatially enabled content management, discovery and distribution system for unstructured information management
US20090112812A1 (en) * 2007-10-29 2009-04-30 Ellis John R Spatially enabled content management, discovery and distribution system for unstructured information management
US8490025B2 (en) * 2008-02-01 2013-07-16 Gabriel Jakobson Displaying content associated with electronic mapping systems
US20100146436A1 (en) * 2008-02-01 2010-06-10 Gabriel Jakobson Displaying content associated with electronic mapping systems
US8504945B2 (en) * 2008-02-01 2013-08-06 Gabriel Jakobson Method and system for associating content with map zoom function
US20090198767A1 (en) * 2008-02-01 2009-08-06 Gabriel Jakobson Method and system for associating content with map zoom function
US20090241046A1 (en) * 2008-03-18 2009-09-24 Steven Nielsen Virtual white lines for delimiting planned excavation sites
US20090238417A1 (en) * 2008-03-18 2009-09-24 Nielsen Steven E Virtual white lines for indicating planned excavation sites on electronic images
US8861795B2 (en) 2008-03-18 2014-10-14 Certusview Technologies, Llc Virtual white lines for delimiting planned excavation sites
US8280117B2 (en) 2008-03-18 2012-10-02 Certusview Technologies, Llc Virtual white lines for indicating planned excavation sites on electronic images
US8290215B2 (en) 2008-03-18 2012-10-16 Certusview Technologies, Llc Virtual white lines for delimiting planned excavation sites
US9830338B2 (en) 2008-03-18 2017-11-28 Certusview Technologies, Inc. Virtual white lines for indicating planned excavation sites on electronic images
US8300895B2 (en) 2008-03-18 2012-10-30 Certusview Technologies, Llc Virtual white lines for delimiting planned excavation sites
US20090238416A1 (en) * 2008-03-18 2009-09-24 Steven Nielsen Virtual white lines for delimiting planned excavation sites
US8355542B2 (en) * 2008-03-18 2013-01-15 Certusview Technologies, Llc Virtual white lines for delimiting planned excavation sites
US20090241045A1 (en) * 2008-03-18 2009-09-24 Certusview Technologies, Llc Virtual white lines for delimiting planned excavation sites
US8934678B2 (en) 2008-03-18 2015-01-13 Certusview Technologies, Llc Virtual white lines for delimiting planned excavation sites
US20090237408A1 (en) * 2008-03-18 2009-09-24 Nielsen Steven E Virtual white lines for delimiting planned excavation sites
US8218827B2 (en) 2008-03-18 2012-07-10 Certusview Technologies, Llc Virtual white lines for delimiting planned excavation sites
US8249306B2 (en) 2008-03-18 2012-08-21 Certusview Technologies, Llc Virtual white lines for delimiting planned excavation sites
US8861794B2 (en) 2008-03-18 2014-10-14 Certusview Technologies, Llc Virtual white lines for indicating planned excavation sites on electronic images
US8274506B1 (en) * 2008-04-28 2012-09-25 Adobe Systems Incorporated System and methods for creating a three-dimensional view of a two-dimensional map
US20100095231A1 (en) * 2008-10-13 2010-04-15 Yahoo! Inc. Method and system for providing customized regional maps
US9336695B2 (en) * 2008-10-13 2016-05-10 Yahoo! Inc. Method and system for providing customized regional maps
US20100205554A1 (en) * 2009-02-11 2010-08-12 Certusview Technologies, Llc Virtual white lines (vwl) application for indicating an area of planned excavation
US8832565B2 (en) 2009-02-11 2014-09-09 Certusview Technologies, Llc Methods and apparatus for controlling access to a virtual white line (VWL) image for an excavation project
US8626571B2 (en) 2009-02-11 2014-01-07 Certusview Technologies, Llc Management system, and associated methods and apparatus, for dispatching tickets, receiving field information, and performing a quality assessment for underground facility locate and/or marking operations
US8566737B2 (en) 2009-02-11 2013-10-22 Certusview Technologies, Llc Virtual white lines (VWL) application for indicating an area of planned excavation
US8384742B2 (en) 2009-02-11 2013-02-26 Certusview Technologies, Llc Virtual white lines (VWL) for delimiting planned excavation sites of staged excavation projects
US8356255B2 (en) 2009-02-11 2013-01-15 Certusview Technologies, Llc Virtual white lines (VWL) for delimiting planned excavation sites of staged excavation projects
US8296308B2 (en) * 2009-02-11 2012-10-23 Certusview Technologies, Llc Methods and apparatus for associating a virtual white line (VWL) image with corresponding ticket information for an excavation project
US20100324967A1 (en) * 2009-02-11 2010-12-23 Certusview Technologies, Llc Management system, and associated methods and apparatus, for dispatching tickets, receiving field information, and performing a quality assessment for underground facility locate and/or marking operations
US20100205195A1 (en) * 2009-02-11 2010-08-12 Certusview Technologies, Llc Methods and apparatus for associating a virtual white line (vwl) image with corresponding ticket information for an excavation project
US8458232B1 (en) * 2009-03-31 2013-06-04 Symantec Corporation Systems and methods for identifying data files based on community data
WO2012004450A2 (en) * 2010-07-09 2012-01-12 Nokia Corporation Method and apparatus for aggregating and linking place data
US20160217146A1 (en) * 2010-07-09 2016-07-28 Here Global B.V. Method and apparatus for aggregating and linking place data
WO2012004450A3 (en) * 2010-07-09 2012-03-01 Nokia Corporation Method and apparatus for aggregating and linking place data
US20150324407A1 (en) * 2012-03-29 2015-11-12 Isogeo Method for indexing geographical data
US20150161102A1 (en) * 2013-12-05 2015-06-11 Seal Software Ltd. Non-Standard and Standard Clause Detection
US9268768B2 (en) * 2013-12-05 2016-02-23 Seal Software Ltd. Non-standard and standard clause detection
US8781815B1 (en) * 2013-12-05 2014-07-15 Seal Software Ltd. Non-standard and standard clause detection
US9619523B2 (en) * 2014-03-31 2017-04-11 Microsoft Technology Licensing, Llc Using geographic familiarity to generate search results
US20150278211A1 (en) * 2014-03-31 2015-10-01 Microsoft Corporation Using geographic familiarity to generate search results
US10371541B2 (en) 2014-03-31 2019-08-06 Microsoft Technology Licensing, Llc Using geographic familiarity to generate navigation directions
US20160026620A1 (en) * 2014-07-24 2016-01-28 Seal Software Ltd. Advanced clause groupings detection
US9996528B2 (en) * 2014-07-24 2018-06-12 Seal Software Ltd. Advanced clause groupings detection
US10402496B2 (en) * 2014-07-24 2019-09-03 Seal Software Ltd. Advanced clause groupings detection
US9805025B2 (en) 2015-07-13 2017-10-31 Seal Software Limited Standard exact clause detection
US10185712B2 (en) * 2015-07-13 2019-01-22 Seal Software Ltd. Standard exact clause detection
USRE49576E1 (en) * 2015-07-13 2023-07-11 Docusign International (Emea) Limited Standard exact clause detection
EP3367267A1 (en) * 2017-02-23 2018-08-29 Innoplexus AG System and method for creating entity records using existing data sources
US11056244B2 (en) * 2017-12-28 2021-07-06 Cilag Gmbh International Automated data scaling, alignment, and organizing based on predefined parameters within surgical networks

Also Published As

Publication number Publication date
KR101450358B1 (en) 2014-10-14
BRPI0807172A2 (en) 2014-05-13
EP2118779A1 (en) 2009-11-18
CA2677307C (en) 2015-04-14
KR20090116747A (en) 2009-11-11
US7836085B2 (en) 2010-11-16
US20110060749A1 (en) 2011-03-10
WO2008097921A1 (en) 2008-08-14
JP2010518495A (en) 2010-05-27
JP5336391B2 (en) 2013-11-06
CN101647020B (en) 2012-11-28
EP2118779A4 (en) 2013-07-17
CN101647020A (en) 2010-02-10
CA2677307A1 (en) 2008-08-14
AU2008213993A1 (en) 2008-08-14
US8200704B2 (en) 2012-06-12

Similar Documents

Publication Publication Date Title
US7836085B2 (en) Searching structured geographical data
JP5256293B2 (en) System and method for including interactive elements on a search results page
CA2610208C (en) Learning facts from semi-structured text
CA2583042C (en) Providing information relating to a document
US8832058B1 (en) Systems and methods for syndicating and hosting customized news content
US7809710B2 (en) System and method for extracting content for submission to a search engine
US7765209B1 (en) Indexing and retrieval of blogs
US7310633B1 (en) Methods and systems for generating textual information
US20120124053A1 (en) Annotation Framework
JP2009020901A (en) Database system, method of database retrieval and recording medium
NO337806B1 (en) Systems and methods for grouping search results
JPH11191114A (en) Meta retrieving method, image retrieving method, meta retrieval engine and image retrieval engine
US20090210389A1 (en) System to support structured search over metadata on a web index
KR20160124079A (en) Systems and methods for in-memory database search
JP2003173280A (en) Apparatus, method and program for generating database
US7792855B2 (en) Efficient storage of XML in a directory
JP2005025418A (en) Question answering device, question answering method, and program
JP2002189713A (en) Method and system for supporting document creation
Rao et al. Web Search Engine
JP2002269000A (en) Method for automatically preparing and displaying homepage and user information database
JP2001312519A (en) Method to generate compound content and its retrieval method

Legal Events

Date Code Title Description
AS Assignment

Owner name: GOOGLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PETAKOV, ARTEM;MINOGUE, DAVID;SPIRIDONOV, ALEXEY;REEL/FRAME:019191/0948;SIGNING DATES FROM 20070308 TO 20070309

Owner name: GOOGLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PETAKOV, ARTEM;MINOGUE, DAVID;SPIRIDONOV, ALEXEY;SIGNING DATES FROM 20070308 TO 20070309;REEL/FRAME:019191/0948

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: GOOGLE LLC, CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:044101/0405

Effective date: 20170929

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552)

Year of fee payment: 8

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12