WO2000062204A1 - Online content lifecycle management based on usage statistics, user-supplied value ratings and expiration dates - Google Patents

Online content lifecycle management based on usage statistics, user-supplied value ratings and expiration dates Download PDF

Info

Publication number
WO2000062204A1
WO2000062204A1 PCT/US2000/009767 US0009767W WO0062204A1 WO 2000062204 A1 WO2000062204 A1 WO 2000062204A1 US 0009767 W US0009767 W US 0009767W WO 0062204 A1 WO0062204 A1 WO 0062204A1
Authority
WO
WIPO (PCT)
Prior art keywords
content
information
item
piece
database
Prior art date
Application number
PCT/US2000/009767
Other languages
French (fr)
Inventor
Nicholas D'arbeloff
Joseph Dimare
Barbara Heath
Original Assignee
Conjoin, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Conjoin, Inc. filed Critical Conjoin, Inc.
Priority to AU42342/00A priority Critical patent/AU4234200A/en
Publication of WO2000062204A1 publication Critical patent/WO2000062204A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Definitions

  • the invention pertains to a system for maintaining and accessing a diverse collection of information to identify and keep information that is current and useful.
  • the invention pertains to a system for maintaining currency and value for a collection of information.
  • a piece of information (a content item) is added to the collection
  • metadata is added to a database specifying the location of the information, and threshold values for usage and utility of the information.
  • the database item is accessed to access the information, the user is allowed to specify a value rating for the content item, and an access log is updated. The frequency of access of the content item and its value can thus be monitored to determine if either has fallen below the associated threshold value.
  • the system performs some action, such as warning the author of the information that it is not being used or is received low ratings, marking the item for archiving, archiving the item, or deleting the item from the system.
  • Content items may further have associated expiration dates which are given at the time that the content item is added to the collection. If a content item is approaching its expiration date, the system may notify the author or perform another action such as archiving the item.
  • the system can be used for monitoring diverse collections of information, including such items as URLs and web pages, files stored on a fileserver, whole directories and subdirectories, and nondigital data such as books.
  • Figure 1 shows a sample screen containing content and a ratings entry area
  • Figure 2 shows the content matrix
  • CLM Content Lifecycle Management
  • FIG. 1 is a screen shot illustrating how feedback is collected. Each time a user views a document (or item of content) it is registered in the usage table. When a user rates the content using the ratings engine 10 at the top of the window, it is also recorded associating the content ID with the user ID, date, and rating. Rating of content may be either optional or required for the user.
  • each content item can receive one of four classifications: (1) active or good content 20, which means that this item of content is receiving high usage and that users find it of high value; (2) needs promotion 22, which means that users value the content but usage is low (usually because users can not easily find the content); (3) needs updating 24, which means that users access the file frequently but that it is not of significant use or value; and (4) archive content 26, which is the default action for content that is not being viewed often and is of low value.
  • the ratings and usage of content in combination with its expiration date serve as an effective indicator for archiving content.
  • the inventive system for maintaining currency of the information dovetails well with systems for targeting information to particular users or groups of users.
  • the methods described herein may be used with the methods described in U.S. Provisional Application No. 60/129,106, filed April 13, 1999, and U.S. Patent Application "Group Targeted Content Personalization,” (attorney docket 2001774- 0001), filed on even date herewith.
  • Content Lifecycle Management is based on a set of database tables that are part of an overall intranet database.
  • the following data are cataloged in the database tables: •
  • Content information - may include ID, name, file name, content type, publish and expiration dates, author/publisher, approval and archive status, and ratings/usage thresholds for each content item
  • Ratings - may include content ID, user ID, date, rating, and module in which the content was rated
  • Usage - may include content ID, user ID, date, and module in which the content was viewed
  • Author/Publisher Info - may include e-mail address for notification by e- mail
  • the following data are stored on durable media such as a hard disk: • Content, which may include documents, files, data, executables, and catalogue entries or other references to nondigital content.
  • Content is published to the intranet using the publishing functionality within the system.
  • the file may be uploaded and stored on the server and all file-specific and publisher-entered information as well as the association between the file and the database entry is saved in the database tables.
  • Data specific content such as web links (URLs) are stored in the database and the information is associated with the database entry.
  • the user-entered information may include items such as the classification of the document using the topic and subtopic parameters, access classification (who has access to view the document), target classification (who would most benefit from this content), expiration date and the rating and usage thresholds for the content.
  • access classification who has access to view the document
  • target classification who would most benefit from this content
  • expiration date and the rating and usage thresholds for the content.
  • the Archiving Daemon comprises executable code which is run as a scheduled task on a periodic basis (set by the administrator). In one preferred embodiment, it is set to run on a nightly basis.
  • the archiving daemon sends an alert e-mail to notify the author that the content is aging and will be retired at the end of the week unless it is updated.
  • Low-Usage & Low-Rating Daemons These daemons are used to notify the author if content the author has published falls below the rating and usage thresholds set when the content was published. Like the Archiving Daemon, they may be set to run on a periodic basis (e.g. , nightly). These Daemons may further directly archive content which falls below one of the thresholds (e.g., if the author/publisher does not respond to an invitation to improve the content).
  • the Content Lifecycle Management Daemon compares the averaged ratings and the usage values of each record in the content management database against the administration-set values for both the rating and usage.
  • the administration set values are the index values and allow the matrix to be configured to a particular group of users. If the rating and usage values fall above the index values for both ratings and usage, no notification is sent. If either the usage or the ratings values fall below their respective index value, a notification is sent.

Abstract

Content Lifecycle Management (CLM) is a method for keeping content within an intranet (or internet) up-to-date and relevant for users of the website by archiving low-value, unused, inactive, and obsolete content from the site based on a combination of ratings threshold, usage threshold, and expiration date. Archiving content removes general access to that content, but retains it in the site for searches specific to the archive. CLM uses these threshold values as a method for providing automated feedback and notification to the publisher and/or author of aging or poorly performing content.

Description

ONLINE CONTENT LIFECYCLE MANAGEMENT BASED ON USAGE STATISTICS, USER-SUPPLIED VALUE RATINGS AND EXPIRATION DATES
Field of the Invention
The invention pertains to a system for maintaining and accessing a diverse collection of information to identify and keep information that is current and useful.
Background of the Invention
As information databases become more prevalent and more complete, a number of content management issues must be confronted. In particular, valuable data often become "lost in the haystack" of lower-quality content. In addition, systems rarely remove stale content or provide meaningful feedback to authors or publishers of the value of their content. It is an object of the present invention to address these issues by providing methods for winnowing databases to preserve high- quality content while archiving or deleting low-quality or stale content.
Summary of the Invention In one aspect, the invention pertains to a system for maintaining currency and value for a collection of information. When a piece of information (a content item) is added to the collection, metadata is added to a database specifying the location of the information, and threshold values for usage and utility of the information. When the database item is accessed to access the information, the user is allowed to specify a value rating for the content item, and an access log is updated. The frequency of access of the content item and its value can thus be monitored to determine if either has fallen below the associated threshold value. If either usage or value rating falls below its threshold, the system performs some action, such as warning the author of the information that it is not being used or is received low ratings, marking the item for archiving, archiving the item, or deleting the item from the system. Content items may further have associated expiration dates which are given at the time that the content item is added to the collection. If a content item is approaching its expiration date, the system may notify the author or perform another action such as archiving the item. The system can be used for monitoring diverse collections of information, including such items as URLs and web pages, files stored on a fileserver, whole directories and subdirectories, and nondigital data such as books.
Brief Description of the Drawing The invention is described with reference to the several figures of the drawing, in which,
Figure 1 shows a sample screen containing content and a ratings entry area; and
Figure 2 shows the content matrix.
Detailed Description
Websites today are faced with daunting content management issues including the proliferation of old and stale content and poor capabilities of providing feedback and notification to the author. Content Lifecycle Management (CLM) is designed to eliminate this proliferation by keeping content that is displayed on an intranet site relevant and up-to-date. This goal is accomplished by tagging each item of content recorded in the content management database with an expiration date and threshold values. Threshold values are minimally acceptable levels in order for content to remain current and available. CLM uses at least 2 thresholds: one for the rating (value of the content) and one for the usage ( how often each item is viewed.)
When content is published to the intranet, it is tagged with content thresholds for ratings and usage which the author or publisher feels represent minimum values. When content falls below either value based on a given timed period (for example, weekly basis), the CLM will notify the author of this fact, for example by e-mail, from within the CLM application, or both. The tabulated values which are required for this are collected from users who view and rate the content on the site. Figure 1 is a screen shot illustrating how feedback is collected. Each time a user views a document (or item of content) it is registered in the usage table. When a user rates the content using the ratings engine 10 at the top of the window, it is also recorded associating the content ID with the user ID, date, and rating. Rating of content may be either optional or required for the user.
The maximum and minimum values of the two thresholds form the CLM matrix, shown in Figure 2. Based on the rating and usage values, each content item can receive one of four classifications: (1) active or good content 20, which means that this item of content is receiving high usage and that users find it of high value; (2) needs promotion 22, which means that users value the content but usage is low (usually because users can not easily find the content); (3) needs updating 24, which means that users access the file frequently but that it is not of significant use or value; and (4) archive content 26, which is the default action for content that is not being viewed often and is of low value.
The ratings and usage of content in combination with its expiration date serve as an effective indicator for archiving content.
The inventive system for maintaining currency of the information dovetails well with systems for targeting information to particular users or groups of users. For example, the methods described herein may be used with the methods described in U.S. Provisional Application No. 60/129,106, filed April 13, 1999, and U.S. Patent Application "Group Targeted Content Personalization," (attorney docket 2001774- 0001), filed on even date herewith.
Example In one embodiment of the invention, Content Lifecycle Management is based on a set of database tables that are part of an overall intranet database. The following data are cataloged in the database tables: • Content information - may include ID, name, file name, content type, publish and expiration dates, author/publisher, approval and archive status, and ratings/usage thresholds for each content item
• Ratings - may include content ID, user ID, date, rating, and module in which the content was rated • Usage - may include content ID, user ID, date, and module in which the content was viewed • Author/Publisher Info - may include e-mail address for notification by e- mail
The following data are stored on durable media such as a hard disk: • Content, which may include documents, files, data, executables, and catalogue entries or other references to nondigital content.
• Archiving Daemon (code)
• Low-Usage Daemon (code)
• Low-Rating Daemon (code) • CLM Matrix Daemon (code)
Content Publishing
Content is published to the intranet using the publishing functionality within the system. During the publishing process, the file may be uploaded and stored on the server and all file-specific and publisher-entered information as well as the association between the file and the database entry is saved in the database tables. Data specific content such as web links (URLs) are stored in the database and the information is associated with the database entry. The user-entered information may include items such as the classification of the document using the topic and subtopic parameters, access classification (who has access to view the document), target classification (who would most benefit from this content), expiration date and the rating and usage thresholds for the content. (Access and target classification are discussed more fully in U.S. Provisional Application No. 60/129,106, filed April 13, 1999, and U.S. Patent Application "Group Targeted Content Personalization," (attorney docket 2001774-0001 ), filed on even date herewith).
Archiving Daemon
The Archiving Daemon comprises executable code which is run as a scheduled task on a periodic basis (set by the administrator). In one preferred embodiment, it is set to run on a nightly basis. Each time the archiving daemon is run, it makes two comparisons. First, it compares the expiration date of each record in the content library with the current date. If the current date matches the expiration date, the record is archived and an e-mail is sent to the author notifying the author that the record has been archived. At this point, the content that has been archived is no longer available throughout the site unless a search on archived content is executed. If the current date represents a period of a week before the expiration date (or any other suitable time period), the archiving daemon sends an alert e-mail to notify the author that the content is aging and will be retired at the end of the week unless it is updated.
Low-Usage & Low-Rating Daemons These daemons are used to notify the author if content the author has published falls below the rating and usage thresholds set when the content was published. Like the Archiving Daemon, they may be set to run on a periodic basis (e.g. , nightly). These Daemons may further directly archive content which falls below one of the thresholds (e.g., if the author/publisher does not respond to an invitation to improve the content).
CFM Matrix Daemon
The Content Lifecycle Management Daemon compares the averaged ratings and the usage values of each record in the content management database against the administration-set values for both the rating and usage. The administration set values are the index values and allow the matrix to be configured to a particular group of users. If the rating and usage values fall above the index values for both ratings and usage, no notification is sent. If either the usage or the ratings values fall below their respective index value, a notification is sent.
Other embodiments of the invention will be apparent to those skilled in the art from a consideration of the specification or practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with the true scope and spirit of the invention being indicated by the following claims. What is claimed is:

Claims

1. A computer-implemented method of maintaining currency and value of a collection of information, comprising: adding an item corresponding to a single piece of information to a database of metadata pertaining to the collection of information, where the added item includes an identifier indicating a location for the piece of information; a minimum value level for the piece of information; and a minimum usage level for the piece of information; accessing the added item in the database of metadata, where accessing includes using the identifier to access the piece of information; updating a record of the actual usage level for the piece of information; and allowing updating of a record of the actual value rating for the piece of information; and performing an action in response to a condition in which the actual value rating is below the minimum value level; or the actual usage level is below the minimum usage level.
2. The method of claim 1 , wherein the item further includes an expiration date for the piece of information, and wherein an action is performed in response to a condition in which the actual date is within a selected time period from or is equal to the expiration date.
3. The method of claim 1 or 2, wherein the action is selected from the group consisting of: removing the item from the database of metadata; marking the item for archiving; placing the item in an archive database; and notifying a user that the condition exists.
4. The method of claim 1 , wherein the identifier is selected from the group consisting of uniform resource locators, file location paths, directories, subdirectories, and catalogue entries for nondigital information.
PCT/US2000/009767 1999-04-14 2000-04-13 Online content lifecycle management based on usage statistics, user-supplied value ratings and expiration dates WO2000062204A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU42342/00A AU4234200A (en) 1999-04-14 2000-04-13 Online content lifecycle management based on usage statistics, user-supplied value ratings and expiration dates

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12910499P 1999-04-14 1999-04-14
US60/129,104 1999-04-14

Publications (1)

Publication Number Publication Date
WO2000062204A1 true WO2000062204A1 (en) 2000-10-19

Family

ID=22438472

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/009767 WO2000062204A1 (en) 1999-04-14 2000-04-13 Online content lifecycle management based on usage statistics, user-supplied value ratings and expiration dates

Country Status (2)

Country Link
AU (1) AU4234200A (en)
WO (1) WO2000062204A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1249767A2 (en) * 2001-04-12 2002-10-16 General Electric Company System and method for updating an intranet portal
WO2004012126A3 (en) * 2002-07-29 2004-03-18 Opinionlab Inc System and method for providing substantially real-time access to collected information concerning user interaction with a web page of a website
US6785717B1 (en) 1999-08-30 2004-08-31 Opinionlab, Inc. Method of incorporating user reaction measurement software into particular web pages of a website
US6928392B2 (en) 1999-08-30 2005-08-09 Opinionlab, Inc. Collecting a user response to an explicit question specifically concerning a particular web page of a website
US7085820B1 (en) 1999-08-30 2006-08-01 Opinionlab, Inc. System and method for reporting to a website owner user reactions to particular web pages of a website
US7370285B1 (en) 2002-07-31 2008-05-06 Opinionlab, Inc. Receiving and reporting page-specific user feedback concerning one or more particular web pages of a website
US7478121B1 (en) 2002-07-31 2009-01-13 Opinionlab, Inc. Receiving and reporting page-specific user feedback concerning one or more particular web pages of a website
US7809602B2 (en) 2006-08-31 2010-10-05 Opinionlab, Inc. Computer-implemented system and method for measuring and reporting business intelligence based on comments collected from web page users using software associated with accessed web pages
US7827487B1 (en) 2003-06-16 2010-11-02 Opinionlab, Inc. Soliciting user feedback regarding one or more web pages of a website without obscuring visual content
US7865455B2 (en) 2008-03-13 2011-01-04 Opinionlab, Inc. System and method for providing intelligent support
US8041805B2 (en) 1999-08-30 2011-10-18 Opinionlab, Inc. System and method for reporting to a website owner user reactions to particular web pages of a website
US8332232B2 (en) 2009-11-05 2012-12-11 Opinionlab, Inc. System and method for mobile interaction
US8775237B2 (en) 2006-08-02 2014-07-08 Opinionlab, Inc. System and method for measuring and reporting user reactions to advertisements on a web page

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0515073A2 (en) * 1991-05-21 1992-11-25 Hewlett-Packard Company Dynamic migration of software
GB2327787A (en) * 1997-06-02 1999-02-03 Knowledge Horizons Pty Ltd Data classification and retrieval system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0515073A2 (en) * 1991-05-21 1992-11-25 Hewlett-Packard Company Dynamic migration of software
GB2327787A (en) * 1997-06-02 1999-02-03 Knowledge Horizons Pty Ltd Data classification and retrieval system

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7970887B2 (en) 1999-08-30 2011-06-28 Opinionlab, Inc. Measuring a page-specific subjective user reaction concerning each of multiple web pages of a website
US6785717B1 (en) 1999-08-30 2004-08-31 Opinionlab, Inc. Method of incorporating user reaction measurement software into particular web pages of a website
US6928392B2 (en) 1999-08-30 2005-08-09 Opinionlab, Inc. Collecting a user response to an explicit question specifically concerning a particular web page of a website
US7085820B1 (en) 1999-08-30 2006-08-01 Opinionlab, Inc. System and method for reporting to a website owner user reactions to particular web pages of a website
US8041805B2 (en) 1999-08-30 2011-10-18 Opinionlab, Inc. System and method for reporting to a website owner user reactions to particular web pages of a website
EP1249767A3 (en) * 2001-04-12 2007-05-23 General Electric Company System and method for updating an intranet portal
EP1249767A2 (en) * 2001-04-12 2002-10-16 General Electric Company System and method for updating an intranet portal
WO2004012126A3 (en) * 2002-07-29 2004-03-18 Opinionlab Inc System and method for providing substantially real-time access to collected information concerning user interaction with a web page of a website
US8024668B2 (en) 2002-07-31 2011-09-20 Opinionlab, Inc. Receiving and reporting page-specific user feedback concerning one or more particular web pages of a website
US7478121B1 (en) 2002-07-31 2009-01-13 Opinionlab, Inc. Receiving and reporting page-specific user feedback concerning one or more particular web pages of a website
US8037128B2 (en) 2002-07-31 2011-10-11 Opinionlab, Inc. Receiving page-specific user feedback concerning one or more particular web pages of a website
US7370285B1 (en) 2002-07-31 2008-05-06 Opinionlab, Inc. Receiving and reporting page-specific user feedback concerning one or more particular web pages of a website
US8082295B2 (en) 2002-07-31 2011-12-20 Opinionlab, Inc. Reporting to a website owner one or more appearances of a specified word in one or more page-specific open-ended comments concerning one or more particular web pages of a website
US7827487B1 (en) 2003-06-16 2010-11-02 Opinionlab, Inc. Soliciting user feedback regarding one or more web pages of a website without obscuring visual content
US8775237B2 (en) 2006-08-02 2014-07-08 Opinionlab, Inc. System and method for measuring and reporting user reactions to advertisements on a web page
US7809602B2 (en) 2006-08-31 2010-10-05 Opinionlab, Inc. Computer-implemented system and method for measuring and reporting business intelligence based on comments collected from web page users using software associated with accessed web pages
US8538790B2 (en) 2006-08-31 2013-09-17 Opinionlab, Inc. Computer-implemented system and method for measuring and reporting business intelligence based on comments collected from web page users using software associated with accessed web pages
US7865455B2 (en) 2008-03-13 2011-01-04 Opinionlab, Inc. System and method for providing intelligent support
US8332232B2 (en) 2009-11-05 2012-12-11 Opinionlab, Inc. System and method for mobile interaction

Also Published As

Publication number Publication date
AU4234200A (en) 2000-11-14

Similar Documents

Publication Publication Date Title
US6631369B1 (en) Method and system for incremental web crawling
US7752314B2 (en) Automated tagging of syndication data feeds
US5898836A (en) Change-detection tool indicating degree and location of change of internet documents by comparison of cyclic-redundancy-check(CRC) signatures
EP1622053B1 (en) Phrase identification in an information retrieval system
EP1622055B1 (en) Phrase-based indexing in an information retrieval system
Bergman White paper: the deep web: surfacing hidden value
EP1622054B1 (en) Phrase-based searching in an information retrieval system
US8973128B2 (en) Search result presentation
US8812478B1 (en) Distributed crawling of hyperlinked documents
EP2024879B1 (en) Significant change search alerts
WO2000062204A1 (en) Online content lifecycle management based on usage statistics, user-supplied value ratings and expiration dates
Wilde et al. Criminal choice, nonmonetary sanctions, and marginal deterrence: a normative analysis
US20070061297A1 (en) Ranking blog documents
EP2812815B1 (en) Web page retrieval method and device
US20110087644A1 (en) Enterprise node rank engine
US8001462B1 (en) Updating search engine document index based on calculated age of changed portions in a document
US7792827B2 (en) Temporal link analysis of linked entities
EP1652027A2 (en) Server architecture and methods for persistently storing and serving event data
Lin Detection of cloaked web spam by using tag-based methods
CN111026961A (en) Method and system for indexing data of interest within multiple data elements
Doran et al. A classification framework for web robots
US8239382B2 (en) Method and apparatus for creating an index of network data for a set of messages
JP2003316774A (en) Document control system, document accumulation method and program executing the method
Seo et al. UMass at TREC 2007 Blog Distillation Task.
Tsegay et al. Dynamic index pruning for effective caching

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP