WO2001035218A1

WO2001035218A1 - Method for modifying data and/or code represented in one language by a computer program written in a second language

Info

Publication number: WO2001035218A1
Application number: PCT/US2000/028711
Authority: WO
Inventors: James S. Proctor; Duane A. Murphy; Benjamin S. Flaumenhaft
Original assignee: Bear River Associates Inc.
Priority date: 1999-11-09
Filing date: 2000-10-16
Publication date: 2001-05-17

Abstract

A method for operating a data processing system to generate an output. A source file containing a plurality of instructions is stored on the data processing system (12). The source file includes instructions of two types, external instructions and internal instructions. The data processing system (12) also provides one or more external processing programs that process the external instruction at run time. A source file processor processes the source file. The source file processor recognize the external instruction (14) although it can not execute the instructions. The source file processor includes a mapping that assigns one of the external processing program to each of the external instructions. When the source file processor recognize one of the external instructions the source file processor, (a) determines which of the external processing program processes that instruction, (b) transfer that instructions to that external processing program and (c) receives back from the external processing progam one or more instructions that are inserted into the source file in place of that instruction (15). The ouptut is generated from the internal instructions that where originally present in the source file and those generated in response to the external instructions (17). The external instructions may also include one or more variable to be filled in by the external processing program corresponding to that external instruction. One class of external instructions generates multiple records. Each record provides a different value for one of the variables. In this case the instructions that are inserted into the source file in place of the external instruction comprise N instruction, wherein each instruction includes a different one of the variable values.

Description

METHOD FOR MODIFYING DATA AND/OR CODE REPRESENTED IN ONE LANGUAGE BY A COMPUTER PROGRAM WRITTEN IN A SECOND LANGUAGE

Field of the Invention

The present invention relates to compilers for use in computer systems, and more particularly, to a method for using one computer language to modify or generate parts of a source in another language.

Background of the Invention

The World Wide Web ("Web") has become a very successful means of communication between central sites connected to the Internet and individual users on the Internet who wish to communicate with the site. The communications are controlled by two programs, a Web Browser that runs on the user's computer and a Web server that runs on the site's computer. A Web Browser sends a request to a Web Server using a protocol such as the HTTP protocol. A request results in a MIME ("Multipurpose Internet Mail Extensions" - see IETF RFC 1341, 1342, 1521) stream being sent back to the Web Browser. The MIME stream includes a Content Type header for the data that indicates how the Web Browser will treat the data being sent. For example, a "text/html" MIME type indicates that the data is in the hypertext markup language (HTML), and should be interpreted accordingly; an

"image/gif ' MIME type indicates that the data is in a "gif ' image file, and should be rendered as an image after unpacking the data in the file.

The Web Server typically services a request either by sending back a file stored locally on the server or by running a program, the output of which is the MIME stream to be sent back to the browser. As noted above, the Web typically makes use of the hypertext format to display information to a user and receive input from the user. Hypertext allows a body of information to be organized into a hierarchical system in which the user can pursue increasing levels of specificity by following the various hypertext links from one document to the next. A typical hypertext text display system (a Web Browser) displays a document in which selected words or phrases are highlighted. The highlighted phrase indicates that another document related to that phrase is in the system. If the person viewing the document selects one of these words or phrases by pointing and clicking using a pρi-ntine_dβ«ace, the second document related to that word or phrase is sent to the user's screen. The user may return to the original document at any time selecting a "back" option on the viewer screen.

This form of information display has found wide acceptance on the Internet because of its ease of use. A user located at a terminal on the network connects to a server on the network that has a "home page" in hypertext format. The home page is then displayed on the user's screen by the browser. When the user selects a highlighted word, the browser communicates the user's choice to the server in a MIME data stream. The server then transfers the corresponding file to the user's machine via the network. The browser on the user's machine then displays this file to the user.

Conventional browser's also allow the user to input text on the user's screen which is then transferred to the server when the user selects a graphical element such as a "button". Hence, the user can communicate information to the server beyond the predefined hypertext link information, provided the server is programmed to use this information.

The hypertext mode of information organization is also efficient from the point of view of the home page provider on the server. A home page is written in HTML. HTML is a word processing format, which allows the user to define a page as the user would with a conventional word processor. In fact, programs for converting the various conventional word processing formats to HTML are commercially available. For each phrase that is to provide a link, the user marks the phrase by enclosing it with beginning and ending "tags". The user then defines another hypertext file that contains the document to be displayed in response to the user selecting the phrase. Hence, a server program can be as simple as a set of HTML documents created with a conventional word processing system and stored on the server.

If the interaction between the user and server is basically a transfer of predefined information which is static in nature, the simple "set of documents" mode is satisfactory. If, however, the information to be transferred requires some form of processing prior to the transfer, the simple hypertext engines are less than ideal. Consider an application in which the server must execute a program to gather and calculate the data that forms a portion of hypertext material that is to be returned to the user. To provide such a service, the server must include a program that is specific to the application and which performs the computations and then generates the results in the form of a hypertext document that is delivered on the network. The HTTP Protocol defines a general mechanism for programs to operate in this way, called the Common Gateway Interface (or CGI). A program that uses this mechanism is often referred to as a CGI Program.

In general, complex Web pages that require both conventional programming support and HTML page layout must be written by two different programmers. The HTML work is typically the work of someone who specializes in the graphic arts and information presentation. Such individuals usually have little training in conventional object-oriented programming. The computational portions of the program are typically written by a programmer who has little knowledge of HTML. Combining the output of these two markedly different disciplines poses a number of problems.

One prior art attempt to solve this problem is taught in U.S. Patent 5,745,908. In this system, a preprocessor is provided that converts a mixed file containing both HTML code and conventional COBOL computer code into an input file for a COBOL compiler. The COBOL code is surrounded by special tags that allow the preprocessor to separate the COBOL code sections from the HTML sections. The preprocessor converts the material that is not surrounded by the special tags into output statements in the COBOL computer language. The output of the preprocessor is then compiled using a COBOL compiler. When the compiled program executes, it generates the HTML statements from the output statements generated by the preprocessor.

While this approach provides a solution to the problem of combining the HTML and computer code sections, the resulting file is difficult for either programmer to read, since it is a mixture of two languages. In addition, the resulting file is difficult to maintain. The task of maintaining the program often falls on programmers who were not involved in creating the original file. The mixture of languages in the file makes this task particularly difficult.

Furthermore, any time the contents of the page are altered, the input file must be re-compiled and the resultant executables reloaded on to the server even if the only changes are to the HTML portion of the page.

In addition, this solution is limited to programs that involve only one underlying computer language. Different computer languages are optimized for different types of programming tasks. If the optimum solution to the overall programming tasks involve computer programs written in different languages, the approach cannot be utilized.

Broadly, it is the object of the present invention to provide an improved system for generating a source file in one computer language by utilizing code in a second computer language.

It is another object of the present invention to provide an improved system for generating HTML or similar page layout language documents when specialized computer code is needed to provide part of the document.

It is a further object of the present invention to provide a programming system in which the page layout sections of a document are separated from the computer code needed to implement the document.

These and other objects of the present invention will become apparent to those skilled in the art from the following detailed description of the invention and the accompanying drawings.

Summary of the Invention

The present invention is a method for operating a data processing system to generate an output. A source file containing a plurality of instructions is stored on the data processing system. The source file includes instructions of two types, external instructions and internal instructions. The data processing system also provides one or more external processing programs that process the external instructions at run time. A source file processor processes the source file. The source file processor recognizes the external instructions although it cannot execute the instructions. The source file processor includes a mapping that assigns one of the external processing programs to each of the external instructions. When the source file processor recognizes one of the external instructions, the source file processor, (a) determines which of the external processing programs processes that instruction, (b) transfers that instruction to that external processing program, and (c) receives back from the external processing program instructions that are inserted into the source file in place of that instruction or an indication that the instruction in question is to be deleted. The output is generated from the internal instructions that were originally present in the source file and those generated in response to the external instructions. In general, the instructions generated by the external instructions will be internal instructions; however, some external instructions may generate other external instructions that are eventually used to generate the final internal instructions. The external instructions may also include one or more variables to be filled in by the external processing program corresponding to that external instruction. One class of external instructions generates multiple records. Each record provides a different value for one of the variables. In this case, the instructions that are inserted into the source file in place of the external instruction comprise N instructions, wherein each instruction includes a different one of the variable values.

Brief Description of the Drawings

Figure 1 is a flow chart of the process by which an XML processor according to one application of the present invention generates a hypertext file to be sent to the browser.

Detailed Description of the Invention

The present invention may be more easily understood with reference to an embodiment that operates on a server to provide a HTML language file that is returned to a browser in response to a request from the browser to a server having a program that generates the HTML file. It is assumed that the browser's request requires both page layout information and computation of the type requiring a computer program operating on the server. To simplify the following discussion, it will be assumed that the page layout information is provided in the form of an XML file on the server and that the XML processor is a processor according to the present invention. XML is a generalized version of HTML. However, it will be obvious to those skilled in the art from the following discussion that other layout languages could be utilized. A processor according to the present invention is essentially an XML processor that has been modified to recognize a class of external tags as described below.

The HTML/XML application of the present invention may be more easily understood with reference to the structure of a typical prior art hypertext document. Such a document consists of a string of characters in which specific sequences of characters are delimitated by beginning and end tags. For example, one class of tagged sequences includes tags that mark the beginning and end of sequences to be displayed in a particular style, which define font, print size, etc. A second class of tags specifies the file to be sent in response to a user selecting the sequence with the user's pointing device. As these tags are also well known in the computer arts, they will not be discussed further here.

A hypertext document according to the present invention may be viewed as a standard hypertext file having one or more new classes of tags. Refer now to Figure 1 , which is a flow chart of the process by which an XML processor according to the present invention generates the hypertext file to be sent to the browser. The XML processor on the server opens the file in preparation for transmitting the document specified in the file and proceeds to process the file until it finds an open tag marking the beginning of a tagged sequence as shown at 12. If the processor finds an end of file marker before it finds such a tag, the process is complete. The processor examines the open tag to determine if the tag is a standard tag that can be processed by the processor as shown at 13. Such tags will be referred to as "local tags" in the following discussion. If the open tag is a local tag, the processor processes the tag in the conventional manner. Since properly formed HTML is a proper subset of XML, no significant translation of XML to HTML is required to produce the output file. If the XML processor does not recognize the start tag as a local tag, the processor consults a table on the server, which defines a set of non-standard tags that will be referred to as "external tags" in the following discussion, as shown at 14. If the open tag is not found in this table, an exception is thrown and the process is aborted. If the tag is recognized as an external tag, the processor scans the file for the corresponding end tag to construct a sequence TS containing the portion of the file between the external tag and its end tag. The table that defines the external tags also includes an object-method for each external tag. The identified object-method for the external tag in question is then invoked through the appropriate operating system command as shown at 16. The command also includes a pointer to the location of TS. In this embodiment of the present invention, the object-method returns a new hypertext sequence that replaces TS. The processor replaces TS in the input file with this new sequence and continues the processing of the file at the insertion point of the new sequence.

This embodiment separates the hypertext portion of the document from the "computer code". The hypertext programmer need only learn the operations carried out by the specialized tags implemented on the server. Similarly, the computer programmer does not need to understand hypertext programming. As will be explained in more detail below, at most the conventional programmer must learn a few rules about which portions of the hypertext string are to be replicated when returning the new hypertext string.

It should also be noted that this embodiment does not require the underlying computer code to be re-compiled each time a change is made to the hypertext document. In fact, this embodiment does not require re-programming of the server functions so long as the object- methods specified by the specialized tags remain the same.

Having provided the above overview of one use of the present invention, the present invention will be explained in detail with reference to an input file defined in XML. In general, an XML file can be regarded as a sequence of tagged sequences of the form:

<TagID, Parameter List>

TS </TagID>

Here, <T,...> denotes the tag marking the beginning of the sequence, and </T> denotes an end tag for the sequence. In general, the material between the tags, TS, may include additional tagged sequences having parameters that are determined by the sequences containing it. As noted above, the tagged sequences are divided into local tags and external tags. The local tagged sequences are processed directly by the XML processor in the conventional manner. The external tagged sequences are processed by the object-method defined in the table that defines the external tags.

TS may also include a list of variables that are to be filled in with the contents of specified fields returned by the operation defined in TaglD. In the preferred embodiment of the present invention, these variables are identified by a delimiter followed by a name that specifies the field in the record returned by the object-method that handles the TaglD in question.

When the XML processor of the present invention encounters an external tagged sequence, it sends the Parameter List to the object-method specified in the table discussed above. The operations performed by this object-method may generate multiple response records. If no records are generated, the tagged sequence is ignored and processing continues at the next tagged sequence in the file. In effect, the tagged sequence is deleted. If N response records are returned, where N>0, then the present invention replaces the tagged sequence by N concatenated replacement sequences based on the sequence TS. Each replacement sequence corresponds to one of these records and consists of the sequence TS with any variables found in the corresponding response sequence replaced by the values from identified fields in the response record.

For example, consider a TaglD that causes a database to be searched in the manner specified by the Parameter List, and each record returned has a field named "name" and a field named "address". The tagged sequence would have the form:

TS(&varl, &name, &address, &var2 ...) </Search Data Base> Assume that the object-method invoked by the command returns two records. The first record has name="G. Smith" and address="123 2^nd Street", and the second record has name="A. Jones" and address="14 Main Street". The present invention would replace the tagged sequence with the two tagged sequences

TS(&varl, "G. Smith", "123 2^nd Street", &var2 ...) TS(&var2, "A. Jones", "14 Main Street", &var2 ...).

The XML processor would then pick-up its processing starting with the first of these sequences.

It should be noted that since the records returned in the above-described example did not include fields matching "varl" and "var2", these parameters were not replaced. These remaining variables will be supplied by the operations specified in each TS. It should be noted that TS might also include both external and local tagged sequences.

While the present invention has been explained in terms of hypertext markup language processors, it will be obvious to those skilled in the art from the preceding discussion that the teachings of the present invention can be applied to a wide variety of code processing engines. The code-processing engine needs to be able to recognize sequences of instructions that are to be processed by it, "internal instructions", and those that are to be processed externally, "external instructions". The engine will contain mapping for identifying an external object-method to process each type of external instruction. Each of the external processes must return a sequence of instructions that is recognizable by the code processor. The number of sequences returned will be determined by the specific external instruction and the parameter list accompanying that instruction. While the above examples utilized a separate parameter list and included tagged sequence TS, it will be obvious to those skilled in the art from the preceding discussion that the parameter list sent to the external object might include TS as well.

It should also be noted that the present invention may be implemented via a preprocessor that operates on the source file to generate an intermediate file which is converted by a conventional XML processor to the HTML file sent to the browser. In such an embodiment, the preprocessor only needs to recognize the tags for the external instructions. Each time the processor finds such a tag, it finds the matching tag, constructs the TS sequence, and transmits the same to the object-method specified in the table that defines the mapping between the external instructions and the corresponding object-methods. The preprocessor then inserts the returned sequences into the source file in place of the original sequence and continues processing at the beginning of the inserted sequence.

The present invention has been explained in terms of object-oriented systems in which the external instructions are sent to methods of objects that run under the operating system on the server. However, it will be obvious to those skilled in the art from the preceding discussion that the present invention can utilize any form of external processing program to convert the external instructions into internal instructions.

The present invention has been explained in terms of the replacement of the external instruction by one or more new instructions. However, it will be obvious to those skilled in the art from the preceding discussion that the instruction returned could be an instruction indicating that the original instruction was to be deleted, i.e., a "no-op" instruction.

The above-described embodiments of the present invention have utilized hypertextlike documents. However, the teachings of the present invention can be generalized to other processing systems. In general, a file A exists whose format conforms to the specification for Language X. This file, however, contains language elements which are directives recognized by the file processor. While these directives are written in legal Language X syntax, they are not generally legal Language X elements.

The processor reads file A and holds it as a representation. The most convenient representation is often hierarchical, as is the case with a tree, but the file need not be represented as such.

The processor traverses the representation, looking up each language element encountered in a list of special element directives. If the element is found, the file processor then directs control to a functional module written in Language Y. Typically, the functional module is informed of the element and location within the representation from which it was invoked, and also some context information, in those situations where the representation supports it. However, the functional module may just receive a copy of the element.

The functional module then performs the manipulations specified by the element on the representation. For example, the functional module can remove the element, and any associated elements (the element's children, in the case of an hierarchical representation) via the module that was invoked.In addition, the functional module can repeatedly copy associated elements (children, for a tree representation), generally with context information individual to each copy. In the case where the representation is hierarchical, it is entirely possible that Language X elements within the scope of the module's manipulation also represent special directives requiring processing via functional modules of their own. Hence, in this case the functional module must also arrange for traversal of its children. In general, there are no limitations on the extent or type of manipulation the module can perform on the representation.

The element which caused the invocation of the module is generally (but not always) not a legal Language X element. Hence, the execution of the module is normally accompanied by the removal of the element from the representation.

The processor continues traversing the representation until its elements are exhausted, at which point it may be converted into its output format.

One application of such a method is directed to delivering web pages dynamically. In this case, the content of the web page is written in one language (HTML); HTML is not, however, suitable for describing programmatic operations. For example, HTML cannot describe a database query. In this case, the present invention has been described in terms of XML, which is a superset of HTML. XML is similar to HTML in that there are tags. However, in addition to the static set of tags specified by HTML, XML allows the author to declare and use these newly defined tags. The translation of an HTML file to XML is trivial, since the HTML file looks exactly the same in XML as it did in HTML. In the terms of the above general description, the XML file represents the file A; XML is the Language X. In addition to the static HTML tags used on a web page, other special tags can now be included as described above. Each of these tags is associated by name with a particular Java class implemented as one or more code modules in the Java programming language. In this case Java is Language Y.

The processor reads the XML file A and holds it in a tree data structure, in which each of the nodes of the tree represents either statements analogous to text or statements analogous to an XML tag.

The processor then traverses the representation, looking up each tag node in a list of names. If the tag node is found, the processor calls a piece of Java code. The piece of Java code is informed of the element and location within the data tree, as well as some context information (which it can then pass to any tags it may find in the continued traversal). This context information is referred to by XML entities, which can be bound by name to pieces of data.

The Java code can then perform the manipulation on the tree associated with that code. In the case of delivering web pages, two kinds of manipulation may be utilized as suggested above. First, the code might determine that given the particular place in the tree at which the code was invoked, a piece of the tree should be excluded (i.e., removal of the element). This would simply remove a fragment of the HTML text as represented in the tree.

Second, the code might copy the nodes in the tree that fall below its reference. This has the effect of reproducing fragments of HTML. Each row of an HTML table, for example, would be represented initially by a fragment of HTML, and, subsequently, by a piece of the tree. This piece of the tree can be copied many times to produce, for example, many rows of the table, each row of which differs only by data. A table mapping names of cars to prices of cars, for example, would have entity references to names and prices; the tree structure of each row would be the same, and would be copied many times, but with each copy, the data itself (the value of the entity reference) would change. In this case, the initial web page, expressed in XML source, is read into a tree; the tree is then manipulated to produce completely dynamic data. After the tree has been altered, it is converted back into the original language, which now contains only tags a browser can represent. Since HTML is a subset of XML, once the processing removes all the tags from the XML source that are not legal HTML, the resulting file will produce HTML which can then be sent to the browser.

Various modifications to the present invention will become apparent to those skilled in the art from the foregoing description and accompanying drawings. Accordingly, the present invention is to be limited solely by the scope of the following claims.

Claims

WHAT IS CLAIMED IS:

1. A method for operating a data processing system to generate output, said method comprising the steps of:

providing a source file containing a plurality of instructions, said instructions including external instructions and internal instructions;

providing one or more external processing programs;

providing a source file processor for said source file, said source file processor recognizing said external instructions, said source file processor also including code for assigning one of said external processing programs to each of said external instructions, wherein upon recognizing one of said external instructions, said source file processor (a) determines which of said external processing programs processes that instruction, (b) transfers that instruction to that external processing program, (c) receives back from said external processing program zero or more instructions to be inserted into said source file in place of that instruction, and (d) inserts said received instructions into said source file in place of that instruction to provide a processed input ; and

providing an output processor for generating said output from said processed input..

2. The method of claim 1 wherein said inserted instructions comprise at least one internal instruction.

3. The method of claim 1 wherein said internal instructions and said external instructions each begin with a unique start tag and end with a predetermined end tag.

4. The method of claim 1 wherein said source file processor operates on said source file after said server has received a request for said output file.

5. The method of claim 3 wherein said source file processor includes a table of beginning tags that defines one of said external processing programs for each of said external instructions.

6. The method of claim 5 wherein one of said external instructions includes a variable to be filled in by said external processing program corresponding to that external instruction.

7. The method of claim 6 wherein, in response to receiving one of said external instructions, said external processing program generates N of records, wherein N>1, each record providing a different value for said variable and wherein said instructions that are inserted into said source file in place of said external instruction comprise N instructions, each instruction comprising a different one of said variable values.

8. The method of claim 7 wherein each of said N instructions inserted in place of said external instruction differs from the other of said N instructions only by the values inserted into said variable.

9. The method of claim 1 wherein said output file comprises an HTML file.

10. The method of Claim 1 wherein said source file is an XML file.