WO2010025763A1 - Protocol message parsing - Google Patents

Protocol message parsing Download PDF

Info

Publication number
WO2010025763A1
WO2010025763A1 PCT/EP2008/061577 EP2008061577W WO2010025763A1 WO 2010025763 A1 WO2010025763 A1 WO 2010025763A1 EP 2008061577 W EP2008061577 W EP 2008061577W WO 2010025763 A1 WO2010025763 A1 WO 2010025763A1
Authority
WO
WIPO (PCT)
Prior art keywords
message
memory
session initiation
lexical element
lexical
Prior art date
Application number
PCT/EP2008/061577
Other languages
French (fr)
Inventor
Anders Nordström
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Kalluri, Krishna Prasad
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ), Kalluri, Krishna Prasad filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/EP2008/061577 priority Critical patent/WO2010025763A1/en
Publication of WO2010025763A1 publication Critical patent/WO2010025763A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/12Protocol engines
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1101Session protocols
    • H04L65/1104Session initiation protocol [SIP]

Definitions

  • the present invention relates to a method and apparatus for optimising the parsing of protocol messages at a message handling agent and is applicable in particular to the parsing of text-based protocol messages, for example Session Initiation Protocol messages.
  • SIP Session Initiation Protocol
  • IETF Internet Engineering Task Force
  • HTTP Hypertext Transfer Protocol
  • SMTP Simple Mail Transfer Protocol
  • a key objective in the development of SIP was simplicity and the reuse of existing Internet mechanisms. Not only is SIP text based (where individual characters are encoded using the 8-bit UCS/Unicode Transformation Format (UTF-8) character set), at least the header portions of SIP messages are essentially human readable to simplify the message creation and debugging processes.
  • UTF-8 Unicode Transformation Format
  • SIP messages are handled by SIP agents.
  • a SIP agent may be present at a user terminal or at a network-based node, in particular at SIP proxies and SIP Application Servers (ASs).
  • ASs SIP Application Servers
  • SIP agents may be implemented at the following SIP entities: Back-to-Back User Agent , Proxy, Registrar, Redirect Server, User Agent Client, and User Agent Server.
  • a particular SIP agent receiving a SIP message might in some cases only need to look at one or a small number of fields within a SIP message in order to handle a message, for example it may only need to look at the "To" field which contains a destination SIP address.
  • the human readable format of SIP message headers makes the parsing of messages at network nodes incorporating a SIP agent, a computationally difficult process. Parsing involves searching and interpreting the header for particular text strings, e.g. in order to identify the address or Universal Resource Identifier (URI) of the called party (e.g. "sip:bob@biloxi.com”).
  • URI Universal Resource Identifier
  • SIP does not specify a field order within the SIP header, allows the unrestricted use of delimiters such as spaces between header field names and values, and allows header field values to span multiple lines.
  • Parsing is a particular problem in communication networks such as the IP Multimedia Subsystem (IMS) specified by 3GPP for the provision of multimedia services to mobile or fixed subscribers, where a message may pass though a large number of SIP nodes and be parsed separately at each, e.g. at a Call Session Control Function (CSCF) or SIP Application Server (AS). On some occasions it may even be necessary to parse a message multiple times within the same node (e.g. at different components within the node). This could be the case, for example, where multiple SIP servlets within multiple application server instances are deployed at the same node. It might also be the case when a SIP agent is implemented using components written in different programming languages (e.g.
  • SIP agents utilise the software parsing of SIP messages. This is however extremely computationally intensive given the text-based nature of SIP messages as described above.
  • the software parsing of SIP messages becomes a bottleneck for message handling given that the network hardware can operate at Gigabit speeds.
  • a hardware parser that is connected directly to the network interface, parsing messages before passing them to the protocol message processing entity, e.g. a SIP agent in the case of SIP.
  • the apparatus for handling a message conforming to a text-based signaling protocol.
  • the apparatus comprises a memory, a network interface for receiving a message, and a controller for storing the received message in the memory.
  • the apparatus further comprises a hardware parser for retrieving the message from the memory, for parsing the message to identify the presence of predefined lexical elements in the message and, for each identified lexical element, for identifying the location and length of any associated data, and for storing presence, location and length information in the memory.
  • a protocol message processing entity is provided for retrieving said information from the memory, for using the information to extract relevant data from the message stored in the memory, and for processing the message according to the extracted data.
  • the hardware parser may be implemented in a Field Programmable Gate Array or an Application Specific Integrated Circuit.
  • Other components of the apparatus including the controller, memory and network interface may also be implemented within the same hardware.
  • the network interface may be configured to inform said controller upon receipt of a message by the controller, e.g. by sending an interrupt or responding to a poll message.
  • the text-based signaling protocol is the Session Initiation Protocol
  • said protocol message processing entity is a Session Initiation Protocol agent
  • the apparatus may comprise at least one processor and a second memory storing a computer program which, when run on said at least one processor, implements said Session Initiation Protocol agent.
  • the apparatus may be configured to perform as one of a:
  • Session Initiation Protocol Back-to-Back User Agent Session Initiation Protocol Proxy; Session Initiation Protocol Registrar; Session Initiation Protocol Redirect Server; Session Initiation Protocol User Agent Client; and
  • the apparatus may be configured for use with an IP Multimedia Subsystem, being one of a user terminal, a Session Initiation Protocol Application Server, and a Call Session Control Function node.
  • IP Multimedia Subsystem being one of a user terminal, a Session Initiation Protocol Application Server, and a Call Session Control Function node.
  • the hardware parser is configured to construct, for the received message, a bit sequence in which each bit setting indicates the presence or absence of a given lexical element within the message, and to store the bit sequence in the memory as part of said information.
  • the parser is configured to store the corresponding location and length within a data structure at a location in the memory predefined for that lexical element.
  • the parser For a second occurrence of a lexical element in the message, the parser is configured to store the corresponding location and length within a second data structure in the memory, and to include within the data structure stored for the first occurrence of the lexical element a pointer to the location of the second data structure, and to repeat these steps iteratively for each further occurrence of the lexical element.
  • the parser is also configured to identify in the data structure stored for the first occurrence of the lexical element, the number of further occurrences of the lexical element within the message.
  • the data structure may be created to contain data within fields at predefined locations and of predefined size.
  • the network interface is associated with an IP socket, i.e. an IP address and port number.
  • this may comprise a first recogniser for recognising a control character within the message and a second recogniser for recognising upper case characters within the message and for converting these to lower case characters, the outputs of each recogniser being used by the hardware parser to identify lexical elements.
  • a method of handling a message conforming to a text-based signaling protocol comprises receiving a message at a network interface of a message handling node, storing the received message in a memory of the node, and retrieving the message from the memory into a hardware parser.
  • the parser parses the message in order to identify the presence of predefined lexical elements in the message and, for each identified lexical element, to determine the location and length of any data associated with the element.
  • the parser then stores the presence, location and length information in the memory.
  • the stored information can be retrieved from the memory, and used by a protocol message processing entity to extract relevant data from the message stored in the memory and process the message according to the extracted data.
  • the text-based signaling protocol may be the Session Initiation Protocol, with said protocol message processing entity being a Session Initiation Protocol agent.
  • the method may comprise running a computer program on one or more processors in order to implement said Session Initiation Protocol agent.
  • the method may comprise constructing with the hardware parser, for the received message, a bit sequence in which each bit setting indicates the presence or absence of a given lexical element within the message, and storing the bit sequence in the memory as part of said information. For a first occurrence of a lexical element in the message, the corresponding location and length are stored within a data structure at a location in the memory predefined for that lexical element.
  • the method may further comprise, within said hardware parser, for a second occurrence of a lexical element in the message, storing the corresponding location and length within a second data structure in the memory, and including within the data structure stored for the first occurrence of the lexical element a pointer to the location of the second data structure, and repeating these steps iteratively for each further occurrence of the lexical element.
  • an indication of the number of further occurrences of the lexical element within the message may be stored within the data structure.
  • the data structure may be created to contain data within fields at predefined locations and of predefined size.
  • Figure 1 shows an exemplary SIP message including SIP method and header fields
  • Figure 2 illustrates schematically components within an IMS network
  • Figure 3 illustrates schematically elements of a SIP entity including a hardware parser
  • Figure 4 is a flow diagram illustrating a SIP message parsing process implemented within the entity of Figure 3
  • Figure 5 illustrates a data storage structure employed by the entity of Figure 3.
  • a Session Initiation Protocol (SIP) message contains a number of header fields and optionally a body containing the payload (which might conform to the Session Description Protocol). Whilst some of these fields are optional, others are mandatory (typically dependent on the type of SIP request). All of the header fields are encoded in a human-readable form and some even allow for multiple different encodings (e.g. timestamps).
  • Figure 1 shows an example SIP message which contains a SIP method field, in this case the INVITE method, and a number of header fields including multiple occurrences of the Route filed.
  • the message may also include a Session Description Part (SDP) although this is not shown in the example.
  • SDP Session Description Part
  • FIG. 2 illustrates schematically components of an IMS network that can be used to establish a call (voice, multimedia, etc) between two user terminals 1,2.
  • An IMS core network which represents a home IMS network of client 1 comprises a P-CSCF 3, an S- CSCF 4, and an I-CSCF 5, as well as a SIP AS 6.
  • An IMS core network 7 (also comprising appropriate CSCFs and ASs) is a home IMS network for client 2.
  • client 1 and client 2 may share a common home network.
  • SIP messages traversing the IMS network(s) must be parsed at multiple different nodes including the CSCFs and the SIP ASs.
  • FIG 3 illustrates schematically relevant elements of a SIP entity such as, for example, a SIP proxy.
  • the elements include a network interface 100 which may be an internet socket identified by an IP address and port number, or other appropriate network interface.
  • An area of common memory 101 is provided and within which various memory sub-areas 101a, 101b, 101 c, etc, are dynamically defined on a per message basis.
  • a Sip Agent 102 is implemented in software (running on an appropriate computer server) and implements standard SIP handling operations such as message routing.
  • the SIP agent 102 does not parse received SIP messages for included methods and headers. Rather, the SIP agent is configured to identify and extract relevant data from structured data stored in the common memory 101 by a hardware parser 103.
  • the hardware parser 103 is preferably implemented using a Field Programmable Gate Array (FPGA). Implementation using a FPGA is extremely cost effective as it is relatively easy to make modifications to the device functionality, for example to take account of new SIP messages and message structures.
  • the hardware parser 103 is configured to provide a control unit 104 which is coupled to both the network interface 100 and the SIP agent 102.
  • the parser also provides a memory controller 105, coupled to the control unit 104, for the purpose of controlling the common memory 101.
  • the memory controller also controls a Lexical element lookup memory 112 that is provided within the hardware parser 103.
  • the Lexical element lookup memory maintains a mapping between bit positions (bits 0 to 53) and corresponding unique character sequences corresponding to SIP methods and headers (RFC 3261).
  • bit positions bits 0 to 53
  • unique character sequences act as addresses to respective bit positions.
  • the bit allocation can be extended for other methods and headers described in SIP extensions.
  • a further component provided by the hardware parser 103 is a Main Supervision Finite State Machine (FSM) 106.
  • the role of the Main Supervision FSM is to retrieve individual characters within a received SIP message from memory, and instruct subsequent processing entities to handle the characters.
  • This FSM is connected to a Lexical Element Finder (LEF) FSM 107 via a Control signals block 108 and an Other characters handling block 109.
  • Outputs of the LEF FSM 107 are connected to respective Header/Method and Address offset length registers 110 and 111.
  • each memory sub-block is configured to provide a SIP packet memory.
  • the start address of this SIP packet memory is fixed and is known to the SIP agent 102 and to the hardware parser 103 via the memory controller 105.
  • the SIP message handling process commences at step 200, and thereafter the Network Interface 100 detects receipt of a new SIP message (step 201) and stores this into the SIP packet memory 101 (step 202).
  • the network interface 100 then informs (via an appropriate interrupt) the control unit 104 that a new message has been received, whereupon the control unit 104 initiates the Main Supervision FSM 106 to begin parsing the message (step 203).
  • the Main Supervision FSM resets the LEF FSM 107 to an initial state, and resets a hit Scoreboard (the function of which is described in detail below) maintained in a memory of the memory controller (step 204).
  • the Main Supervision FSM 106 begins streaming the stored SIP message from the SIP message memory via the memory controller 105 (step 205), delivering one character per clock cycle to the Control signals and Other characters blocks via the common bus 115.
  • Control signals block 108 this block analyses each character in turn and identifies if the character matches one of six special characters, namely; space, horizontal tab, comma, colon, linefeed and carriage return. These represent SIP control characters. If a control character is detected, the Control signals block 108 sets a corresponding one of six control lines 114 to high so as to inform the LEF FSM 107.
  • the Other characters block 109 is responsible for converting any upper case letters to lower case such that only lower case characters are passed to the LEF FSM 107 (i.e. lower case and other characters are passed without modification). These processes are identified by step 206 in Figure 4.
  • the Main Supervision FSM 106 instructs the LEF FSM that a new message is being streamed, and that individual characters are being passed to it, via the control lines 113.
  • the LEF FSM 107 searches the character stream, using the control signals, to obtain a unique pattern for each SIP lexical element (steps 207 and 208). For each unique pattern identified by the LEF FSM, the LEF FSM determines whether or not an associated data field is present and, if so, determines the start address and length of the data field. Considering for example the SIP message of Figure 1 , after determining a unique pattern for the INVITE method, the LEF FSM will determine the start address and length of the data field "sip :bob@bilo xi.com SIP/2.0". The LEF FSM also performs a syntactic analysis of the data field to identify the presence of errors. The offset and length (within the SIP message) of the identified header or method is written to the address offset register 111.
  • the LEF FSM 107 When the LEF FSM 107 identifies a complete unique pattern, it writes the pattern to the Header/Method register 110 and calls the memory controller 105 (step 209).
  • the memory controller will first call lexical element lookup memory 112 to find the lexical element's correct alphabetical order number (Table 1) using the lexical element's unique pattern as address (stored in the Header/Method register).
  • the lookup memory returns a number from 0 to 63. Bit positions 0 to 9 are used for methods. Bit position 10 to 53 are used for SIP headers. Bit positions 54 to 62 are currently unused, whilst bit position 63 indicates unrecognized method/header.
  • the returned bit position is entered into a "hit Scoreboard" or SIP summary by the memory controller, i.e. a 64 bit word in which a "1" indicates the presence of a SIP method/header.
  • the Scoreboard is maintained in an "internal" register of the memory controller.
  • the memory controller 105 creates for each identified header and method, a data structure within the data storage memory component as illustrated in Figure 5.
  • the structure comprises 64 bits assigned as follows:
  • each lexical element has a predefined start address according to its bit position in the Scoreboard.
  • the LEF FSM 107 performs the analysis of the data field (including determining the start address and length and performing the syntactic analysis) at the same time as the memory controller 105 access the Lexical Element lookup memory 112, as the two processes are essentially independent.
  • the current state of the hit Scoreboard is used by the memory controller 105 to determine if a newly identified method/header has already been identified in the current SIP message. If there is more than one hit for a lexical element, these hits are put in a linked list in the appropriate data storage component. The header field count is incremented to indicate the number of remaining entries in the linked list, whilst the next pointer field contains the start address of the next entry.
  • an example data structure might be: 0040 0040 0026 007 where
  • Header Field Count + error indication (12 bits) : 004
  • the Main Supervision FSM 106 calls the control unit (step 210).
  • the control unit in turn calls the memory controller 105 which causes the Scoreboard held in the Header/Method register 110 to be written to the memory sub-block.
  • the control unit then calls the SIP agent 102, providing to it the start address of the corresponding memory sub-block. Handling of the SIP message is then passed to the SIP agent.
  • the memory controller 105 In the event that a lookup to the Lexical element lookup memory 112 by the memory controller 105 returns an error, i.e. there is no lexical element corresponding to the unique pattern determined by the LEF FSM, the memory controller indicates this by setting bits 61-63 of the data structure ( Figure 5). Data structures corresponding to erroneous unique patterns are stored within the data storage memory as a linked list, within that block of memory allocated to bit position 63 (of the hit Scoreboard). If the LEF FSM identifies a syntactical error in a data field associated with a method or header, an error code is written to bits 61-63 of the data structure in Figure 5, via control signals 117.
  • header field count (bits 52 to 60) may be stored within a separate part of memory sub-block. More particularly, a filed count table may be created for each SIP message, the table have 64 rows, one for each method/header type. Each row contains the number of occurrences of the corresponding method/header.
  • the SIP agent retrieves a given data structure, either the SIP agent or the memory controller extracts the appropriate entry from the table into bits 52 to 60 of the data structure. For each occurrence retrieved, the header filed count is decremented by one.
  • the data structure proposed here is optimised in view of the SIP message structure.
  • any header field can be present more than once, and multiple occurrences of the same header field need not be consecutive in the SIP message. It is precisely this structure that requires significant effort when using the conventional software parsers.
  • the handling and data structure proposed here presents in a clearly accessible format both the number of occurrences of a header field (i.e. Lexical Element Count) and the locations within the message of those occurrences.
  • SIP is only one example of a text based signaling protocol to which the described parsing approach may be applied.
  • Other examples include SDP and H.248.

Abstract

Apparatus for handling a message conforming to a text-based signaling protocol, for example the Session Initiation Protocol (SIP). The apparatus comprises a memory and a network interface for receiving a message. This interface may be an Internet interface. The apparatus further comprises a controller for storing the received message in the memory and a hardware parser for retrieving the message from the memory and for parsing the message to identify the presence of predefined lexical elements in the message. For each identified lexical element, the hardware parser identifies the location and length of any associated data, and stores this information in the memory. A protocol message processing entity is provided to retrieve said information from the memory, to use the information to extract relevant data from the message stored in the memory, and to process the message according to the extracted data.

Description

PROTOCOL MESSAGE PARSING
Technical Field
The present invention relates to a method and apparatus for optimising the parsing of protocol messages at a message handling agent and is applicable in particular to the parsing of text-based protocol messages, for example Session Initiation Protocol messages.
Background
The Session Initiation Protocol (SIP) is a text-based protocol specified by the Internet Engineering Task Force (IETF) in RFC 3261, similar to Hypertext Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP), for initiating interactive communication sessions between users. Such sessions include voice, video, chat, interactive games, and virtual reality. A key objective in the development of SIP was simplicity and the reuse of existing Internet mechanisms. Not only is SIP text based (where individual characters are encoded using the 8-bit UCS/Unicode Transformation Format (UTF-8) character set), at least the header portions of SIP messages are essentially human readable to simplify the message creation and debugging processes.
SIP messages are handled by SIP agents. A SIP agent may be present at a user terminal or at a network-based node, in particular at SIP proxies and SIP Application Servers (ASs). Specifically, SIP agents may be implemented at the following SIP entities: Back-to-Back User Agent , Proxy, Registrar, Redirect Server, User Agent Client, and User Agent Server. A particular SIP agent receiving a SIP message might in some cases only need to look at one or a small number of fields within a SIP message in order to handle a message, for example it may only need to look at the "To" field which contains a destination SIP address.
The human readable format of SIP message headers makes the parsing of messages at network nodes incorporating a SIP agent, a computationally difficult process. Parsing involves searching and interpreting the header for particular text strings, e.g. in order to identify the address or Universal Resource Identifier (URI) of the called party (e.g. "sip:bob@biloxi.com"). The problem is compounded by the fact that SIP does not specify a field order within the SIP header, allows the unrestricted use of delimiters such as spaces between header field names and values, and allows header field values to span multiple lines.
Parsing is a particular problem in communication networks such as the IP Multimedia Subsystem (IMS) specified by 3GPP for the provision of multimedia services to mobile or fixed subscribers, where a message may pass though a large number of SIP nodes and be parsed separately at each, e.g. at a Call Session Control Function (CSCF) or SIP Application Server (AS). On some occasions it may even be necessary to parse a message multiple times within the same node (e.g. at different components within the node). This could be the case, for example, where multiple SIP servlets within multiple application server instances are deployed at the same node. It might also be the case when a SIP agent is implemented using components written in different programming languages (e.g. C++ and Java), so that the components cannot reuse each others parsing result. This could easily happen when third party software is used, or during other integration activities. In any case, the computational overhead introduced by handling messages within the SIP stack can be considerable, giving rise to resource problems particularly at nodes handling a high volume of SIP message traffic.
Existing implementations of SIP agents utilise the software parsing of SIP messages. This is however extremely computationally intensive given the text-based nature of SIP messages as described above. The software parsing of SIP messages becomes a bottleneck for message handling given that the network hardware can operate at Gigabit speeds.
It is known to use hardware to parse messages having an unstructured format. For example, US2003/0023633 presents a method for the efficient parsing of XML data. Applying such a hardware approach to parsing implies that, upon receipt of a SIP message at a SIP agent, the message is passed to the hardware parser. When parsing is completed, the message is returned to the SIP agent. Of course, processing of the message is delayed due to the requirement for the SIP agent to receive the SIP message in the first instance and pass it to the hardware parser.
Summary
It is an object of the present invention to overcome the above noted disadvantages of conventional software and hardware parsing mechanisms. This is achieved, at least in part, by employing a hardware parser that is connected directly to the network interface, parsing messages before passing them to the protocol message processing entity, e.g. a SIP agent in the case of SIP.
According to a first aspect of the present invention there is provided apparatus for handling a message conforming to a text-based signaling protocol. The apparatus comprises a memory, a network interface for receiving a message, and a controller for storing the received message in the memory. The apparatus further comprises a hardware parser for retrieving the message from the memory, for parsing the message to identify the presence of predefined lexical elements in the message and, for each identified lexical element, for identifying the location and length of any associated data, and for storing presence, location and length information in the memory. A protocol message processing entity is provided for retrieving said information from the memory, for using the information to extract relevant data from the message stored in the memory, and for processing the message according to the extracted data.
The hardware parser may be implemented in a Field Programmable Gate Array or an Application Specific Integrated Circuit. Other components of the apparatus including the controller, memory and network interface may also be implemented within the same hardware.
The network interface may be configured to inform said controller upon receipt of a message by the controller, e.g. by sending an interrupt or responding to a poll message.
According to a particular embodiment of the invention, the text-based signaling protocol is the Session Initiation Protocol, and said protocol message processing entity is a Session Initiation Protocol agent. In this case, the apparatus may comprise at least one processor and a second memory storing a computer program which, when run on said at least one processor, implements said Session Initiation Protocol agent.
The apparatus may be configured to perform as one of a:
Session Initiation Protocol Back-to-Back User Agent; Session Initiation Protocol Proxy; Session Initiation Protocol Registrar; Session Initiation Protocol Redirect Server; Session Initiation Protocol User Agent Client; and
Session Initiation Protocol User Agent Server
The apparatus may be configured for use with an IP Multimedia Subsystem, being one of a user terminal, a Session Initiation Protocol Application Server, and a Call Session Control Function node.
According to one particular implementation, the hardware parser is configured to construct, for the received message, a bit sequence in which each bit setting indicates the presence or absence of a given lexical element within the message, and to store the bit sequence in the memory as part of said information. For a first occurrence of a lexical element in the message, the parser is configured to store the corresponding location and length within a data structure at a location in the memory predefined for that lexical element. For a second occurrence of a lexical element in the message, the parser is configured to store the corresponding location and length within a second data structure in the memory, and to include within the data structure stored for the first occurrence of the lexical element a pointer to the location of the second data structure, and to repeat these steps iteratively for each further occurrence of the lexical element. The parser is also configured to identify in the data structure stored for the first occurrence of the lexical element, the number of further occurrences of the lexical element within the message. The data structure may be created to contain data within fields at predefined locations and of predefined size. In a typical application of the invention, the network interface is associated with an IP socket, i.e. an IP address and port number.
Considering further the hardware parser, this may comprise a first recogniser for recognising a control character within the message and a second recogniser for recognising upper case characters within the message and for converting these to lower case characters, the outputs of each recogniser being used by the hardware parser to identify lexical elements.
According to a second aspect of the present invention there is provided a method of handling a message conforming to a text-based signaling protocol. The method comprises receiving a message at a network interface of a message handling node, storing the received message in a memory of the node, and retrieving the message from the memory into a hardware parser. The parser parses the message in order to identify the presence of predefined lexical elements in the message and, for each identified lexical element, to determine the location and length of any data associated with the element. The parser then stores the presence, location and length information in the memory. The stored information can be retrieved from the memory, and used by a protocol message processing entity to extract relevant data from the message stored in the memory and process the message according to the extracted data.
The text-based signaling protocol may be the Session Initiation Protocol, with said protocol message processing entity being a Session Initiation Protocol agent. The method may comprise running a computer program on one or more processors in order to implement said Session Initiation Protocol agent.
The method may comprise constructing with the hardware parser, for the received message, a bit sequence in which each bit setting indicates the presence or absence of a given lexical element within the message, and storing the bit sequence in the memory as part of said information. For a first occurrence of a lexical element in the message, the corresponding location and length are stored within a data structure at a location in the memory predefined for that lexical element. The method may further comprise, within said hardware parser, for a second occurrence of a lexical element in the message, storing the corresponding location and length within a second data structure in the memory, and including within the data structure stored for the first occurrence of the lexical element a pointer to the location of the second data structure, and repeating these steps iteratively for each further occurrence of the lexical element. For the first occurrence of the lexical element, an indication of the number of further occurrences of the lexical element within the message may be stored within the data structure. The data structure may be created to contain data within fields at predefined locations and of predefined size.
Brief Description of the Drawings
Figure 1 shows an exemplary SIP message including SIP method and header fields; Figure 2 illustrates schematically components within an IMS network; Figure 3 illustrates schematically elements of a SIP entity including a hardware parser; Figure 4 is a flow diagram illustrating a SIP message parsing process implemented within the entity of Figure 3; and Figure 5 illustrates a data storage structure employed by the entity of Figure 3.
Detailed Description
A Session Initiation Protocol (SIP) message contains a number of header fields and optionally a body containing the payload (which might conform to the Session Description Protocol). Whilst some of these fields are optional, others are mandatory (typically dependent on the type of SIP request). All of the header fields are encoded in a human-readable form and some even allow for multiple different encodings (e.g. timestamps). Figure 1 shows an example SIP message which contains a SIP method field, in this case the INVITE method, and a number of header fields including multiple occurrences of the Route filed. The message may also include a Session Description Part (SDP) although this is not shown in the example.
Figure 2 illustrates schematically components of an IMS network that can be used to establish a call (voice, multimedia, etc) between two user terminals 1,2. An IMS core network which represents a home IMS network of client 1 comprises a P-CSCF 3, an S- CSCF 4, and an I-CSCF 5, as well as a SIP AS 6. Of course, multiple instances of these entities will in practice be present. An IMS core network 7 (also comprising appropriate CSCFs and ASs) is a home IMS network for client 2. Of course, client 1 and client 2 may share a common home network. In any case, SIP messages traversing the IMS network(s) must be parsed at multiple different nodes including the CSCFs and the SIP ASs.
Considering the message of Figure 1, in order to allow a SIP entity receiving that message to efficiently handle the message, it is desirable to employ a hardware parser that can transform the message into a structured format that allows the SIP agent to quickly and easily locate specific fields and components of the message that are relevant to it. It is also desirable to do this in such a way that the SIP agent is not involved in the parsing operation, and is merely "activated" once the message has been parsed and the structured format made available to it.
Figure 3 illustrates schematically relevant elements of a SIP entity such as, for example, a SIP proxy. The elements include a network interface 100 which may be an internet socket identified by an IP address and port number, or other appropriate network interface. An area of common memory 101 is provided and within which various memory sub-areas 101a, 101b, 101 c, etc, are dynamically defined on a per message basis. A Sip Agent 102 is implemented in software (running on an appropriate computer server) and implements standard SIP handling operations such as message routing. However, in contrast to conventional SIP agents, the SIP agent 102 does not parse received SIP messages for included methods and headers. Rather, the SIP agent is configured to identify and extract relevant data from structured data stored in the common memory 101 by a hardware parser 103.
The hardware parser 103 is preferably implemented using a Field Programmable Gate Array (FPGA). Implementation using a FPGA is extremely cost effective as it is relatively easy to make modifications to the device functionality, for example to take account of new SIP messages and message structures. The hardware parser 103 is configured to provide a control unit 104 which is coupled to both the network interface 100 and the SIP agent 102. The parser also provides a memory controller 105, coupled to the control unit 104, for the purpose of controlling the common memory 101. In addition, the memory controller also controls a Lexical element lookup memory 112 that is provided within the hardware parser 103. The Lexical element lookup memory maintains a mapping between bit positions (bits 0 to 53) and corresponding unique character sequences corresponding to SIP methods and headers (RFC 3261). The allocation of bit positions is illustrated in Table 1 below. The unique character sequences act as addresses to respective bit positions. The bit allocation can be extended for other methods and headers described in SIP extensions.
A further component provided by the hardware parser 103 is a Main Supervision Finite State Machine (FSM) 106. The role of the Main Supervision FSM is to retrieve individual characters within a received SIP message from memory, and instruct subsequent processing entities to handle the characters. This FSM is connected to a Lexical Element Finder (LEF) FSM 107 via a Control signals block 108 and an Other characters handling block 109. Outputs of the LEF FSM 107 are connected to respective Header/Method and Address offset length registers 110 and 111.
Considering now the common memory 101, each memory sub-block is configured to provide a SIP packet memory. The start address of this SIP packet memory is fixed and is known to the SIP agent 102 and to the hardware parser 103 via the memory controller 105. With reference to the flow diagram of Figure 4, the SIP message handling process commences at step 200, and thereafter the Network Interface 100 detects receipt of a new SIP message (step 201) and stores this into the SIP packet memory 101 (step 202).
The network interface 100 then informs (via an appropriate interrupt) the control unit 104 that a new message has been received, whereupon the control unit 104 initiates the Main Supervision FSM 106 to begin parsing the message (step 203). The Main Supervision FSM resets the LEF FSM 107 to an initial state, and resets a hit Scoreboard (the function of which is described in detail below) maintained in a memory of the memory controller (step 204). The Main Supervision FSM 106 begins streaming the stored SIP message from the SIP message memory via the memory controller 105 (step 205), delivering one character per clock cycle to the Control signals and Other characters blocks via the common bus 115. Considering firstly the Control signals block 108, this block analyses each character in turn and identifies if the character matches one of six special characters, namely; space, horizontal tab, comma, colon, linefeed and carriage return. These represent SIP control characters. If a control character is detected, the Control signals block 108 sets a corresponding one of six control lines 114 to high so as to inform the LEF FSM 107.
The Other characters block 109 is responsible for converting any upper case letters to lower case such that only lower case characters are passed to the LEF FSM 107 (i.e. lower case and other characters are passed without modification). These processes are identified by step 206 in Figure 4.
The Main Supervision FSM 106 instructs the LEF FSM that a new message is being streamed, and that individual characters are being passed to it, via the control lines 113. The LEF FSM 107 searches the character stream, using the control signals, to obtain a unique pattern for each SIP lexical element (steps 207 and 208). For each unique pattern identified by the LEF FSM, the LEF FSM determines whether or not an associated data field is present and, if so, determines the start address and length of the data field. Considering for example the SIP message of Figure 1 , after determining a unique pattern for the INVITE method, the LEF FSM will determine the start address and length of the data field "sip :bob@bilo xi.com SIP/2.0". The LEF FSM also performs a syntactic analysis of the data field to identify the presence of errors. The offset and length (within the SIP message) of the identified header or method is written to the address offset register 111.
When the LEF FSM 107 identifies a complete unique pattern, it writes the pattern to the Header/Method register 110 and calls the memory controller 105 (step 209). The memory controller will first call lexical element lookup memory 112 to find the lexical element's correct alphabetical order number (Table 1) using the lexical element's unique pattern as address (stored in the Header/Method register). The lookup memory returns a number from 0 to 63. Bit positions 0 to 9 are used for methods. Bit position 10 to 53 are used for SIP headers. Bit positions 54 to 62 are currently unused, whilst bit position 63 indicates unrecognized method/header. The returned bit position is entered into a "hit Scoreboard" or SIP summary by the memory controller, i.e. a 64 bit word in which a "1" indicates the presence of a SIP method/header. The Scoreboard is maintained in an "internal" register of the memory controller.
The memory controller 105 creates for each identified header and method, a data structure within the data storage memory component as illustrated in Figure 5. The structure comprises 64 bits assigned as follows:
Bits 0- 15 ; field value length
Bits 16-31; field value o ffset
Bits 32-47; next pointer (to be described) Bit 48; Indicates presence of space or horizontal tab (HT), i.e. a folded line
Bits 49-51; reserved, i.e. currently unassigned
Bits 52-60; header field count
Bits 61 -63 ; error indication.
In the data storage component of the memory sub-block, each lexical element (Table 1) has a predefined start address according to its bit position in the Scoreboard.
It is noted that the LEF FSM 107 performs the analysis of the data field (including determining the start address and length and performing the syntactic analysis) at the same time as the memory controller 105 access the Lexical Element lookup memory 112, as the two processes are essentially independent.
During parsing of a message, the current state of the hit Scoreboard is used by the memory controller 105 to determine if a newly identified method/header has already been identified in the current SIP message. If there is more than one hit for a lexical element, these hits are put in a linked list in the appropriate data storage component. The header field count is incremented to indicate the number of remaining entries in the linked list, whilst the next pointer field contains the start address of the next entry. By way of example, in hexadecimal notation an example data structure might be: 0040 0040 0026 007 where
Field Value Length : (16 bits) : 0017
Field Value Offset : (16 bits) : 002b
Next Pointer : (16 bits) : 0040 Space or HTab + Reserved : (4 bits) : 0
Header Field Count + error indication : (12 bits) : 004
When the last character in the SIP message has been processed by the hardware parser 103, the Main Supervision FSM 106 calls the control unit (step 210). The control unit in turn calls the memory controller 105 which causes the Scoreboard held in the Header/Method register 110 to be written to the memory sub-block. The control unit then calls the SIP agent 102, providing to it the start address of the corresponding memory sub-block. Handling of the SIP message is then passed to the SIP agent.
In the event that a lookup to the Lexical element lookup memory 112 by the memory controller 105 returns an error, i.e. there is no lexical element corresponding to the unique pattern determined by the LEF FSM, the memory controller indicates this by setting bits 61-63 of the data structure (Figure 5). Data structures corresponding to erroneous unique patterns are stored within the data storage memory as a linked list, within that block of memory allocated to bit position 63 (of the hit Scoreboard). If the LEF FSM identifies a syntactical error in a data field associated with a method or header, an error code is written to bits 61-63 of the data structure in Figure 5, via control signals 117.
It is noted that the header field count (bits 52 to 60) may be stored within a separate part of memory sub-block. More particularly, a filed count table may be created for each SIP message, the table have 64 rows, one for each method/header type. Each row contains the number of occurrences of the corresponding method/header. When the SIP agent retrieves a given data structure, either the SIP agent or the memory controller extracts the appropriate entry from the table into bits 52 to 60 of the data structure. For each occurrence retrieved, the header filed count is decremented by one.
The data structure proposed here is optimised in view of the SIP message structure. In a SIP message, any header field can be present more than once, and multiple occurrences of the same header field need not be consecutive in the SIP message. It is precisely this structure that requires significant effort when using the conventional software parsers. The handling and data structure proposed here presents in a clearly accessible format both the number of occurrences of a header field (i.e. Lexical Element Count) and the locations within the message of those occurrences. When message handling is passed to the SIP agent, the agent readily knows all the values of a given header without the need for additional processing.
It will be appreciated by the person of skill in the art that various modifications may be made to the above described embodiments without departing from the scope of the present invention. In particular, it will be appreciated that SIP is only one example of a text based signaling protocol to which the described parsing approach may be applied. Other examples include SDP and H.248.
Figure imgf000015_0001
Table 1

Claims

CLAIMS:
1. Apparatus for handling a message conforming to a text-based signaling protocol, the apparatus comprising: a memory; a network interface for receiving a message; a controller for storing the received message in the memory; a hardware parser for retrieving the message from the memory and for parsing the message to identify the presence of predefined lexical elements in the message and, for each identified lexical element, to identify the location and length of any associated data, and for storing presence, location and length information in the memory; and a protocol message processing entity for retrieving said information from the memory, for using the information to extract relevant data from the message stored in the memory, and for processing the message according to the extracted data.
2. Apparatus according to claim 1, wherein said hardware parser is implemented in a Field Programmable Gate Array or an Application Specific Integrated Circuit.
3. Apparatus according to claim 2, wherein said controller is implemented within said Field Programmable Gate Array or Application Specific Integrated Circuit.
4. Apparatus according to claim 3, wherein said network interface is configured to inform said controller upon receipt of a message by the controller.
5. Apparatus according to any one of the preceding claims, wherein said text-based signaling protocol is the Session Initiation Protocol, and said protocol message processing entity is a Session Initiation Protocol agent.
6. Apparatus according to claim 5 and comprising at least one processor and a second memory storing a computer program which, when run on said at least one processor, implements said Session Initiation Protocol agent.
7. Apparatus according to claim 6, the apparatus being configured to perform as one of a:
Session Initiation Protocol Back-to-Back User Agent; Session Initiation Protocol Proxy; Session Initiation Protocol Registrar; Session Initiation Protocol Redirect Server; Session Initiation Protocol User Agent Client; and Session Initiation Protocol User Agent Server
8. Apparatus according to claim 6, the apparatus configured for use with an IP Multimedia Subsystem and being one of a user terminal, a Session Initiation Protocol Application Server, and a Call Session Control Function node.
9. Apparatus according to any one of the preceding claims, the hardware parser being configured to construct, for the received message, a bit sequence in which each bit setting indicates the presence or absence of a given lexical element within the message, and to store the bit sequence in the memory as part of said information.
10. Apparatus according to any one of the preceding claims, the hardware parser being configured, for a first occurrence of a lexical element in the message, to store the corresponding location and length within a data structure at a location in the memory predefined for that lexical element.
11. Apparatus according to claim 10, said hardware parser being configured, for a second occurrence of a lexical element in the message, to store the corresponding location and length within a second data structure in the memory, and to include within the data structure stored for the first occurrence of the lexical element a pointer to the location of the second data structure, and to repeat these steps iteratively for each further occurrence of the lexical element.
12. Apparatus according to claim 11, said hardware parser being configured to identify in the data structure stored for the first occurrence of the lexical element the number of further occurrences of the lexical element within the message.
13. Apparatus according to any one of claims 10 to 12, wherein said data structure is created to contain data within fields at predefined locations and of predefined size.
14. Apparatus according to any one of the preceding claims, wherein said network interface is associated with an IP socket.
15. Apparatus according to any one of the preceding claims and comprising a first recogniser for recognising a control character within the message and a second recogniser for recognising upper case characters within the message and for converting these to lower case characters, the outputs of each recogniser being used by the hardware parser to identify lexical elements.
16. A method of handling a message conforming to a text-based signaling protocol, the method comprising: receiving a message at a network interface of a message handling node; storing the received message in a memory of the node; retrieving the message from the memory into a hardware parser and parsing the message there in order to identify the presence of predefined lexical elements in the message and, for each identified lexical element, to determine the location and length of any data associated with the element; storing the presence, location and length information in the memory; and retrieving said information from the memory, and using a protocol message processing entity to extract relevant data from the message stored in the memory and process the message according to the extracted data.
17. A method according to claim 16, wherein said text-based signaling protocol is the Session Initiation Protocol, and said protocol message processing entity is a Session Initiation Protocol agent.
18. A method according to claim 17 and comprising running a computer program on one or more processors in order to implement said Session Initiation Protocol agent.
19. A method according to any one of claims 16 to 18 and comprising constructing with the hardware parser, for the received message, a bit sequence in which each bit setting indicates the presence or absence of a given lexical element within the message, and storing the bit sequence in the memory as part of said information.
20. A method according to any one of claims 16 to 19 and comprising, for a first occurrence of a lexical element in the message, storing the corresponding location and length within a data structure at a location in the memory predefined for that lexical element.
21. A method according to claim 20 and comprising, within said hardware parser, for a second occurrence of a lexical element in the message, storing the corresponding location and length within a second data structure in the memory, and including within the data structure stored for the first occurrence of the lexical element a pointer to the location of the second data structure, and repeating these steps iteratively for each further occurrence of the lexical element.
22. A method according to claim 21 and comprising including within the data structure stored for the first occurrence of the lexical element, and indication of the number of further occurrences of the lexical element within the message.
23. A method according to any one of claims 20 to 22 and comprising creating said data structure to contain data within fields at predefined locations and of predefined size.
PCT/EP2008/061577 2008-09-02 2008-09-02 Protocol message parsing WO2010025763A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2008/061577 WO2010025763A1 (en) 2008-09-02 2008-09-02 Protocol message parsing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2008/061577 WO2010025763A1 (en) 2008-09-02 2008-09-02 Protocol message parsing

Publications (1)

Publication Number Publication Date
WO2010025763A1 true WO2010025763A1 (en) 2010-03-11

Family

ID=40627452

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/061577 WO2010025763A1 (en) 2008-09-02 2008-09-02 Protocol message parsing

Country Status (1)

Country Link
WO (1) WO2010025763A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112751827A (en) * 2020-12-11 2021-05-04 武汉虹信科技发展有限责任公司 Application method and system of SIP multi-party session in broadband cluster

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030023633A1 (en) * 2001-07-30 2003-01-30 Telemedia Technologies, Inc. Methods and apparatus for accelerating data parsing
US20040213209A1 (en) * 2003-04-22 2004-10-28 O'connor Neil Processing of communication session request messages
US20060036671A1 (en) * 2004-08-14 2006-02-16 Samsung Electronics Co., Ltd. Content display system for sharing content between display apparatuses
US20070022474A1 (en) * 2005-07-21 2007-01-25 Mistletoe Technologies, Inc. Portable firewall
WO2008049853A2 (en) * 2006-10-24 2008-05-02 International Business Machines Corporation Methods, apparatuses and computer program for improving sip parse performance

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030023633A1 (en) * 2001-07-30 2003-01-30 Telemedia Technologies, Inc. Methods and apparatus for accelerating data parsing
US20040213209A1 (en) * 2003-04-22 2004-10-28 O'connor Neil Processing of communication session request messages
US20060036671A1 (en) * 2004-08-14 2006-02-16 Samsung Electronics Co., Ltd. Content display system for sharing content between display apparatuses
US20070022474A1 (en) * 2005-07-21 2007-01-25 Mistletoe Technologies, Inc. Portable firewall
WO2008049853A2 (en) * 2006-10-24 2008-05-02 International Business Machines Corporation Methods, apparatuses and computer program for improving sip parse performance

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112751827A (en) * 2020-12-11 2021-05-04 武汉虹信科技发展有限责任公司 Application method and system of SIP multi-party session in broadband cluster

Similar Documents

Publication Publication Date Title
US10855585B2 (en) Session initiation protocol stack optimisation
EP2792117B1 (en) Service domain selection service indicator
KR101417192B1 (en) Sip endpoint enhancer
EP1813080A1 (en) Improvements in communication message processing
KR20110138282A (en) System and method for determining trust for sip messages
CA2655806A1 (en) Technique for providing access to a media resource attached to a network-registered device
KR101356813B1 (en) Detection of loops within a sip intermediate signaling element
CN100574474C (en) Set up the method that communication traffic connects in a kind of communication system
US8589567B2 (en) Method and apparatus for improving SIP parse performance
US8855135B2 (en) Method and system for processing session initiation protocol messages
US7899058B2 (en) Using a hash value as a pointer to an application class in a communications device
EP1631021A1 (en) Method for routing messages between servers located on the same board
WO2010025763A1 (en) Protocol message parsing
US20150149650A1 (en) Method and device for positioning session inititaion protocol dialog
US9219756B2 (en) IMS network node to enable troubleshooting and a method thereof
Cisco GKTMP Messages (GK API Guide Version 4.1)
CN106534140A (en) Transmission system and method of SIP message
JP2009284471A (en) Handling received data messages according to text-based protocol

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08803548

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08803548

Country of ref document: EP

Kind code of ref document: A1