WO2015140592A1 - Repository and recommendation system for computer programs - Google Patents

Repository and recommendation system for computer programs Download PDF

Info

Publication number
WO2015140592A1
WO2015140592A1 PCT/IB2014/001933 IB2014001933W WO2015140592A1 WO 2015140592 A1 WO2015140592 A1 WO 2015140592A1 IB 2014001933 W IB2014001933 W IB 2014001933W WO 2015140592 A1 WO2015140592 A1 WO 2015140592A1
Authority
WO
WIPO (PCT)
Prior art keywords
algorithms
user
concepts
executable
metadata
Prior art date
Application number
PCT/IB2014/001933
Other languages
French (fr)
Inventor
Santa Maiti
Biswanath BARIK
Arindam Pal
Ranjan Dasgupta
Arpan Pal
Anupam BASU
Original Assignee
Tata Consultancy Services Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tata Consultancy Services Limited filed Critical Tata Consultancy Services Limited
Publication of WO2015140592A1 publication Critical patent/WO2015140592A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML

Definitions

  • the present subject matter relates to developing and maintaining repositories and, particularly but not exclusively, to a repository and recommendation system for computer implemented functions.
  • an algorithm may be considered as a finite list of well-defined instructions or steps for performing a specified task.
  • An executable code is a sequence of codes defining the instructions for performing the specified task or implementing a desired function in a computing device, such as a desktop or a laptop.
  • the executable codes are written in a particular programming language to perform the specified task.
  • An algorithm may thus have one or more associated executable code such that each executable code is based on a different programming environment.
  • One or more executable codes defined for performing different tasks may accordingly be used for developing a software application depending on the software environment where the application is to be deployed.
  • Figure 2 i llustrates the C1F recommendation system, in accordance with an embodiment of the present subject matter.
  • Figure 3 illustrates a method for creating and maintaining a C I F repository for recommending a CIF to a user, in accordance with an embodiment of the present subject matter.
  • any block diagrams herein represent conceptual views of illustrative systems embodying the principles of the present subject matter.
  • any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
  • the present subject matter relates to methods and systems for creating and maintaining a repository of computer implemented functions (CIF).
  • the CI F may be understood as a collection of functions, such as algorithms and executable codes that may be implemented on a computing device for performing one or more specified task.
  • a large number of CIF, such as algorithms and executable codes in various fields of technology have been proposed til l date for performing diverse categories of tasks.
  • An algorithm may be defined as a finite list of instructions defined for performing a specified task, such as clustering, counting, and time series analysis.
  • one or more algorithms may be defined for performing a same task, with each algorithm performing the task with a different level of complexity.
  • the functional ity as intended by the algorithm is general ly implemented by executing one or more executable code.
  • An executable code may be defined as a sequence of codes defining the instructions for performing the specified task in a computing device, such as a computer and a laptop.
  • the executable codes are written in a particular programming language, corresponding to the programming environment in wh ich the algorithm is to be implemented, to perform the speci fied task. Further, the executable codes are written in h igh level programming languages.
  • various executable codes may be used by an appl ication program to develop a computer application or software for performing one or more tasks, such that each executable code facil itates in functioning and execution of the software for performing the one or more tasks.
  • the same CI F may be used, either in their original form or a slightly modified form, by any user on any computing device for developing one or more software applications.
  • various databases have been developed for storing the CIF. Such databases, however, typically store the CI F for either a particular programming language or a particular domain, such as time series analysis, machine learning, and sensor fusion and data routing.
  • each programming environment such as C++ and JavaTM generally has an associated library for storing various executable codes that may be used by an user working in that programming environment. Use of such executable codes, however, is limited to that particular programming environment and can not be used in other programming languages.
  • the user in such case might have to manually parse through library of each programming environment the user is working in.
  • the user may further have to develop the executable code in case the executable code is not present in the library of any particular programming environment. Further, since such l ibraries do not store the algorithm used for developing the executable codes, the user may have to either develop a new executable or study the whole executable code in order to make any modifications, thus increasing the time and complexity required for developing a software application.
  • databases storing large number of algorithms for performing varied tasks have been implemented.
  • Such databases typically categorize and store the algorithms based on the problems, such as numerical, geometrical, combinational, and graph based problems being solved by them.
  • Such a categorization al lows the users to search a suitable algorithm based on the problem they wish to solve.
  • Such a categorization may not be suitable for cases where the same algorithm may be used for solving different problems as either the algorithm wi l l be stored in just one category or duplicate copies of the same algorithm wi ll have to stored for each category.
  • Storing the algorithm under just one category may affect search results and usefulness of such databases as a user trying to search the algorithm for solving a di fferent problem may never find the algorithm.
  • various search engines have been provided that facil itate searching of the
  • a general web search through such search engines for any specific task may l ist up a set of several algorithms and associated executable codes available over the web or in the associated databases.
  • such a search may not guarantee fetching up of all possible algorithms as a search query used for the search may not have contained exact name, such as Fibonacci and binary sort of the algorithm being searched.
  • a search query having name used in a particular programming environment may not provide executable codes for a different programming language.
  • the search engines typically obtain the algorithms from different sources, a large number of algorithms and executable codes may be provided in reply to a search query. The user in such a case may not be able to identify a best suited algorithm for performing a specific task, as performance of an algorithm may depend on various parameters, such as namely input features, output quality, and time and space requirement.
  • CI F computer implemented functions
  • the CIF repository facilitates storage and quick search of CI F, i.e., algorithms and their associated executable codes for varying programming languages using a semantic based text search.
  • CI F recommendation system may be implemented for creating and maintaining the CI F repository.
  • the CI F recommendation system may further facilitate a user to search the algorithms and their associated executable codes in the CI F repository.
  • the CI F recommendation system may be implemented in a variety of computing devices, such as desktop computers, cloud servers, mainframe computers, workstations, multiprocessor systems, laptops, network computers, minicomputers, and servers.
  • the CI F recommendation system faci l itates creation of the
  • the domain knowledge is organized in the form of concepts and stored using ontology based representations.
  • Ontology refers to a common vocabulary for people who need to share information within a domain, such as time-series analysis, machine learning, and sensor fusion and data routing.
  • the ontology contains machine-interpretable and human readable definitions of basic concepts of the domain and the relations among these basic concepts.
  • Concept may be understood as a category related to a domain. Developing the ontologies helps in ensuring that all metadata required for searching and execution of the algorithms and the executable codes are avai lable directly.
  • one or more C 1F concepts may be defined based on domain knowledge associated with the various domains for which CIF are typical ly developed.
  • Each CI F concept may be further categorized into one or more CI F sub-concepts based on the domain knowledge. For instance, CIF concepts classification, regression, and clustering may be defined for the domain machine learning.
  • the CIF concept clustering may be further categorized into CI F sub-concepts hierarchical clustering, partition based clustering, and probabilistic clustering. Relationships between the various CIF concepts may be subsequently determined.
  • the CI F concepts, CI F sub-concepts, the relationships, and metadata associated with the CI F concepts and CIF sub-concepts may be subsequently used to develop CIF ontology.
  • a plurality of algorithms and their associated executable codes may be obtained, for example, from various databases and repositories.
  • the executable codes may be identified such that each executable code corresponding to an algorithm is implementable in a different programming environment than the other executable codes associated with the same algorithm.
  • the algorithms and their associated executable codes may be subsequently associated with a CIF concept and a CI F sub-concept associated with the CIF concept based on executable metadata associated with the algorithms and their associated executable codes.
  • Examples of the executable metadata include, but are not limited to, I/O specification, mandatory-optional arguments, execution environment, algorithm and package dependency, and special features.
  • the algorithms, the associated executable codes, and the corresponding executable metadata may then be stored in the CI F repository.
  • the algorithms are stored in a searchable text format.
  • the CI F repository thus created may then be used for recommending CI F to users based on search queries received from the users.
  • the search query may be initial ly analyzed to determ ine search metadata, such as terms defining an appl ication of an algorithm, name of an algorithm, a type of an algorithm, and synonyms/semantically same terms for name of the algorithm and the terms defining the application of the algorithm.
  • the C1 F repository may then be searched to identify one or more algorithms from among the plurality of algorithms matching the search query based on the search metadata.
  • a C1 F recommendation list having the one or more algorithms and the corresponding executable codes in a predefined order may then be provided to the user based on one or more ranking parameters, such as executable metadata, the search metadata, computational time of the executable codes, and input dataset received from the user.
  • An executable code chosen by the user may then be rendered to the user based on a user selection indicating the user's choice.
  • the CIF recommendation system may access external sources to determine semantic terms for the search query. Once determined, the semantic terms may be used to identify the algorithms. Subsequently, the CIF concepts, C I F sub-concepts, and concept metadata associated with the CIF concept may be updated based on the search query and one or more external databases using text processing and information retrieval techniques.
  • the CI F repository may be further modified to add new CIF concepts, CI F sub-concepts, algorithms, and executable codes and to modify the existing CIF concepts and CIF sub-concepts.
  • the present subject matter thus facilitate in providing a CI F repository for storing and searching CI F of varying domains and programming environments.
  • the present subject matter further enables structuring the available algorithms and storing the same in a text format, such that the algorithms can be searched by the CI F recommendation system using a semantic based text search. For instance, searching the CI F using the search metadata instead of the natural language search query provided by the user facilitates in a quick and efficient search as the search metadata uses various other terms, such as the synonyms/semantical ly same terms apart from the terms mentioned by the user in the search query.
  • storing the CI F based on the CIF concepts and CI F sub-concepts facilitates in search and storage of the CI F as the CI F can now be based on the search metadata apart from just name or type of problem being solved by the algorithms.
  • regularly updating the C I F repository facil itates in improving the search results and the number of CI F as new CI F concepts and algorithms can be automatical ly updated by the CI F recommendation system.
  • a user such as a C I F developer can easily add new C I F to the C I F repository, using the C I F recommendation system, by identi fying the CI F concept and CI F sub-concept to wh ich the algorithm belongs.
  • the CI F may thus be recommended to users as a part of providing Cl F-as-a service.
  • Figure 1 illustrates a network environment 100 implementing a computer implemented functions (CI F) recommendation system 102 in accordance with an embodiment of the present subject matter.
  • the network environment 100 includes a network 1 04 for enabl ing communication between the CIF recommendation system 102 and a plurality of user devices 106- 1 , 106-2, ... , 106-N.
  • the user devices 106- 1 , 1 06-2, 106-N are hereinafter collectively referred to as user devices 106 and individually referred to as a user device 106.
  • the network 104 may be a wireless network, a wired network, or a combination thereof.
  • the network 1 04 can also be an individual network or a collection of many such individual networks, interconnected with each other and functioning as a single large network, for example, the Internet or an intranet.
  • the network 104 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and such. Further, the network 104 may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), etc., to communicate with each other.
  • HTTP Hypertext Transfer Protocol
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • WAP Wireless Application Protocol
  • the user devices 106 may be one or more computing devices, such as mainframe computers, workstations, personal computers, desktop computers, minicomputers, servers, multiprocessor systems, laptops, a cellular communicating device, such as a personal digital assistant, a smart phone, and a mobi le phone, and the l ike.
  • the CI F recommendation system 1 02 may be one or more computing devices, such as a desktop computer, cloud servers, mainframe computers, workstation, a multiprocessor system, a hand- held device, a laptop computer, a network computer, a minicomputer, a server, and the l ike.
  • the CI F recommendation system 1 02 may also be implemented as multiple servers to concurrently perform a number of tasks.
  • the CI F recommendation system 1 02 may work on a web based platform to facilitate col laborations between various stakeholders, for example, concept developers, CI F providers, application developers, and researchers, etc.. interacting with the CI F recommendation system 1 02 through the user devices 1 06.
  • the CI F recommendation system 102 may work on other similar platforms.
  • the network environment 1 00 further comprises a CI F repository 1 08 associated with the CI F recommendation system 102.
  • the CI F repository 108 stores a plurality of CI F, such as a plurality of algorithms and a plurality of executable codes that may be accessed by the CIF recommendation system 102. It will be understood that although the CI F repository 1 08 has been shown external to the CIF recommendation system 102 however, the C IF repository 108 may be located within the CIF recommendation system 1 02.
  • the CIF recommendation system 102 may include an ontology creation module 1 10 and a CIF registration module 1 12 for creating and maintaining the CI F repository 1 08.
  • the CI F repository 1 08 includes data, such as the CIF and metadata associated with the CI F in the form of CIF ontology.
  • the CI F repository 1 08 may include algorithms, executable codes, and metadata organized as rules, cases, equation, models and so on, all expressed in terms of the common vocabulary provided by the ontology.
  • Ontology refers to a common vocabulary for people who need to share information within a domain, such as time-series analysis, machine learning, and sensor fusion and data routing.
  • the ontology contains machine-interpretable and human readable definitions of basic concepts of the domain and the relations among these basic concepts.
  • the CI F recommendation system 1 02 facil itates defin ing a plural ity of CI F concepts for development of the CI F ontology for storing the algorithms and the executable codes in the CI F repository 1 08.
  • Concept may be understood as a category related to a domain.
  • the C I F related to one domain may thus be associated with a CI F concept with in the domain before being stored in the C I F repository 108.
  • This representation of the information in form of concepts and relationships between them faci l itates structured representation of the CI F and its associated information, such as metadata.
  • the CI F concepts may further include CI F sub-concepts, i.e., specific categories, which are also associated with the C I F concepts.
  • one or more C I F concepts may be defined based on domain knowledge associated with the various domains.
  • a domain expert may define CI F concepts related to a domain in the CI F repository 1 08.
  • the C I F concepts may be defined using domain knowledge obtained by parsing available unstructured data, such as documents, reports, web pages, and discussion on web forums, related to the domain.
  • the domain expert through the ontology creation module I 10, may parse the avai lable unstructured data to obtain the domain knowledge.
  • the ontology creation module 1 10 may subsequently determine the plurality of CI F concepts into which the plurality of algorithms can be organized.
  • the domain expert may analyze the domain knowledge and provide concept creation inputs to the ontology creation module 1 10 for determining the CIF concepts.
  • CI F concepts classification, regression, and clustering may be defined for the domain machine learning.
  • the ontology creation module 1 10 may categorize each CIF concept into one or more CI F sub-concepts based on the domain knowledge. For instance, the domain expert may analyze the domain knowledge and provide concept creation inputs to the ontology creation module 1 10 for determining the CI F sub-concepts for each CIF concept. For instance, in the previous example the CIF concept clustering may be further categorized into CIF sub-concepts hierarchical clustering, partition based clustering, and probabilistic clustering.
  • the CI F sub-concepts may be determined based on domain metadata associated with each domain.
  • the domain expert through the ontology creation module I 10, may determine the domain metadata associated with each of the domains based on the domain knowledge. Examples of the domain metadata include, but are not limited to, application of the CI F and process used by the CI F.
  • the domain metadata may be used to determine, for example, the number of CI F sub-concepts into wh ich the CI F concepts may be divided.
  • the CI F concept clustering may be categorized into two sub-concepts l inearly separable data clustering, linearly non-separable data clustering, whi le based on the aspect 'process "the C I F concept clustering may be categorized into two sub- concepts flat clustering and hierarchical clustering.
  • the ontology creation module 1 1 0 may determine relationship between the CI F concepts for being used for developing the CI F ontology comprising the plurality of CI F concepts and the C I F sub-concepts.
  • the C1F ontology thus developed may be stored in the O F repository 108.
  • the CI F recommendation system 102 may subsequently use the CI F concepts and the CIF sub-concepts to categorize CI F, i .e., algorithms and executable codes obtained, for example, from various databases and repositories.
  • a CI F registration module 1 12 may obtain a plurality of algorithms and their associated executable code for each domain.
  • the CI F registration module 1 12 may associate the algorithms and their associated executable codes with a CI F concept and a CIF sub-concept associated with the CIF concept based on executable metadata associated with the algorithms and their associated executable codes.
  • the CIF registration module 1 12 may subsequently store the algorithms, the associated executable codes, and the corresponding executable metadata in the CIF repository 1 08.
  • the CI F repository 108 thus created may then be used for recommending CIF to users based on search queries received from the users, through the user device 106.
  • the algorithms are stored in a searchable text format such that the CIF recommendation system 1 02 may search the algorithm using a semantic based text search.
  • Figure 2 illustrates exemplary components of the CIF recommendation system
  • the CI F recommendation system 1 02 includes one or more processor(s) 202, I/O interface(s) 204, and memory 206 coupled to the processor 202.
  • the processor(s) 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that manipulate signals based on operational instructions.
  • the processor(s) 202 is configured to fetch and execute computep-readable instructions stored in the memory 206.
  • processor(s)' ⁇ may be provided through the use of dedicated hardware as wel l as hardware capable of executing software in association with appropriate software.
  • the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of ind ividual processors, some of which may be shared.
  • processor should not be construed to refer exclusively to hardware capable of executing software, and may impl icitly include, without limitation, digital signal processor (DSP) hardware, network processor, appl ication specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), non-volati le storage. Other hardware, conventional and/or custom, may also be included.
  • DSP digital signal processor
  • ASIC appl ication specific integrated circuit
  • FPGA field programmable gate array
  • ROM read only memory
  • RAM random access memory
  • non-volati le storage non-volati le storage
  • Other hardware conventional and/or custom, may also be included.
  • the I/O interface(s) 204 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, and an external memory.
  • the interfaces 204 may facilitate multiple communications within a wide variety of protocol types including, operating system to appl ication communication, inter process communication, etc.
  • the memory 206 can include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
  • the CIF recommendation system 1 02 may include module(s) 208 and data 210. The modules 208 and the data 21 0 may be coupled to the processor(s) 202.
  • the modules 208 include routines, programs, objects, components, data structures, etc., which perform particular tasks or implement particular abstract data types.
  • the modules 208 may also be implemented as, signal processor(s), state machine(s), logic circuitries, and/or any other device or component that manipulate signals based on operational instructions.
  • the modules 208 may be computer-readable instructions which, when executed by a processor/processing unit, perform any of the described functionalities.
  • the machine-readable instructions may be stored on an electronic memory device, hard disk, optical disk or other machine-readable storage medium or non-transitory medium.
  • the computer-readable instructions can be also be downloaded to a storage medium via a network connection.
  • the module(s) 208 include the ontology creation module
  • the other module(s) 2 16 may include programs or coded instructions that supplement applications and functions performed by the CI F recommendation system 1 02.
  • the data 21 serves as a repository for storing data processed, received, and generated by one or more of the module(s) 208.
  • the data 2 1 0 includes, for example, CI F ontology data 2 1 8, metadata 220, recommendation data 222, execution data 224, and other data 226.
  • the other data 226 includes data generated as a result of the execution of one or more modules in the other module(s) 21 6.
  • the CI F recommendation system 102 facilitates in creating and maintaining the CIF repository 108 for recommending a CI F to the user of the user device 106.
  • the CI F recommendation system 1 02 may store the CIF, i.e., the algorithms and the executable codes in the CIF repository by categorizing the CIF into different CI F concepts and CIF sub-concepts.
  • the CI F concepts and the CI F sub-concepts may be determined by the ontology creation module 1 10 with help of the domain expert. Further, the CI F concepts and the CIF sub-concepts for each domain may then be used to develop the CIF ontology for organizing the CIF in the CI F repository.
  • the ontology creation module 1 1 0 may further save the information relation related to the CI F ontology, the CIF concepts, the CIF sub- concepts, and the domain metadata in the CI F ontology data 2 1 8 for further processing.
  • the CIF registration module 1 12 may subsequently associate the algorithms and their associated executable codes with a CI F concept and a CI F sub-concept associated with the CI F concept.
  • the CI F registration module I 12 may obtain the plurality of algorithms and their associated executable code for each domain from varying sources.
  • the CIF registration module 1 12 may initially identify the different algorithms and subsequently identify, for each algorithm, the associated executable codes from among the plurality of executable codes.
  • the executable codes may be identified such that each executable code corresponding to an algorithm is imp!ementable in a different programming environment than the other executable codes associated with the same algorithm. For instance, an algorithm for k-means clustering may be associated with executable codes 'kmeans.c' and 'kmeans.java' for executing the algorithm in 'C and 'java' programming environment.
  • the CI F registration module 1 1 2 may subsequently associate the algorithms and their associated executable codes with a C I F concept and a CI F sub-concept associated with the CI F concept based on the executable metadata associated with the algorithms and their associated executable codes.
  • the C I F registration module 1 12 may identify the executable metadata by analyzing data, such as inputs received from domain experts and algorithm developers. Examples of the executable metadata include, but are not limited to, I/O specification, mandatory-optional arguments, execution environment, algorithm and package dependency, and special features. I/O specifications may be understood as type, format, and size of data that may be provided as input for execution of the executable code and received as output upon execution of the executable code using the input data.
  • the mandatory-optional arguments may be understood as arguments or parameters related to the executable function. As will be understood, during execution of an executable code, some arguments are mandatory to mention and some are optional. For instance, for an executable code kmeans(), mandatory arguments may be numeric matrix of data (x) and the number of clusters (centers), while optional attributes may be the maximum number of iterations allowed (iter.max) and number of random centers (nstart).
  • the algorithm dependency may be defined as usage of an algorithm or executable code by another algorithm or executable. For example, heapsort() algorithm has dependency on buildheap() algorithm.
  • the package dependency may be defined as an executable code's dependency on packages for execution.
  • the C1 F registration module 1 12 may subsequently analyze the algorithms, their associated executable codes, and the executable metadata to determine the C1F concept and the CI F sub-concepts with which the algorithms and the executed code may be associated.
  • the algorithms and the executed code may be associated with CI F concepts based on similarity between the executable metadata and the domain metadata associated with the CI F concepts.
  • the CI F registration module 1 12 may further save the executable metadata in the metadata 220
  • the CI F registration module 1 12 may subsequently store the algorithms, the associated executable codes, and the corresponding executable metadata in the CI F repository 1 08.
  • the CI F registration module 1 12 may store the algorithms in a searchable text format such that the CI F recommendation system 102 may search the algorithm using a semantic based text search.
  • the user of the user device 1 06 wishing to search for an algorithm send a search query to the CI F recommendation system 1 02.
  • the search query may be received by the recommendation module 2 12 and saved in the recommendation data 224 for further processing.
  • the user may wish develop an application that involves predicting stock index in a particular time instance, and may thus look for an algorithm for predicting the stock index.
  • the user in such a case may assume the predicted index as "missing data" in a distribution of data points available with him and may search for an algorithm handling "missing data' ' .
  • the user device 1 06 may thus transmit the search query having the term "missing data" to the recommendation module 2 1 2.
  • the recommendation module 2 12 may analyze the search query to determine search metadata.
  • search metadata examples include, but are note l imited to, terms defining an application of an algorithm, name of an algorithm, a type of an algorithm, and synonyms/semantically same terms for name of the algorithm and the terms defining the application of the algorithm.
  • the recommendation module 212 may then search the C1 F repository 108 to identify one or more algorithms from among the plurality of algorithms matching the search query based on the search metadata.
  • the recommendation module 212 may function as a search engine for searching the avai lable algorithms and executable codes either by exploring algorithm ontology or by posing semantic query.
  • the recommendation module 2 1 2 may process the search query and the search metadata to generate a standard SPARQL (SPARQL Protocol and RDF Query Language) query.
  • the recommendation module 2 1 2 may then perform the algorithm search based on the SPARQL query and the executable metadata associated with the various algorithms and executable code.
  • the recommendation module 2 1 2 may use web min ing techniques to determine semantically similar terms for the terms used in the search query. For instance, the recommendation module 2 12 may access varied onl ine sources to determine the semantically similar terms. The recommendation modu le 2 1 2 may subsequently use the semantically similar terms thus obtained for identifying the algorithms. For instance, in the previous example of the search query having the term "missing data", the recommendation module 2 1 2 may mine the web to determine the term "interpolation" as the semantical ly similar term for the term "missing data " .
  • the recommendation module 2 12 may obtain the executable codes associated with the algorithms from the C1 F repository 1 08.
  • the recommendation module 2 12 subsequently generates a CI F recommendation list having the one or more algorithms and the corresponding executable codes in a predefined order.
  • the predefined order may be determined based on ranking of the algorithms and the executable codes.
  • the recommendation module 2 12 in such a case may rank the one or more algorithms and the associated executable code based on one or more ranking parameters. Examples of the ranking parameters include, but are not limited to, executable metadata, the search metadata, computational time of the executable codes, and input dataset received from the user.
  • the computational time of the executable codes may be determined, for example, based on previous runs of the executable codes by the CI F recommendation system 1 02.
  • the computational time of the executable codes may further be provided by an algorithm developer at the time of providing the algorithm and the executable codes for storage in the CI F repository 1 08.
  • the input dataset may be understood as data provided by the user device 1 06 for execution, for instance, for testing purpose of the execution code.
  • the input dataset may be provided either as a part of the search query or at later stage, for example, upon being requested by the CI F recommendation system 102. Using the input dataset helps in an efficient ranking of the algorithms as the performance of an algorithm may vary depending on the input data and its characteristics, such as format, size, and complexity.
  • the CI F recommendation list having the one or more algorithms and the corresponding executable codes in the predefined order, say, decreasing order or ranking may be provided by the recommendation module 2 12 to the user.
  • the CI F recommendation list may be rendered on a display screen (not shown in the figure) of the user device 106 been used by the user.
  • the C I F recommendation list may also include additional information, such as the computational time and the programming environment for the executable codes.
  • the user may subsequently select an algorithm and one of the associated executable codes based on the ranking and the additional information and provide.
  • the user device 1 06 may- subsequently transm it a user selection indicating an algorithm and at least one execution code associated with algorithm chosen by the user.
  • the execution module 2 14 may receive the user selection and save the same in the execution data 222 for further processing. [0051] The execution module 214 may subsequently analyze the user selection to identify the algorithm and the executable code chosen by the user and render the same to the user device 106. In one implementation, the execution module 2 14 may render the algorithm or the executable code by performing at least one of executing the chosen executable code using the input dataset received from the user for generating an output dataset; providing the chosen algorithm to the user in a text format; and providing the chosen executable code to the user in an executable format.
  • the execution module 214 may perform any of the above activities based on, for example, user selection indicating the action the user wishes the CIF recommendation system to perform and one or more predefined rules, such as level of services to be provided to the user. For instance, in case the CI F repository 108 and the CIF recommendation system 1 02 are implemented for an organization, employees of the organization may have the facility of using both the algorithms and the executable codes in the text format and the executable format, respectively, while other users may only be allowed to use the executable code for processing the input dataset on the CIF recommendation system 102 itself and obtain the output data. [0052J In order to execute the chosen executable code using the input dataset received from the user, the execution module 214 may initially obtain the chosen execution code and the associated execution metadata from the CIF repository 108.
  • the execution module 2 14 may further obtain the input dataset and analyze the input dataset to verify whether the input dataset satisfies conditions specified in the execution metadata.
  • the execution module 2 14 may further preprocess the input dataset for performing preprocessing, such as noise removal and handling of missing values in the input dataset.
  • the execution module 2 14 may subsequently set execution environment variable for obtaining the execution environment for executing the executable code. For instance, the execution module 214 may initialize various run time environment variables, such as path, classpath, and temp before running the executable codes in order to provide possible run-time errors.
  • the execution module 214 may then execute the executable code using the input dataset to obtain the output dataset.
  • the execution module 2 14 may subsequently provide the output dataset to the user device 106.
  • the execution module 2 1 4 may perform the execution of the associated executable codes accordingly.
  • the execution module 214 may perform actions, for example, algorithm stitching issues like data type compatibility, change of execution environment and its variable, etc.
  • the execution module 2 14 may perform post processing to convert the output dataset to the desired output format. The output dataset is then provided by the execution module 214 to the user device 106.
  • the C1F registration module 1 12 may regularly and automatically update the CI F repository 108 based on, for example, the semantically similar terms obtained by the recommendation module 1 12 during searching.
  • the CI F registration module 1 12 may use the semantically similar terms to update the executable metadata associated with each algorithm.
  • the CIF registration module 1 12 may further access various online sources to obtain valid concepts and relations useful information for updating the CIF ontology.
  • the various online sources may include unstructured data, the CI F registration module 1 12 may thus use various text processing, natural language processing (NLP) and information retrieval (IR) techniques to extract valid concepts and relations useful information or to infer valid relations from resources having such unstructured data.
  • the CI F registration module 1 12 may subsequently update the CI F concepts, CI F sub-concepts, and concept metadata associated with the CIF concept based on the search query and one or more external databases using the NLP and IR techniques.
  • Figure 3 illustrates a method 300 for creating and maintaining a computer implemented functions (CIF) repository for recommending a CI F to a user, according to an embodiment of the present subject matter.
  • CIF computer implemented functions
  • the method(s) may be described in the general context of computer executable instructions.
  • computer executable instructions can include routines, programs, objects, components, data structures, procedures, modu les, functions, etc., that perform particu lar functions or implement particular abstract data types.
  • the method may also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communications network.
  • computer executable instructions may be located in both local and remote computer storage media, including memory storage devices.
  • some embodiments are also intended to cover program storage devices, for example, digital data storage media, which are machine or computer readable and encode machine-executable or computer-executable programs of instructions, where said instructions perform some or all of the steps of the described method.
  • the program storage devices may be, for example, digital memories, magnetic storage media, such as a magnetic disks and magnetic tapes, hard drives, or optical ly readable digital data storage media.
  • the embodiments are also intended to cover both communication network and communication devices configured to perform said steps of the exemplary methods.
  • C1F ontology comprising a plurality of C1 F concepts is developed.
  • the C1 F ontology is developed based on domain knowledge obtained from one or more knowledge sources.
  • the plurality of CIF concepts into which a plurality of algorithms, that are to be stored in the CIF repository, can be organized are determined.
  • Each of the plural ity of CI F concepts are further categorized into one or more CI F sub-concepts based on the domain knowledge.
  • relationship between each of the plurality of CI F concepts is determ ined for being used for developing the CI F ontology.
  • a CI F recommendation system such as the CI F recommendation system 102 may develop the CI F ontology.
  • the algorithms and their associated executable codes are associated with a CIF concept, from among the plurality of CI F concepts, and a CI F sub-concept associated with the CI F concept.
  • the algorithms and their associated executable codes may be associated with the CI F concepts and the C I F sub-concept based on executable metadata correspond ing to the algorithms and the associated executable codes.
  • the executable metadata include, but are not l imited to, I/O specification, mandatory-optional arguments, execution environment, algorithm and package dependency, and special features.
  • the plural ity of algorithms, the associated executable codes, and the corresponding executable metadata may be stored in the C1 F repository.
  • the CI F recommendation system 102 may store the plurality of algorithms, the associated executable codes, and the corresponding executable metadata in the C I F repository 108.
  • a search query received from a user may be analyzed to determine search metadata.
  • the CI F recommendation system 102 may receive the search query from an user wishing to search algorithms, for example, for developing applications.
  • the CI F recommendation system 102 may analyze the search query to determine search metadata, such as terms defining an application of an algorithm, name of an algorithm, a type of an algorithm, and synonyms/semantically same terms for name of the algorithm and the terms defining the application of the algorithm.
  • the CI F repository may be searched to identify one or more algorithms from among the plural ity of algorithms matching the search query based on the search metadata.
  • a determination is made to ascertain whether any algorithm matching the search query has been found. If no algorithm matching the search query has been found, which is the 'No' path from the block 3 12, it updates the CI F concepts, CI F sub-concepts, and concept metadata associated with the CIF concept based on the algorithm search query and one or more external databases at block 3 14.
  • a CIF recommendation l ist having the one or more algorithms and the corresponding executable codes in a predefined order is provided to the user at the block 3 1 6.
  • the CI F recommendation system 102 may rank the one or more algorithms and the associated executable codes based on ranking parameters. Examples of the ranking parameters include at least one of executable metadata, the search metadata, computational time of the executable codes, and input dataset received from the user.
  • a user selection indicating at least one execution code associated with an algorithm, from among the one or more algorithms, chosen by the user is received.
  • the C I F recommendation system 102 may receive the user selection from a user device 1 06 of the user.
  • at least one of the executable code and the algorithm are rendered to the user based on the user selection.
  • the CI F recommendation system 1 02 may render the at least one of the executable code and the algorithm by performing at least one of executing the chosen executable code using an input dataset received from the user for generating an output dataset; providing the chosen algorithm to the user in a text format; and providing the chosen executable code to the user in an executable format.

Abstract

A CIF recommendation system (102) comprises a processor (202) and a CIF registration module (112) coupled to a processor (202) to store a plurality of algorithms and corresponding executable codes in a CIF repository (108). The algorithms are stored in a searchable text format. The executable codes include one or more executable codes implementable in one or more programming environments. The algorithms are associated with a CIF concept. A recommendation module (212) coupled to the processor (202) analyzes a search query received from a user to determine search metadata and identifies one or more algorithms from among the plurality of algorithms and the corresponding executable codes matching the search query based on the search metadata. The recommendation module (212) provides a CIF recommendation list having the one or more algorithms and the corresponding executable codes in a predefined order to the user based on one or more ranking parameters.

Description

REPOSITORY AND RECOMMENDATION SYSTEM FOR COMPUTER PROGRAMS
FI ELD OF INVENTION
[0001 ] The present subject matter relates to developing and maintaining repositories and, particularly but not exclusively, to a repository and recommendation system for computer implemented functions.
BACKGROUND
[0002] With the onset of automation in almost every walk of life, software applications have replaced tedious human labour in almost all avenues. For example, applications are developed and implemented for carrying out clustering, sorting, data processing, and a plethora of other such purposes. Developing the software applications involves initially developing computer implemented functions, such as algorithms and their executable codes. As will be understood, an algorithm may be considered as a finite list of well-defined instructions or steps for performing a specified task. An executable code, on the other hand, is a sequence of codes defining the instructions for performing the specified task or implementing a desired function in a computing device, such as a desktop or a laptop. The executable codes are written in a particular programming language to perform the specified task. An algorithm may thus have one or more associated executable code such that each executable code is based on a different programming environment. One or more executable codes defined for performing different tasks may accordingly be used for developing a software application depending on the software environment where the application is to be deployed.
BRIEF DESCRIPTION OF THE FIGURES
[0003] The detailed description is described with reference to the accompanying figures.
In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the figures to reference like features and components. Some embodiments of system and/or methods in accordance with embodiments of the present subject matter are now described, by way of example only, and with reference to the accompanying figures, in which: [0004J Figure 1 il lustrates a network environment implementing a computer implemented functions (C1 F) recommendation system, in accordance with an embodiment of the present subject matter;
[0005] Figure 2 i llustrates the C1F recommendation system, in accordance with an embodiment of the present subject matter; and
[0006] Figure 3 illustrates a method for creating and maintaining a C I F repository for recommending a CIF to a user, in accordance with an embodiment of the present subject matter.
[0007] It should be appreciated by those ski lled in the art that any block diagrams herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
DESCRIPTION OF EM BODIMENTS [0008] The present subject matter relates to methods and systems for creating and maintaining a repository of computer implemented functions (CIF). The CI F may be understood as a collection of functions, such as algorithms and executable codes that may be implemented on a computing device for performing one or more specified task. A large number of CIF, such as algorithms and executable codes in various fields of technology have been proposed til l date for performing diverse categories of tasks. An algorithm may be defined as a finite list of instructions defined for performing a specified task, such as clustering, counting, and time series analysis. As will be understood, one or more algorithms may be defined for performing a same task, with each algorithm performing the task with a different level of complexity.
[0009] The functional ity as intended by the algorithm is general ly implemented by executing one or more executable code. An executable code may be defined as a sequence of codes defining the instructions for performing the specified task in a computing device, such as a computer and a laptop. The executable codes are written in a particular programming language, corresponding to the programming environment in wh ich the algorithm is to be implemented, to perform the speci fied task. Further, the executable codes are written in h igh level programming languages. As wi l l be understood various executable codes may be used by an appl ication program to develop a computer application or software for performing one or more tasks, such that each executable code facil itates in functioning and execution of the software for performing the one or more tasks.
[0010] As the CIF are typically not dependent or tied to a specific computing device, the same CI F may be used, either in their original form or a slightly modified form, by any user on any computing device for developing one or more software applications. In one conventional technique, various databases have been developed for storing the CIF. Such databases, however, typically store the CI F for either a particular programming language or a particular domain, such as time series analysis, machine learning, and sensor fusion and data routing. [0011] For instance, each programming environment, such as C++ and Java™ generally has an associated library for storing various executable codes that may be used by an user working in that programming environment. Use of such executable codes, however, is limited to that particular programming environment and can not be used in other programming languages. The user in such case might have to manually parse through library of each programming environment the user is working in. The user may further have to develop the executable code in case the executable code is not present in the library of any particular programming environment. Further, since such l ibraries do not store the algorithm used for developing the executable codes, the user may have to either develop a new executable or study the whole executable code in order to make any modifications, thus increasing the time and complexity required for developing a software application.
[0012] In another conventional technique, databases storing large number of algorithms for performing varied tasks have been implemented. Such databases typically categorize and store the algorithms based on the problems, such as numerical, geometrical, combinational, and graph based problems being solved by them. Such a categorization al lows the users to search a suitable algorithm based on the problem they wish to solve. However, such a categorization may not be suitable for cases where the same algorithm may be used for solving different problems as either the algorithm wi l l be stored in just one category or duplicate copies of the same algorithm wi ll have to stored for each category. Storing the algorithm under just one category may affect search results and usefulness of such databases as a user trying to search the algorithm for solving a di fferent problem may never find the algorithm. (0013] Further, various search engines have been provided that facil itate searching of the
CI F over the World Wide Web and various databases associated with the search engine. A general web search through such search engines for any specific task may l ist up a set of several algorithms and associated executable codes available over the web or in the associated databases. However, such a search may not guarantee fetching up of all possible algorithms as a search query used for the search may not have contained exact name, such as Fibonacci and binary sort of the algorithm being searched. Further, since the same algorithm and the associated executable codes may be referred using different names in different programming environment, a search query having name used in a particular programming environment may not provide executable codes for a different programming language. Further, since the search engines typically obtain the algorithms from different sources, a large number of algorithms and executable codes may be provided in reply to a search query. The user in such a case may not be able to identify a best suited algorithm for performing a specific task, as performance of an algorithm may depend on various parameters, such as namely input features, output quality, and time and space requirement.
[0014] According to an embodiment of the present subject matter, a repository for computer implemented functions (CI F) is described. The CIF repository facilitates storage and quick search of CI F, i.e., algorithms and their associated executable codes for varying programming languages using a semantic based text search. In said embodiment, CI F recommendation system may be implemented for creating and maintaining the CI F repository. The CI F recommendation system may further facilitate a user to search the algorithms and their associated executable codes in the CI F repository. The CI F recommendation system may be implemented in a variety of computing devices, such as desktop computers, cloud servers, mainframe computers, workstations, multiprocessor systems, laptops, network computers, minicomputers, and servers.
[0015] In an implementation, the CI F recommendation system faci l itates creation of the
CI F repository based on domain knowledge of the algorithms and the executable codes. In one implementation, the domain knowledge is organized in the form of concepts and stored using ontology based representations. Ontology, as wil l be understood, refers to a common vocabulary for people who need to share information within a domain, such as time-series analysis, machine learning, and sensor fusion and data routing. The ontology contains machine-interpretable and human readable definitions of basic concepts of the domain and the relations among these basic concepts. Concept may be understood as a category related to a domain. Developing the ontologies helps in ensuring that all metadata required for searching and execution of the algorithms and the executable codes are avai lable directly. [0016) In an implementation, one or more C 1F concepts may be defined based on domain knowledge associated with the various domains for which CIF are typical ly developed. Each CI F concept may be further categorized into one or more CI F sub-concepts based on the domain knowledge. For instance, CIF concepts classification, regression, and clustering may be defined for the domain machine learning. The CIF concept clustering may be further categorized into CI F sub-concepts hierarchical clustering, partition based clustering, and probabilistic clustering. Relationships between the various CIF concepts may be subsequently determined. The CI F concepts, CI F sub-concepts, the relationships, and metadata associated with the CI F concepts and CIF sub-concepts may be subsequently used to develop CIF ontology.
[0017] Further, for each domain, a plurality of algorithms and their associated executable codes may be obtained, for example, from various databases and repositories. The executable codes may be identified such that each executable code corresponding to an algorithm is implementable in a different programming environment than the other executable codes associated with the same algorithm. The algorithms and their associated executable codes may be subsequently associated with a CIF concept and a CI F sub-concept associated with the CIF concept based on executable metadata associated with the algorithms and their associated executable codes. Examples of the executable metadata include, but are not limited to, I/O specification, mandatory-optional arguments, execution environment, algorithm and package dependency, and special features. The algorithms, the associated executable codes, and the corresponding executable metadata may then be stored in the CI F repository. In one implementation, the algorithms are stored in a searchable text format. The CI F repository thus created may then be used for recommending CI F to users based on search queries received from the users.
[0018] In one implementation, the search query may be initial ly analyzed to determ ine search metadata, such as terms defining an appl ication of an algorithm, name of an algorithm, a type of an algorithm, and synonyms/semantically same terms for name of the algorithm and the terms defining the application of the algorithm. The C1 F repository may then be searched to identify one or more algorithms from among the plurality of algorithms matching the search query based on the search metadata. A C1 F recommendation list having the one or more algorithms and the corresponding executable codes in a predefined order may then be provided to the user based on one or more ranking parameters, such as executable metadata, the search metadata, computational time of the executable codes, and input dataset received from the user. An executable code chosen by the user may then be rendered to the user based on a user selection indicating the user's choice.
[0019J Further, in case no algorithm matching the search query is identified, the CIF recommendation system may access external sources to determine semantic terms for the search query. Once determined, the semantic terms may be used to identify the algorithms. Subsequently, the CIF concepts, C I F sub-concepts, and concept metadata associated with the CIF concept may be updated based on the search query and one or more external databases using text processing and information retrieval techniques. The CI F repository may be further modified to add new CIF concepts, CI F sub-concepts, algorithms, and executable codes and to modify the existing CIF concepts and CIF sub-concepts.
[0020] The present subject matter thus facilitate in providing a CI F repository for storing and searching CI F of varying domains and programming environments. The present subject matter further enables structuring the available algorithms and storing the same in a text format, such that the algorithms can be searched by the CI F recommendation system using a semantic based text search. For instance, searching the CI F using the search metadata instead of the natural language search query provided by the user facilitates in a quick and efficient search as the search metadata uses various other terms, such as the synonyms/semantical ly same terms apart from the terms mentioned by the user in the search query. Further, storing the CI F based on the CIF concepts and CI F sub-concepts facilitates in search and storage of the CI F as the CI F can now be based on the search metadata apart from just name or type of problem being solved by the algorithms. Further, regularly updating the C I F repository facil itates in improving the search results and the number of CI F as new CI F concepts and algorithms can be automatical ly updated by the CI F recommendation system. Further, since the CI F are stored based on the CI F concepts, a user, such as a C I F developer can easily add new C I F to the C I F repository, using the C I F recommendation system, by identi fying the CI F concept and CI F sub-concept to wh ich the algorithm belongs. The CI F may thus be recommended to users as a part of providing Cl F-as-a service.
[0021 ] These and other advantages of the present subject matter would be described in greater detai l in conjunction with the following figures. While aspects of described systems and methods for designing materials and processing techniques can be implemented in any number of different computing systems, environments, and/or configurations, the embodiments are described in the context of the following exemplary system(s).
[0022] Figure 1 illustrates a network environment 100 implementing a computer implemented functions (CI F) recommendation system 102 in accordance with an embodiment of the present subject matter. In one embodiment, the network environment 100 includes a network 1 04 for enabl ing communication between the CIF recommendation system 102 and a plurality of user devices 106- 1 , 106-2, ... , 106-N. The user devices 106- 1 , 1 06-2, 106-N are hereinafter collectively referred to as user devices 106 and individually referred to as a user device 106.
[0023] The network 104 may be a wireless network, a wired network, or a combination thereof. The network 1 04 can also be an individual network or a collection of many such individual networks, interconnected with each other and functioning as a single large network, for example, the Internet or an intranet. The network 104 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and such. Further, the network 104 may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), etc., to communicate with each other.
|0024| Further, the user devices 106 may be one or more computing devices, such as mainframe computers, workstations, personal computers, desktop computers, minicomputers, servers, multiprocessor systems, laptops, a cellular communicating device, such as a personal digital assistant, a smart phone, and a mobi le phone, and the l ike. In one implementation, the CI F recommendation system 1 02 may be one or more computing devices, such as a desktop computer, cloud servers, mainframe computers, workstation, a multiprocessor system, a hand- held device, a laptop computer, a network computer, a minicomputer, a server, and the l ike. [0025] Alternatively, the CI F recommendation system 1 02 may also be implemented as multiple servers to concurrently perform a number of tasks. In an example the CI F recommendation system 1 02 may work on a web based platform to facilitate col laborations between various stakeholders, for example, concept developers, CI F providers, application developers, and researchers, etc.. interacting with the CI F recommendation system 1 02 through the user devices 1 06. In other implementations, the CI F recommendation system 102 may work on other similar platforms.
[0026] The network environment 1 00 further comprises a CI F repository 1 08 associated with the CI F recommendation system 102. The CI F repository 108 stores a plurality of CI F, such as a plurality of algorithms and a plurality of executable codes that may be accessed by the CIF recommendation system 102. It will be understood that although the CI F repository 1 08 has been shown external to the CIF recommendation system 102 however, the C IF repository 108 may be located within the CIF recommendation system 1 02. In one embodiment, the CIF recommendation system 102 may include an ontology creation module 1 10 and a CIF registration module 1 12 for creating and maintaining the CI F repository 1 08.
[0027] In one implementation, the CI F repository 1 08 includes data, such as the CIF and metadata associated with the CI F in the form of CIF ontology. The CI F repository 1 08 may include algorithms, executable codes, and metadata organized as rules, cases, equation, models and so on, all expressed in terms of the common vocabulary provided by the ontology. Ontology, as wi ll be understood, refers to a common vocabulary for people who need to share information within a domain, such as time-series analysis, machine learning, and sensor fusion and data routing. The ontology contains machine-interpretable and human readable definitions of basic concepts of the domain and the relations among these basic concepts. Further, the CI F recommendation system 1 02 facil itates defin ing a plural ity of CI F concepts for development of the CI F ontology for storing the algorithms and the executable codes in the CI F repository 1 08. Concept may be understood as a category related to a domain. The C I F related to one domain may thus be associated with a CI F concept with in the domain before being stored in the C I F repository 108. This representation of the information in form of concepts and relationships between them faci l itates structured representation of the CI F and its associated information, such as metadata. In said implementation, the CI F concepts may further include CI F sub-concepts, i.e., specific categories, which are also associated with the C I F concepts. [0028] In an implementation, one or more C I F concepts may be defined based on domain knowledge associated with the various domains. For instance, a domain expert may define CI F concepts related to a domain in the CI F repository 1 08. The C I F concepts may be defined using domain knowledge obtained by parsing available unstructured data, such as documents, reports, web pages, and discussion on web forums, related to the domain. In one implementation, the domain expert, through the ontology creation module I 10, may parse the avai lable unstructured data to obtain the domain knowledge. The ontology creation module 1 10 may subsequently determine the plurality of CI F concepts into which the plurality of algorithms can be organized. For instance, the domain expert may analyze the domain knowledge and provide concept creation inputs to the ontology creation module 1 10 for determining the CIF concepts. In one example, CI F concepts classification, regression, and clustering may be defined for the domain machine learning.
[0029] Further, the ontology creation module 1 10 may categorize each CIF concept into one or more CI F sub-concepts based on the domain knowledge. For instance, the domain expert may analyze the domain knowledge and provide concept creation inputs to the ontology creation module 1 10 for determining the CI F sub-concepts for each CIF concept. For instance, in the previous example the CIF concept clustering may be further categorized into CIF sub-concepts hierarchical clustering, partition based clustering, and probabilistic clustering.
[0030] In one implementation, the CI F sub-concepts may be determined based on domain metadata associated with each domain. The domain expert, through the ontology creation module I 10, may determine the domain metadata associated with each of the domains based on the domain knowledge. Examples of the domain metadata include, but are not limited to, application of the CI F and process used by the CI F. The domain metadata may be used to determine, for example, the number of CI F sub-concepts into wh ich the CI F concepts may be divided. For instance, depending on the aspect 'application" the CI F concept clustering may be categorized into two sub-concepts l inearly separable data clustering, linearly non-separable data clustering, whi le based on the aspect 'process" the C I F concept clustering may be categorized into two sub- concepts flat clustering and hierarchical clustering.
|0031 ) Further, the ontology creation module 1 1 0 may determine relationship between the CI F concepts for being used for developing the CI F ontology comprising the plurality of CI F concepts and the C I F sub-concepts. The C1F ontology thus developed may be stored in the O F repository 108.
[0032] The CI F recommendation system 102 may subsequently use the CI F concepts and the CIF sub-concepts to categorize CI F, i .e., algorithms and executable codes obtained, for example, from various databases and repositories. In one implementation, a CI F registration module 1 12 may obtain a plurality of algorithms and their associated executable code for each domain. The CI F registration module 1 12 may associate the algorithms and their associated executable codes with a CI F concept and a CIF sub-concept associated with the CIF concept based on executable metadata associated with the algorithms and their associated executable codes. The CIF registration module 1 12 may subsequently store the algorithms, the associated executable codes, and the corresponding executable metadata in the CIF repository 1 08. The CI F repository 108 thus created may then be used for recommending CIF to users based on search queries received from the users, through the user device 106. In one implementation, the algorithms are stored in a searchable text format such that the CIF recommendation system 1 02 may search the algorithm using a semantic based text search.
[0033] Figure 2 illustrates exemplary components of the CIF recommendation system
1 02 in accordance with an embodiment of the present subject matter. In said embodiment, the CI F recommendation system 1 02 includes one or more processor(s) 202, I/O interface(s) 204, and memory 206 coupled to the processor 202. The processor(s) 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) 202 is configured to fetch and execute computep-readable instructions stored in the memory 206.
[0034] The functions of the various elements shown in the figure, including any functional blocks labeled as "processor(s)'\ may be provided through the use of dedicated hardware as wel l as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of ind ividual processors, some of which may be shared. Moreover, explicit use of the term "processor" should not be construed to refer exclusively to hardware capable of executing software, and may impl icitly include, without limitation, digital signal processor (DSP) hardware, network processor, appl ication specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), non-volati le storage. Other hardware, conventional and/or custom, may also be included. [0035] The I/O interface(s) 204 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, and an external memory. Further, the interfaces 204 may facilitate multiple communications within a wide variety of protocol types including, operating system to appl ication communication, inter process communication, etc. [0036] The memory 206 can include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. (0037] Further, the CIF recommendation system 1 02 may include module(s) 208 and data 210. The modules 208 and the data 21 0 may be coupled to the processor(s) 202. The modules 208, amongst other things, include routines, programs, objects, components, data structures, etc., which perform particular tasks or implement particular abstract data types. The modules 208 may also be implemented as, signal processor(s), state machine(s), logic circuitries, and/or any other device or component that manipulate signals based on operational instructions. In another aspect of the present subject matter, the modules 208 may be computer-readable instructions which, when executed by a processor/processing unit, perform any of the described functionalities. The machine-readable instructions may be stored on an electronic memory device, hard disk, optical disk or other machine-readable storage medium or non-transitory medium. In one implementation, the computer-readable instructions can be also be downloaded to a storage medium via a network connection.
[0038] In an implementation, the module(s) 208 include the ontology creation module
I 1 0. the CI F registration module 1 12, a recommendation module 2 1 2. an execution module 2 14, and other module(s) 2 1 6. The other module(s) 2 16 may include programs or coded instructions that supplement applications and functions performed by the CI F recommendation system 1 02. [0039] The data 21 0, amongst other things, serves as a repository for storing data processed, received, and generated by one or more of the module(s) 208. The data 2 1 0 includes, for example, CI F ontology data 2 1 8, metadata 220, recommendation data 222, execution data 224, and other data 226. The other data 226 includes data generated as a result of the execution of one or more modules in the other module(s) 21 6.
[0040] As previously discussed, the CI F recommendation system 102 facilitates in creating and maintaining the CIF repository 108 for recommending a CI F to the user of the user device 106. The CI F recommendation system 1 02 may store the CIF, i.e., the algorithms and the executable codes in the CIF repository by categorizing the CIF into different CI F concepts and CIF sub-concepts. As described in Figure 1 , the CI F concepts and the CI F sub-concepts may be determined by the ontology creation module 1 10 with help of the domain expert. Further, the CI F concepts and the CIF sub-concepts for each domain may then be used to develop the CIF ontology for organizing the CIF in the CI F repository. The ontology creation module 1 1 0 may further save the information relation related to the CI F ontology, the CIF concepts, the CIF sub- concepts, and the domain metadata in the CI F ontology data 2 1 8 for further processing.
[0041] The CIF registration module 1 12 may subsequently associate the algorithms and their associated executable codes with a CI F concept and a CI F sub-concept associated with the CI F concept. As previously described, the CI F registration module I 12 may obtain the plurality of algorithms and their associated executable code for each domain from varying sources. In one implementation, the CIF registration module 1 12 may initially identify the different algorithms and subsequently identify, for each algorithm, the associated executable codes from among the plurality of executable codes. The executable codes may be identified such that each executable code corresponding to an algorithm is imp!ementable in a different programming environment than the other executable codes associated with the same algorithm. For instance, an algorithm for k-means clustering may be associated with executable codes 'kmeans.c' and 'kmeans.java' for executing the algorithm in 'C and 'java' programming environment.
[0042] The CI F registration module 1 1 2 may subsequently associate the algorithms and their associated executable codes with a C I F concept and a CI F sub-concept associated with the CI F concept based on the executable metadata associated with the algorithms and their associated executable codes. In one implementation, the C I F registration module 1 12 may identify the executable metadata by analyzing data, such as inputs received from domain experts and algorithm developers. Examples of the executable metadata include, but are not limited to, I/O specification, mandatory-optional arguments, execution environment, algorithm and package dependency, and special features. I/O specifications may be understood as type, format, and size of data that may be provided as input for execution of the executable code and received as output upon execution of the executable code using the input data.
[0043] The mandatory-optional arguments may be understood as arguments or parameters related to the executable function. As will be understood, during execution of an executable code, some arguments are mandatory to mention and some are optional. For instance, for an executable code kmeans(), mandatory arguments may be numeric matrix of data (x) and the number of clusters (centers), while optional attributes may be the maximum number of iterations allowed (iter.max) and number of random centers (nstart). The algorithm dependency may be defined as usage of an algorithm or executable code by another algorithm or executable. For example, heapsort() algorithm has dependency on buildheap() algorithm. The package dependency may be defined as an executable code's dependency on packages for execution.
[0044] The C1 F registration module 1 12 may subsequently analyze the algorithms, their associated executable codes, and the executable metadata to determine the C1F concept and the CI F sub-concepts with which the algorithms and the executed code may be associated. In one implementation, the algorithms and the executed code may be associated with CI F concepts based on similarity between the executable metadata and the domain metadata associated with the CI F concepts. The CI F registration module 1 12 may further save the executable metadata in the metadata 220
[0045] The CI F registration module 1 12 may subsequently store the algorithms, the associated executable codes, and the corresponding executable metadata in the CI F repository 1 08. In one implementation, the CI F registration module 1 12 may store the algorithms in a searchable text format such that the CI F recommendation system 102 may search the algorithm using a semantic based text search.
[0046] In one implementation, the user of the user device 1 06, wishing to search for an algorithm send a search query to the CI F recommendation system 1 02. The search query may be received by the recommendation module 2 12 and saved in the recommendation data 224 for further processing. For example, the user may wish develop an application that involves predicting stock index in a particular time instance, and may thus look for an algorithm for predicting the stock index. The user in such a case may assume the predicted index as "missing data" in a distribution of data points available with him and may search for an algorithm handling "missing data''. The user device 1 06 may thus transmit the search query having the term "missing data" to the recommendation module 2 1 2. Upon receiving the search query, the recommendation module 2 12 may analyze the search query to determine search metadata. Examples of the search metadata include, but are note l imited to, terms defining an application of an algorithm, name of an algorithm, a type of an algorithm, and synonyms/semantically same terms for name of the algorithm and the terms defining the application of the algorithm.
[0047] The recommendation module 212 may then search the C1 F repository 108 to identify one or more algorithms from among the plurality of algorithms matching the search query based on the search metadata. In one implementation, the recommendation module 212 may function as a search engine for searching the avai lable algorithms and executable codes either by exploring algorithm ontology or by posing semantic query. In order to search the C1F based on semantic query, the recommendation module 2 1 2 may process the search query and the search metadata to generate a standard SPARQL (SPARQL Protocol and RDF Query Language) query. The recommendation module 2 1 2 may then perform the algorithm search based on the SPARQL query and the executable metadata associated with the various algorithms and executable code.
[0048] In case the recommendation module 2 12 fai ls to identify any algorithm matching the search query~ the recommendation module 2 1 2 may use web min ing techniques to determine semantically similar terms for the terms used in the search query. For instance, the recommendation module 2 12 may access varied onl ine sources to determine the semantically similar terms. The recommendation modu le 2 1 2 may subsequently use the semantically similar terms thus obtained for identifying the algorithms. For instance, in the previous example of the search query having the term "missing data", the recommendation module 2 1 2 may mine the web to determine the term "interpolation" as the semantical ly similar term for the term "missing data". (0049] Upon identifying the one or more algorithms matching the search query, the recommendation module 2 12 may obtain the executable codes associated with the algorithms from the C1 F repository 1 08. The recommendation module 2 12 subsequently generates a CI F recommendation list having the one or more algorithms and the corresponding executable codes in a predefined order. In one implementation, the predefined order may be determined based on ranking of the algorithms and the executable codes. The recommendation module 2 12 in such a case may rank the one or more algorithms and the associated executable code based on one or more ranking parameters. Examples of the ranking parameters include, but are not limited to, executable metadata, the search metadata, computational time of the executable codes, and input dataset received from the user. The computational time of the executable codes may be determined, for example, based on previous runs of the executable codes by the CI F recommendation system 1 02. The computational time of the executable codes may further be provided by an algorithm developer at the time of providing the algorithm and the executable codes for storage in the CI F repository 1 08. The input dataset may be understood as data provided by the user device 1 06 for execution, for instance, for testing purpose of the execution code. The input dataset may be provided either as a part of the search query or at later stage, for example, upon being requested by the CI F recommendation system 102. Using the input dataset helps in an efficient ranking of the algorithms as the performance of an algorithm may vary depending on the input data and its characteristics, such as format, size, and complexity. [0050] Once ranked, the CI F recommendation list having the one or more algorithms and the corresponding executable codes in the predefined order, say, decreasing order or ranking may be provided by the recommendation module 2 12 to the user. In one implementation, the CI F recommendation list may be rendered on a display screen (not shown in the figure) of the user device 106 been used by the user. Further, the C I F recommendation list may also include additional information, such as the computational time and the programming environment for the executable codes. The user may subsequently select an algorithm and one of the associated executable codes based on the ranking and the additional information and provide. The user device 1 06 may- subsequently transm it a user selection indicating an algorithm and at least one execution code associated with algorithm chosen by the user. In one implementation, the execution module 2 14 may receive the user selection and save the same in the execution data 222 for further processing. [0051] The execution module 214 may subsequently analyze the user selection to identify the algorithm and the executable code chosen by the user and render the same to the user device 106. In one implementation, the execution module 2 14 may render the algorithm or the executable code by performing at least one of executing the chosen executable code using the input dataset received from the user for generating an output dataset; providing the chosen algorithm to the user in a text format; and providing the chosen executable code to the user in an executable format. The execution module 214 may perform any of the above activities based on, for example, user selection indicating the action the user wishes the CIF recommendation system to perform and one or more predefined rules, such as level of services to be provided to the user. For instance, in case the CI F repository 108 and the CIF recommendation system 1 02 are implemented for an organization, employees of the organization may have the facility of using both the algorithms and the executable codes in the text format and the executable format, respectively, while other users may only be allowed to use the executable code for processing the input dataset on the CIF recommendation system 102 itself and obtain the output data. [0052J In order to execute the chosen executable code using the input dataset received from the user, the execution module 214 may initially obtain the chosen execution code and the associated execution metadata from the CIF repository 108. The execution module 2 14 may further obtain the input dataset and analyze the input dataset to verify whether the input dataset satisfies conditions specified in the execution metadata. The execution module 2 14 may further preprocess the input dataset for performing preprocessing, such as noise removal and handling of missing values in the input dataset. The execution module 2 14 may subsequently set execution environment variable for obtaining the execution environment for executing the executable code. For instance, the execution module 214 may initialize various run time environment variables, such as path, classpath, and temp before running the executable codes in order to provide possible run-time errors. The execution module 214 may then execute the executable code using the input dataset to obtain the output dataset. The execution module 2 14 may subsequently provide the output dataset to the user device 106.
[0053] Further, if the user selects more than one algorithm from either same or different programming languages/environments for execution such that any interdependencies l ike input- output dependency exists between the associated executable codes; the execution module 2 1 4 may perform the execution of the associated executable codes accordingly. For instance, the execution module 214 may perform actions, for example, algorithm stitching issues like data type compatibility, change of execution environment and its variable, etc. Further, in case the output dataset is in a format different from the format desired by the user, the execution module 2 14 may perform post processing to convert the output dataset to the desired output format. The output dataset is then provided by the execution module 214 to the user device 106.
[0054J Further, the C1F registration module 1 12 may regularly and automatically update the CI F repository 108 based on, for example, the semantically similar terms obtained by the recommendation module 1 12 during searching. In one implementation, the CI F registration module 1 12 may use the semantically similar terms to update the executable metadata associated with each algorithm. The CIF registration module 1 12 may further access various online sources to obtain valid concepts and relations useful information for updating the CIF ontology. As will be understood, the various online sources may include unstructured data, the CI F registration module 1 12 may thus use various text processing, natural language processing (NLP) and information retrieval (IR) techniques to extract valid concepts and relations useful information or to infer valid relations from resources having such unstructured data. The CI F registration module 1 12 may subsequently update the CI F concepts, CI F sub-concepts, and concept metadata associated with the CIF concept based on the search query and one or more external databases using the NLP and IR techniques.
[0055] Figure 3 illustrates a method 300 for creating and maintaining a computer implemented functions (CIF) repository for recommending a CI F to a user, according to an embodiment of the present subject matter. The order in which the method is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method 300 or any alternative methods. Additionally, individual blocks may be deleted from the methods without departing from the spirit and scope of the subject matter described herein. Furthermore, the methods can be implemented in any suitable hardware, software, firmware, or combination thereof.
[0056] The method(s) may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modu les, functions, etc., that perform particu lar functions or implement particular abstract data types. The method may also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, computer executable instructions may be located in both local and remote computer storage media, including memory storage devices. [0057] A person skilled in the art will readily recognize that steps of the methods can be performed by programmed computers. Herein, some embodiments are also intended to cover program storage devices, for example, digital data storage media, which are machine or computer readable and encode machine-executable or computer-executable programs of instructions, where said instructions perform some or all of the steps of the described method. The program storage devices may be, for example, digital memories, magnetic storage media, such as a magnetic disks and magnetic tapes, hard drives, or optical ly readable digital data storage media. The embodiments are also intended to cover both communication network and communication devices configured to perform said steps of the exemplary methods.
[0058] At block 302, C1F ontology comprising a plurality of C1 F concepts is developed. In one implementation, the C1 F ontology is developed based on domain knowledge obtained from one or more knowledge sources. Initial ly, the plurality of CIF concepts into which a plurality of algorithms, that are to be stored in the CIF repository, can be organized are determined. Each of the plural ity of CI F concepts are further categorized into one or more CI F sub-concepts based on the domain knowledge. Subsequently, relationship between each of the plurality of CI F concepts is determ ined for being used for developing the CI F ontology. In one example, a CI F recommendation system, such as the CI F recommendation system 102 may develop the CI F ontology.
[0059] At block 304, the algorithms and their associated executable codes are associated with a CIF concept, from among the plurality of CI F concepts, and a CI F sub-concept associated with the CI F concept. In one implementation, the algorithms and their associated executable codes may be associated with the CI F concepts and the C I F sub-concept based on executable metadata correspond ing to the algorithms and the associated executable codes. Examples of the executable metadata include, but are not l imited to, I/O specification, mandatory-optional arguments, execution environment, algorithm and package dependency, and special features. [0060] At block 306, the plural ity of algorithms, the associated executable codes, and the corresponding executable metadata may be stored in the C1 F repository. In one implementation, the CI F recommendation system 102 may store the plurality of algorithms, the associated executable codes, and the corresponding executable metadata in the C I F repository 108. [0061] At block 308, a search query received from a user may be analyzed to determine search metadata. In one implementation, the CI F recommendation system 102 may receive the search query from an user wishing to search algorithms, for example, for developing applications. The CI F recommendation system 102 may analyze the search query to determine search metadata, such as terms defining an application of an algorithm, name of an algorithm, a type of an algorithm, and synonyms/semantically same terms for name of the algorithm and the terms defining the application of the algorithm.
[0062] At block 3 10, the CI F repository may be searched to identify one or more algorithms from among the plural ity of algorithms matching the search query based on the search metadata. [0063] At block 3 12, a determination is made to ascertain whether any algorithm matching the search query has been found. If no algorithm matching the search query has been found, which is the 'No' path from the block 3 12, it updates the CI F concepts, CI F sub-concepts, and concept metadata associated with the CIF concept based on the algorithm search query and one or more external databases at block 3 14. |0064] In case one or more algorithms matching the search query have been found, which is the 'Yes' path from the block 3 1 2, a CIF recommendation l ist having the one or more algorithms and the corresponding executable codes in a predefined order is provided to the user at the block 3 1 6. In one implementation, the CI F recommendation system 102 may rank the one or more algorithms and the associated executable codes based on ranking parameters. Examples of the ranking parameters include at least one of executable metadata, the search metadata, computational time of the executable codes, and input dataset received from the user.
(0065] At block 3 1 8, a user selection indicating at least one execution code associated with an algorithm, from among the one or more algorithms, chosen by the user is received. In one implementation, the C I F recommendation system 102 may receive the user selection from a user device 1 06 of the user. [0066] At block 320, at least one of the executable code and the algorithm are rendered to the user based on the user selection. In one implementation, the CI F recommendation system 1 02 may render the at least one of the executable code and the algorithm by performing at least one of executing the chosen executable code using an input dataset received from the user for generating an output dataset; providing the chosen algorithm to the user in a text format; and providing the chosen executable code to the user in an executable format.
[0067] Although embodiments for the present subject matter have been described in a language specific to structural features and/or method(s), it is to be understood that the invention is not necessarily limited to the specific features or method(s) described. Rather, the specific features and methods are disclosed as exemplary embodiments of the present subject matter.

Claims

I/We claim:
1 . A computer implemented functions (GI F) recommendation system ( 102) comprising: a processor (202);
a CIF registration module ( 1 1 2) coupled to the processor (202) to,
store a plurality of algorithms and corresponding executable codes in a computer implemented functions (CI F) repository ( 108), wherein the plurality of algorithms are stored in a searchable text format, and wherein the corresponding executable codes include one or more executable codes implementable in one or more programming environments, and wherein each of the plurality of algorithms are associated with a CI F concept from among a plurality of CIF concepts; and a recommendation module (2 12) coupled to the processor (202) to,
analyze a search query received from a user to determine search metadata; identify one or more algorithms from among the plurality of algorithms and the corresponding executable codes matching the search query based on the search metadata; and provide a CI F recommendation l ist having the one or more algorithms and the corresponding executable codes in a predefined order to the user based on one or more ranking parameters.
2. The CI F recommendation system ( 1 02) as claimed in claim 1 , wherein the CI F recommendation system ( 102) further comprises an ontology creation module ( 1 10) coupled to the processor (202) to: determine, based on domain knowledge, the plural ity of CI F concepts into which the plurality of algorithms can be organized:
categorize each of the plurality of CI F concepts into one or more CI F sub-concepts based on the domain knowledge: and
determine relationship between each of the plurality of CI F concepts for being used for developing C I F ontology comprising the plurality of CI F concepts and the CI F sub-concepts.
3. The C I F recommendation system ( 1 02) as claimed in claim I , wherein the CI F registration module ( I 1 2) further associates each of the plurality of algorithms and their associated executable codes with the CI F concept and a CI F sub-concept associated with the CI F concept based on executable metadata associated with the plurality of algorithms and their associated executable codes.
4. The CIF recommendation system ( 102) as claimed in claim 3, wherein the recommendation module (2 12) further ranks the one or more algorithms and the associated executable codes based on the ranking parameters, and wherein the ranking parameters include at least one of executable metadata, the search metadata, computational time of the executable codes, and input dataset received from the user.
5. The CI F recommendation system (102) as claimed in claim 1 , wherein the CIF recommendation system ( 102) further comprises an execution module (2 14) coupled to the processor (202) to: perform, based on a user selection indicating at least one execution code associated with an algorithm chosen by the user from among the one or more algorithms, at least one of:
executing the chosen executable code using an input dataset received from the user for generating an output dataset;
providing the chosen algorithm to the user in a text format; and
providing the chosen executable code to the user in an executable format.
6. The CIF recommendation system ( 102) as claimed in claim 1 , wherein the CIF registration module ( 1 1 2) further updates the CI F concepts, CI F sub-concepts, and concept metadata associated with the CI F concept based on the search query and one or more external databases using text processing and information retrieval techniques.
7. . A method for creating and maintaining a computer implemented functions (CI F) repository ( 108) for recommending a CI F to a user, the method comprising: developing a CI F ontology comprising a plurality of C I F concepts based on domain knowledge obtained from one or more knowledge sources;
associating each of a plural ity of algorithms and their associated executable codes with a CI F concept, from among the plurality of CI F concepts, and a CI F sub-concept associated with the CI F concept based on executable metadata associated with the plurality of algorithms and their associated executable codes; and storing the plurality of algorithms, the associated executable codes, and the corresponding executable metadata in the C I F repository ( 1 08), wherein the plurality of algorithms are stored in a searchable text format, and wherein the corresponding executable codes include one or more executable codes implementable in one or more programming environments.
8. The method as claimed in claim 7, wherein the developing the CI F ontology further comprises: determining, based on the domain knowledge, the plurality of C I F concepts into which the plurality of algorithms can be organized;
categorizing each of the plurality of CIF concepts into one or more CI F sub-concepts based on the domain knowledge; and
determining relationship between each of the plurality of CIF concepts for being used for developing the CIF ontology.
9. The method as claimed in claim 7, wherein the method further comprises: analyzing a search query received from an user to determine search metadata;
identifying one or more algorithms from among the plurality of algorithms matching the search query based on the search metadata; and
providing a CI F recommendation list having the one or more algorithms and the corresponding executable codes in a predefined order to the user based on ranking parameters.
1 0. The method as claimed in claim 9, wherein the providing the C I F recommendation l ist further comprises ranking the one or more algorithms and the associated executable codes based on the ranking parameters, and wherein the ranking parameters include at least one of executable metadata, the search metadata, computational time of the executable codes, and input dataset received from the user.
1 1 . The method as claimed in claim 9, wherein the method further comprising, receiving a user selection indicating at least one execution code associated with an algorithm, from among the one or more algorithms, chosen by the user; and
performing, based on the user selection, at least one of: executing the chosen executable code using an input dataset received from the user for generating an output dataset;
providing the chosen algorithm to the user in a text format; and
providing the chosen executable code to the user in an executable format.
12. The method as claimed in claim 9, wherein the method further comprises updating the CI F concepts, C1 F sub-concepts, and concept metadata associated with the CI F concept based on the search query and one or more external databases using text processing and information retrieval techniques.
13. A method for recommending computer implemented functions (CI F) using a CI F repository ( 108), the method comprising: analyzing, by a processor, an search query received from an user to determine search metadata;
identifying one or more algorithms from among a plurality of algorithms and corresponding executable codes matching the search query based on the search metadata, wherein the plurality of algorithms are stored in the CIF repository ( 108) in a searchable text format, and wherein the corresponding executable codes include one or more executable codes implementable in one or more programming environments, and wherein each of the plurality of algorithms are associated with a CI F concept from among one or more CI F concepts; and
providing a CI F recommendation list having the one or more algorithms and the corresponding executable codes in a predefined order to the user based on ranking parameters.
14. The method as claimed in claim 13, wherein the method further comprises ranking the one or more algorithms and the associated executable codes based on the ranking parameters, and wherein the ranking parameters include at least one of executable metadata, the search metadata, computational time of the executable codes, and input dataset received from the user.
1 5. The method as claimed in claim 1 3, wherein the method further comprises receiving a user selection indicating at least one execution code associated with an algorithm, from among the one or more algorithms, chosen by the user.
1 6. The method as claimed in claim 1 5, wherein the method further comprises perform ing, based on the user selection, at least one of: executing the chosen executable code using an input dataset received from the user for generating an output dataset;
providing the chosen algorithm to the user in a text format; and
providing the chosen executable code to the user in an executable format.
1 7. The method as claimed in claim 1 3, wherein the search metadata includes terms defining an application of an algorithm, name of an algorithm, a type of an algorithm, and synonyms/semantically same terms for name of the algorithm and the terms defining the appl ication of the algorithm.
1 8. The method as claimed in claim 13, wherein the method further comprises updating the CI F concepts, C1F sub-concepts, and concept metadata associated with the C1 F concept based on the search query and one or more external databases using text processing and information retrieval techniques.
PCT/IB2014/001933 2014-03-20 2014-09-29 Repository and recommendation system for computer programs WO2015140592A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN918/MUM/2014 2014-03-20
IN918MU2014 IN2014MU00918A (en) 2014-03-20 2014-09-29

Publications (1)

Publication Number Publication Date
WO2015140592A1 true WO2015140592A1 (en) 2015-09-24

Family

ID=54143808

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2014/001933 WO2015140592A1 (en) 2014-03-20 2014-09-29 Repository and recommendation system for computer programs

Country Status (2)

Country Link
IN (1) IN2014MU00918A (en)
WO (1) WO2015140592A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107464164A (en) * 2017-07-27 2017-12-12 宇龙计算机通信科技(深圳)有限公司 Terminal recommends method and relevant device
CN114625901A (en) * 2022-05-13 2022-06-14 南京维数软件股份有限公司 Multi-algorithm integration method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070050343A1 (en) * 2005-08-25 2007-03-01 Infosys Technologies Ltd. Semantic-based query techniques for source code

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070050343A1 (en) * 2005-08-25 2007-03-01 Infosys Technologies Ltd. Semantic-based query techniques for source code

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107464164A (en) * 2017-07-27 2017-12-12 宇龙计算机通信科技(深圳)有限公司 Terminal recommends method and relevant device
CN114625901A (en) * 2022-05-13 2022-06-14 南京维数软件股份有限公司 Multi-algorithm integration method and device
CN114625901B (en) * 2022-05-13 2022-08-05 南京维数软件股份有限公司 Multi-algorithm integration method and device

Also Published As

Publication number Publication date
IN2014MU00918A (en) 2015-09-25

Similar Documents

Publication Publication Date Title
Landset et al. A survey of open source tools for machine learning with big data in the Hadoop ecosystem
AU2011269676B2 (en) Systems of computerized agents and user-directed semantic networking
Mohan et al. A study on ontology based abstractive summarization
Pusala et al. Massive data analysis: tasks, tools, applications, and challenges
CN116244418B (en) Question answering method, device, electronic equipment and computer readable storage medium
Pushpa et al. OntoDisco: improving web service discovery by hybridization of ontology focused concept clustering and interface semantics
Rogushina et al. Ontological methods and tools for semantic extension of the media WIKI technology
Konys A framework for analysis of ontology-based data access
Hussain et al. A methodology to rank the design patterns on the base of text relevancy
WO2015140592A1 (en) Repository and recommendation system for computer programs
Kuleshov et al. Natural language search and associative-ontology matching algorithms based on graph representation of texts
Kovács et al. Conceptualization with incremental bron-kerbosch algorithm in big data architecture
Zarka et al. Fuzzy reasoning framework to improve semantic video interpretation
Janev Semantic intelligence in big data applications
Kettouch et al. SemiLD: mediator-based framework for keyword search over semi-structured and linked data
Manaswini et al. Towards a novel strategic scheme for web crawler design using simulated annealing and semantic techniques
Chen et al. Open Taiwan Government data recommendation platform using DBpedia and Semantic Web based on cloud computing
Salman et al. A STUDYING OF WEB CONTENT MINING TOOLS.
CN114648121A (en) Data processing method and device, electronic equipment and storage medium
Neto et al. WASOTA: What Are the States Of The Art?
Gnotthivongsa et al. Rating Prediction for Mobile Applications via Collective Matrix Factorization Considering App Categories
Li et al. Process materials scientific data for intelligent service using a dataspace model
Pradel Allowing end users to query graph-based knowledge bases
Halioui et al. Towards an ontology-based recommender system for relevant bioinformatics workflows
Cujar-Rosero Nature: a tool resulting from the union of artificial intelligence and natural language processing for searching research projects in Colombia

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14796546

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14796546

Country of ref document: EP

Kind code of ref document: A1