WO2017181837A1

WO2017181837A1 - Rendering program online optimisation method

Info

Publication number: WO2017181837A1
Application number: PCT/CN2017/078973
Authority: WO
Inventors: 王锐; 鲍虎军; 袁亚振
Original assignee: 浙江大学
Priority date: 2016-04-21
Filing date: 2017-03-31
Publication date: 2017-10-26
Also published as: CN105976421B; CN105976421A

Abstract

Disclosed in the present invention is a rendering program online optimisation method, comprising: constructing a simplified program of the original rendering program, and on the basis of the rendering costs and drawing errors, selecting as representative rendering programs a plurality of simplified dependency graphs constructed on the basis of the dependency relationship; in the rendering process, monitoring parameters of a scene to be drawn, and when a dramatic change occurs, using a new round of online optimisation, an optimisation being completed through a number of cycles, and the following operations being implemented in the cycles in each optimisation: on the basis of the simplified dependency graphs, selecting from all the representative simplified programs K candidate simplified programs; on the basis of the drawing errors and rendering costs, determining the results of the current cycle; and on the basis of the results of a plurality of cycles, deciding whether the current online optimisation is complete. The optimal simplified rendering program is dynamically selected and decoupling of the rendering program and the scene is achieved, avoiding the problem of parameter space enumeration in off-line optimisation; rendering program error calculation and time measurement and scene drawing are simultaneously implemented, being rapid and practical.

Description

An online optimization method for rendering programs

Technical field

The invention relates to the field of real-time rendering, in particular to an online optimization method for a rendering program, which is used to optimize a rendering program online during the drawing process.

Background technique

In the field of real-time rendering, there is always a need to speed up the drawing, and the complexity of the rendering program is an important factor in determining the efficiency of rendering. The renderer is composed of multiple shader code, including vertices, shells, subdivisions, geometry, pixels, calculations, and more. Each of these shaders corresponds to a stage of the programmable hardware rendering pipeline. During the rendering process, each shader in the renderer will be called for execution during the unused render phase. The number of times the different shaders are executed will vary with the corresponding stage. In the application process, the simpler the rendering program can run faster, but the simpler program that uses human rewriting is very inefficient, so the method of automatically simplifying the original rendering program is constantly appearing.

Pellacini provides a user-configurable shader simplification method for pixel-by-pixel process modeling. The program generates a series of shaders that are gradually simplified by the original shader. The method selects a candidate variant with the smallest error by applying a specified simplification rule to the shader's code to generate a series of candidate variants and then evaluating the difference between the metric and the original shader metric. This selection process loops until the final shader becomes a constant. Sitthi-amorn uses genetic programming to automatically simplify the rendering process. Similar to Pellacini, the algorithm also computes a series of progressively simplified shaders, but considers more code transformation rules, including the exchange of operands and operators in expression statements in code, the deletion of statements, and the insertion of statements. At the same time, the method uses genetic algorithms to select more simplified shaders and can also generate faster and more reliable results. Wang provides a simplification that spans multiple shader stages. By treating the execution of the renderer as a surface signal generation process, the simplification of the renderer is seen as a re-fitting of the surface signal and provides a basis function. Express these signals, such as higher-order polynomial functions, meshless sparse radial basis functions, or basis functions for data decomposition.

But these programs have the following two problems:

1. Long calculation time, each simplified program needs to be rendered and the error and time calculated. Then choose the appropriate simplified procedure. But a lot of simplified programs take a long time to calculate.

2. Offline optimization fixes the rendered model. When calculating time and error, only a limited number of perspectives can be selected for calculation, and it does not cover all possible space during the running process.

Summary of the invention

In view of the deficiencies of the prior art, the present invention proposes an online optimization method for a rendering program, which reduces the number of simplified programs generated, and completes the selection of simplified programs in the drawing process, so that the optimization of the rendering program is faster. more acurrate.

An online simplification method for a rendering program, comprising the following steps:

(1) The original rendering program is preprocessed as follows:

(1-1) Simplify the original rendering program with different rendering program simplification rules to obtain several simplified programs, and calculate the rendering cost of each simplified rendering program;

(1-2) Determine the dependencies between the original rendering program and each of the simplified programs according to the simplified rules used by each of the simplified programs;

(1-3) calculating a parameter influence vector and a rendering cost of each simplified rendering program, the parameter influence vector being a vector composed of influence values of all input parameters in the corresponding rendering program on the calculation result in the rendering program;

(1-4) According to the simplified program rendering cost and parameter influence vector, from all the simplified rendering programs, the clustering selects several as the representative simplified program, and according to the dependency, the original rendering program and all the representative simplified programs generate simplified dependency graphs;

(2) During the rendering process, the program currently drawn to the window is used as a basic program to monitor the parameters of the scene to be drawn, and a new round of online optimization is started when a drastic change occurs, and the following operations are performed in each round of online optimization:

(2-1) Select K as the candidate simplified program from all representative simplified programs according to the simplified dependency graph:

(2-2) In the process of drawing to the window using the basic program, inserting the drawing of the K candidate simplified programs, calculating and storing the corresponding drawing error and rendering cost;

(2-3) Simplify the program for any candidate:

If E1 < E_max is satisfied and T1 < T0, the update base program ends the current round of optimization;

Otherwise, do not update the base program and do the following:

If the basic program is not updated for several consecutive cycles, the current round of online optimization is stopped;

Otherwise, return to continue with step (2-1);

Among them, E1, T1 respectively represent the rendering error and rendering cost of the simplified program, E_max represents the maximum allowable error (set according to needs), and T0 represents the rendering cost of the current basic program.

The rendering program in the present invention uses source code written in the HLSL language. In order to process in the rendering process (ie, the rendering process), the source code needs to be processed into a corresponding abstract syntax tree and a program dependency tree. After applying the simplified program generated by the simplified rule, it is still necessary to convert the simplified abstract syntax tree into HLSL code for rendering.

There are three simplified rules used in the simplification of the original rendering program in the present invention: expression deletion rules, code movement rules, and tessellation rules, where code movement and tessellation and the paper "Automatic shader simplification using The two rules disclosed in surfacesignal approximation are consistent, and the expression deletion rules can be consistent with the rules disclosed in the document "User-configurable automatic shader simplification". As a preference, the expression deletion rule may be slightly different from the document "User-configurable automatic shader simplification", and the differences are as follows:

(a) pair and binary expression

There will be two simplified variants

with

(b) For the cyclic calculation from cb to ce, i, j will be randomly generated and the loop will be replaced from cb+i to ce-j.

When simplifying the original renderer, select the target code from the renderer (either one of the original renderer or multiple statements or all statements), and then select at least one of the above three simplified rules for the target code. Simplify, and different combinations of simplification rules can be used for different target code, so that a large number of simplified rendering programs can be obtained by simplifying an original rendering program.

Each node in the dependency graph represents a simplified rendering program that is linked by directed edges. Because all of the simplified programs are simplified by the original renderer, the original renderer exists as the root node of the entire simplified dependency graph.

In the simplified dependency graph, there are node A and node B. If the program of node B is obtained by applying a simplifying operation on the simplified program A, there is a directed edge directed by node A to node B. Take the following two statements S1, S2 for example: S1: c = a + b; S2: e = d + c. You can apply the code separately Move the rule to statement S1, S2 to generate two nodes V1, V2. However, it is observed that the move statement S2 must move the statement S1, so V2 can be said to be generated by moving the statement S2 on the basis of V1, so there is a directed edge from V1 to V2 in the simplified figure.

After the clustering is completed, the dependencies between these selected simplified programs are reconstructed according to the original simplified dependency graph, and a dependency graph is generated in which only the selected representative simplified program is retained.

In the step (1-3) of the present invention, for any simplified program, the parameter influence vector and the rendering cost are calculated by the following methods:

Get the number of floating-point numbers required by the scalar instruction quantity and vertex shader and pixel shader interfaces in the vertex shader and pixel shader, and the effect of each input parameter on the corresponding calculation result in the renderer The value, and the rendering cost of the simplified program is estimated based on the amount of scalar instructions.

The number of floating point numbers required for the vertex shader and pixel shader interfaces is actually the number of floating point numbers that need to be rasterized during the rasterization phase. The number of scalar instruction quantities in the vertex shader and pixel shader and the number of floating point numbers required by the vertex shader and pixel shader interface are obtained using the method disclosed in the paper "A system for rapid, automatic shader level-of-detail". .

For any simplified program, calculate the rendering cost according to the following formula:

C _total = W _v C _v + W _f C _f + W _a N _a ,

Wherein, C _v, C _f are the number of scalar instructions in an amount of vertex shaders and pixel shaders are, N _a floating interface to the desired pixel shader, C _v, C _f are in the vertex shader and the number of scalar instructions amount of a pixel shader, N _a pixel shader number of floating-point required interfaces, W _v, W _f are calculated weight scalar instruction amount vertex shader and pixel shader weight, W _a is the weight of the floating point number required by the interface of the pixel shader.

In the present invention, W _v , W _f , and W _a are weights calculated for each drawing stage, and the assignment ranges are 0.2 to 2.0, 0.8 to 20, and 10 to 400, respectively. The preferred values of W _v , W _f , and W _a are 0.2, 10, and 200, respectively.

Effect of the k-th input parameter in the calculation result of the rendering programs Q value _k is calculated according to the formula:

Where N is the total number of simplified rules applied to the kth input parameter.

i is the i-th expression or statement associated with the kth input parameter.

To define the effect of the kth argument on the ith expression or statement,

w _i is the scalar quantity of the ith expression or statement.

For different simplified rules, the weight of the parameters caused by them is also different. In the present invention Set the code removal to 1 and the code movement to 0.1. Because tessellation is always generated based on code movement, set it to the weight of the code movement divided by the number of triangle vertices that are subdivided. The parameter influence vector combines the influence values of all parameters into a vector whose dimension is equal to the number of input variables.

The steps (1-4) are selected from all the simplified programs as a representative simplified program by the following method:

(S1-41) Divide all simplified programs into N groups according to the rendering cost of the simplified program;

(S1-42) For each group, each group is clustered into M classes by K-means clustering, and the distance function used in clustering is the dot product of the parameter influence vectors of two simplified programs;

(S1-43) Represents each class as a simplified program that selects the nearest center based on the distance function.

N is 10 to 100. According to the number of simplified programs, the larger N is larger, and M is 5 to 50.

In step (2), it is necessary to monitor whether the scene parameters change. When a drastic change occurs (such as scene loading, camera steering, or the amount of pixels being rendered changes by more than 20%), a new round of optimization needs to be restarted, otherwise the round optimization is normal. . At the same time, in order to ensure the consistency of the input data during the optimization process, the scene input parameters need to be cached. The monitoring and caching interval is set to 200ms by default.

In the first cycle of each round of optimization, the following method is used for searching:

Starting from the original rendering program, according to the connectivity of the dependency graph, the K representative rendering programs with a small selection cost (ie, rendering cost) within the L neighborhood of the original rendering program are used as candidate rendering programs.

The rendering cost of this search is calculated according to the method in step (1-3). For the representative rendering program that has calculated the rendering cost before, since the data is cached throughout the optimization process, it can be directly based on the cached result. Obtain.

When selecting the rendering cost, if there is a parallel (parallel definition: if the difference between the two costs does not exceed 10% to 50% of the two means, then the two are considered to be juxtaposed), then select the nearest simplified program from the root node. As a candidate to simplify the program. Further, if several identical ones are identical and the same as the step size of the original rendering program, then K are randomly selected from them.

The L neighborhood is a representative rendering program within the L step of the step size of the original rendering program in the dependency graph, and the value of L ranges from 2 to 6, preferably L is 4.

In the first round of loops, several representative rendering programs have been solved (that is, the corresponding rendering cost and rendering error have been solved), and the rendering cost and drawing error of other representative rendering programs can be predicted based on these solutions. Therefore, The first round of loops can be searched by predicting the rendering error and rendering cost.

The search is performed in the following ways after all the cycles after the first cycle in each round of optimization:

First, the rendering cost and rendering error of the remaining representative rendering program are predicted according to several known representative rendering programs obtained in the first loop. Then, K representative rendering programs are selected as candidate rendering programs according to the prediction result according to the Pareto rule.

On behalf of the rendering process to render the cost of C _total forecast in accordance with the following formula:

C _total = N _v C _v t _v + N _p C _p t _p + N _a t _a , (1)

N _v , N _p , N _a respectively refer to the number of vertices, the number of pixels updated by the model (ie, the number of pixels in the drawn area, which is queried by the DirectX API), and the total number of readings that need to be read in the rendering pipeline. The number of scalars, C _v , C _p respectively refer to the number of instructions executed by the vertex shader and the pixel shader, respectively, t _v , t _p , t _a respectively refer to the rendering process in the optimization process during the vertex processing stage, the pixel processing stage, and The unit time spent in the rasterization phase.

The unit time is fitted according to formula (1) during online optimization. The parameters at the time of fitting include: initial values are 1.0, 1.2, 1200; several sets of known quantities (all in formula (1) except t _v , t Several parameters other than _p , t _a ), several sets of known quantities are obtained from step (2-2) in the previous cycle.

The drawing error e _{j of} the jth representative simplified program is predicted by the following formula:

I _j denotes the j-th representative simplified program in simplifying the total number of parent nodes of the corresponding nodes in the dependency graph, q _j ,

The parameter influence vectors of the simplified program corresponding to the jth and _ikth parent nodes, respectively.

Then, it represents the drawing error of the rendering program corresponding to the i _kth parent node.

For the drawing error of the i _k th corresponding parent node representative of rendering programs

Calculated according to the following formula:

Where e _p is the rendering error of the representative rendering program corresponding to a known child node of the i _kth parent node, and q _p is the parameter influence vector of the pth representative simplified program, respectively.

Impact vector simplified representative of a simplified program dependence graph in the first i _k th parent node corresponding parameters for the nodes of the p representative simplified procedure corresponding, I _p denotes the p-th representative of a simplified procedure corresponds in a simplified dependency graph The total number of all parent nodes of the node.

The first known node in the present invention is a node that has solved the corresponding rendering cost and drawing error in the first cycle of the online optimization process.

The goal of online renderer optimization is to select the fastest renderer within a given quality threshold under any render parameters. Changes in rendering parameters can result in changes in the rendering quality and rendering cost of different simplified programs, and the rendering parameters are constantly changing during the rendering process. Therefore, when the parameters change, it is necessary to recalculate the rendering cost and rendering quality of the simplified program and re-select the optimal. This dynamic calculation process and the rendering process are synchronous, so it is called online rendering program optimization. However, the computational cost and rendering quality of the simplified program requires additional GPU execution, which reduces overall rendering efficiency, so fewer simplified programs are executed and the efficiency is optimized.

In the actual process, in order to estimate the specific quality of the simplified program more accurately, two data-driven models are used to estimate the more accurate rendering cost and drawing quality (ie drawing error), and an iterative optimization is designed. The solution optimizes the rendering program online, completes the monitoring and caching of the scene, and further obtains the real data after the prediction is completed according to the monitoring result, and then the real data will in turn become the basis of the next prediction to provide a more accurate prediction. Loop iterations in this way and make optimization decisions based on the data. The prediction of rendering quality requires three aspects of support, simplifying the dependency graph, the drawing error of the simplified program that has been calculated, and the parameter influence of each simplified program.

In order to calculate

Need to use the simplified program that has been calculated, for a simplified program with a known rendering error of e _p , all its parent nodes will be found according to the simplified dependency graph, and for each parent node, the parent node will be calculated separately.

After completing the rendering cost and quality estimation of the simplified rendering program, you need to select K suitable simplified programs for rendering. According to our optimization goals, those points should belong to Pareto's best, so these estimated simplified programs will calculate and select Pareto Frontier based on the estimated rendering cost and quality. The fastest K on the Pareto line, K is 2-7, preferably K is 4.

In the present invention, the corresponding drawing error is calculated by using the hierarchical graph for each subsequent rendering program, as follows:

For the current candidate rendering program, use the scene parameters that are turned on to optimize the rendering, and draw it to the same size as the drawing window;

Generate a hierarchical map of this texture (MipMap) and use the Compute Shader to calculate the L2 pixel error of the texture at the specific level and the original drawing program (ie the rendering program to be optimized) Poor, the default setting is 5 layers.

The level map (MipMap) can reduce the amount of calculation, improve the efficiency of error calculation, and thus improve the optimization speed.

In order to solve the problem of unsynchronization between the CPU and the GPU, the rendering cost and the rendering error calculated by each candidate simplified program must wait for 3 to 7 frames before acquiring, N is. These data will be stored on the corresponding simplified nodes and calculated

And t _v , t _p , t _a .

In the process (2-3) of the present invention, there are two criteria for determining whether to update the basic program: 1. In all the simplified programs that are calculated, whether there is a rendering error within the error range and the rendering efficiency is higher than the existing basic program. Simplify the program; 2. Whether the simplified program is too different from the image of the existing base program. If the basic program cannot be updated for 2 to 5 consecutive cycles, the optimization is stopped, and the default is 3 cycles in the present invention.

The optimization method of the invention consists of two parts, a preprocessing stage and an online optimization stage. In the pre-processing stage, the parameter influences of code instructions and code are analyzed, the prediction of rendering cost and drawing error is obtained, and the clustering of the simplified program is completed, which reduces the redundancy of the simplified program. In addition, a simplified dependency graph is proposed to represent the relationship between the simplified programs, so that the completion of the online optimization can be used to search and predict the simplified program. In the online optimization phase, multiple iterations are used to complete an optimization. To simplify the dependency graph, a data-driven predictive model is used to predict the rendering cost and rendering efficiency of the simplified program, thereby reducing the number of simplified programs that need to be drawn and accelerating the optimization efficiency.

Compared with the prior art, the beneficial effects of the present invention are as follows: the use of parameter influence and performance estimation reduces the number of simplified program generations and accelerates the preprocessing time. Dynamically selecting the optimal simplified program in the drawing stage not only realizes the decoupling of the rendering program and the scene, but also avoids the enumeration parameter space problem of offline optimization and accelerates the optimization. And by introducing the influence of parameters to reduce the number of simplified programs that need to be generated, the calculation cost of the simplified program and the calculation of the rendering quality (ie, drawing error) are performed during the drawing process to dynamically determine the quality of the simplified program, so that the error calculation of the simplified program is The time measurement is performed simultaneously with the scene rendering, and the real rendering cost and drawing error can be made under the current drawing parameters, thereby making a more accurate selection.

detailed description

The invention will now be described in detail in connection with specific embodiments.

In this embodiment, before the execution, the original rendering program is parsed, and the original pixel shader and the original vertex shader in the original rendering program are converted into corresponding abstract syntax trees. All operations are followed by operations on the respective abstract syntax trees. In this embodiment, the original rendering program is originally The vertex shader is composed of the original pixel shader.

An online optimization method for a rendering program, comprising:

(1) Pre-processing the original rendering program, as follows:

(1-1) Simplify the rendering program according to a plurality of rendering program simplification rules, and generate a large number of simplified rendering programs;

In this embodiment, three rules of expression deletion, code movement, and tessellation are used to simplify the rendering process, and a total of 75,342 simplified programs are generated.

(1-2) For each simplified rendering program, calculate the dependencies between the simplified programs according to the simplified rules used by each simplified program;

(1-3) Analyze the code of each simplified program, obtain the parameter influence of each input parameter, and estimate the rendering cost of the simplified program;

For any simplified program, the effect of its kth parameter (ie input variable) is calculated as follows:

N is the total number of simplified rules applied to the kth parameter, i is the i-th simplified expression or statement associated with k.

Represents the effect defined on the kth argument, the ith expression or statement, and w _i represents the scalar quantity of the simplified variable.

For different simplified rules, the weight of the parameters caused by them is also different. Set the code removal to 1 and the code movement to 0.1. Because tessellation is always generated based on code movement, set it to the weight of the code movement divided by the number of triangle vertices that are subdivided.

Use the following formula to calculate the rendering cost C _total for any of the simplified renderers:

_{_{_{C total = 0.2C v + 10C f}}} + 200N a

C _v, C _f refer calculating vertex shader and pixel shader number of instructions, N _a rasterization stage refers to the number required for the rasterization scalar.

(1-4) Estimating the time and parameter influence vectors according to the simplified procedure, clustering representative simplified programs, and generating simplified dependency graphs according to the dependencies;

Clustering is performed using a K-means scheme. The distance function value uses the dot product of the parameter of the simplified program to influence the vector. In this embodiment, 15 groups are divided, and 20 are selected for each group.

After completing the preprocessing stage, I get a simplified dependency graph generated from the original rendering program, including The simplified program generated has simplified the dependencies between programs. In this embodiment, the final simplified dependency graph has 794 simplified programs, which is a representative rendering program.

(2) In the rendering process, the program currently drawn to the window is used as a basic program to monitor the parameters of the scene to be drawn, and online optimization is performed when a drastic change occurs. In order to improve the optimization efficiency, in this embodiment, another thread is opened for online optimization.

In the rendering process, the program drawn to the window is called the basic program, and the error value allowed by the optimization is set. Monitor the scene motion, decide whether to enable new optimization, and cache the scene parameters at regular intervals;

In this embodiment, a new round of optimization is started due to the detection of scene import (initialization), and the current scene parameters are cached. At the same time, the error tolerance is set to 1.2.

Each round of online optimization process loops as follows:

Selecting N simplified programs from the simplified dependency graph, according to different optimization states, has two different selection strategies: 1. initial search strategy; 2. predictive search strategy based on rendering error and rendering cost.

The initial search applies to the search for the first round of loops in each round of online optimization, as follows:

Starting from the original rendering program, according to the connectivity of the dependency graph, the K representative rendering programs with a small selection cost (ie, rendering cost) within the L neighborhood of the original rendering program are used as candidate rendering programs. In this embodiment, K = 4.

When selecting the rendering cost, if there is a parallel (parallel definition: if the difference between the two costs does not exceed 10% to 50% of the mean value, then the two are considered to be juxtaposed), then select the representative simplified program closest to the root node. As a candidate to simplify the program. Further, if several identical ones are identical and the same as the step size of the original rendering program, then K are randomly selected from them.

The L neighborhood is a representative rendering program within the L step in the dependency graph and the step size of the original rendering program, L=4 in this embodiment.

In the first round of the loop, several representative rendering programs have been solved (that is, the corresponding rendering has been solved). Dyeing cost and drawing error), based on these solutions, can predict the rendering cost and drawing error of other rendering programs. Therefore, after the first round of looping, you can search by predicting drawing error and rendering cost, ie based on drawing A predictive search strategy for errors and rendering costs.

In each round of online optimization, the search strategy based on the prediction error of rendering error and rendering cost is used in the loop after the first loop, as follows:

Predict the rendering cost of the renderer C _total according to the following formula:

C _total = N _v C _v t _v + N _p C _p t _p + N _a t _a , (1)

N _v , N _p , N _a respectively refer to the number of vertices, the number of pixels updated by the model (ie, the number of pixels in the drawn area, which is queried by the DirectX API), and the total number of readings that need to be read in the rendering pipeline. Number of scalars,

C _v , C _p respectively refer to the number of instructions executed by the vertex shader and the pixel shader, and t _v , t _p , t _a respectively refer to the rendering process in the optimization process during the vertex processing stage, the pixel processing stage, and the rasterization stage. Unit time.

I indicates that the jth represents the total number of parent nodes on the simplified dependency graph in the simplified program, q _j ,

The parameter influence vectors for the jth and _ikth simplified programs, respectively,

And indicating a drawing error between the representative rendering program corresponding to the i _kth parent node and the jth representative simplified program;

In order to calculate

Need to take advantage of the simplified program of the real rendering error that has been calculated in step (2-2) (ie, to represent the simplified program), for a simplified program with a known rendering error of e _p , it will find its parent based on the simplified dependency graph Node Ip, and according to the following formula, calculate the drawing error of the representative simplified program corresponding to each parent node

(2-2) Inserting the selected K candidate simplified programs in the process of drawing to the window using the basic program, calculating and storing the corresponding drawing error and rendering cost;

When the model is normally drawn to the window, the drawing of the selected four candidate rendering programs is inserted, and the drawing error and rendering cost of each candidate rendering program are calculated and collected.

When calculating the rendering error and rendering cost:

For each candidate renderer, use the previously cached scene parameters and draw them to the same size as the draw window;

Generates a hierarchical map of this texture (MipMap) and uses the Compute Shader to calculate the pixel error between this texture at a particular level and the rendering of the original drawing program. The default setting is 5 layers.

In order to solve the problem of CPU and GPU execution out of sync, the time and error calculated by each simplified program must wait for several frames before acquiring. These data will be stored on the corresponding simplified nodes and calculated

And t _v , t _p , t _a . It is usually acquired after waiting for 3 to 7 frames. In this embodiment, it waits for 5 frames.

(2-3) Simplify the program for any candidate:

If E1 < E_max is satisfied and T1 < T0, the update basic program ends the current round of online optimization;

Otherwise, do not update the base program and do the following:

If the basic program is not updated for consecutive L cycles, the current round of online optimization is stopped;

Otherwise, return to step (2-1) to continue the next cycle;

In this embodiment, the four simplified programs that are calculated have a better selection than the existing basic program, the basic program is updated, and the optimization is continued, and steps (2-1) to (2-3) are executed cyclically. In the next loop, K candidate rendering programs will be selected using a prediction strategy based on rendering error and time.

The above description is only a preferred embodiment of the present invention, and the scope of protection of the present invention is not limited to the above embodiments, and all the technical solutions belonging to the principles of the present invention fall within the protection scope of the present invention. For this Those skilled in the art will be able to devise a number of modifications and refinements without departing from the principles of the invention.

Claims

An online simplified method for a rendering program, comprising the steps of:

(1) The original rendering program is preprocessed as follows:

(1-1) Simplify the original rendering program with different rendering program simplification rules to obtain several simplified programs, and calculate the rendering cost of each simplified rendering program;

(1-2) Determine the dependencies between the original rendering program and each of the simplified programs according to the simplified rules used by each of the simplified programs;

(1-3) calculating a parameter influence vector and a rendering cost of each simplified rendering program, the parameter influence vector being a vector composed of influence values of all input parameters in the corresponding rendering program on the calculation result in the rendering program;

(1-4) According to the rendering cost and parameter influence vector of the simplified program, from all the simplified rendering programs, several clusters are selected as the representative simplified program, and according to the dependency, the original rendering program and all the representative simplified programs are generated to simplify the dependency graph. ;

(2) In the rendering process, the program currently drawn to the window is used as the basic program to monitor the parameters of the scene to be drawn, and a new round of online optimization is started when the dramatic change occurs, and the following operations are performed in each round of online optimization:

(2-1) Select K as the candidate simplified program from all representative simplified programs according to the simplified dependency graph:

(2-2) Inserting the selected K candidate simplified programs in the process of drawing to the window using the basic program, calculating and storing the corresponding drawing error and rendering cost;

(2-3) Simplify the program for any candidate:

If E1 < E_max is satisfied and T1 < T0, the basic program is updated and the current round of online optimization is ended;

Otherwise, do not update the base program and do the following:

If the basic program is not updated for several consecutive cycles, the online optimization is stopped;

Otherwise, return to continue with step (2-1);

Among them, E1, T1 respectively represent the rendering error and rendering cost of the simplified program, E_max represents the maximum allowable error, and T0 represents the rendering cost of the current basic program.
The online simplification method of the rendering program according to claim 1, wherein the step (1-3) is for any one of the simplified programs, and the parameter influence vector and the rendering generation are calculated by the following method. price:

Get the number of floating-point numbers required by each simplistic program's scalar instruction quantity in the vertex shader and pixel shader and the vertex shader and pixel shader interface and the value of each input parameter to the corresponding calculation result in the renderer Affect the value and estimate the rendering cost of the simplified program based on the amount of scalar instructions.
The online simplification method of the rendering program according to claim 1, wherein the step (1-3) calculates a rendering cost according to the following formula for any one of the simplified programs:

C total = W v C v + W f C f + W a N a,

Wherein, C v, C f are the number of scalar instructions amount of vertex shaders and pixel shaders, N a is the number of pixels required to interface float shader, W v, W f are a vertex shader And the calculated weight of the scalar instruction quantity in the pixel shader, W a is the weight of the floating point number required by the interface of the pixel shader.
The online rendering program optimization method according to claim 1, wherein the influence value q k of the kth input parameter in the step (1-3) on the calculation result in the rendering program is calculated according to the following formula:

Where N is the total number of simplified rules applied to the kth input parameter.

i is the i-th expression or statement associated with the kth input parameter.

To define the effect of the kth argument on the ith expression or statement,

w i is the scalar quantity of the ith expression or statement.
The online simplification method of the rendering program according to claim 1, wherein said step (1-4) selects a plurality of simplified programs from all of the simplified programs as follows:

(S1-41) Divide all simplified programs into N groups according to the rendering cost of the simplified program;

(S1-42) For each group, each group is clustered into M classes by K-means clustering, and the distance function used in clustering is the dot product of the parameter influence vectors of two simplified programs;

(S1-43) Represents each class as a simplified program that selects the nearest center based on the distance function.
The online simplification method of the rendering program according to claim 1, wherein each of the rounds of online optimization uses the following method to search in the first cycle:

Starting from the original rendering program, according to the connectivity of the dependency graph, the K representative rendering programs with a small selection cost in the L neighborhood of the original rendering program are used as candidate rendering programs.
An online simplification method for a rendering program according to claim 1, wherein each round is In the loop optimization, the loop after the first loop is searched by the following method:

First, the rendering cost and rendering error of the remaining representative rendering program are predicted according to several known representative rendering programs obtained in the first loop, and then K representative rendering programs are selected as candidate rendering programs according to the prediction result according to the Pareto rule.
The online simplification method of the rendering program according to claim 7, wherein the rendering cost C total representing the rendering program is predicted according to the following formula:

C total = N v C v t v + N p C p t p + N a t a , (1)

N v , N p , N a refer to the number of vertices, the number of pixels in the model update, and the total number of scalars that need to be read in the rendering pipeline.

C v , C p refer to the number of instructions executed by the vertex shader and the pixel shader, respectively.

t v , t p , t a respectively refer to the unit time consumed by the rendering program in the optimization process during the vertex processing phase, the pixel processing phase, and the rasterization phase.
The online simplification method of the rendering program according to claim 7, wherein the drawing error e j of the jth representative simplified program is predicted by the following formula:

I j indicates that the jth represents the total number of parent nodes of the corresponding node in the simplified program in the simplified dependency graph.

q j , q ik are the parameter influence vectors of the simplified program corresponding to the jth and i kth parent nodes, respectively.

It said drawing error of i k th corresponding parent node representative of the rendering programs.