WO2017181837A1 - Rendering program online optimisation method - Google Patents

Rendering program online optimisation method Download PDF

Info

Publication number
WO2017181837A1
WO2017181837A1 PCT/CN2017/078973 CN2017078973W WO2017181837A1 WO 2017181837 A1 WO2017181837 A1 WO 2017181837A1 CN 2017078973 W CN2017078973 W CN 2017078973W WO 2017181837 A1 WO2017181837 A1 WO 2017181837A1
Authority
WO
WIPO (PCT)
Prior art keywords
rendering
program
simplified
programs
online
Prior art date
Application number
PCT/CN2017/078973
Other languages
French (fr)
Chinese (zh)
Inventor
王锐
鲍虎军
袁亚振
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学 filed Critical 浙江大学
Publication of WO2017181837A1 publication Critical patent/WO2017181837A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering

Definitions

  • the invention relates to the field of real-time rendering, in particular to an online optimization method for a rendering program, which is used to optimize a rendering program online during the drawing process.
  • the renderer is composed of multiple shader code, including vertices, shells, subdivisions, geometry, pixels, calculations, and more. Each of these shaders corresponds to a stage of the programmable hardware rendering pipeline. During the rendering process, each shader in the renderer will be called for execution during the unused render phase. The number of times the different shaders are executed will vary with the corresponding stage. In the application process, the simpler the rendering program can run faster, but the simpler program that uses human rewriting is very inefficient, so the method of automatically simplifying the original rendering program is constantly appearing.
  • Pellacini provides a user-configurable shader simplification method for pixel-by-pixel process modeling.
  • the program generates a series of shaders that are gradually simplified by the original shader.
  • the method selects a candidate variant with the smallest error by applying a specified simplification rule to the shader's code to generate a series of candidate variants and then evaluating the difference between the metric and the original shader metric. This selection process loops until the final shader becomes a constant.
  • Sitthi-amorn uses genetic programming to automatically simplify the rendering process. Similar to Pellacini, the algorithm also computes a series of progressively simplified shaders, but considers more code transformation rules, including the exchange of operands and operators in expression statements in code, the deletion of statements, and the insertion of statements.
  • the method uses genetic algorithms to select more simplified shaders and can also generate faster and more reliable results.
  • Wang provides a simplification that spans multiple shader stages.
  • the simplification of the renderer is seen as a re-fitting of the surface signal and provides a basis function.
  • these signals such as higher-order polynomial functions, meshless sparse radial basis functions, or basis functions for data decomposition.
  • the present invention proposes an online optimization method for a rendering program, which reduces the number of simplified programs generated, and completes the selection of simplified programs in the drawing process, so that the optimization of the rendering program is faster. more acurrate.
  • An online simplification method for a rendering program comprising the following steps:
  • E1, T1 respectively represent the rendering error and rendering cost of the simplified program
  • E_max represents the maximum allowable error (set according to needs)
  • T0 represents the rendering cost of the current basic program.
  • the rendering program in the present invention uses source code written in the HLSL language.
  • the source code In order to process in the rendering process (ie, the rendering process), the source code needs to be processed into a corresponding abstract syntax tree and a program dependency tree. After applying the simplified program generated by the simplified rule, it is still necessary to convert the simplified abstract syntax tree into HLSL code for rendering.
  • Each node in the dependency graph represents a simplified rendering program that is linked by directed edges. Because all of the simplified programs are simplified by the original renderer, the original renderer exists as the root node of the entire simplified dependency graph.
  • the parameter influence vector and the rendering cost are calculated by the following methods:
  • the number of floating point numbers required for the vertex shader and pixel shader interfaces is actually the number of floating point numbers that need to be rasterized during the rasterization phase.
  • the number of scalar instruction quantities in the vertex shader and pixel shader and the number of floating point numbers required by the vertex shader and pixel shader interface are obtained using the method disclosed in the paper "A system for rapid, automatic shader level-of-detail". .
  • C v, C f are the number of scalar instructions in an amount of vertex shaders and pixel shaders are, N a floating interface to the desired pixel shader, C v, C f are in the vertex shader and the number of scalar instructions amount of a pixel shader, N a pixel shader number of floating-point required interfaces, W v, W f are calculated weight scalar instruction amount vertex shader and pixel shader weight, W a is the weight of the floating point number required by the interface of the pixel shader.
  • W v , W f , and W a are weights calculated for each drawing stage, and the assignment ranges are 0.2 to 2.0, 0.8 to 20, and 10 to 400, respectively.
  • the preferred values of W v , W f , and W a are 0.2, 10, and 200, respectively.
  • N is the total number of simplified rules applied to the kth input parameter.
  • i is the i-th expression or statement associated with the kth input parameter.
  • w i is the scalar quantity of the ith expression or statement.
  • the weight of the parameters caused by them is also different.
  • the parameter influence vector combines the influence values of all parameters into a vector whose dimension is equal to the number of input variables.
  • each group is clustered into M classes by K-means clustering, and the distance function used in clustering is the dot product of the parameter influence vectors of two simplified programs;
  • N 10 to 100. According to the number of simplified programs, the larger N is larger, and M is 5 to 50.
  • step (2) it is necessary to monitor whether the scene parameters change.
  • a drastic change such as scene loading, camera steering, or the amount of pixels being rendered changes by more than 20%
  • a new round of optimization needs to be restarted, otherwise the round optimization is normal.
  • the scene input parameters need to be cached.
  • the monitoring and caching interval is set to 200ms by default.
  • the K representative rendering programs with a small selection cost (ie, rendering cost) within the L neighborhood of the original rendering program are used as candidate rendering programs.
  • the rendering cost of this search is calculated according to the method in step (1-3). For the representative rendering program that has calculated the rendering cost before, since the data is cached throughout the optimization process, it can be directly based on the cached result. Obtain.
  • the L neighborhood is a representative rendering program within the L step of the step size of the original rendering program in the dependency graph, and the value of L ranges from 2 to 6, preferably L is 4.
  • the first round of loops can be searched by predicting the rendering error and rendering cost.
  • the search is performed in the following ways after all the cycles after the first cycle in each round of optimization:
  • the rendering cost and rendering error of the remaining representative rendering program are predicted according to several known representative rendering programs obtained in the first loop. Then, K representative rendering programs are selected as candidate rendering programs according to the prediction result according to the Pareto rule.
  • N v , N p , N a respectively refer to the number of vertices, the number of pixels updated by the model (ie, the number of pixels in the drawn area, which is queried by the DirectX API), and the total number of readings that need to be read in the rendering pipeline.
  • the number of scalars, C v , C p respectively refer to the number of instructions executed by the vertex shader and the pixel shader, respectively
  • t v , t p , t a respectively refer to the rendering process in the optimization process during the vertex processing stage, the pixel processing stage, and The unit time spent in the rasterization phase.
  • the unit time is fitted according to formula (1) during online optimization.
  • the parameters at the time of fitting include: initial values are 1.0, 1.2, 1200; several sets of known quantities (all in formula (1) except t v , t Several parameters other than p , t a ), several sets of known quantities are obtained from step (2-2) in the previous cycle.
  • I j denotes the j-th representative simplified program in simplifying the total number of parent nodes of the corresponding nodes in the dependency graph, q j ,
  • e p is the rendering error of the representative rendering program corresponding to a known child node of the i kth parent node
  • q p is the parameter influence vector of the pth representative simplified program, respectively.
  • I p denotes the p-th representative of a simplified procedure corresponds in a simplified dependency graph The total number of all parent nodes of the node.
  • the first known node in the present invention is a node that has solved the corresponding rendering cost and drawing error in the first cycle of the online optimization process.
  • the goal of online renderer optimization is to select the fastest renderer within a given quality threshold under any render parameters. Changes in rendering parameters can result in changes in the rendering quality and rendering cost of different simplified programs, and the rendering parameters are constantly changing during the rendering process. Therefore, when the parameters change, it is necessary to recalculate the rendering cost and rendering quality of the simplified program and re-select the optimal. This dynamic calculation process and the rendering process are synchronous, so it is called online rendering program optimization.
  • the computational cost and rendering quality of the simplified program requires additional GPU execution, which reduces overall rendering efficiency, so fewer simplified programs are executed and the efficiency is optimized.
  • the corresponding drawing error is calculated by using the hierarchical graph for each subsequent rendering program, as follows:
  • the level map (MipMap) can reduce the amount of calculation, improve the efficiency of error calculation, and thus improve the optimization speed.
  • the rendering cost and the rendering error calculated by each candidate simplified program must wait for 3 to 7 frames before acquiring, N is. These data will be stored on the corresponding simplified nodes and calculated And t v , t p , t a .
  • the optimization method of the invention consists of two parts, a preprocessing stage and an online optimization stage.
  • the pre-processing stage the parameter influences of code instructions and code are analyzed, the prediction of rendering cost and drawing error is obtained, and the clustering of the simplified program is completed, which reduces the redundancy of the simplified program.
  • a simplified dependency graph is proposed to represent the relationship between the simplified programs, so that the completion of the online optimization can be used to search and predict the simplified program.
  • multiple iterations are used to complete an optimization.
  • a data-driven predictive model is used to predict the rendering cost and rendering efficiency of the simplified program, thereby reducing the number of simplified programs that need to be drawn and accelerating the optimization efficiency.
  • the beneficial effects of the present invention are as follows: the use of parameter influence and performance estimation reduces the number of simplified program generations and accelerates the preprocessing time. Dynamically selecting the optimal simplified program in the drawing stage not only realizes the decoupling of the rendering program and the scene, but also avoids the enumeration parameter space problem of offline optimization and accelerates the optimization.
  • the calculation cost of the simplified program and the calculation of the rendering quality are performed during the drawing process to dynamically determine the quality of the simplified program, so that the error calculation of the simplified program is
  • the time measurement is performed simultaneously with the scene rendering, and the real rendering cost and drawing error can be made under the current drawing parameters, thereby making a more accurate selection.
  • the original rendering program is parsed, and the original pixel shader and the original vertex shader in the original rendering program are converted into corresponding abstract syntax trees. All operations are followed by operations on the respective abstract syntax trees.
  • the original rendering program is originally The vertex shader is composed of the original pixel shader.
  • An online optimization method for a rendering program comprising:
  • N is the total number of simplified rules applied to the kth parameter
  • i is the i-th simplified expression or statement associated with k. Represents the effect defined on the kth argument, the ith expression or statement
  • w i represents the scalar quantity of the simplified variable.
  • the weight of the parameters caused by them is also different.
  • C v, C f refer calculating vertex shader and pixel shader number of instructions
  • N a rasterization stage refers to the number required for the rasterization scalar.
  • Clustering is performed using a K-means scheme.
  • the distance function value uses the dot product of the parameter of the simplified program to influence the vector. In this embodiment, 15 groups are divided, and 20 are selected for each group.
  • the simplified program generated has simplified the dependencies between programs.
  • the final simplified dependency graph has 794 simplified programs, which is a representative rendering program.
  • the program currently drawn to the window is used as a basic program to monitor the parameters of the scene to be drawn, and online optimization is performed when a drastic change occurs.
  • another thread is opened for online optimization.
  • the program drawn to the window is called the basic program, and the error value allowed by the optimization is set.
  • Monitor the scene motion decide whether to enable new optimization, and cache the scene parameters at regular intervals;
  • a new round of optimization is started due to the detection of scene import (initialization), and the current scene parameters are cached.
  • the error tolerance is set to 1.2.
  • Selecting N simplified programs from the simplified dependency graph, according to different optimization states, has two different selection strategies: 1. initial search strategy; 2. predictive search strategy based on rendering error and rendering cost.
  • the initial search applies to the search for the first round of loops in each round of online optimization, as follows:
  • the K representative rendering programs with a small selection cost (ie, rendering cost) within the L neighborhood of the original rendering program are used as candidate rendering programs.
  • K 4.
  • the rendering cost of this search is calculated according to the method in step (1-3). For the representative rendering program that has calculated the rendering cost before, since the data is cached throughout the optimization process, it can be directly based on the cached result. Obtain.
  • the search strategy based on the prediction error of rendering error and rendering cost is used in the loop after the first loop, as follows:
  • the rendering cost and rendering error of the remaining representative rendering program are predicted according to several known representative rendering programs obtained in the first loop. Then, K representative rendering programs are selected as candidate rendering programs according to the prediction result according to the Pareto rule.
  • N v , N p , N a respectively refer to the number of vertices, the number of pixels updated by the model (ie, the number of pixels in the drawn area, which is queried by the DirectX API), and the total number of readings that need to be read in the rendering pipeline.
  • Number of scalars the number of vertices, the number of pixels updated by the model (ie, the number of pixels in the drawn area, which is queried by the DirectX API), and the total number of readings that need to be read in the rendering pipeline.
  • C v , C p respectively refer to the number of instructions executed by the vertex shader and the pixel shader
  • t v , t p , t a respectively refer to the rendering process in the optimization process during the vertex processing stage, the pixel processing stage, and the rasterization stage. Unit time.
  • the unit time is fitted according to formula (1) during online optimization.
  • the parameters at the time of fitting include: initial values are 1.0, 1.2, 1200; several sets of known quantities (all in formula (1) except t v , t Several parameters other than p , t a ), several sets of known quantities are obtained from step (2-2) in the previous cycle.
  • I indicates that the jth represents the total number of parent nodes on the simplified dependency graph in the simplified program, q j ,
  • e p is the rendering error of the representative rendering program corresponding to a known child node of the i kth parent node
  • q p is the parameter influence vector of the pth representative simplified program, respectively.
  • I p denotes the p-th representative of a simplified procedure corresponds in a simplified dependency graph The total number of all parent nodes of the node.
  • MipMap Generates a hierarchical map of this texture (MipMap) and uses the Compute Shader to calculate the pixel error between this texture at a particular level and the rendering of the original drawing program.
  • the default setting is 5 layers.
  • E1, T1 respectively represent the rendering error and rendering cost of the simplified program
  • E_max represents the maximum allowable error (set according to needs)
  • T0 represents the rendering cost of the current basic program.
  • the four simplified programs that are calculated have a better selection than the existing basic program, the basic program is updated, and the optimization is continued, and steps (2-1) to (2-3) are executed cyclically.
  • K candidate rendering programs will be selected using a prediction strategy based on rendering error and time.

Abstract

Disclosed in the present invention is a rendering program online optimisation method, comprising: constructing a simplified program of the original rendering program, and on the basis of the rendering costs and drawing errors, selecting as representative rendering programs a plurality of simplified dependency graphs constructed on the basis of the dependency relationship; in the rendering process, monitoring parameters of a scene to be drawn, and when a dramatic change occurs, using a new round of online optimisation, an optimisation being completed through a number of cycles, and the following operations being implemented in the cycles in each optimisation: on the basis of the simplified dependency graphs, selecting from all the representative simplified programs K candidate simplified programs; on the basis of the drawing errors and rendering costs, determining the results of the current cycle; and on the basis of the results of a plurality of cycles, deciding whether the current online optimisation is complete. The optimal simplified rendering program is dynamically selected and decoupling of the rendering program and the scene is achieved, avoiding the problem of parameter space enumeration in off-line optimisation; rendering program error calculation and time measurement and scene drawing are simultaneously implemented, being rapid and practical.

Description

一种渲染程序的在线优化方法An online optimization method for rendering programs 技术领域Technical field
本发明涉及实时绘制领域,尤其涉及一种渲染程序的在线优化方法,用于在绘制过程中在线对渲染程序进行优化。The invention relates to the field of real-time rendering, in particular to an online optimization method for a rendering program, which is used to optimize a rendering program online during the drawing process.
背景技术Background technique
在实时绘制领域,一直存在加快绘制速度的需求,而渲染程序的复杂度则是决定绘制效率的一个重要因素。渲染程序是由多种着色器代码组成的,包括顶点,外壳,细分,几何,像素,计算等多个种类。这些着色器各自对应可编程硬件渲染管线某一阶段。在绘制过程中,渲染程序中的各个着色器将在不用的渲染阶段中被调用执行,不同的着色器,被执行的次数也随其对应的阶段的不同而不同。在应用过程中,越简单的渲染程序可以跑得越快,但使用人力重写更简单的程序非常低效,所以可以自动简化原始渲染程序的方法不断出现。In the field of real-time rendering, there is always a need to speed up the drawing, and the complexity of the rendering program is an important factor in determining the efficiency of rendering. The renderer is composed of multiple shader code, including vertices, shells, subdivisions, geometry, pixels, calculations, and more. Each of these shaders corresponds to a stage of the programmable hardware rendering pipeline. During the rendering process, each shader in the renderer will be called for execution during the unused render phase. The number of times the different shaders are executed will vary with the corresponding stage. In the application process, the simpler the rendering program can run faster, but the simpler program that uses human rewriting is very inefficient, so the method of automatically simplifying the original rendering program is constantly appearing.
Pellacini提供了一个用户可配置的着色器简化方法来进行逐像素的过程建模。程序生成一系列由原始着色器逐步简化生成的着色器。该方法通过对着色器的代码应用指定的简化规则来生成一系列的候选变种,然后评估变种与原始着色器之间度量的差来选择具有最小误差的候选变种。这种选择过程一直循环直到最后的着色器成为了常量。Sitthi-amorn使用遗传编程来自动简化渲染过程。与Pellacini类似,该算法同样计算一系列逐步简化的着色器,但考虑了更多的代码变换规则,包括代码中表达式语句中操作数与操作符的交换、语句的删除和语句的插入等,同时该方法使用遗传算法来选取更多的简化后的着色器,并且也能生成更快的和更可靠的结果。Wang则提供了一种跨越多个着色器阶段的简化,通过将渲染程序的执行当成一个表面信号生成的过程,将渲染程序的简化视为对表面信号的重新拟合,并提供使用基函数来表达这些信号,比如高阶的多项式函数,无网格的稀疏径向基函数,或者数据分解的基函数。Pellacini provides a user-configurable shader simplification method for pixel-by-pixel process modeling. The program generates a series of shaders that are gradually simplified by the original shader. The method selects a candidate variant with the smallest error by applying a specified simplification rule to the shader's code to generate a series of candidate variants and then evaluating the difference between the metric and the original shader metric. This selection process loops until the final shader becomes a constant. Sitthi-amorn uses genetic programming to automatically simplify the rendering process. Similar to Pellacini, the algorithm also computes a series of progressively simplified shaders, but considers more code transformation rules, including the exchange of operands and operators in expression statements in code, the deletion of statements, and the insertion of statements. At the same time, the method uses genetic algorithms to select more simplified shaders and can also generate faster and more reliable results. Wang provides a simplification that spans multiple shader stages. By treating the execution of the renderer as a surface signal generation process, the simplification of the renderer is seen as a re-fitting of the surface signal and provides a basis function. Express these signals, such as higher-order polynomial functions, meshless sparse radial basis functions, or basis functions for data decomposition.
但是这些方案存在以下两个问题:But these programs have the following two problems:
1.较长的计算时间,每一个简化程序都需要进行渲染并计算误差和时间, 再从中选择出合适的简化程序。但是大量的简化程序需要消耗很长的预计算时间。1. Long calculation time, each simplified program needs to be rendered and the error and time calculated. Then choose the appropriate simplified procedure. But a lot of simplified programs take a long time to calculate.
2.离线优化固定了渲染的模型,在计算时间和误差时,只能选择有限个视角进行计算,并不能涵盖在运行过程中的所有可能空间。2. Offline optimization fixes the rendered model. When calculating time and error, only a limited number of perspectives can be selected for calculation, and it does not cover all possible space during the running process.
发明内容Summary of the invention
针对现有技术的不足,本发明提出了一种渲染程序的在线优化方法,该方法减少了生成的简化程序的数量,并在绘制过程中完成简化程序的选择,使得渲染程序的优化更迅速,更准确。In view of the deficiencies of the prior art, the present invention proposes an online optimization method for a rendering program, which reduces the number of simplified programs generated, and completes the selection of simplified programs in the drawing process, so that the optimization of the rendering program is faster. more acurrate.
一种渲染程序的在线简化方法,包括如下步骤:An online simplification method for a rendering program, comprising the following steps:
(1)对原始渲染程序进行如下预处理:(1) The original rendering program is preprocessed as follows:
(1-1)采用不同渲染程序简化规则对原始渲染程序进行预简化得到若干个简化程序,并计算各个简化渲染程序的渲染代价;(1-1) Simplify the original rendering program with different rendering program simplification rules to obtain several simplified programs, and calculate the rendering cost of each simplified rendering program;
(1-2)根据每一个简化程序使用的简化规则,确定原始渲染程序以及各个简化程序之间的依赖关系;(1-2) Determine the dependencies between the original rendering program and each of the simplified programs according to the simplified rules used by each of the simplified programs;
(1-3)计算各个简化渲染程序的参数影响向量和渲染代价,所述参数影响向量为相应渲染程序中所有输入参数对渲染程序中计算结果的影响值组成的向量;(1-3) calculating a parameter influence vector and a rendering cost of each simplified rendering program, the parameter influence vector being a vector composed of influence values of all input parameters in the corresponding rendering program on the calculation result in the rendering program;
(1-4)根据简化程序渲染代价和参数影响向量从所有简化渲染程序,中聚类选择若干个作为代表简化程序,并依据依赖关系,将原始渲染程序和所有代表简化程序生成简化依赖图;(1-4) According to the simplified program rendering cost and parameter influence vector, from all the simplified rendering programs, the clustering selects several as the representative simplified program, and according to the dependency, the original rendering program and all the representative simplified programs generate simplified dependency graphs;
(2)在渲染过程中,以当前绘制到窗口的程序作为基础程序,监测待绘制场景参数,当发生剧烈变化时开始新一轮在线优化,每轮在线优化过程中循环进行如下操作:(2) During the rendering process, the program currently drawn to the window is used as a basic program to monitor the parameters of the scene to be drawn, and a new round of online optimization is started when a drastic change occurs, and the following operations are performed in each round of online optimization:
(2-1)根据简化依赖图从所有代表简化程序中选择K个作为候选简化程序:(2-1) Select K as the candidate simplified program from all representative simplified programs according to the simplified dependency graph:
(2-2)在使用基础程序绘制到窗口的过程中,插入K个候选简化程序的绘制,计算并储存相应的绘制误差和渲染代价;(2-2) In the process of drawing to the window using the basic program, inserting the drawing of the K candidate simplified programs, calculating and storing the corresponding drawing error and rendering cost;
(2-3)针对任意一个候选简化程序:(2-3) Simplify the program for any candidate:
若满足E1<E_max且T1<T0,则更新基础程序结束本轮优化; If E1 < E_max is satisfied and T1 < T0, the update base program ends the current round of optimization;
否则,不更新基础程序,并进行如下操作:Otherwise, do not update the base program and do the following:
若连续若干次循环都不更新基础程序,则停止本轮在线优化;If the basic program is not updated for several consecutive cycles, the current round of online optimization is stopped;
否则,返回继续执行步骤(2-1);Otherwise, return to continue with step (2-1);
其中,E1,T1分别表示简化程序的绘制误差和渲染代价,E_max表示最大允许的误差(根据需要设定),T0表示当前基础程序的渲染代价。Among them, E1, T1 respectively represent the rendering error and rendering cost of the simplified program, E_max represents the maximum allowable error (set according to needs), and T0 represents the rendering cost of the current basic program.
本发明中的渲染程序采用HLSL语言编写的源代码,为了在渲染过程(即绘制过程)中进行处理,需要将源代码处理为对应的抽象语法树和程序依赖树。在应用简化规则生成的简化程序之后,仍需将简化的抽象语法树转化为HLSL代码,进行渲染。The rendering program in the present invention uses source code written in the HLSL language. In order to process in the rendering process (ie, the rendering process), the source code needs to be processed into a corresponding abstract syntax tree and a program dependency tree. After applying the simplified program generated by the simplified rule, it is still necessary to convert the simplified abstract syntax tree into HLSL code for rendering.
本发明中对原始渲染程序进行简化时使用的简化规则共有3种,分别为:表达式删除规则、代码移动规则和曲面细分规则,其中,代码移动和曲面细分与论文“Automatic shader simplification using surfacesignal approximation”中提出公开的两种规则一致,表达式删除规则可以采用文献“User-configurable automatic shader simplification”中公开的规则一致。作为优选,对于表达式删除规则可以与文献“User-configurable automatic shader simplification”中稍有不同,不同点具体如下:There are three simplified rules used in the simplification of the original rendering program in the present invention: expression deletion rules, code movement rules, and tessellation rules, where code movement and tessellation and the paper "Automatic shader simplification using The two rules disclosed in surfacesignal approximation are consistent, and the expression deletion rules can be consistent with the rules disclosed in the document "User-configurable automatic shader simplification". As a preference, the expression deletion rule may be slightly different from the document "User-configurable automatic shader simplification", and the differences are as follows:
(a)对与二元表达式
Figure PCTCN2017078973-appb-000001
将会有两种简化变种
Figure PCTCN2017078973-appb-000002
Figure PCTCN2017078973-appb-000003
(a) pair and binary expression
Figure PCTCN2017078973-appb-000001
There will be two simplified variants
Figure PCTCN2017078973-appb-000002
with
Figure PCTCN2017078973-appb-000003
(b)对于从cb到ce的循环计算,将随机生成i,j并将循环替代为从cb+i到ce-j。(b) For the cyclic calculation from cb to ce, i, j will be randomly generated and the loop will be replaced from cb+i to ce-j.
对原始渲染程序进行简化时,从渲染程序中选择目标代码(可以是原始渲染程序中的某一条语句或多条语句或所有语句),然后从上述三种简化规则选择其中至少一种对目标代码进行简化,且针对不同的目标代码可以采用不同的简化规则的组合,这样通过对一个原始渲染程序简化即可得到大量的简化渲染程序。When simplifying the original renderer, select the target code from the renderer (either one of the original renderer or multiple statements or all statements), and then select at least one of the above three simplified rules for the target code. Simplify, and different combinations of simplification rules can be used for different target code, so that a large number of simplified rendering programs can be obtained by simplifying an original rendering program.
依赖图当中每一个节点表示的都是一个简化的渲染程序,通过有向边联系在一起。因为所有的简化程序都是由原始渲染程序简化而来,故原始渲染程序作为整个简化依赖图的根节点存在。Each node in the dependency graph represents a simplified rendering program that is linked by directed edges. Because all of the simplified programs are simplified by the original renderer, the original renderer exists as the root node of the entire simplified dependency graph.
在简化依赖图中,存在节点A和节点B,若节点B的程序是通过在简化程序A上应用一次简化操作得到的,则会存在一条由节点A指向节点B的有向边。以下面两条语句S1,S2举例:S1:c=a+b;S2:e=d+c.可以分别应用代码 移动规则到语句S1,S2生成两个节点V1,V2。但是观察到移动语句S2必将移动语句S1,因此V2可以说是在V1的基础上移动语句S2生成的,故简化图中存在一条从V1指向V2的有向边。In the simplified dependency graph, there are node A and node B. If the program of node B is obtained by applying a simplifying operation on the simplified program A, there is a directed edge directed by node A to node B. Take the following two statements S1, S2 for example: S1: c = a + b; S2: e = d + c. You can apply the code separately Move the rule to statement S1, S2 to generate two nodes V1, V2. However, it is observed that the move statement S2 must move the statement S1, so V2 can be said to be generated by moving the statement S2 on the basis of V1, so there is a directed edge from V1 to V2 in the simplified figure.
聚类完成之后,将依照原始的简化依赖图重新构建这些选出来的简化程序之间的依赖关系,生成依赖图,该依赖图中仅保留选择的代表简化程序。After the clustering is completed, the dependencies between these selected simplified programs are reconstructed according to the original simplified dependency graph, and a dependency graph is generated in which only the selected representative simplified program is retained.
本发明步骤(1-3)中,针对任意一个简化程序,通过如下方法计算参数影响向量和渲染代价:In the step (1-3) of the present invention, for any simplified program, the parameter influence vector and the rendering cost are calculated by the following methods:
获取该简化程序在顶点着色器和像素着色器中的标量指令量和顶点着色器和像素着色器的接口所需的浮点数数量和每一个输入参数的取值对渲染程序中相应计算结果的影响值,并根据标量指令量估算简化程序的渲染代价。Get the number of floating-point numbers required by the scalar instruction quantity and vertex shader and pixel shader interfaces in the vertex shader and pixel shader, and the effect of each input parameter on the corresponding calculation result in the renderer The value, and the rendering cost of the simplified program is estimated based on the amount of scalar instructions.
顶点着色器和像素着色器的接口所需的浮点数数量实际上也是光栅化阶段所需光栅化的浮点数数量。顶点着色器和像素着色器中的标量指令量和顶点着色器和像素着色器的接口所需的浮点数数量均采用论文“A system for rapid,automatic shader level-of-detail”中公开的方法获取。The number of floating point numbers required for the vertex shader and pixel shader interfaces is actually the number of floating point numbers that need to be rasterized during the rasterization phase. The number of scalar instruction quantities in the vertex shader and pixel shader and the number of floating point numbers required by the vertex shader and pixel shader interface are obtained using the method disclosed in the paper "A system for rapid, automatic shader level-of-detail". .
针对任意一个简化程序,根据如下公式计算渲染代价:For any simplified program, calculate the rendering cost according to the following formula:
Ctotal=WvCv+WfCf+WaNaC total = W v C v + W f C f + W a N a ,
其中,Cv,Cf分别为在顶点着色器和像素着色器的标量指令量的数量,Na为像素着色器的接口所需的浮点数,Cv、Cf分别为在顶点着色器和像素着色器中的标量指令量的数量,Na为像素着色器的接口所需的浮点数数量,Wv,Wf分别为顶点着色器和像素着色器中的标量指令量的计算的权重,Wa为像素着色器的接口所需的浮点数的权重。Wherein, C v, C f are the number of scalar instructions in an amount of vertex shaders and pixel shaders are, N a floating interface to the desired pixel shader, C v, C f are in the vertex shader and the number of scalar instructions amount of a pixel shader, N a pixel shader number of floating-point required interfaces, W v, W f are calculated weight scalar instruction amount vertex shader and pixel shader weight, W a is the weight of the floating point number required by the interface of the pixel shader.
本发明中Wv,Wf,Wa分别为各个绘制阶段计算的权重,其赋值范围分别为0.2~2.0,0.8~20,10~400。优选的Wv,Wf,Wa的取值分别为0.2,10,200。In the present invention, W v , W f , and W a are weights calculated for each drawing stage, and the assignment ranges are 0.2 to 2.0, 0.8 to 20, and 10 to 400, respectively. The preferred values of W v , W f , and W a are 0.2, 10, and 200, respectively.
第k个输入参数对渲染程序中计算结果的影响值qk根据如下公式计算:Effect of the k-th input parameter in the calculation result of the rendering programs Q value k is calculated according to the formula:
Figure PCTCN2017078973-appb-000004
Figure PCTCN2017078973-appb-000004
其中,N为应用到第k个输入参数对应的简化规则总数,Where N is the total number of simplified rules applied to the kth input parameter.
i是指与第k个输入参数相关的第i个表达式或语句,i is the i-th expression or statement associated with the kth input parameter.
Figure PCTCN2017078973-appb-000005
为定义在第k个参数对第i个表达式或语句上的影响,
Figure PCTCN2017078973-appb-000005
To define the effect of the kth argument on the ith expression or statement,
wi为第i个表达式或语句的标量数量。w i is the scalar quantity of the ith expression or statement.
对于不同的简化规则,其造成的参数影响的权重也是不一样的。本发明中 将代码删除设为1,代码移动设为0.1。因为曲面细分总是基于代码移动生成出来的,将其设置为代码移动的权重除以细分出的三角形顶点个数。参数影响向量是将所有参数的影响值组合为向量,向量的维数等于输入变量的个数。For different simplified rules, the weight of the parameters caused by them is also different. In the present invention Set the code removal to 1 and the code movement to 0.1. Because tessellation is always generated based on code movement, set it to the weight of the code movement divided by the number of triangle vertices that are subdivided. The parameter influence vector combines the influence values of all parameters into a vector whose dimension is equal to the number of input variables.
所述步骤(1-4)通过如下方法从所有的简化程序中选择若干个作为代表简化程序时:The steps (1-4) are selected from all the simplified programs as a representative simplified program by the following method:
(S1-41)按照简化程序的渲染代价将所有简化程序划分为N组;(S1-41) Divide all simplified programs into N groups according to the rendering cost of the simplified program;
(S1-42)针对每一组,利用K-means聚类将每一组聚成M类,聚类时使用的距离函数为两个简化程序的参数影响向量的点积;(S1-42) For each group, each group is clustered into M classes by K-means clustering, and the distance function used in clustering is the dot product of the parameter influence vectors of two simplified programs;
(S1-43)针对每类根据距离函数选取离类中心最近的简化程序为代表。(S1-43) Represents each class as a simplified program that selects the nearest center based on the distance function.
N为10~100,根据简化程序的个数设定,越大相应的N较大,M为5~50。N is 10 to 100. According to the number of simplified programs, the larger N is larger, and M is 5 to 50.
步骤(2)中需要监测场景参数是否变化,当剧烈变化发生时(如场景载入,摄像机转向,或者渲染的像素量改变超过20%),需要重启新一轮优化,否则此轮优化正常进行。同时为保证一轮优化过程中输入数据的一致性,需要对场景输入参数进行缓存。监测和缓存的间隔默认设置为200ms。In step (2), it is necessary to monitor whether the scene parameters change. When a drastic change occurs (such as scene loading, camera steering, or the amount of pixels being rendered changes by more than 20%), a new round of optimization needs to be restarted, otherwise the round optimization is normal. . At the same time, in order to ensure the consistency of the input data during the optimization process, the scene input parameters need to be cached. The monitoring and caching interval is set to 200ms by default.
在每轮优化过程中的第一次循环采用如下方法进行搜索:In the first cycle of each round of optimization, the following method is used for searching:
从原始渲染程序出发,根据依赖图的联通性,在原始渲染程序的L邻域范围内的选择代价(即渲染代价)小的K个代表渲染程序作为候选渲染程序。Starting from the original rendering program, according to the connectivity of the dependency graph, the K representative rendering programs with a small selection cost (ie, rendering cost) within the L neighborhood of the original rendering program are used as candidate rendering programs.
此次搜索时渲染代价根据采用步骤(1-3)中的方法计算得到,针对前已经计算过渲染代价的代表渲染程序,由于在整个优化过程中会对数据进行缓存,因此可以根据缓存结果直接获取。The rendering cost of this search is calculated according to the method in step (1-3). For the representative rendering program that has calculated the rendering cost before, since the data is cached throughout the optimization process, it can be directly based on the cached result. Obtain.
选择渲染代价时,若存在并列(并列定义:若两个代价的差值不超过二者均值的10%~50%,则认为二者并列)则从中依次选择与根节点最邻近的代表简化程序作为候选简化程序。进一步,若出现若干个完全相等且与原始渲染程序的步长相同的,此时从中则随机选择K个。When selecting the rendering cost, if there is a parallel (parallel definition: if the difference between the two costs does not exceed 10% to 50% of the two means, then the two are considered to be juxtaposed), then select the nearest simplified program from the root node. As a candidate to simplify the program. Further, if several identical ones are identical and the same as the step size of the original rendering program, then K are randomly selected from them.
L邻域范围为在依赖图中与原始渲染程序的步长在L步之内的代表渲染程序,L的取值范围为2~6,优选L为4。The L neighborhood is a representative rendering program within the L step of the step size of the original rendering program in the dependency graph, and the value of L ranges from 2 to 6, preferably L is 4.
在第一轮循环中已经求解若干个代表渲染程序(即已经求解得到对应的渲染代价和绘制误差),在这些已经求解的基础上可以预测其他代表渲染程序的渲染代价和绘制误差,因此,在第一轮循环,可以通过预测绘制误差和渲染代价进行搜索。 In the first round of loops, several representative rendering programs have been solved (that is, the corresponding rendering cost and rendering error have been solved), and the rendering cost and drawing error of other representative rendering programs can be predicted based on these solutions. Therefore, The first round of loops can be searched by predicting the rendering error and rendering cost.
在每轮优化过程中的第一次循环之后的所有循环中外采用如下方法进行搜索:The search is performed in the following ways after all the cycles after the first cycle in each round of optimization:
首先,根据第一次循环得到的若干已知代表渲染程序预测剩余代表渲染程序的渲染代价和绘制误差,然后,在根据预测结果依据帕累托法则选择K个代表渲染程序作为候选渲染程序。First, the rendering cost and rendering error of the remaining representative rendering program are predicted according to several known representative rendering programs obtained in the first loop. Then, K representative rendering programs are selected as candidate rendering programs according to the prediction result according to the Pareto rule.
依照以下公式预测代表渲染程序的渲染代价CtotalOn behalf of the rendering process to render the cost of C total forecast in accordance with the following formula:
Ctotal=NvCvtv+NpCptp+Nata,(1)C total = N v C v t v + N p C p t p + N a t a , (1)
Nv,Np,Na分别指顶点数,绘制模型更新的像素数(即已绘制区域内的像素个数,由DirectX API查询得来),以及渲染管线中需要读取的总共需要读取的标量的数量,Cv,Cp分别指顶点着色器和像素着色器执行的指令数,tv,tp,ta分别指的优化过程中的渲染程序在顶点处理阶段、像素处理阶段和光栅化阶段所耗费的单位时间。N v , N p , N a respectively refer to the number of vertices, the number of pixels updated by the model (ie, the number of pixels in the drawn area, which is queried by the DirectX API), and the total number of readings that need to be read in the rendering pipeline. The number of scalars, C v , C p respectively refer to the number of instructions executed by the vertex shader and the pixel shader, respectively, t v , t p , t a respectively refer to the rendering process in the optimization process during the vertex processing stage, the pixel processing stage, and The unit time spent in the rasterization phase.
单位时间是在线优化时按照公式(1)拟合出来的,拟合时的参数包括:初始值分别为1.0,1.2,1200;若干组已知量(公式(1)中所有除tv,tp,ta外的所有参数),若干组已知量由上一次循环中的步骤(2-2)获取。The unit time is fitted according to formula (1) during online optimization. The parameters at the time of fitting include: initial values are 1.0, 1.2, 1200; several sets of known quantities (all in formula (1) except t v , t Several parameters other than p , t a ), several sets of known quantities are obtained from step (2-2) in the previous cycle.
通过如下公式预测第j个代表简化程序的绘制误差ejThe drawing error e j of the jth representative simplified program is predicted by the following formula:
Figure PCTCN2017078973-appb-000006
Figure PCTCN2017078973-appb-000006
Ij表示第j个代表简化程序在简化依赖图中对应的节点的父节点总数,qj,
Figure PCTCN2017078973-appb-000007
分别为第j个和第ik个父节点对应的代表简化程序的参数影响向量,
Figure PCTCN2017078973-appb-000008
则表示第ik个父节点对应的代表渲染程序的绘制误差。
I j denotes the j-th representative simplified program in simplifying the total number of parent nodes of the corresponding nodes in the dependency graph, q j ,
Figure PCTCN2017078973-appb-000007
The parameter influence vectors of the simplified program corresponding to the jth and ikth parent nodes, respectively.
Figure PCTCN2017078973-appb-000008
Then, it represents the drawing error of the rendering program corresponding to the i kth parent node.
对于第ik个父节点对应的代表渲染程序的绘制误差
Figure PCTCN2017078973-appb-000009
根据如下公式计算得到:
For the drawing error of the i k th corresponding parent node representative of rendering programs
Figure PCTCN2017078973-appb-000009
Calculated according to the following formula:
Figure PCTCN2017078973-appb-000010
Figure PCTCN2017078973-appb-000010
其中,ep为第ik个父节点的一个已知的子节点对应的代表渲染程序的绘制误差,qp分别为第p个代表简化程序的参数影响向量,
Figure PCTCN2017078973-appb-000011
为第p个代表简化程序对应的节点在在简化依赖图中的第ik个父节点所对应的代表简化程序的参数影响向量,Ip表示第p个代表简化程序在简化依赖图中对应的节点的所有父节点总数。
Where e p is the rendering error of the representative rendering program corresponding to a known child node of the i kth parent node, and q p is the parameter influence vector of the pth representative simplified program, respectively.
Figure PCTCN2017078973-appb-000011
Impact vector simplified representative of a simplified program dependence graph in the first i k th parent node corresponding parameters for the nodes of the p representative simplified procedure corresponding, I p denotes the p-th representative of a simplified procedure corresponds in a simplified dependency graph The total number of all parent nodes of the node.
本发明中最初的已知节点为本轮在线优化过程中第一次循环中已经求解得到对应的渲染代价和绘制误差的节点。The first known node in the present invention is a node that has solved the corresponding rendering cost and drawing error in the first cycle of the online optimization process.
在线渲染程序优化的目标是在任意渲染参数下,选择在给定质量阈值内速度最快的渲染程序。渲染参数的变化会导致不同简化程序的渲染质量和渲染代价有变化,而在渲染过程中,渲染参数又是不断变化的。因此在参数变化时,需要重新计算简化程序的渲染代价和绘制质量并重新选择最优,这一动态计算的过程和渲染过程是同步的,故称之为在线渲染程序优化。但是计算简化程序的渲染代价和绘制质量需要占用额外的GPU执行,降低整体绘制效率,故越少的简化程序被执行,优化后的效率越高。The goal of online renderer optimization is to select the fastest renderer within a given quality threshold under any render parameters. Changes in rendering parameters can result in changes in the rendering quality and rendering cost of different simplified programs, and the rendering parameters are constantly changing during the rendering process. Therefore, when the parameters change, it is necessary to recalculate the rendering cost and rendering quality of the simplified program and re-select the optimal. This dynamic calculation process and the rendering process are synchronous, so it is called online rendering program optimization. However, the computational cost and rendering quality of the simplified program requires additional GPU execution, which reduces overall rendering efficiency, so fewer simplified programs are executed and the efficiency is optimized.
实际的过程当中,为了较为准确的估算出简化程序的具体质量,采用了两种数据驱动的模型来进行较为精确的渲染代价和绘制质量(即绘制误差)的估测,设计了一个迭代式优化方案来在线优化渲染程序,完成对场景的监控和缓存,根据监控结果在完成预测之后进一步中获取真实数据,然后这些真实数据反过来又将成为下一次预测的基础,以提供更准确的预测,如此循环迭代,并依照数据做出优化决策。绘制质量的预测需要三个方面的支持,简化依赖图,已经计算过的简化程序的绘制误差,以及各个简化程序的参数影响。In the actual process, in order to estimate the specific quality of the simplified program more accurately, two data-driven models are used to estimate the more accurate rendering cost and drawing quality (ie drawing error), and an iterative optimization is designed. The solution optimizes the rendering program online, completes the monitoring and caching of the scene, and further obtains the real data after the prediction is completed according to the monitoring result, and then the real data will in turn become the basis of the next prediction to provide a more accurate prediction. Loop iterations in this way and make optimization decisions based on the data. The prediction of rendering quality requires three aspects of support, simplifying the dependency graph, the drawing error of the simplified program that has been calculated, and the parameter influence of each simplified program.
为了计算得到
Figure PCTCN2017078973-appb-000012
需要利用那些已经被计算过的简化程序,对于一个已知绘制误差为ep的简化程序,将根据简化依赖图找到它的所有父节点,针对每个父节点,分别计算得到它父节点的
Figure PCTCN2017078973-appb-000013
在完成对简化渲染程序的渲染代价和质量预估之后,需要从中选取K个合适的简化程序进行渲染。依照我们的优化的目标,那些点都应该属于帕累托最优点,所以这些被预估的简化程序将按照预估出来的渲染代价和质量计算帕累托线(Pareto Frontier),并从中选出此帕累托线上K个最快的,K为2~7,优选K为4。
In order to calculate
Figure PCTCN2017078973-appb-000012
Need to use the simplified program that has been calculated, for a simplified program with a known rendering error of e p , all its parent nodes will be found according to the simplified dependency graph, and for each parent node, the parent node will be calculated separately.
Figure PCTCN2017078973-appb-000013
After completing the rendering cost and quality estimation of the simplified rendering program, you need to select K suitable simplified programs for rendering. According to our optimization goals, those points should belong to Pareto's best, so these estimated simplified programs will calculate and select Pareto Frontier based on the estimated rendering cost and quality. The fastest K on the Pareto line, K is 2-7, preferably K is 4.
本发明中针对每一个后续渲染程序利用层级图计算相应的绘制误差,具体如下步骤:In the present invention, the corresponding drawing error is calculated by using the hierarchical graph for each subsequent rendering program, as follows:
针对当前候选渲染程序,使用开启优化监测到的场景参数,将其绘制到与绘制窗口同样大小的纹理;For the current candidate rendering program, use the scene parameters that are turned on to optimize the rendering, and draw it to the same size as the drawing window;
生成这张纹理的层级图(MipMap),并使用计算着色器(Compute Shader)计算特定层级上的纹理与原始绘制程序(即待优化的渲染程序)的L2像素误 差,默认设置为5层。Generate a hierarchical map of this texture (MipMap) and use the Compute Shader to calculate the L2 pixel error of the texture at the specific level and the original drawing program (ie the rendering program to be optimized) Poor, the default setting is 5 layers.
层级图(MipMap)能够减小计算量,提高误差计算效率,进而提高优化速度。The level map (MipMap) can reduce the amount of calculation, improve the efficiency of error calculation, and thus improve the optimization speed.
为解决CPU与GPU执行不同步问题,每一个候选简化程序计算所得的渲染代价和绘制误差都必须等待3~7帧之后才去获取,N为。这些数据将被储存到对应的简化节点上,并计算出
Figure PCTCN2017078973-appb-000014
和tv,tp,ta
In order to solve the problem of unsynchronization between the CPU and the GPU, the rendering cost and the rendering error calculated by each candidate simplified program must wait for 3 to 7 frames before acquiring, N is. These data will be stored on the corresponding simplified nodes and calculated
Figure PCTCN2017078973-appb-000014
And t v , t p , t a .
本发明中步骤(2-3)过程有两个标准决定是否更新基础程序:1.在所有被计算过的简化程序中,是否存在绘制误差在误差范围内且绘制效率比现有基础程序高的简化程序;2.此简化程序是否与现有基础程序的图像差别过大。若连续2~5次循环无法更新基础程序,则停止优化,本发明中默认3次循环。In the process (2-3) of the present invention, there are two criteria for determining whether to update the basic program: 1. In all the simplified programs that are calculated, whether there is a rendering error within the error range and the rendering efficiency is higher than the existing basic program. Simplify the program; 2. Whether the simplified program is too different from the image of the existing base program. If the basic program cannot be updated for 2 to 5 consecutive cycles, the optimization is stopped, and the default is 3 cycles in the present invention.
本发明的优化方法两部分组成,预处理阶段和在线优化阶段。在预处理阶段分析代码指令和代码的参数影响,获取对渲染代价和绘制误差的预测并完成对简化程序的聚类,减少了简化程序的冗余现象。此外提出了简化依赖图来表示各个简化程序之间的关系,以便在线优化时的完成对简化程序的搜索及预测。在在线优化阶段,多次迭代来完成一次优化。为依靠简化依赖图,使用了数据驱动的预测模型来预测简化程序的渲染代价和绘制效率,从而减少了需要绘制的简化程序的数量,加速了优化效率。The optimization method of the invention consists of two parts, a preprocessing stage and an online optimization stage. In the pre-processing stage, the parameter influences of code instructions and code are analyzed, the prediction of rendering cost and drawing error is obtained, and the clustering of the simplified program is completed, which reduces the redundancy of the simplified program. In addition, a simplified dependency graph is proposed to represent the relationship between the simplified programs, so that the completion of the online optimization can be used to search and predict the simplified program. In the online optimization phase, multiple iterations are used to complete an optimization. To simplify the dependency graph, a data-driven predictive model is used to predict the rendering cost and rendering efficiency of the simplified program, thereby reducing the number of simplified programs that need to be drawn and accelerating the optimization efficiency.
与现有技术相比,本发明的有益效果如下:利用参数影响和性能估计减少了简化程序生成数量,加速预处理时间。在绘制阶段动态选择最优的简化程序,不仅实现渲染程序与场景的解耦合,且避免离线优化的枚举参数空间问题,加速优化。且通过引入参数影响来减少需要生成的简化程序个数,在绘制过程中进行简化程序的渲染代价和绘制质量(即绘制误差)的计算来动态确定简化程序的质量,使简化程序的误差计算和时间测量与场景绘制同时进行,能在当前绘制参数下真实的渲染代价和绘制误差,从而做出更准确的选择。Compared with the prior art, the beneficial effects of the present invention are as follows: the use of parameter influence and performance estimation reduces the number of simplified program generations and accelerates the preprocessing time. Dynamically selecting the optimal simplified program in the drawing stage not only realizes the decoupling of the rendering program and the scene, but also avoids the enumeration parameter space problem of offline optimization and accelerates the optimization. And by introducing the influence of parameters to reduce the number of simplified programs that need to be generated, the calculation cost of the simplified program and the calculation of the rendering quality (ie, drawing error) are performed during the drawing process to dynamically determine the quality of the simplified program, so that the error calculation of the simplified program is The time measurement is performed simultaneously with the scene rendering, and the real rendering cost and drawing error can be made under the current drawing parameters, thereby making a more accurate selection.
具体实施方式detailed description
下面将结合具体实施例对本发明进行详细说明。The invention will now be described in detail in connection with specific embodiments.
本实施例中在执行前,先将原始渲染程序进行语法分析,将原始渲染程序中的原始像素着色器和原始顶点着色器转化为相应的抽象语法树。之后所有的操作均是对各个相应的抽象语法树进行操作。本实施例中原始渲染程序由原始 顶点着色器和原始像素着色器组成。In this embodiment, before the execution, the original rendering program is parsed, and the original pixel shader and the original vertex shader in the original rendering program are converted into corresponding abstract syntax trees. All operations are followed by operations on the respective abstract syntax trees. In this embodiment, the original rendering program is originally The vertex shader is composed of the original pixel shader.
一种渲染程序的在线优化方法,包括:An online optimization method for a rendering program, comprising:
(1)对原始渲染程序进行预处理,操作如下:(1) Pre-processing the original rendering program, as follows:
(1-1)依照多个渲染程序简化规则对渲染程序进行简化,生成大量简化过后的渲染程序;(1-1) Simplify the rendering program according to a plurality of rendering program simplification rules, and generate a large number of simplified rendering programs;
本实施例中应用表达式删除,代码移动,曲面细分三种规则对渲染程序进行简化,共生成75342个简化程序。In this embodiment, three rules of expression deletion, code movement, and tessellation are used to simplify the rendering process, and a total of 75,342 simplified programs are generated.
(1-2)对每个简化后的渲染程序,根据每一个简化程序使用的简化规则,计算出简化程序之间的依赖关系;(1-2) For each simplified rendering program, calculate the dependencies between the simplified programs according to the simplified rules used by each simplified program;
(1-3)分析每一个简化程序的代码,得到每一个输入参数的参数影响,并估算简化程序的渲染代价;(1-3) Analyze the code of each simplified program, obtain the parameter influence of each input parameter, and estimate the rendering cost of the simplified program;
对于任意一个简化程序,它的第k个参数(即输入变量)的影响按以下方式计算:For any simplified program, the effect of its kth parameter (ie input variable) is calculated as follows:
Figure PCTCN2017078973-appb-000015
Figure PCTCN2017078973-appb-000015
N应用到第k个参数的简化规则总数,i是指与k相关的第i个简化表达式或语句,
Figure PCTCN2017078973-appb-000016
表示的是定义在第k个参数,第i个表达式或语句上的影响,wi表示的是简化的变量的标量数量。
N is the total number of simplified rules applied to the kth parameter, i is the i-th simplified expression or statement associated with k.
Figure PCTCN2017078973-appb-000016
Represents the effect defined on the kth argument, the ith expression or statement, and w i represents the scalar quantity of the simplified variable.
对于不同的简化规则,其造成的参数影响的权重也是不一样的。将代码删除设为1,代码移动设为0.1。因为曲面细分总是基于代码移动生成出来的,将其设置为代码移动的权重除以细分出的三角形顶点个数。For different simplified rules, the weight of the parameters caused by them is also different. Set the code removal to 1 and the code movement to 0.1. Because tessellation is always generated based on code movement, set it to the weight of the code movement divided by the number of triangle vertices that are subdivided.
针对任意一个简化渲染程序使用以下公式计算渲染代价CtotalUse the following formula to calculate the rendering cost C total for any of the simplified renderers:
Ctotal=0.2Cv+10Cf+200Na C total = 0.2C v + 10C f + 200N a
Cv,Cf分别指在顶点着色器和像素着色器的计算指令条数,Na指在光栅化阶段所需光栅化的标量数量。C v, C f refer calculating vertex shader and pixel shader number of instructions, N a rasterization stage refers to the number required for the rasterization scalar.
(1-4)依照简化程序估算出来时间和参数影响向量,从中聚类出具有代表性的简化程序,并依据依赖关系,生成简化依赖图;(1-4) Estimating the time and parameter influence vectors according to the simplified procedure, clustering representative simplified programs, and generating simplified dependency graphs according to the dependencies;
采用K-平均(K-means)的方案进行聚类,距离函数值使用简化程序的参数影响向量的点积,本实施例中共划分15组,每一组选出20个。Clustering is performed using a K-means scheme. The distance function value uses the dot product of the parameter of the simplified program to influence the vector. In this embodiment, 15 groups are divided, and 20 are selected for each group.
完成预处理阶段之后,得到根据原始渲染程序生成的简化依赖图,包含了 生成的简化程序,已经简化程序之间的依赖关系。本实施例中,最终的简化依赖图有794个简化程序,此时为代表渲染程序。After completing the preprocessing stage, I get a simplified dependency graph generated from the original rendering program, including The simplified program generated has simplified the dependencies between programs. In this embodiment, the final simplified dependency graph has 794 simplified programs, which is a representative rendering program.
(2)在渲染过程中,以当前绘制到窗口的程序作为基础程序,监测待绘制场景参数,当发生剧烈变化时进行在线优化。为提高优化效率,本实施例中另外开启一个线程进行在线优化。(2) In the rendering process, the program currently drawn to the window is used as a basic program to monitor the parameters of the scene to be drawn, and online optimization is performed when a drastic change occurs. In order to improve the optimization efficiency, in this embodiment, another thread is opened for online optimization.
在渲染过程中,绘制到窗口的程序称之为基础程序,设定优化允许的误差值。监测场景运动,决定是否开启新的优化,同时每隔一段时间对场景参数进行缓存;In the rendering process, the program drawn to the window is called the basic program, and the error value allowed by the optimization is set. Monitor the scene motion, decide whether to enable new optimization, and cache the scene parameters at regular intervals;
本实施例中,因检测到场景导入(初始化)故开始新一轮优化,并缓存当前的场景参数。同时设置误差允许值为1.2。In this embodiment, a new round of optimization is started due to the detection of scene import (initialization), and the current scene parameters are cached. At the same time, the error tolerance is set to 1.2.
每轮在线优化过程中循环进行如下操作:Each round of online optimization process loops as follows:
(2-1)根据简化依赖图从所有代表简化程序中选择K个作为候选简化程序:(2-1) Select K as the candidate simplified program from all representative simplified programs according to the simplified dependency graph:
从简化依赖图中选择N个简化程序,根据所处的不同优化状态,拥有两种不同的选择策略:1.初始搜索策略;2.基于绘制误差和渲染代价的预测的搜索策略。Selecting N simplified programs from the simplified dependency graph, according to different optimization states, has two different selection strategies: 1. initial search strategy; 2. predictive search strategy based on rendering error and rendering cost.
初始搜索适用于每轮在线优化过程中的第一轮循环时的搜索,具体如下:The initial search applies to the search for the first round of loops in each round of online optimization, as follows:
从原始渲染程序出发,根据依赖图的联通性,在原始渲染程序的L邻域范围内的选择代价(即渲染代价)小的K个代表渲染程序作为候选渲染程序。本实施例中K=4。Starting from the original rendering program, according to the connectivity of the dependency graph, the K representative rendering programs with a small selection cost (ie, rendering cost) within the L neighborhood of the original rendering program are used as candidate rendering programs. In this embodiment, K = 4.
此次搜索时渲染代价根据采用步骤(1-3)中的方法计算得到,针对前已经计算过渲染代价的代表渲染程序,由于在整个优化过程中会对数据进行缓存,因此可以根据缓存结果直接获取。The rendering cost of this search is calculated according to the method in step (1-3). For the representative rendering program that has calculated the rendering cost before, since the data is cached throughout the optimization process, it can be directly based on the cached result. Obtain.
选择渲染代价时,若存在并列(并列定义:若两个代价的差值不超过二者均值的10%~50%,则认为二者并列)则从中依次选择与根节点最邻近的代表简化程序作为候选简化程序。进一步,若出现若干个完全相等且与原始渲染程序的步长相同的,此时从中则随机选择K个。When selecting the rendering cost, if there is a parallel (parallel definition: if the difference between the two costs does not exceed 10% to 50% of the mean value, then the two are considered to be juxtaposed), then select the representative simplified program closest to the root node. As a candidate to simplify the program. Further, if several identical ones are identical and the same as the step size of the original rendering program, then K are randomly selected from them.
L邻域范围为在依赖图中与原始渲染程序的步长在L步之内的代表渲染程序,本实施例中L=4。The L neighborhood is a representative rendering program within the L step in the dependency graph and the step size of the original rendering program, L=4 in this embodiment.
在第一轮循环中已经求解若干个代表渲染程序(即已经求解得到对应的渲 染代价和绘制误差),在这些已经求解的基础上可以预测其他代表渲染程序的渲染代价和绘制误差,因此,在第一轮循环之后,可以通过预测绘制误差和渲染代价进行搜索,即基于绘制误差和渲染代价的预测的搜索策略。In the first round of the loop, several representative rendering programs have been solved (that is, the corresponding rendering has been solved). Dyeing cost and drawing error), based on these solutions, can predict the rendering cost and drawing error of other rendering programs. Therefore, after the first round of looping, you can search by predicting drawing error and rendering cost, ie based on drawing A predictive search strategy for errors and rendering costs.
在每轮在线优化过程中在第一次循环之后的循环中外采用基于绘制误差和渲染代价的预测的搜索策略进行搜索,具体如下:In each round of online optimization, the search strategy based on the prediction error of rendering error and rendering cost is used in the loop after the first loop, as follows:
首先,根据第一次循环得到的若干已知代表渲染程序预测剩余代表渲染程序的渲染代价和绘制误差,然后,在根据预测结果依据帕累托法则选择K个代表渲染程序作为候选渲染程序。First, the rendering cost and rendering error of the remaining representative rendering program are predicted according to several known representative rendering programs obtained in the first loop. Then, K representative rendering programs are selected as candidate rendering programs according to the prediction result according to the Pareto rule.
依照以下公式预测代表渲染程序的渲染代价CtotalPredict the rendering cost of the renderer C total according to the following formula:
Ctotal=NvCvtv+NpCptp+Nata,(1)C total = N v C v t v + N p C p t p + N a t a , (1)
Nv,Np,Na分别指顶点数,绘制模型更新的像素数(即已绘制区域内的像素个数,由DirectX API查询得来),以及渲染管线中需要读取的总共需要读取的标量的数量,N v , N p , N a respectively refer to the number of vertices, the number of pixels updated by the model (ie, the number of pixels in the drawn area, which is queried by the DirectX API), and the total number of readings that need to be read in the rendering pipeline. Number of scalars,
Cv,Cp分别指顶点着色器和像素着色器执行的指令数,tv,tp,ta分别指的优化过程中的渲染程序在顶点处理阶段、像素处理阶段和光栅化阶段所耗费的单位时间。C v , C p respectively refer to the number of instructions executed by the vertex shader and the pixel shader, and t v , t p , t a respectively refer to the rendering process in the optimization process during the vertex processing stage, the pixel processing stage, and the rasterization stage. Unit time.
单位时间是在线优化时按照公式(1)拟合出来的,拟合时的参数包括:初始值分别为1.0,1.2,1200;若干组已知量(公式(1)中所有除tv,tp,ta外的所有参数),若干组已知量由上一次循环中的步骤(2-2)获取。The unit time is fitted according to formula (1) during online optimization. The parameters at the time of fitting include: initial values are 1.0, 1.2, 1200; several sets of known quantities (all in formula (1) except t v , t Several parameters other than p , t a ), several sets of known quantities are obtained from step (2-2) in the previous cycle.
通过如下公式预测第j个代表简化程序的绘制误差ejThe drawing error e j of the jth representative simplified program is predicted by the following formula:
Figure PCTCN2017078973-appb-000017
Figure PCTCN2017078973-appb-000017
I表示第j个代表简化程序在简化依赖图上父节点的总数,qj,
Figure PCTCN2017078973-appb-000018
分别为第j个和第ik个代表简化程序的参数影响向量,
Figure PCTCN2017078973-appb-000019
则表示第ik个父节点对应的代表渲染程序与第j个代表简化程序之间的绘制误差;
I indicates that the jth represents the total number of parent nodes on the simplified dependency graph in the simplified program, q j ,
Figure PCTCN2017078973-appb-000018
The parameter influence vectors for the jth and ikth simplified programs, respectively,
Figure PCTCN2017078973-appb-000019
And indicating a drawing error between the representative rendering program corresponding to the i kth parent node and the jth representative simplified program;
为了计算得到
Figure PCTCN2017078973-appb-000020
需要利用那些已经在步骤(2-2)中计算过的真实绘制误差的简化程序(即代表简化程序),对于一个已知绘制误差为ep的简化程序,将根据简化依赖图找到它的父节点Ip,并根据如下公式计算得到它每一个父节点对应的代表简化程序的绘制误差
Figure PCTCN2017078973-appb-000021
In order to calculate
Figure PCTCN2017078973-appb-000020
Need to take advantage of the simplified program of the real rendering error that has been calculated in step (2-2) (ie, to represent the simplified program), for a simplified program with a known rendering error of e p , it will find its parent based on the simplified dependency graph Node Ip, and according to the following formula, calculate the drawing error of the representative simplified program corresponding to each parent node
Figure PCTCN2017078973-appb-000021
Figure PCTCN2017078973-appb-000022
Figure PCTCN2017078973-appb-000022
其中,ep为第ik个父节点的一个已知的子节点对应的代表渲染程序的绘制误差,qp分别为第p个代表简化程序的参数影响向量,
Figure PCTCN2017078973-appb-000023
为第p个代表简化程序对应的节点在在简化依赖图中的第ik个父节点所对应的代表简化程序的参数影响向量,Ip表示第p个代表简化程序在简化依赖图中对应的节点的所有父节点总数。
Where e p is the rendering error of the representative rendering program corresponding to a known child node of the i kth parent node, and q p is the parameter influence vector of the pth representative simplified program, respectively.
Figure PCTCN2017078973-appb-000023
Impact vector simplified representative of a simplified program dependence graph in the first i k th parent node corresponding parameters for the nodes of the p representative simplified procedure corresponding, I p denotes the p-th representative of a simplified procedure corresponds in a simplified dependency graph The total number of all parent nodes of the node.
(2-2)在使用基础程序绘制到窗口的过程中,插入选出的K个候选简化程序的绘制,计算并储存相应的绘制误差和渲染代价;(2-2) Inserting the selected K candidate simplified programs in the process of drawing to the window using the basic program, calculating and storing the corresponding drawing error and rendering cost;
在正常向窗口绘制模型时,插入选择的4个候选渲染程序的绘制,计算并收集各个候选渲染程序的绘制误差和渲染代价。When the model is normally drawn to the window, the drawing of the selected four candidate rendering programs is inserted, and the drawing error and rendering cost of each candidate rendering program are calculated and collected.
计算绘制误差和渲染代价时:When calculating the rendering error and rendering cost:
针对每一个候选渲染程序,使用之前缓存的场景参数,将其绘制到与绘制窗口同样大小的纹理;For each candidate renderer, use the previously cached scene parameters and draw them to the same size as the draw window;
生成这张纹理的层级图(MipMap),并使用计算着色器(Compute Shader)计算特定层级上的此纹理与原始绘制程序的绘制结果之间的像素误差,默认设置为5层。Generates a hierarchical map of this texture (MipMap) and uses the Compute Shader to calculate the pixel error between this texture at a particular level and the rendering of the original drawing program. The default setting is 5 layers.
为解决CPU与GPU执行不同步问题,每一个简化程序计算所得的时间和误差都必须等待若干帧之后才去获取。这些数据将被储存到对应的简化节点上,并计算出
Figure PCTCN2017078973-appb-000024
和tv,tp,ta。通常等待3~7帧之后才获取,本实施例中等待5帧。
In order to solve the problem of CPU and GPU execution out of sync, the time and error calculated by each simplified program must wait for several frames before acquiring. These data will be stored on the corresponding simplified nodes and calculated
Figure PCTCN2017078973-appb-000024
And t v , t p , t a . It is usually acquired after waiting for 3 to 7 frames. In this embodiment, it waits for 5 frames.
(2-3)针对任意一个候选简化程序:(2-3) Simplify the program for any candidate:
若满足E1<E_max且T1<T0,则更新基础程序结束本轮在线优化;If E1 < E_max is satisfied and T1 < T0, the update basic program ends the current round of online optimization;
否则,不更新基础程序,并进行如下操作:Otherwise, do not update the base program and do the following:
若连续L次循环都不更新基础程序,则停止本轮在线优化;If the basic program is not updated for consecutive L cycles, the current round of online optimization is stopped;
否则,返回步骤(2-1)继续执行下一次循环;Otherwise, return to step (2-1) to continue the next cycle;
其中,E1,T1分别表示简化程序的绘制误差和渲染代价,E_max表示最大允许的误差(根据需要设定),T0表示当前基础程序的渲染代价。Among them, E1, T1 respectively represent the rendering error and rendering cost of the simplified program, E_max represents the maximum allowable error (set according to needs), and T0 represents the rendering cost of the current basic program.
本实施例中,被计算的4个简化程序存在比现有基础程序更优的选择,更新基础程序,并继续优化,循环执行步骤(2-1)~(2-3)。在接下来的循环中,将使用基于绘制误差和时间的预测的搜索策略选择K个候选渲染程序。In this embodiment, the four simplified programs that are calculated have a better selection than the existing basic program, the basic program is updated, and the optimization is continued, and steps (2-1) to (2-3) are executed cyclically. In the next loop, K candidate rendering programs will be selected using a prediction strategy based on rendering error and time.
以上所述仅为本发明的优选实施方式,本发明的保护范围并不仅限于上述实施方式,凡是属于本发明原理的技术方案均属于本发明的保护范围。对于本 领域的技术人员而言,在不脱离本发明的原理的前提下进行的若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。 The above description is only a preferred embodiment of the present invention, and the scope of protection of the present invention is not limited to the above embodiments, and all the technical solutions belonging to the principles of the present invention fall within the protection scope of the present invention. For this Those skilled in the art will be able to devise a number of modifications and refinements without departing from the principles of the invention.

Claims (9)

  1. 一种渲染程序的在线简化方法,其特征在于,包括如下步骤:An online simplified method for a rendering program, comprising the steps of:
    (1)对原始渲染程序进行如下预处理:(1) The original rendering program is preprocessed as follows:
    (1-1)采用不同渲染程序简化规则对原始渲染程序进行预简化得到若干个简化程序,并计算各个简化渲染程序的渲染代价;(1-1) Simplify the original rendering program with different rendering program simplification rules to obtain several simplified programs, and calculate the rendering cost of each simplified rendering program;
    (1-2)根据每一个简化程序使用的简化规则,确定原始渲染程序以及各个简化程序之间的依赖关系;(1-2) Determine the dependencies between the original rendering program and each of the simplified programs according to the simplified rules used by each of the simplified programs;
    (1-3)计算各个简化渲染程序的参数影响向量和渲染代价,所述参数影响向量为相应渲染程序中所有输入参数对渲染程序中计算结果的影响值组成的向量;(1-3) calculating a parameter influence vector and a rendering cost of each simplified rendering program, the parameter influence vector being a vector composed of influence values of all input parameters in the corresponding rendering program on the calculation result in the rendering program;
    (1-4)根据简化程序的渲染代价和参数影响向量从所有简化渲染程序,中聚类选择若干个作为代表简化程序,并依据依赖关系,将原始渲染程序和所有代表简化程序生成简化依赖图;(1-4) According to the rendering cost and parameter influence vector of the simplified program, from all the simplified rendering programs, several clusters are selected as the representative simplified program, and according to the dependency, the original rendering program and all the representative simplified programs are generated to simplify the dependency graph. ;
    (2)在渲染过程中,以当前绘制到窗口的程序作为基础程序,监测待绘制场景参数,当发生剧烈变化时开始新一轮在线优化,每一轮在线优化时循环进行如下操作:(2) In the rendering process, the program currently drawn to the window is used as the basic program to monitor the parameters of the scene to be drawn, and a new round of online optimization is started when the dramatic change occurs, and the following operations are performed in each round of online optimization:
    (2-1)根据简化依赖图从所有代表简化程序中选择K个作为候选简化程序:(2-1) Select K as the candidate simplified program from all representative simplified programs according to the simplified dependency graph:
    (2-2)在使用基础程序绘制到窗口的过程中,插入选出的K个候选简化程序的绘制,计算并储存相应的绘制误差和渲染代价;(2-2) Inserting the selected K candidate simplified programs in the process of drawing to the window using the basic program, calculating and storing the corresponding drawing error and rendering cost;
    (2-3)针对任意一个候选简化程序:(2-3) Simplify the program for any candidate:
    若满足E1<E_max且T1<T0,则更新基础程序并结束本轮在线优化;If E1 < E_max is satisfied and T1 < T0, the basic program is updated and the current round of online optimization is ended;
    否则,不更新基础程序,并进行如下操作:Otherwise, do not update the base program and do the following:
    若连续若干次循环都不更新基础程序,则停止在线优化;If the basic program is not updated for several consecutive cycles, the online optimization is stopped;
    否则,返回继续执行步骤(2-1);Otherwise, return to continue with step (2-1);
    其中,E1,T1分别表示简化程序的绘制误差和渲染代价,E_max表示最大允许的误差,T0表示当前基础程序的渲染代价。Among them, E1, T1 respectively represent the rendering error and rendering cost of the simplified program, E_max represents the maximum allowable error, and T0 represents the rendering cost of the current basic program.
  2. 如权利要求1所述的渲染程序的在线简化方法,其特征在于,所述步骤(1-3)针对任意一个简化程序,通过如下方法计算参数影响向量和渲染代 价:The online simplification method of the rendering program according to claim 1, wherein the step (1-3) is for any one of the simplified programs, and the parameter influence vector and the rendering generation are calculated by the following method. price:
    获取每一个简化程序在顶点着色器和像素着色器中的标量指令量和顶点着色器和像素着色器的接口所需的浮点数数量和每一个输入参数的取值对渲染程序中相应计算结果的影响值,并根据标量指令量估算简化程序的渲染代价。Get the number of floating-point numbers required by each simplistic program's scalar instruction quantity in the vertex shader and pixel shader and the vertex shader and pixel shader interface and the value of each input parameter to the corresponding calculation result in the renderer Affect the value and estimate the rendering cost of the simplified program based on the amount of scalar instructions.
  3. 如权利要求1所述的渲染程序的在线简化方法,其特征在于,所述步骤(1-3)针对任意一个简化程序,根据如下公式计算渲染代价:The online simplification method of the rendering program according to claim 1, wherein the step (1-3) calculates a rendering cost according to the following formula for any one of the simplified programs:
    Ctotal=WvCv+WfCf+WaNa C total = W v C v + W f C f + W a N a,
    其中,Cv,Cf分别为在顶点着色器和像素着色器中的标量指令量的数量,Na为像素着色器的接口所需的浮点数数量,Wv,Wf分别为顶点着色器和像素着色器中的标量指令量的计算的权重,Wa为像素着色器的接口所需的浮点数的权重。Wherein, C v, C f are the number of scalar instructions amount of vertex shaders and pixel shaders, N a is the number of pixels required to interface float shader, W v, W f are a vertex shader And the calculated weight of the scalar instruction quantity in the pixel shader, W a is the weight of the floating point number required by the interface of the pixel shader.
  4. 如权利要求1所述的在线渲染程序优化方法,其特征在于,所述步骤(1-3)中第k个输入参数对渲染程序中计算结果的影响值qk根据如下公式计算:The online rendering program optimization method according to claim 1, wherein the influence value q k of the kth input parameter in the step (1-3) on the calculation result in the rendering program is calculated according to the following formula:
    Figure PCTCN2017078973-appb-100001
    Figure PCTCN2017078973-appb-100001
    其中,N为应用到第k个输入参数对应的简化规则总数,Where N is the total number of simplified rules applied to the kth input parameter.
    i是指与第k个输入参数相关的第i个表达式或语句,i is the i-th expression or statement associated with the kth input parameter.
    Figure PCTCN2017078973-appb-100002
    为定义在第k个参数对第i个表达式或语句上的影响,
    Figure PCTCN2017078973-appb-100002
    To define the effect of the kth argument on the ith expression or statement,
    wi为第i个表达式或语句的标量数量。w i is the scalar quantity of the ith expression or statement.
  5. 如权利要求1所述的渲染程序的在线简化方法,其特征在于,所述步骤(1-4)通过如下方法从所有的简化程序中选择若干个作为代表简化程序时:The online simplification method of the rendering program according to claim 1, wherein said step (1-4) selects a plurality of simplified programs from all of the simplified programs as follows:
    (S1-41)按照简化程序的渲染代价将所有简化程序划分为N组;(S1-41) Divide all simplified programs into N groups according to the rendering cost of the simplified program;
    (S1-42)针对每一组,利用K-means聚类将每一组聚成M类,聚类时使用的距离函数为两个简化程序的参数影响向量的点积;(S1-42) For each group, each group is clustered into M classes by K-means clustering, and the distance function used in clustering is the dot product of the parameter influence vectors of two simplified programs;
    (S1-43)针对每类根据距离函数选取离类中心最近的简化程序为代表。(S1-43) Represents each class as a simplified program that selects the nearest center based on the distance function.
  6. 如权利要求1所述的渲染程序的在线简化方法,其特征在于,每轮在线优化时第一次循环中采用如下方法进行搜索:The online simplification method of the rendering program according to claim 1, wherein each of the rounds of online optimization uses the following method to search in the first cycle:
    从原始渲染程序出发,根据依赖图的联通性,在原始渲染程序的L邻域范围内的选择代价小的K个代表渲染程序作为候选渲染程序。Starting from the original rendering program, according to the connectivity of the dependency graph, the K representative rendering programs with a small selection cost in the L neighborhood of the original rendering program are used as candidate rendering programs.
  7. 如权利要求1所述的渲染程序的在线简化方法,其特征在于,每轮在 线优化时第一次循环之后的循环中外采用如下方法进行搜索:An online simplification method for a rendering program according to claim 1, wherein each round is In the loop optimization, the loop after the first loop is searched by the following method:
    首先,根据第一次循环得到的若干已知代表渲染程序预测剩余代表渲染程序的渲染代价和绘制误差,然后,再根据预测结果依据帕累托法则选择K个代表渲染程序作为候选渲染程序。First, the rendering cost and rendering error of the remaining representative rendering program are predicted according to several known representative rendering programs obtained in the first loop, and then K representative rendering programs are selected as candidate rendering programs according to the prediction result according to the Pareto rule.
  8. 如权利要求7所述的渲染程序的在线简化方法,其特征在于,依照以下公式预测代表渲染程序的渲染代价CtotalThe online simplification method of the rendering program according to claim 7, wherein the rendering cost C total representing the rendering program is predicted according to the following formula:
    Ctotal=NvCvtv+NpCptp+Nata,  (1)C total = N v C v t v + N p C p t p + N a t a , (1)
    Nv,Np,Na分别指顶点数,绘制模型更新的像素数,以及渲染管线中需要读取的总共需要读取的标量的数量,N v , N p , N a refer to the number of vertices, the number of pixels in the model update, and the total number of scalars that need to be read in the rendering pipeline.
    Cv,Cp分别指顶点着色器和像素着色器执行的指令数,C v , C p refer to the number of instructions executed by the vertex shader and the pixel shader, respectively.
    tv,tp,ta分别指的优化过程中的渲染程序在顶点处理阶段、像素处理阶段和光栅化阶段所耗费的单位时间。t v , t p , t a respectively refer to the unit time consumed by the rendering program in the optimization process during the vertex processing phase, the pixel processing phase, and the rasterization phase.
  9. 如权利要求7所述的渲染程序的在线简化方法,其特征在于,通过如下公式预测第j个代表简化程序的绘制误差ejThe online simplification method of the rendering program according to claim 7, wherein the drawing error e j of the jth representative simplified program is predicted by the following formula:
    Figure PCTCN2017078973-appb-100003
    Figure PCTCN2017078973-appb-100003
    Ij表示第j个代表简化程序在简化依赖图中对应的节点的父节点总数,I j indicates that the jth represents the total number of parent nodes of the corresponding node in the simplified program in the simplified dependency graph.
    qj,qik分别为第j个和第ik个父节点对应的代表简化程序的参数影响向量,q j , q ik are the parameter influence vectors of the simplified program corresponding to the jth and i kth parent nodes, respectively.
    Figure PCTCN2017078973-appb-100004
    则表示第ik个父节点对应的代表渲染程序的绘制误差。
    Figure PCTCN2017078973-appb-100004
    It said drawing error of i k th corresponding parent node representative of the rendering programs.
PCT/CN2017/078973 2016-04-21 2017-03-31 Rendering program online optimisation method WO2017181837A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610256550.9 2016-04-21
CN201610256550.9A CN105976421B (en) 2016-04-21 2016-04-21 A kind of method for on-line optimization of rendering program

Publications (1)

Publication Number Publication Date
WO2017181837A1 true WO2017181837A1 (en) 2017-10-26

Family

ID=56994617

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/078973 WO2017181837A1 (en) 2016-04-21 2017-03-31 Rendering program online optimisation method

Country Status (2)

Country Link
CN (1) CN105976421B (en)
WO (1) WO2017181837A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109887061A (en) * 2019-02-19 2019-06-14 青岛海信电器股份有限公司 Scene rendering method, apparatus and equipment
CN110322540A (en) * 2019-07-09 2019-10-11 北京电影学院 Ink analogy method is interacted with what GPU optimization rendered based on hydrodynamics
CN114418917A (en) * 2022-03-11 2022-04-29 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105976421B (en) * 2016-04-21 2018-06-19 浙江大学 A kind of method for on-line optimization of rendering program
CN111179150B (en) * 2019-12-27 2020-11-03 浙江大学 Shader automatic simplification method and system based on drawing instruction stream
US11144289B1 (en) 2020-05-19 2021-10-12 International Business Machines Corporation Dynamic automation of selection of pipeline artifacts

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6064819A (en) * 1993-12-08 2000-05-16 Imec Control flow and memory management optimization
US20060202989A1 (en) * 2005-03-10 2006-09-14 Bracco Imaging, S.P.A. Systems and methods to optimize volumetric rendering of a region of interest ("Tension Vectors")
US7768523B2 (en) * 2006-03-09 2010-08-03 Microsoft Corporation Shading using texture space lighting and non-linearly optimized MIP-maps
CN102682461A (en) * 2012-04-28 2012-09-19 Tcl集团股份有限公司 Animation rendering method, animation rendering system and animation player
CN102945558A (en) * 2012-10-17 2013-02-27 沈阳创达技术交易市场有限公司 Optimizing method of high model rendering
CN104537706A (en) * 2014-07-31 2015-04-22 浙江大学 Shader simplifying method, shader simplifying device and graphic rendering method based on code motion
CN105976421A (en) * 2016-04-21 2016-09-28 浙江大学 Rendering program online optimization method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103236082B (en) * 2013-04-27 2015-12-02 南京邮电大学 Towards the accurate three-dimensional rebuilding method of two-dimensional video of catching static scene
GB2515343B (en) * 2013-06-21 2018-02-07 Toshiba Res Europe Limited Methods and systems for generating a three dimensional representation of a subject

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6064819A (en) * 1993-12-08 2000-05-16 Imec Control flow and memory management optimization
US20060202989A1 (en) * 2005-03-10 2006-09-14 Bracco Imaging, S.P.A. Systems and methods to optimize volumetric rendering of a region of interest ("Tension Vectors")
US7768523B2 (en) * 2006-03-09 2010-08-03 Microsoft Corporation Shading using texture space lighting and non-linearly optimized MIP-maps
CN102682461A (en) * 2012-04-28 2012-09-19 Tcl集团股份有限公司 Animation rendering method, animation rendering system and animation player
CN102945558A (en) * 2012-10-17 2013-02-27 沈阳创达技术交易市场有限公司 Optimizing method of high model rendering
CN104537706A (en) * 2014-07-31 2015-04-22 浙江大学 Shader simplifying method, shader simplifying device and graphic rendering method based on code motion
CN105976421A (en) * 2016-04-21 2016-09-28 浙江大学 Rendering program online optimization method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109887061A (en) * 2019-02-19 2019-06-14 青岛海信电器股份有限公司 Scene rendering method, apparatus and equipment
CN110322540A (en) * 2019-07-09 2019-10-11 北京电影学院 Ink analogy method is interacted with what GPU optimization rendered based on hydrodynamics
CN114418917A (en) * 2022-03-11 2022-04-29 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium
CN114418917B (en) * 2022-03-11 2022-06-21 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN105976421B (en) 2018-06-19
CN105976421A (en) 2016-09-28

Similar Documents

Publication Publication Date Title
WO2017181837A1 (en) Rendering program online optimisation method
EP4036803A1 (en) Neural network model processing method and apparatus, computer device, and storage medium
Zhao et al. Dynamic stale synchronous parallel distributed training for deep learning
CN114078076A (en) Apparatus and method for compressing ray tracing acceleration structure construction data
US20190005377A1 (en) Artificial neural network reduction to reduce inference computation time
US11216732B2 (en) Systems and methods for generation of sparse code for convolutional neural networks
CN115543639B (en) Optimization method for performing deep learning tasks in distributed mode and distributed system
CN114756383A (en) Distributed computing method, system, device and storage medium
US9858715B2 (en) Transforming polygonal mesh by sub-polychord collapse
KR102346119B1 (en) Identification of primitives in the input index stream
CN111353949A (en) Apparatus and method for efficient distributed denoising of graphics frames
CN104167015A (en) Shader simplifying method and device and image rendering method based on surface signal fitting
US9922442B2 (en) Graphics processing unit and method for performing tessellation operations
CN104183008A (en) Shader classification method and device based on surface signal fitting and tessellation and graphics rendering method
US20230394110A1 (en) Data processing method, apparatus, device, and medium
CN111738435B (en) Online sparse training method and system based on mobile equipment
CN108874532B (en) Memory allocation method and device
CN111160459A (en) Device and method for optimizing hyper-parameters
WO2020073910A1 (en) Systems and methods for efficiently mapping neural networks to programmable logic devices
CN104616327A (en) Tessellation based tinter simplifying method and device and graphic rendering method
US10083264B1 (en) Systems and methods for implicit surface modeling
CN104537706A (en) Shader simplifying method, shader simplifying device and graphic rendering method based on code motion
US10049487B2 (en) Identifying duplicate indices in an input index stream
WO2020164644A2 (en) Neural network model splitting method, apparatus, computer device and storage medium
CN116151384B (en) Quantum circuit processing method and device and electronic equipment

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17785322

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17785322

Country of ref document: EP

Kind code of ref document: A1