Dynamic sampling rate: harnessing frame coherence in graphics applications for energy-efficient GPUs.

Fragment Shading GPU Sampling Tile-Based Rendering

Journal

The Journal of supercomputing
ISSN: 0920-8542
Titre abrégé: J Supercomput
Pays: United States
ID NLM: 9889997

Informations de publication

Date de publication:
2022
Historique:
accepted: 26 02 2022
entrez: 15 8 2022
pubmed: 16 8 2022
medline: 16 8 2022
Statut: ppublish

Résumé

In real-time rendering, a 3D scene is modelled with meshes of triangles that the GPU projects to the screen. They are discretized by sampling each triangle at regular space intervals to generate fragments which are then added texture and lighting effects by a shader program. Realistic scenes require detailed geometric models, complex shaders, high-resolution displays and high screen refreshing rates, which all come at a great compute time and energy cost. This cost is often dominated by the fragment shader, which runs for each sampled fragment. Conventional GPUs sample the triangles once per pixel; however, there are many screen regions containing low variation that produce identical fragments and could be sampled at lower than pixel-rate with no loss in quality. Additionally, as temporal frame coherence makes consecutive frames very similar, such variations are usually maintained from frame to frame. This work proposes Dynamic Sampling Rate (DSR), a novel hardware mechanism to reduce redundancy and improve the energy efficiency in graphics applications. DSR analyzes the spatial frequencies of the scene once it has been rendered. Then, it leverages the temporal coherence in consecutive frames to decide, for each region of the screen, the lowest sampling rate to employ in the next frame that maintains image quality. We evaluate the performance of a state-of-the-art mobile GPU architecture extended with DSR for a wide variety of applications. Experimental results show that DSR is able to remove most of the redundancy inherent in the color computations at fragment granularity, which brings average speedups of 1.68x and energy savings of 40%.

Identifiants

pubmed: 35966445
doi: 10.1007/s11227-022-04413-7
pii: 4413
pmc: PMC9360083
doi:

Types de publication

Journal Article

Langues

eng

Pagination

14940-14964

Informations de copyright

© The Author(s) 2022.

Références

IEEE Trans Image Process. 2004 Apr;13(4):600-12
pubmed: 15376593
IEEE Trans Image Process. 2009 Jul;18(7):1409-23
pubmed: 19447715

Auteurs

Martí Anglada (M)

Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Jordi Girona 1-3, Barcelona, 08034 Spain.

Enrique de Lucas (E)

Imagination Technologies, Imagination House, King's Langley, WD4 8LZ UK.

Joan-Manuel Parcerisa (JM)

Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Jordi Girona 1-3, Barcelona, 08034 Spain.

Juan L Aragón (JL)

Dept. Ingeniería y Tecnología de Computadores, Universidad de Murcia, Campus de Espinardo, Murcia, 30100 Spain.

Antonio González (A)

Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Jordi Girona 1-3, Barcelona, 08034 Spain.

Classifications MeSH