This thing can get called several thousand times per LUT
so seems like we want to inline it to:
- avoid the function call overhead
- allow constant folding
A quick synthetic test (w/o any hardware interaction) with
a ridiculously large LUT size shows about 50% reduction in
runtime on my HSW and BSW boxes. Slightly less with more
reasonable LUT size but still easily measurable in tens
of microseconds.
v2: Include drm_color_mgmt.h in the .rst (Daniel)
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108135654.12907-1-ville.syrjala@linux.intel.com