image processing is done in a DAG with potentially multiple sources and multiple sinks.
the graph does in fact not strictly have to be acyclic, we allow feedback
connectors
for iterative/multi-frame execution.
it follows the reference documentation of available modules, grouped into categories. also see the list of presets shipped with vkdt.
input
- i-bc1: input module for bc1-compressed thumbnails
- i-exr: input openexr images
- i-jpg: jpg input module
- i-jpglst: input a longer list of jpg as an array connector
- i-lut: half float lut input module
- i-mcraw: read motioncam raw video
- i-mlv: magic lantern raw video input module
- i-pfm: 32-bit floating point map input module
- i-raw: input module for raw-format photographic stills or timelapses
- i-v4l2: webcam input module
- i-vid: video input module
output
- o-bc1: write bc1 compressed thumbnail files
- o-copy: copy the input file to a new destination
- o-exr: write openexr image files
- o-jpg: write jpeg compressed still image
- o-lut: write varying precision multi channel luts
- o-null: write absolutely nothing
- o-pfm: write uncompressed 32-bit floating point image
- o-vid: write h264 or prores compressed video streams
- o-web: transitional module for jpg and mp4 for webpages
- loss: compute loss for optimisation
visualisation and inspection
- ab: a/b images in split screen
- check: mark out of gamut and under- and overexposure
- ciediag: vectorscope diagram in cie chromaticity space
- display: generic display sink node
- hist: waveform histogram
- pick: colour picker and visualisation tool
- rawhist: raw histogram with estimated noise levels
- test10b: render a gradient prone to banding to test 10 bit displays and dithering
- vis: convert linear input to srgb colour ramp for visualisation
- y2rgb: visualise first channel in grey scale
raw processing
- ca: correct chromatic aberrations
- demosaic: demosaic bayer or x-trans raw files
- hilite: highlight reconstruction based on local inpainting
- jddcnn: joint demosaicing and denoising via neural network
colour processing
corrective
- autoexp: smooth auto exposure of video sequences
- crop: crop/rotate/perspective correction
- deconv: deconvolution sharpening
- denoise: noise reduction based on edge-aware wavelets and noise profiles
- hotpx: remove impulse noise/stuck pixels
- kpn: kernel prediction neural network for denoising
- lens: lens distortion correction
- negative: invert film negatives
- usm: unsharp masking sharpening
tone
- contrast: local contrast enhancement using the guided filter
- eq: local contrast equaliser
- exposure: simple exposure correction, useful for dodging/burning
- filmcurv: display transform curve
- grad: linear gradient density filter
- llap: local contrast, shadow lifting, and highligh compression via local laplacian pyramids
- vignette: add/remove parametric vignette
- zones: zone system-like tone manipulation tool
retouching
- draw: draw raster masks via brush strokes (e.g. for dodging and burning)
- guided: guided filter blur module, useful for refining drawn masks
- inpaint: smooth reconstruction of masked out areas
- mask: create parametric masks from colour images for use with blending
- wavelet: skin retouching
effects
technical
- align: align animation frames or burst photographs
- blend: masked frame blending
- cnngenin: generate random input for neural network training
- colenc: encode colour for colour managed output like adobeRGB, P3, etc
- format: change texture format (number of channels and data type)
- f2srgb: convert linear floating point data to 8-bit sRGB for output
- kpn-t: kernel prediction neural network for denoising, training
- mv2rot: estimate rotation + translation from motion vectors
- resize: add ability to resize buffers
- resnet: gmic convolutional neural network
- srgb2f: convert sRGB input to linear rec2020 floating point
3d rendering
- accum: accumulate frames in a frame buffer
- bvh: append triangle mesh to ray tracing acceleration structure
- i-obj: load triangle meshes from wavefront obj files
- quake: the 1996 game ray traced based on QSS
- rt: real-time ray tracing
- spheres: shadertoy demo ported for testing
- sss: sub surface scattering testbed
- svgf: spatiotemporal variance guided filtering
internal use
by default, a raw image is passed through the following pipeline:
denoise
: (this module also subtracts the black point and removes black borders which are otherwise used for noise estimation)hilite
: reconstruct highlights on raw datademosaic
: interpolate three colour channels for every pixelcrop
: grab exif orientation and allow crop/perspective correctioncolour
: apply white balance, colour matrix, and gamut corrections. optionally apply fully profiled RBF for colour correction.filmcurv
: film style tone curvellap
: local contrast, shadows, and highlights
you can change the default pipeline by hacking default-darkroom.i-raw
(either
in the vkdt basedir or the homedir) for darkroom mode and default.i-raw
for thumbnails. the i-raw
suffix indicates that the file will be used for raw
input, there is also the equivalent i-mlv
version for raw video.
note that most strings here (parameter and connector names etc) are stored as
dt_token_t
, which is exactly 8 bytes long. this means we can very easily
express the parameters and corresponding history stacks in a binary format,
which is useful for larger parameter sets such as used for the vertices
in the draw
module.
is used to express compile time dependencies (if you're including glsl files or
if you have a main.c to be compiled). this is build-time only and not needed
to run vkdt
.
defines the io connectors, for instance
input:read:rgba:f16
output:write:rgba:f16
defines one connector called input
and one called output
in rgba
with f16
format.
the specifics of a connector are name, type, channels, and format. name is an
arbitrary identifier for your perusal. note however that input
and output
trigger special conventions for default callbacks wrt region of interest or
propagation of image parameters (module->img_param
).
the type is one of read
write
source
sink
. sources and sinks do not
have compute shaders associated with them, but will call read_source
and
write_sink
callbacks you can define in a custom main.c
piece of code.
the channels can be anything you want, but the GPU only supports one, two, or four channels per pixel. these are represented by one char each, and will be matched during connection of modules.
format can be one of the primitive ui8
ui16
ui32
f16
f32
or one of
the special formats dspy
and atom
. dspy
evaluates to the display
capabilities (may be a 10 bits/channel special format). atom
evaluates to
f32
if supported, or else falls back to ui32
(many amd cards).
when connecting two modules, the connectors will be tested for compatibility.
some modules can handle more than one specific configuration of channels and
formats, so they specify a wildcard *
instead. if both reader and writer
specify *
vkdt
defaults to rgba:f16
.
there is one more special case, modules can reference the channel or format of a previously connected connector on the same module, for instance the blend module:
back:read:*:*
input:read:*:*
mask:read:y:f16
output:write:&input:f16
references the channel configuration of the input
connector and configures
the output
connector to match it. note that this requires to connect input
before output
.
defines the parameters that can be set in the cfg
files and which
will be routed to the compute shaders as uniforms. for instance
sigma:float:1:0.12
shadows:float:1:1.0
hilights:float:1:1.0
clarity:float:1:0.0
defines the mapping of parameters to ui widgets. i recommend you look through existing examples to get a sense. the ui is programmed in c++, the modules in c, and the processing is done in glsl. this way there is an extremely clear separation of algorithms, module logic, and gui.
the format is in general, one per line:
<name of parameter>:<widget>:<special info for widget>
the ui supports the following widgets
slider
takesmin:max
as range argumentvslider
a vertical slider. takesmin:max
as range argumentcallback
a special button that triggers theui_callback
function in the modulecolour
combo
a combobox. takes the:
-separated list of string entries as argumentcrop
the crop tool of the crop/rotate moduledraw
draw brush strokes with the mouse/pentabletfilename
a file namegrab
grab the keyboard and mouse and pass to the modulegroup
a special directive, see belowhidden
do not show the parameter in the uipers
the perspective correction tool of the crop/rotate modulepick
a special colour picker widget (use mouse to draw rect on image)print
print a string using the parameterrbmap
a special colour widget for radial basis function mappingrgb
three sliders (red green blue) which show the colour in the ui background. takes the usual range argumentmin:max
.straight
the straighten tool of the crop/rotate module
the group
keyword is used to only show the following ui elements if a
parameter switch is set accordingly. syntax: group:<param>:<val>
. for an
example, see the colour module. here, <param>
refers to an int
parameter driven by a combo box, and we use it to only show
the colour temperature slider if the matrix mode is set to colour lookup table
clut
:
matrix:combo:rec2020:image:XYZ:rec709:clut
group:matrix:4
temp:slider:2856:6504
everything after the group
directive until the next one will be shown only if
the matrix
parameter is set to 4
.