I’m looking into implementing high-quality image-processing operations using TF. For example, I’d like to have a higher-quality downsampling method, like Lanczos as a TF model. Please forward any references to this sort of work you are aware of.
For example, a basic Gaussian blur can be implemented by passing a custom-width kernel to tf.conv2d()
(I’m using TFJS). This works great, but has the expected issues along the image boundary. Production-quality image processing tools solve this edge problem in one of a few ways, typically by adjusting the kernel weights outside the image to zero. However, I’m not experienced enough at how to set different kernels along the image boundaries.
Can anyone provide some tips?
For more context, here’s code that does a simple NxN Gaussian blur, without handling the borders. I’d love to figure out how to enhance this code to provide different kernels along the boundary rows and columns to do a better job of handling the edges (ie. not blending with zero).
const lanczos = (x, a) => {
if (x === 0) return 1
if (x >= -a && x < a) {
return (a * Math.sin(Math.PI * x) * Math.sin(Math.PI * (x / a))) / (Math.PI * Math.PI * x * x)
}
return 0
}
const gaussian = (x, theta = 1 /* ~ -3 to 3 */) => {
const C = 1 / Math.sqrt(2 * Math.PI * theta * theta)
const k = -(x * x) / (2 * theta * theta)
return C * Math.exp(k)
}
const filters = {
Lanczos3: x => lanczos(x, 3),
Lanczos2: x => lanczos(x, 2),
Gaussian: x => gaussian(x, 1),
Bilinear: () => 1,
Nearest: () => 1,
}
const normalizedValues = (size, filter) => {
let total = 0
const values = []
for (let y = -size; y <= size; ++y) {
const i = y + size
values[i] = []
for (let x = -size; x <= size; ++x) {
const j = x + size
values[i][j] = []
const f = filter(x) * filter(y)
total += f
for (let c = 0; c < 3; ++c) {
values[i][j][c] = [ f, f, f ]
}
}
}
const kernel = values.map(row => row.map(col => col.map(a => a.map(b => b / total))))
// for (let x = -size; x <= size; ++x) values[x + size] = filter(x)
// const kernel = tf.einsum('i,j->ij', values, values)
// const sum = tf.sum(values)
const normalized = tf.div(kernel, total * 3)
return normalized
}
const frame = async (tensor, args) => {
const filter = filters[args.filter]
// const [ height, width ] = tensor.shape
// const res = args.resolution === 'Source' ? [ width, height ] : resolutions[args.resolution]
// const strides = [ width / res[0], height / res[1] ]
const { zoom, kernelWidth } = args
const strides = Math.max(1, zoom)
const size = Math.max(3, kernelWidth) * strides
const kernel = normalizedValues(size, filter)
const pad = 'valid' // sample to the edge, even when filter extends beyond image
const dst = tf.conv2d(tensor, kernel, strides, pad)
return { tensor: dst }
}