I want to analyse pictures coming from a webcam.
Application has two parts:
Nodejs app which manages the model & the classifier, and is interfaced with…
WebBrowser js app which drives the camera, and provides captured images to Nodejs app.
If I succeed to analyse all pictures got from Internet (cars, bikes, peoples), I fails to analyse pictures coming from my webcam app. Error is :
“Input to reshape is a tensor with 200704 values, but the requested shape requires a multiple of 150528”
Here is extract of my code:
const _mobilenet = require('@tensorflow-models/mobilenet');
var_myNet = await _mobilenet.load();
...
// "buffer" is coming from webcam and provided to nodejs app:
_fs.writeFileSync( './data.png', buffer);
// data.png is valid and can be displayed with any image tools
let buffer = _fs.readFileSync( './data.png');
let image = _tfnode.node.decodeImage( buffer);
let predictions = await _myNet.classify( img);
==> error message
(I have saved and load “buffer” on/from file for my tests, but I think I can use buffer directly in _tfnode.node.decodeImage(). The result is the same in both scenarii).
It is strange.
I have only add parameter channels to 3, and it seems to work. Why ?
tf.node.decodeImage (content, channels?, dtype?, expandAnimations?) channels (number) An optional int. Defaults to 0, use the number of channels in the image. Number of color channels for the decoded image. It is used when image is type Png, Bmp, or Jpeg.
There is a fixed input image size for each model. You are probably giving a different size than the mobilenet input image size. Can you ensure that the width, height and dimension multiplication are 150528 before exporting the images ?
I think your model wants a 3D image. it probably won’t work on one dimensional images. While we are training the model, we enter values such as 227x227x1 or 227x227x3 as input size. but the structure that works in all dimensions should also work in a one-dimensional image. maybe you are right.
I think that none of the answers is right. Consider that 200704 * 0.75 = 150528. I suspect that the input image has 4 channels (RGB, something) while the model only wants three channels (RGB).