Not sure anyone has tried creating corpus document chunks of recent but when I list the ones I’ve created they’re all in STATE_PENDING_PROCESSING
. Weeks ago this would be processed and ready for use within 10s but not the case with new chunks I’ve created as it’s now been about 40 minutes. Is it a new normal state or rather a bug that came with the slightly modified customMetadata
.key
that previously was accepting words but now strictly needs alphanumeric with dashes (no spaces) only?.
The chunk request isn’t actually heavy, just dummy text on a custom SDK am trying to work on.
$response = $gemini->corpora()->documents()->chunks()->batchCreate([
'parent' => 'corpora/test-corpus-j0oywm69m798/documents/test-document-rl76h09upqj3',
'requests' => [
[
'chunk' => new Chunk(
data: new ChunkData("chunk text"),
customMetadata: [
new CustomMetadata(
key: 'some-key-here',
stringValue: 'some value',
),
// can add more (20 max)
],
),
],
[
'chunk' => new Chunk(
data: new ChunkData("also some chunk text"),
customMetadata: [
new CustomMetadata(
key: 'some-more-key-too',
stringValue: 'some value here',
),
// can add more (20 max)
],
),
],
],
]);
foreach ($response->chunks as $chunk) {
$chunk->name; // corpora/test-corpus-j0oywm69m798/documents/test-document-rl76h09upqj3/chunks/4th6003almml
$chunk->data->stringValue; // chunk text
$chunk->createTime->format('Y-m-d H:i:s'); // 2024-07-20 10:22:37
$chunk->updateTime->format('Y-m-d H:i:s'); // 2024-07-20 10:22:37
$chunk->state->value; // STATE_PENDING_PROCESSING
foreach ($chunk->customMetadata as $metadata) {
$metadata->key; // some-key-here
$metadata->stringValue; // some value
}
}
Strange that previously we also had createTime
and updateTime
included in initial creation but now first create request has null on these parameters unless you query for chunks
.list
.