Invoke

Beta
Extract structured data. tl;dr: - pass a valid JSON schema in `json_schema` - pass the page chunks as a list of `Chunk` objects, by default: `{"type": "text", "content": "..."}` - leave all other fields as default Detailed configuration (only relevant for complex use cases): The structured data extractor's architecture follows the map-reduce pattern, where the asset is divided into chunks, the schema is extracted from each chunk, and the chunks are then reduced to a single structured data object. In some applications, you may not want to: - map (if your input asset is small enough) - reduce (if your output object is large enough that it will overflow the output length; if you're extracting a long list of entities; if youre ) to extract all instances of the schema). You can configure these behaviors with the `map` and `reduce` fields.

Authentication

X-API-KEYstring
API Key authentication via header
OR
AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Request

This endpoint expects an object.
chunkslist of objectsRequired
The chunks from which to extract structured data.
json_schemamap from strings to anyRequired
The JSON schema to use for validation (version draft 2020-12). See the docs [here](https://json-schema.org/learn/getting-started-step-by-step).
chunk_messageslist of objectsOptional

The prompt to use for the data extraction over each individual chunk. It must be a list of messages. The chunk content will be appended as a list of human messages.

reducebooleanOptionalDefaults to true

If map, whether to reduce the chunks to a single structured object (true) or return the full list (false). Use True unless you want to preserve duplicates from each page or expect the object to overflow the output context.

reduce_messageslist of objectsOptional
The prompt to use for the reduce steps. It must be a list of messages. The two extraction attempts will be appended as a list of human messages.

Response

Successful Response
chunk_by_chunk_datalist of objects or null

The extracted structured data for each chunk. A list where each element is guaranteed to match json_schema.

reduced_datamap from strings to any or null

If reduce is True, the reduced structured data, otherwise null. Guaranteed to match json_schema.

Errors