Recommendations for LLM-Friendly Version of SpreadJS JSON or Best Practices for

Posted by: heysreenir on 7 January 2026, 4:25 pm EST

Please login to follow topic

heysreenir
- Post Options:
- Link
  Copy
Posted 7 January 2026, 4:25 pm EST
Hi all,

We’re currently using SpreadJS v19.0 in our application, and as part of a POC, we’re aiming to enable a Large Language Model (LLM) to analyze and answer questions about the state of a SpreadJS workbook. To give the LLM spatial and content context, we plan to leverage the workbook export from toJson().

However, we’ve noticed that the exported JSON contains many indirect references and complex nested structures. This can make it difficult for an LLM to reliably interpret workbook data, reason over cell content, and understand context.

My questions:

Does SpreadJS offer any more “flattened” or LLM-friendly export options for workbook data?

Are there recommended patterns or best practices for pre-processing or restructuring the SpreadJS JSON to make it easier for AI models to consume and analyze?

Has anyone else tackled similar requirements or have advice for making SpreadJS data more usable for LLM-based applications?

Any suggestions, experiences, or pointers would be greatly appreciated!

Thank you!
priyam.kushwaha
- Post Options:
- Link
  Copy
Posted 8 January 2026, 8:24 am EST
Hi,

1. Flattened / LLM-friendly export options

At present, SpreadJS does not provide a dedicated “flattened” or AI/LLM-optimized export format.

workbook.toJson()
is primarily designed for lossless persistence and reload, not for semantic readability.

As a result, the exported JSON includes:

Style inheritance and shared objects

Formula references

Sparse storage of rows, columns, and cells

Indirect references for performance and size optimization

Therefore, there is no built-in alternative export that is directly suitable for LLM consumption.

2. Recommended approaches to preprocess SpreadJS data for LLMs

Most teams implementing AI/LLM workflows create a custom abstraction layer on top of SpreadJS APIs rather than relying directly on
toJson()
.

Preferred approach: extract semantic data via APIs

Instead of exporting the full workbook JSON, consider iterating through sheets and ranges using runtime APIs and building a clean, flattened representation, for example:

Sheet name

Used range

Cell address (A1 notation)

Display value (
getText()
or
getValue()
)

Formula (if present)

Basic metadata (row/column headers, merged cells, alignment if required)

This can produce a structure like:

{ "sheet": "Sales", "cells": [ { "address": "B2", "value": 1200, "text": "1200", "formula": "=SUM(A2:A5)" } ] }

Such a format is far more predictable and LLM-friendly.

Use
getText()
instead of raw values

getValue()
may return objects (dates, formulas, etc.).

getText()
returns the actual displayed content, which is generally better for reasoning and Q&A scenarios.

Pre-resolve formulas (if required)

If the LLM needs results rather than logic:

Extract both:

The formula (
getFormula
)

The evaluated value (
getText
)

This avoids requiring the LLM to “execute” formulas.

Limit the scope intentionally

LLMs perform better with:

Smaller, well-scoped contexts

Specific sheets or named ranges

Business-level summaries rather than raw grids

For example:

Export only the active sheet

Export only named tables or ranges

Generate per-sheet summaries before passing data to the LLM

3. Common patterns followed for similar AI use cases

While there is no official SpreadJS + LLM integration, typical approaches include:

Runtime extraction → AI schema

Use SpreadJS APIs at runtime

Convert data into a domain-specific JSON schema

Hybrid approach

Raw cell data for detail

Precomputed summaries for reasoning

In summary, most successful implementations do not feed
toJson()
directly to an LLM. To date, we have not received similar requirements from other users.

Regards,

Priyam
heysreenir
- Post Options:
- Link
  Copy
Posted 12 January 2026, 9:58 pm EST
Thanks for your reply Priyam.

I am trying to understand a few things about SpreadJS v19. Could you please clarify the following:

SSJSON Format Compatibility: Is the toJSON() method converting the workbook into SSJSON (SpreadJS v16 format)?

Format Differences: If the answer to Question 1 is NO - what are the differences between the JSON object we get through toJSON() method and the SSJSON format?

Data Loss During Conversion: Currently my team is using the latest format (. sjs) to save workbooks in storage and load them back into the browser. We have concerns about potential “data loss” which may occur when converting workbooks from .sjs → SSJSON or JSON object. Can you clarify what, if any, data loss we should expect?

Best Practices for Serialization: What are your recommended best practices for serializing workbooks when we need to:

Preserve complete workbook fidelity for round-trip conversion (. sjs → JSON → .sjs)

Process workbook data programmatically for analysis (e.g., for LLM integration)

Handle large/complex workbooks with multiple sheets
priyam.kushwaha
- Post Options:
- Link
  Copy
Posted 13 January 2026, 7:49 am EST
Hi,

I have answered your questions in the same order below:

Yes,
toJSON()
returns a JSON object that is the same as SSJSON.

There is no difference.
toJSON()
returns the JSON object containing the workbook information, while SSJSON is simply the file format used to store this JSON object.

If you are using:

.sjs
files for storage

fromJSON()
/
toJSON()
for loading and saving

The same or compatible SpreadJS versions

then no data loss is expected for supported features.

Important clarifications:

.sjs
is essentially a packaged form of the same workbook model used by
toJSON()
.

Round-trip conversion (
.sjs → JSON → .sjs
) is lossless as long as:

The same SpreadJS version (or newer) is used

All features used in the workbook are supported by that version

Data loss would only occur in edge cases such as:

Loading JSON into a much older SpreadJS version

Manually modifying the JSON and removing required structures

Recommended best practices for serialization

A. Preserve full workbook fidelity (round-trip)

For scenarios where exact visual and functional fidelity is required:

Continue using
.sjs
or
toJSON()
/
fromJSON()
without modification

Avoid transforming or flattening the JSON

Treat the JSON as an opaque persistence format, not an analytical one

This is the safest approach for storage, reload, and version upgrades.

B. Programmatic processing / LLM integration

For analysis, AI, or LLM use cases:

Do not use
toJSON()
directly

Instead:

Extract data at runtime using SpreadJS APIs

Build a separate, flattened semantic model, for example:

Sheet names

Used ranges

Cell addresses

Display text (
getText()
)

Formulas (
getFormula()
), if needed

Optional metadata (tables, headers, merged cells)

This avoids coupling AI logic to internal SpreadJS schemas, which may change over time.

C. Large or complex workbooks

For large, multi-sheet workbooks:

Intentionally limit the scope:

Active sheet only

Named ranges or tables

Business-critical sheets

Consider:

Chunking data per sheet

Generating summaries before passing content to an LLM

Keep persistence (JSON /
.sjs
) and analysis (flattened AI schema) as two separate pipelines

Regards,

Priyam

Please login to reply to thread

Need extra support?

Upgrade your support plan and get personal unlimited phone support with our customer engagement team

Learn More

Forum Channels

ComponentOne

Forums for all current editions of the ComponentOne .NET UI control product line, including ComponentOne Studio and ComponentOne Studio for Xamarin.
ActiveReports

Forums for all versions of ActiveReports and ActiveReports Server
Spread

Forums for all current versions of Spread .NET spreadsheets, SpreadJS JavaScript spreadsheets, and SpreadCOM spreadsheets.
Wijmo

Forums for all Wijmo products, including Wijmo Core, FinancialChart, FlexSheet, MultiRow, OLAP, and ReportViewer
- General Discussion
Document Solutions

Forums for all Document Solutions products, including Document Solutions for PDF, Word, Excel (.NET and Java), and Imaging.

Recommendations for LLM-Friendly Version of SpreadJS JSON or Best Practices for

Need extra support?

Forum Channels

ComponentOne

ActiveReports

Spread

Wijmo

Document Solutions