Recommendations for LLM-Friendly Version of SpreadJS JSON or Best Practices for

Posted by: heysreenir on 7 January 2026, 4:25 pm EST

  • Posted 7 January 2026, 4:25 pm EST

    Hi all,

    We’re currently using SpreadJS v19.0 in our application, and as part of a POC, we’re aiming to enable a Large Language Model (LLM) to analyze and answer questions about the state of a SpreadJS workbook. To give the LLM spatial and content context, we plan to leverage the workbook export from toJson().

    However, we’ve noticed that the exported JSON contains many indirect references and complex nested structures. This can make it difficult for an LLM to reliably interpret workbook data, reason over cell content, and understand context.

    My questions:

    1. Does SpreadJS offer any more “flattened” or LLM-friendly export options for workbook data?

      Are there recommended patterns or best practices for pre-processing or restructuring the SpreadJS JSON to make it easier for AI models to consume and analyze?

    2. Has anyone else tackled similar requirements or have advice for making SpreadJS data more usable for LLM-based applications?

    Any suggestions, experiences, or pointers would be greatly appreciated!

    Thank you!

  • Posted 8 January 2026, 8:24 am EST

    Hi,

    1. Flattened / LLM-friendly export options

    At present, SpreadJS does not provide a dedicated “flattened” or AI/LLM-optimized export format.

    • workbook.toJson()
      is primarily designed for lossless persistence and reload, not for semantic readability.

    • As a result, the exported JSON includes:

      • Style inheritance and shared objects
      • Formula references
      • Sparse storage of rows, columns, and cells
      • Indirect references for performance and size optimization

    Therefore, there is no built-in alternative export that is directly suitable for LLM consumption.


    2. Recommended approaches to preprocess SpreadJS data for LLMs

    Most teams implementing AI/LLM workflows create a custom abstraction layer on top of SpreadJS APIs rather than relying directly on

    toJson()
    .

    Preferred approach: extract semantic data via APIs

    Instead of exporting the full workbook JSON, consider iterating through sheets and ranges using runtime APIs and building a clean, flattened representation, for example:

    • Sheet name
    • Used range
    • Cell address (A1 notation)
    • Display value (
      getText()
      or
      getValue()
      )
    • Formula (if present)
    • Basic metadata (row/column headers, merged cells, alignment if required)

    This can produce a structure like:

    {
      "sheet": "Sales",
      "cells": [
        {
          "address": "B2",
          "value": 1200,
          "text": "1200",
          "formula": "=SUM(A2:A5)"
        }
      ]
    }
    

    Such a format is far more predictable and LLM-friendly.


    Use

    getText()
    instead of raw values

    • getValue()
      may return objects (dates, formulas, etc.).
    • getText()
      returns the actual displayed content, which is generally better for reasoning and Q&A scenarios.

    Pre-resolve formulas (if required)

    If the LLM needs results rather than logic:

    • Extract both:

      • The formula (
        getFormula
        )
      • The evaluated value (
        getText
        )

    This avoids requiring the LLM to “execute” formulas.


    Limit the scope intentionally

    LLMs perform better with:

    • Smaller, well-scoped contexts
    • Specific sheets or named ranges
    • Business-level summaries rather than raw grids

    For example:

    • Export only the active sheet
    • Export only named tables or ranges
    • Generate per-sheet summaries before passing data to the LLM

    3. Common patterns followed for similar AI use cases

    While there is no official SpreadJS + LLM integration, typical approaches include:

    • Runtime extraction → AI schema

      • Use SpreadJS APIs at runtime
      • Convert data into a domain-specific JSON schema
    • Hybrid approach

      • Raw cell data for detail
      • Precomputed summaries for reasoning

    In summary, most successful implementations do not feed

    toJson()
    directly to an LLM. To date, we have not received similar requirements from other users.

    Regards,

    Priyam

  • Posted 12 January 2026, 9:58 pm EST

    Thanks for your reply Priyam.

    I am trying to understand a few things about SpreadJS v19. Could you please clarify the following:

    1. SSJSON Format Compatibility: Is the toJSON() method converting the workbook into SSJSON (SpreadJS v16 format)?

    2. Format Differences: If the answer to Question 1 is NO - what are the differences between the JSON object we get through toJSON() method and the SSJSON format?

    3. Data Loss During Conversion: Currently my team is using the latest format (. sjs) to save workbooks in storage and load them back into the browser. We have concerns about potential “data loss” which may occur when converting workbooks from .sjs → SSJSON or JSON object. Can you clarify what, if any, data loss we should expect?

    4. Best Practices for Serialization: What are your recommended best practices for serializing workbooks when we need to:

      Preserve complete workbook fidelity for round-trip conversion (. sjs → JSON → .sjs)

      Process workbook data programmatically for analysis (e.g., for LLM integration)

      Handle large/complex workbooks with multiple sheets

  • Posted 13 January 2026, 7:49 am EST

    Hi,

    I have answered your questions in the same order below:

    1. Yes,

      toJSON()
      returns a JSON object that is the same as SSJSON.

    2. There is no difference.

      toJSON()
      returns the JSON object containing the workbook information, while SSJSON is simply the file format used to store this JSON object.

    3. If you are using:

    • .sjs
      files for storage
    • fromJSON()
      /
      toJSON()
      for loading and saving
    • The same or compatible SpreadJS versions

    then no data loss is expected for supported features.

    Important clarifications:

    • .sjs
      is essentially a packaged form of the same workbook model used by
      toJSON()
      .

    • Round-trip conversion (

      .sjs → JSON → .sjs
      ) is lossless as long as:

      • The same SpreadJS version (or newer) is used
      • All features used in the workbook are supported by that version

    Data loss would only occur in edge cases such as:

    • Loading JSON into a much older SpreadJS version
    • Manually modifying the JSON and removing required structures
    1. Recommended best practices for serialization

    A. Preserve full workbook fidelity (round-trip)

    For scenarios where exact visual and functional fidelity is required:

    • Continue using
      .sjs
      or
      toJSON()
      /
      fromJSON()
      without modification
    • Avoid transforming or flattening the JSON
    • Treat the JSON as an opaque persistence format, not an analytical one

    This is the safest approach for storage, reload, and version upgrades.

    B. Programmatic processing / LLM integration

    For analysis, AI, or LLM use cases:

    • Do not use

      toJSON()
      directly

    • Instead:

      • Extract data at runtime using SpreadJS APIs

      • Build a separate, flattened semantic model, for example:

        • Sheet names
        • Used ranges
        • Cell addresses
        • Display text (
          getText()
          )
        • Formulas (
          getFormula()
          ), if needed
        • Optional metadata (tables, headers, merged cells)

    This avoids coupling AI logic to internal SpreadJS schemas, which may change over time.

    C. Large or complex workbooks

    For large, multi-sheet workbooks:

    • Intentionally limit the scope:

      • Active sheet only
      • Named ranges or tables
      • Business-critical sheets
    • Consider:

      • Chunking data per sheet
      • Generating summaries before passing content to an LLM
    • Keep persistence (JSON /

      .sjs
      ) and analysis (flattened AI schema) as two separate pipelines

    Regards,

    Priyam

Need extra support?

Upgrade your support plan and get personal unlimited phone support with our customer engagement team

Learn More

Forum Channels