Architecture & Design
This document provides an overview of GridSheet's architecture and design principles.
Core Architecture
Table and Hub Relationship
The most important architectural relationship in GridSheet is between the Hub and Table instances. The Hub serves as a cross-cutting access point for multiple Table instances, while each Table instance corresponds one-to-one with a sheet.
Hub Architecture
The Hub is a cross-cutting instance that provides access to multiple Table instances. It contains:
- Cross-cutting Configuration: Policies, renderers, parsers, histories, and labelers
- Cell Data: Actual cell values, formulas, and metadata
- Calculation Cache: Cached formula results for performance
The Hub is actually a wrapper around the Wire class, which is the core implementation. This design allows React/Preact to trigger sheet re-renders by changing references, while the internal Wire instance maintains the same reference throughout its lifecycle.
Note: Even when using only a single table, a Hub (Wire instance) is implicitly created to manage the cell data and configuration.
Table Architecture
Each Table instance corresponds one-to-one with a sheet and contains:
- ID Matrix: Internal data identifiers that correspond to sheet row/column positions
- Sheet State: Current state of the specific sheet
- UI State: Visual state and rendering information
Cell Reference System
GridSheet uses an internal ID-based reference system for formulas:
- Internal References: Formulas use IDs like
#f1,#f2instead of addresses likeA1 - Address Mapping: The system converts IDs to addresses for display
- Stable References: This ensures formulas don't break when rows/columns are added/removed
For example:
// Internal formula storage
"=SUM(#2!#f1:#f5)" // Uses internal IDs
// Display conversion
"=SUM('Sheet2'!A1:A5)" // Converted to addresses for displayThis architecture provides:
- Stable References: Formulas remain intact during structural changes
- Performance: ID-based lookups are faster than address parsing
- Flexibility: Easy to add/remove rows/columns without breaking formulas
Data Flow
GridSheet processes data through two main flows: data input and data output. Each flow uses identifiers (strings) rather than instances for parsers and renderers to enable persistence.
Data Input Flow
Data input follows this sequence: User Input → Parser → Policy
- User Input: Raw input from keyboard, paste, or programmatic updates
- Parser: Converts input to appropriate data type using
cell.parseridentifier - Policy: Validates and processes data using
cell.policyidentifier - Cell Data: Final processed data stored in the cell
Data Output Flow
Data output follows this sequence: Cell Data → Formula Evaluation → Renderer → CSS Application
- Cell Data: Raw cell value or formula
- Formula Evaluation: Calculates formula results if present
- Renderer: Formats display using
cell.rendereridentifier - CSS Application: Applies styling using
cell.styleproperties - User Interface: Final rendered cell in the grid
Cell Value Serialization
GridSheet is designed with persistence in mind, ensuring that cell values are serializable and can be safely stored and restored. This design principle is crucial for maintaining data integrity across sessions and enabling features like undo/redo, data export, and state persistence.
Serialization Design
Cell values are designed to be serializable and contain only persistable data. This means:
- Serializable Values: All cell values can be converted to JSON and back without loss
- Persistable Data: Values can be safely stored in databases, local storage, or transmitted over networks
- Non-Serializable Objects: Event handlers, class instances, and other non-serializable objects are not stored directly in cells
Hub-Based Architecture for Non-Serializable Data
To handle non-serializable objects while maintaining cell serialization, GridSheet uses a Hub-based architecture:
Multi-Sheet Event Handling
When multiple sheets share the same hub, event handlers receive a table parameter that contains the UserTable instance. You can use table.sheetName to identify which sheet triggered the event:
const hub = useHub({
onChange: ({ table, points }) => {
// Use table.sheetName to identify the source sheet
if (table.sheetName === 'Sales') {
console.log('Sales sheet data changed:', points);
} else if (table.sheetName === 'Inventory') {
console.log('Inventory sheet data changed:', points);
}
},
onRemoveRows: ({ table, ys }) => {
console.log(`Rows removed from ${table.sheetName}:`, ys);
},
onKeyUp: ({ e, points }) => {
console.log(`Key pressed: ${e.key} at position:`, points);
},
onInit: ({ table }) => {
console.log(`Table initialized: ${table.sheetName}`);
},
});Important: Always use table.sheetName for conditional logic when multiple sheets share the same hub to ensure proper event handling and data management.
How it works:
- Cell Storage: Cells store only serializable identifiers (strings) for parsers, renderers, and other components
- Hub Registry: The Hub maintains registries of actual instances, event handlers, and class objects
- Identifier Resolution: When needed, the system resolves identifiers to actual instances from the Hub
- Persistence: Only the cell data (with identifiers) is persisted, while the Hub instances are recreated as needed
Benefits of This Design
- Data Integrity: Cell data remains consistent and serializable
- Flexibility: Non-serializable objects can still be used through the Hub system
- Performance: Direct object references are maintained during runtime
- Persistence: Data can be safely saved and restored without losing functionality
- Extensibility: New parsers, renderers, and handlers can be added without breaking existing data
Example Implementation
// Cell stores only serializable data
const cell = {
value: "user123",
parser: "userParser", // Identifier, not the actual parser instance
renderer: "userRenderer", // Identifier, not the actual renderer instance
custom: { userId: 123 } // Serializable custom data
}
// Hub stores the actual instances
const hub = {
parsers: {
userParser: new UserParser(), // Actual parser instance
},
renderers: {
userRenderer: new UserRenderer(), // Actual renderer instance
},
eventHandlers: {
userHandler: (event) => { /* handler logic */ } // Event handler function
}
}This architecture ensures that GridSheet can handle complex, non-serializable objects while maintaining the ability to persist and restore cell data reliably.
Matrix Data Evaluation
GridSheet uses a matrix-based data structure where each cell is identified by its row and column coordinates. The evaluation process considers both header information and cell data:
| Header Column & Header Row (0,0) | A Column Info (0,1) | B Column Info (0,2) |
|---|---|---|
| 1st Row Info (1,0) | A1 Cell Info (1,1) | B1 Cell Info (1,2) |
| 2nd Row Info (2,0) | A2 Cell Info (2,1) | B2 Cell Info (2,2) |
Async Formula Caching
GridSheet provides built-in caching for asynchronous formulas to improve performance and reduce redundant API calls.
Cache Storage
Async formula results are cached per-cell with the following structure:
cell.asyncCache = {
value: any, // The cached result (or error)
key: string, // Cache key based on function name + arguments
expireTime?: number // Optional expiration timestamp (ms since epoch)
}The cache key is generated using:
- Function name: Identifies which async formula is being called
- Serialized arguments: Includes cell values, tables as matrices, etc.
- Hash: cyrb53 hash of the serialized arguments for collision detection
// Example: GH_REPO("facebook/react", "stars")
// Cache key: "GH_REPO:45:2gosa7pa2gv"
// funcName:length:base36-hash(up to 11 chars)In-Flight Promise Tracking
While an async formula is being evaluated, GridSheet tracks pending promises per-cell:
// In Wire instance
wire.asyncPending.set(cellId, pendingValue);This prevents duplicate requests when the same cell is evaluated multiple times during a single render cycle.
Cache Invalidation
Caches are invalidated in the following scenarios:
- Input Changed: When formula arguments change, the cache key changes and a new fetch is triggered
- Expired: When the TTL (specified by
ttlMilliseconds) expires, the cache is discarded - Dependent Updates: When upstream cells change, dependent formulas are re-evaluated
TTL Configuration
Set cache TTL in your async formula class:
class MyAsyncFunction extends BaseFunction {
// Cache for 1 minute
ttlMilliseconds = 60 * 1000;
async main(...args: any[]) {
// Async logic
}
}Filtering & Sorting with Async Formulas
GridSheet stores filtering information in the header cells using special methods:
Filter Information Storage
-
Filter Conditions: Stored in column header cells
(0, x)viacell.filter()- Contains the filter predicate and configuration for that column
- Used to determine which rows should be visible
-
Filtered Row Tracking: Stored in row header cells
(y, 0)viacell.filtered()- Indicates whether a row passes all active filters
true= row is visible,false= row is hidden by filter
// Column header (0, x) stores filter condition
columnHeaderCell.filter() // Returns filter config for column x
// Row header (y, 0) tracks if row is visible
rowHeaderCell.filtered() // Returns true if row passes filters, false if hiddenFilter Behavior
When filtering data that contains async formulas:
- Async Evaluation: All cells in the filter scope are evaluated
- Wait for Completion: GridSheet waits for all pending async operations to complete
- Filter Application: Once all values are resolved, the filter predicate is applied
- Update Headers:
- Column header
(0, x)is updated with the new filter configuration - Row headers
(y, 0)are updated with visibility status
- Column header
- Error Handling: If an async formula throws, the cell displays an error value which is treated as a falsy value for filtering
// If an async formula returns an error, it's stored in cell.asyncCache
cell.asyncCache = {
value: new FormulaError('#ASYNC!', 'Network error'),
key: 'GH_REPO:45:2847391048',
expireTime: 1708847391048
}
// Error values are falsy in filter predicatesSort Behavior
Sorting works similarly to filtering:
- Async Evaluation: All cells in the sort scope are evaluated
- Wait for Completion: GridSheet waits for all pending async operations
- Sort Application: Resolved values are sorted according to the sort comparator
- Deterministic Ordering: Cells with the same resolved value maintain their original order (stable sort)
Performance Considerations
- Caching: Results are cached with the configured TTL, reducing API calls during repeated filter/sort operations
- Parallel Evaluation: Multiple async formulas in the same column are fetched in parallel via Promise.all
- UI Responsiveness: A loading indicator is shown during the wait period, maintaining a responsive UX
- Error Resilience: Filter/sort operations continue even if individual async formulas fail
Component Hierarchy
GridSheet is composed of several key components that work together to provide the complete spreadsheet experience:
- GridSheet: Main container and orchestrator
- FormulaBar: Displays and edits cell formulas
- SearchBar: Provides search functionality across the grid
- Editor: Handles cell editing and input validation
- Tabular: Core grid rendering component
- CellStateOverlay: Canvas-based overlay for selection visuals
- HeaderCellLeft...: Left-side row headers
- HeaderCellTop...: Top-side column headers
- Cell...: Individual cell components
- StoreObserver: Monitors state changes and triggers updates
- ContextMenu: Right-click context menu functionality
- Resizer: Handles column/row resizing operations
Canvas Overlay (CellStateOverlay)
Selection visuals (choosing border, selecting range, copy/cut dashed border, autofill, formula references, search highlights, and header highlights) are rendered on a single <canvas> element instead of applying per-cell inline styles. This architecture prevents memory leaks caused by mounting style objects on every individual cell component.
Why Canvas?
In the previous approach, each cell received an operationStyle prop containing selection-related CSS (background color, border). This caused:
- Memory leaks: Style objects were created for every cell in the selection range on each state change
- Excessive re-renders: Changing selection triggered re-renders across all affected cells
- DOM overhead: Hundreds or thousands of DOM style mutations per selection change
The canvas overlay draws all selection visuals in a single paint operation, independent of the number of cells involved.
Rendering Architecture
- The canvas is placed inside a sticky wrapper (
position: sticky; top: 0; left: 0; width: 0; height: 0; overflow: visible) so it stays fixed relative to the viewport while scrolling - The canvas is viewport-sized (matches
gs-tabularclient dimensions), not full-table-sized, to avoid canvas size limits on large tables - The Editor has
z-index: 11, above the overlay'sz-index: 10, ensuring the editing input is always interactive
Drawing Layers
The canvas draws 7 layers in order:
- Selecting zone — fill + border for the current selection range (clipped to cell area, excluding headers)
- Autofill dragging — dashed border for autofill drag preview
- Choosing cell — solid border around the currently pointed cell
- Copy/Cut zone — dashed border for the clipboard range
- Formula references — colored borders for formula cell references (using the color palette)
- Search matches — highlighted background for search results
- Header highlights — semi-transparent fills on top/left headers (drawn outside the cell-area clip region)
Layers 1–6 are drawn within a clip region (ctx.clip()) that excludes the sticky header area, so cell selections don't bleed into headers. Layer 7 is drawn after restoring the full canvas to overlay the header region.
Coordinate System
Cell positions are resolved via System.offsetTop / System.offsetLeft metadata stored on header cells. This provides O(1) coordinate lookup instead of iterating over rows/columns:
System.offsetTopon row header cells (y, x=0) stores the cumulative top offsetSystem.offsetLefton column header cells (y=0, x) stores the cumulative left offset- These offsets are computed once in
setTotalSize()during table initialization
Drawing coordinates are converted from absolute table coordinates to viewport coordinates by subtracting scrollTop / scrollLeft.
Scroll Synchronization
The overlay listens to the scroll event on gs-tabular and redraws synchronously (not via requestAnimationFrame) to avoid 1-frame lag between scroll position and overlay rendering. State-driven redraws (e.g., selection changes) use requestAnimationFrame for batching.
CSS Classes vs Canvas
Cell and header elements still retain their CSS classes (gs-choosing, gs-selecting, gs-th-selecting) for:
- E2E test selectors: Tests query these classes to verify selection state
- Non-visual styling: e.g.,
gs-selecting .gs-cell-label { display: block }to show address labels in the selection range
Only the background-color styling for selection states has been moved to the canvas. The classes remain on the DOM elements.