Microsoft announces 'SpreadsheetLLM', a language model that can understand spreadsheets, and Excel work will be handled by AI in the future



Spreadsheets, which are spreadsheets, are a demon for large-scale language models (LLMs), as they have elements such as vast tables, a wide variety of formatting, complex formulas and references. Microsoft researchers have announced 'SpreadsheetLLM', a language model that is designed to process spreadsheets that are difficult to understand with conventional AI such as Excel and Google Spreadsheets.

[2407.09025] SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

https://arxiv.org/abs/2407.09025

Microsoft's new AI system 'SpreadsheetLLM' unlocks insights from spreadsheets, boosting enterprise productivity | VentureBeat
https://venturebeat.com/ai/microsofts-new-ai-system-spreadsheetllm-unlocks-insights-from-spreadsheets-boosting-enterprise-productivity/

Microsoft's experimental Spreadsheet LLM helps AI better understand spreadsheets - SiliconANGLE
https://siliconangle.com/2024/07/15/microsofts-experimental-spreadsheetllm-helps-ai-better-understand-spreadsheets/

In a paper published on the preprint server arXiv on July 12, 2024, a Microsoft research team presented 'SpreadsheetLLM,' an approach that converts spreadsheets into a format that is easy for language models to understand.

In this approach, the research team's 'SheetCompressor' encodes and compresses spreadsheets, converting them into data that is easier for language models to understand and analyze. This allows various LLMs such as GPT-4 and Llama 2 to gain a deeper understanding of spreadsheets. For example, GPT-4 scored 78.9% on the table detection test, outperforming existing approaches by 12.3%.



SheetCompressor consists of three components: 'Structural Anchors for Efficient Layout Understanding', which identifies important rows and columns that define the table structure; 'Inverted Index Translation for Token Efficiency', which indexes the text of non-empty cells to optimize token usage while maintaining data integrity; and 'Data Format Aggregation for Numeric Cells', which groups cells with similar formats or numbers to reduce token usage.

According to the research team, SheetCompressor compresses data by up to 96%, allowing LLM to process large data sets within token limits.

The power of SpreadsheetLLM is that it makes spreadsheet data more accessible and understandable. LLM allows users to view and create spreadsheets with natural language processing rather than complex formulas or programming language, making it easier for more individuals in an organization to make data-driven decisions.

SpreadsheetLLM also automates many of the tedious tasks of spreadsheet data analysis, such as cleaning, formatting, and summarizing data, freeing up employees' resources for higher-value work.



Microsoft is already integrating AI into its office suite with products like Microsoft 365 Copilot, and the arrival of SpreadsheetLLM opens up the possibility of AI being used in spreadsheets as well.

It's unclear when or if SpreadsheetLLM will be officially released, but VentureBeat, an IT news site, said, 'We're excited to see how SpreadsheetLLM will change how we work with spreadsheets as it moves from research to a real-world application, and what possibilities it will bring to data-driven decision-making in the enterprise. With Microsoft at the forefront of this AI-driven innovation, the future of work, especially around Excel and spreadsheets, looks brighter than ever.'

in Mobile, Posted by log1l_ks