In the current wave of generative AI innovation, industries that live in documents and text — legal, healthcare, customer support, sales, marketing — have been riding the crest. The technology transformed legal workflows overnight, and companies like Harvey and OpenEvidence scaled to roughly $100 million in ARR in just three years. Customer support followed closely behind, with AI-native players automating resolution, summarization, and agent workflows at unprecedented speed.
But industries built on structured data have not been as quick to adopt genAI. In financial services, insurance, and industrials, AI teams still stitch together thousands of task-specific machine learning models — each with its own data pipeline, feature engineering, monitoring, retraining schedule, and failure modes. These industries require a general-purpose primitive for structured data, an LLM-equivalent for rows and tables instead of sentences and paragraphs.
We believe that primitive is now emerging: tabular foundation models. And they represent a major opportunity for industries sitting on massive databases of structured, siloed, and confidential data.
LLMs use attention mechanisms to understand relationships between words, and simultaneously capture context, nuance, and meaning across sentences and entire documents. As these models scaled, an unprecedented supply of freely available text across the internet provided trillions of tokens that taught them how language works across domains, styles, and use cases. Models that could read, write, summarize, and reason over text suddenly became everyday business tools — drafting emails, answering tickets, and redlining contracts in seconds.
Entrepreneurs quickly recognized the pattern: plug into a foundation model’s API, wrap it in a vertical interface, solve a painful workflow, and sell seats to high-value knowledge workers. Thousands of AI-native startups followed, forming a virtuous cycle: application companies drove demand, foundation model providers reinvested in better capabilities, and improved models enabled even more powerful applications. Domain by domain, LLMs devoured unstructured data wherever it lived.
To continue reading this article, click here.