Nextdata, a startup that raised a $12 million seed round in 2023, launched its Nextdata OS in a move that aims to take data that sits in multiple platforms and turn them into autonomous data products that can interact with AI applications, agents and models.

Zhamak Dehghani, CEO of Nextdata, outlined the launch of the company's data operating system and the problem it is trying to solve. The core pitch:

  • AI is adding more data, more use cases, and more complexity.
  • Your options to manage this data complexity aren't working. They're monolithic or fragmented.
  • You replatform, migrate data, reorg and retrain, only to do it all over again.
  • It's time to stop replatforming, to build on what you already have by containerizing your complexity.

Dehghani said the goal of Nextdata is to embrace the complexity and launch a system that combines the parts into a simple, unified approach that automates data chores like cataloguing, governing, pipelines and provisioning.

I caught up with Dehghani to talk about the NextData strategy, its data operating system and how AI needs a data abstraction layer. Here's a look at some of the highlights.

The start. Dehghani was a director of technology at Thoughtworks, a consulting company, in 2017 and was helping large enterprises come up with data strategies. She was also well versed in microservices and couldn't understand why big data warehouse architectures couldn't be more nimble.

Dehghani coined the term data mesh as a way to address the big data pain points. She founded Nextdata in 2022. "We really wanted to be the first decentralized and unified data management solution that was multi use case and multimodal," said Dehghani. "We had to reimagine data products as autonomous applications that site on top of your existing infrastructure and then encapsulates the soup to nuts of data management for this little domain. You can have hundreds of these in your organizations with each of them focused on sales, recommendations, supply chain and so on."

A screenshot of a computer

AI-generated content may be incorrect.

Nextdata's first contracts were initially on the back of a napkin and the company rapidly developed and prototyped Nextdata OS. It didn't hurt to have Mars and Bristol Myers Squibb as early adopters.

The fallacy of getting all of your enterprise data in one place. The emergence of generative AI and now agentic AI created a big data push to get everything on one platform in one place. Dehghani said that thinking is outdated.

"I hope to change this paradigm that we have to have the data in one pace. It's a backward way of thinking about it and we'll be chasing this forever. And the moment you get there it's out of date," she said. "What we want is one way of getting access to data in a standard way. It doesn't have to be in one place, but it has to have the experience of being in one place."

A close-up of a document

AI-generated content may be incorrect.

A need for a data abstraction layer. Dehghani said Nextdata OS is designed to be a management layer that sites on top of the foundational data platforms. "These two layers can interact in a decentralized way that's different," said Dehghani.

The data product. In Nextdata's parlance, enterprises build data products that have one code base, one experience, semantic and metadata layers and multiple modes of access. "Every data product is live and emitting information about discoverability, who can access data and include relevant semantic connections to other data products," said Dehghani.

This arrangement could enable agents to pick up what's needed via a mesh of APIs. Data products provide the same vector data, governance and everything needed to access information in various modes. Dehghani explained that Nextdata has a driver that loads into every data product, connects to multiple data stores underneath and redirects access based on the use case.

Nextdata plans to offer a software development kit for its drivers in the future.

Ultimately, these data products within an enterprise can pull from multiple platforms, say Amazon Web Services, Databricks and Snowflake, generate data and provide metadata live so it can automate entire pipelines, said Dehghani.

Data products will also have agency. "For us, data is actually a working application," said Dehghani, who added that there will also be observability into data estates. "Our data products are both inbound and outbound."