Data catalogs — hierarchical tools designed to organize tables, databases, and other critical bits of information — are a relatively new, but fast-growing, cottage industry within the $42 billion big data market. It’s not difficult to see why; enterprises forced to contend with thousands (or millions) of files across disparate systems stand to benefit from solutions capable of indexing said data autonomously. In fact, according to Gartner, companies that adopt a curated data catalog will realize roughly twice the business value from their analytics investments in the next year compared to those that don’t.
That’s why firms like Alation aren’t wanting for attention these days. The Redwood City, California startup, which was cofounded by former Oracle executive Satyen Sangani three years ago, today announced that it has secured $50 million in series C financing led by Sapphire Ventures, with participation from new investor Salesforce Ventures and existing investors Costanoa Ventures, DCVC (Data Collective), Harmony Partners, and Icon Ventures. The funding follows a $23 million series B in July 2017 and $9 million series A in May 2015, and comes as Alation sees triple-digit revenue growth and “thousands” of daily users with customers including Daimler, Fox Networks, Hilton Hotels, eBay, Yahoo Japan, HelloFresh, GoDaddy, Groupon, Pepsi, and Munich Re.
Alation will use the investment to “double” its engineering resources and fund research and development initiatives that “build on … machine learning,” Sangani, who serves as CEO, said.
“With the rise of self-service analytics, the Alation Data Catalog has become the crucial single-source of reference for organizations seeking to be data-driven, allowing everyone from the business user to the data scientist to find the data they need and understand whether it is right for the analysis at hand,” he said. “With the support of Sapphire, Salesforce and our existing investors, we’re confident that we’ll continue to accelerate our company growth and meet the growing demand for data catalogs around the globe.”
Alation’s product suite uses machine learning to automatically parse and organize in a single repository technical metadata, user permissions, and business descriptions from sources like Redshift, Hive, Presto, Spark, and Teradata. A handy admin dashboard enables data managers to visually track the usage of assets like business glossaries, data dictionaries, and wiki articles through reports and profiles, while a human-driven curation component — complete with lists, popularity rankings, annotations, comments, and voting — lets users collaboratively organize data across different physical systems, Hadoop files, and more.
Alation taps AI for more than just data organization — its pattern recognition engine, Behavior I/O, makes recommendations based on how information is being used and managed. In Alation’s Compose app, for instance, users get inline sample data as they write SQL database queries, and see color highlights indicating the trustworthiness and quality of queried datasets.
“Much like Google crawls the public internet, the Alation data catalog automatically crawls, parses, and indexes all of an organization’s data and the logs of how data is used,” Sangani told VentureBeat in an earlier interview. It’s a lucrative business: Alation, which offers a yearly subscription for its services, charges up to millions of dollars a year.