Announcing AI Atlas

Introducing the FastCatalog.ai AI Atlas, a simple tool to explore the lineage of public datasets and AI models. AI models and datasets are highly interconnected, and understanding the origin of the underlying data is critical to determine their compliance profiles.
At FastCatalog.ai, we gather compliance intelligence about AI models and public datasets to help our customers make compliance and legal decisions faster. Before using a public dataset or an AI model, a question remains: does it come with 'strings attached'?
After years at Microsoft legal counseling teams on data and AI model compliance, a lot of time is spent doing basic fact-finding: What are the terms? Any privacy concerns? What about the origin of the data? And with complex lineage, it can be tricky to get a complete picture.
In AI Atlas, each node can represent a dataset or an AI model. We then analyze the documentation for each to map the 'upstream': which assets have been used to train or fine-tune a model, what is the origin of a dataset data, be it other datasets or generated from an AI model.
This enables us to identify the datasets and models that are most often used by others. For example, the Pile is a popular dataset built from Common Crawl, Pubmed and others, that is often used to train AI models.
With AI Atlas, you can navigate from one item to the next, and log into our platform for more detailed compliance insights. If you’re hitting this wall, feel to reach out and let us know if we can help.
.png)