Welcome to the Valyu Exchange documentation!

What is Valyu Exchange?

Valyu Exchange is a platform which connects data providers / creators and users to help accelerate the growth of AI models and applications, through training data licensing, provenance and rights clearing. Here you can discover, curate, license and monetise your datasets.

Under the hood it is more than just an exchange. Comprehensive data infrastructure allows you to easily license, govern and ensure quality of dataset assets for model training and fine tuning tasks. Our SDK fits into your current AI workflows to allow your team to spend less time on data curation and more time training your models.

No matter the size of your team, Valyu Exchange can help you manage your AI data needs.

Main Features

Data discovery

Effortlessly search, filter and compare datasets to allow you to find the best datasets for your needs. As ML engineers, we know first hand how slow and painful dataset sourcing and acquisition can be, that's why we make it as easy as possible to find the right dataset for your specific use-case.

Datacard Standard

Easily understand the key features of your dataset, from dataset characteristics, to provenance, in order to understand how the data can affect your model performance.

Dataset Provenance

See from who and which sources datasets has been curated from and how its being used. Too often bad quality copyright data is shoved into models, understanding where your training data came from is crucial.

Dataset Licensing

Straightforwardly license your data and automatically check for potential licensing conflicts. Traditionally, single data licensing deals can be an incredibly slow and lengthy process. We make it easy to license your datasets, to as many AI companies as you choose, 10x faster than possible before. Choose from standard licensess or use our tailor made licensing templates specific for AI.

Dataset Governance

Provide and revoke fine-grained access policies to datasets through our governance tooling and programatically enforced licenses. This is done entirely p2p, so we never have to take custody of your datasets.

Synthesise New Datasets

Build upon existing datasets using our zero-code data synthesis infrastructure with integrated provenance and licensing tooling.

All the aforementioned tooling is being rolled out through our SDK for a more developer driven experience.

