Creating Valid Croissant Metadata with MCP

Machine learning thrives on high-quality data. A crucial part of this is accurate and standardized metadata. The MLCommons Croissant format provides a common standard for describing ML datasets, making them easier to find, understand, and use across different tools and platforms.

However, creating valid Croissant metadata files can be tricky. Errors or inconsistencies can lead to problems downstream, hindering collaboration and reproducibility. That’s why validating your Croissant file is essential. It ensures your metadata adheres to the standard, is correctly structured, and accurately represents your dataset.

Jetty provides a public Model-Context-Protocol (MCP) server specifically designed to validate MLCommons Croissant files. By connecting your MCP-enabled client (like the Croissant editor or validation libraries) to our server, you can easily check your metadata for compliance against the Croissant specification. The implementation is open source and available here.

1. Setting Up MCP Croissant Validation

To configure your MCP-enabled client to use Jetty’s validation server, add the following endpoint configuration.

Visual Studio Code

Settings > MCP > Edit:

"jetty": {
  "type": "http",
  "url": "https://mcp.jetty.io/mcp/",
  "headers": { "VERSION": "1.0" }
}

Cursor Settings

Settings > Cursor Settings > MCP:

{
  "mcp.jetty.io": {
    "url": "https://mcp.jetty.io/mcp/"
  }
}

Note: Visual Studio Code only supports the MCP endpoint when in “agent” mode.

2. Creating Croissant Metadata from a Dataset

With the MCP endpoint installed, you can instruct your development environment on how to build an MLCommons Croissant metadata file.

3. Validation and Fixing Croissant Metadata

Be sure to check if there are issues with your metadata using the MCP endpoint. Then have your LLM agent make the corrections.

Effortless Dataset Metadata

1. Setting Up MCP Croissant Validation

Visual Studio Code

Cursor Settings

2. Creating Croissant Metadata from a Dataset

3. Validation and Fixing Croissant Metadata