DwCA Validator API


The validator can also be used as a json webservice to validate online archives and will return some basic validation information along with a link to the saved regular html report that is stored for one month.

Request Parameters

The full public url to the archive to be validated
An optional ISO date (yyy-mm-dd) to enable conditional get requests, validating archives only if they have been modified since the given date. This feature requires the archive url to honor the if-modified-since http header. Apache webservers for example do this out of the box for static files, but if you use dynamic scripts to generate the archive on the fly this might not be recognised.
An optional identifier for the report to be generated - if not given some automatic unique value will be given. If you use this parameter make sure your identifier is globally unique and will not clash with other report ids used as the validator does not check for existing reports. It will overwrite any existing report with the same id! Urls and UUIDs are good candidates if you really want your own id - otherwise better use the automatically generated one.

JSON Response

Example of a successful validation response with request = http://tools.gbif.org/dwca-validator/validatews.do?archiveUrl=http://ecat-dev.gbif.org/repository/vernaculars/vernacular_registry_dwca_3.zip

  "archiveUrl": "http://ecat-dev.gbif.org/repository/vernaculars/vernacular_registry_dwca_3.zip",
  "httpStatusCode": 200,
  "online": true,
  "valid": true,
  "metadata": true,
  "reportId": "210-308359173575056734",
  "report": "http://tools.gbif.org/dwca-reports/210-308359173575056734.html",
  "fileRecords": {
      "vernaculars.txt": 1315,
      "Taxa.txt": 1306
  "coreRecords": 1306

Example of a not modified validation response with request = http://tools.gbif.org/dwca-validator/validatews.do?archiveUrl=http://ecat-dev.gbif.org/repository/vernaculars/vernacular_registry_dwca_3.zip&ifModifiedSince=2011-06-27

  "archiveUrl": "http://ecat-dev.gbif.org/repository/vernaculars/vernacular_registry_dwca_3.zip",
  "httpStatusCode": 304,
  "online": true
The requested archiveUrl
The http status code when accessing the archiveUrl with http GET.
A simple boolean to indicate whether the archive url was available or offline.
A simple boolean to indicate whether the archive was valid. False if any validation error occurred.
A simple boolean to indicate whether the archive contain readable metadata.
The identifier for the generated validation html report.
The url to the generated html report.
The number of records in the core data files if it exists and is readable.
The number of records for each of the data files found in the archive.