DwCA Validator API

http://tools.gbif.org/dwca-validator/validatews.do

The validator can also be used as a json webservice to validate online archives and will return some basic validation information along with a link to the saved regular html report that is stored for one month.


Request Parameters

archiveUrl
The full public url to the archive to be validated
ifModifiedSince
An optional ISO date (yyy-mm-dd) to enable conditional get requests, validating archives only if they have been modified since the given date. This feature requires the archive url to honor the if-modified-since http header. Apache webservers for example do this out of the box for static files, but if you use dynamic scripts to generate the archive on the fly this might not be recognised.
reportId
An optional identifier for the report to be generated - if not given some automatic unique value will be given. If you use this parameter make sure your identifier is globally unique and will not clash with other report ids used as the validator does not check for existing reports. It will overwrite any existing report with the same id! Urls and UUIDs are good candidates if you really want your own id - otherwise better use the automatically generated one.

JSON Response

Example of a successful validation response with request = http://tools.gbif.org/dwca-validator/validatews.do?archiveUrl=http://ecat-dev.gbif.org/repository/vernaculars/vernacular_registry_dwca_3.zip

{
  "archiveUrl": "http://ecat-dev.gbif.org/repository/vernaculars/vernacular_registry_dwca_3.zip",
  "httpStatusCode": 200,
  "online": true,
  "valid": true,
  "metadata": true,
  "reportId": "210-308359173575056734",
  "report": "http://tools.gbif.org/dwca-reports/210-308359173575056734.html",
  "fileRecords": {
      "vernaculars.txt": 1315,
      "Taxa.txt": 1306
  },
  "coreRecords": 1306
}

Example of a not modified validation response with request = http://tools.gbif.org/dwca-validator/validatews.do?archiveUrl=http://ecat-dev.gbif.org/repository/vernaculars/vernacular_registry_dwca_3.zip&ifModifiedSince=2011-06-27

{
  "archiveUrl": "http://ecat-dev.gbif.org/repository/vernaculars/vernacular_registry_dwca_3.zip",
  "httpStatusCode": 304,
  "online": true
}
archiveUrl
The requested archiveUrl
httpStatusCode
The http status code when accessing the archiveUrl with http GET.
online
A simple boolean to indicate whether the archive url was available or offline.
valid
A simple boolean to indicate whether the archive was valid. False if any validation error occurred.
metadata
A simple boolean to indicate whether the archive contain readable metadata.
reportId
The identifier for the generated validation html report.
report
The url to the generated html report.
coreRecords
The number of records in the core data files if it exists and is readable.
fileRecords
The number of records for each of the data files found in the archive.