While schema provides valuable validations of general structure, it is limited in terms of validating data on bigger picture, for example to make sure there is no duplicated of the data. This is where custom validation framework for datasets comes.
Basic principles
To make validations useful, platform needs to make sure that for valid datasets (i.e. passes schema) there is some scenario that validation would succeed, and at least scenario that it would fail.
This is a bit of test-driven-development approach. First we need to submit at least one succesful, and at least one failed dataset. Then we write code to make sure validation succes.
Submiting dataset
Writing validators is usually for particular data type (for example, list of store locations) rather than particular task (validate Maxima store locations coordinates are valid).
Example. Lets validation that each apartment in the list has unique ID. First submit your data to particular schema:
After successful validating schema, API returns saved snapshot ID (field <snap>) which we will be able to use in validator code:
import requests
print(requests.get('http://dev.citynow.org/api/snap/10').json())
The code above would print exact same data which was used to create “snapshot”, i.e.:
[{'status': 0, 'id': 'a1'}, {'status': 1, 'id': 'a1'}]
Now lets write core validation code:
Lets put the last pieces together.
First – notice that we have hardcoded snapshot ID. After finishing debugging your validator, change it to read input snapshot from environmental variable <snapshot> instead (refer to article about setting up environmental variables), like this:
import os
snap_id = os.environ.get(snap)
data = requests.get('http://dev.citynow.org/api/snap/%d' % snap_id).json())
Finally, we should change validator so that instead of throwing Exception in case of error, it would throw exception by calling API hook:
data = {
'message': message,
'valid': valid,
}
print(requests.post('http://dev.citynow.org/api/validate/%s' % snap_id , data=data).json())
Full program:
Success
Though not required





