Once you've completed an Importer, you can monitor its progress on the Importer's Jobs page. This page will give you feedback on the health and quality of the data you're importing.
At the end of this tutorial, you’ll be able to
- Use the Importer's Jobs page to understand the quality of your data
View of the data importer log
1. Navigate to the Importer's Jobs page
To navigate to a data source's Importer, select the **Sources** button and then find the data source you want to monitor from the list. Select the source and then navigate to the Importer section on the page.
Sources page in Seattle
Once on the Source's page, navigate to the Importer section and select the "Jobs" link. On the Jobs page, you should see a list of importers, their start/end dates, and their status. Select the dropdown on any importers to see its details and log.
The Importer section will give you quick insights on when the last import happened and a repair notification if it failed.
An Importer log for a CSV on Atlanta's Police Reports. The log contains information on which rows were unable to be transformed and any feedback on the error.
2. Types of Importer Errors
The job status will show you the health and integrity of your datasets. When there are errors, the log will return more information on what caused a problem.
The row was successfully imported and will count as a valid data point for the source.
An example of a successfully completed Importer
"Skipped" is when the Importer matched your filter and ignored this row. See the section on skipping for more information on how to filter data when importing.
An example of using a filter on data before importing. This filter will skip any rows that don't have a Building Permit number. Any rows that don't have a unique permit number will count as skipped in the Importer log.
"Errored" is when a transform function had an error and the row was not imported. This could be due to data containing bad rows, for example: invalid location coordinates. Check the job status log for the importer to trace the error. Also, look at the importer utilities tutorial to see documentation on the transformation utilities.
In this example, the police report data contained rows that errored in the transform process. For these rows, the error was a bad request for the geo.locate utility.
"Rejected" is when a row was not imported because it didn't meet a required validation, such as having a unique ID or a location. For example, in the Building Permits dataset, some permits were missing locations, so those rows would be rejected. Check out the data type's validations as well as the job status log to trace the error.
In this example, the police report data contained rows that were not in the Atlanta city boundary. These rows were counted as rejected.
“In God we trust, everyone else bring data.” Michael Bloomberg, Bloomberg Philanthropies
Have any questions or running into issues with this feature?
Reach us at: firstname.lastname@example.org