This is a tutorial about importing data from a public data portal into Stae. You'll start from scratch, first creating a Data Source on Stae and then creating an Importer for that source. This example uses Building Permits as the Data Source from the City of Seattle Open Data portal.
This Tutorial is for Admins Only
Only accounts with Admin Access can import data.
To start, find the Building Permits dataset from your city's open data portal. If one doesn't exist, you can use another common dataset like 311 issues. For non-admins, if you have a dataset you'd like to upload to Stae, get in touch with us at firstname.lastname@example.org
At the end of this tutorial, you’ll be able to
- Create a Data Source using one of Stae's official Data Sources
- Use Stae's Importer to transform data using the visual importer
- Schedule an interval that Stae will pull data from the public data portal
View of the Importer for NYC Tree Data
1. Create a Data Source
To create your own Data Source, navigate to **Create a Source Creator** on the Sources Page.
- Step 1: Name your source and give it a simple description.
- Step 2: Choose whether you’d like the source to be public or private.
- Step 3: Search for the Data Type that best matches your Data Source. In this example, use the Building Permits Data Type.
- Step 4: Select the Fetch option to pull data from Seattle's open data portal.
To create a new Data Source, navigate to the Sources page and click the "Create a Source" button on the top-right corner.
Step 1: Add a name and description to your Data Source
Step 2: Publish the Source to Seattle and make it visible to the public since the source originates from an open data portal. Stae Admins can select different visibility permissions based on their needs.
Step 3: Select the Data Type that maps to your Data Source. In this example, select the Building Permits Data Type.
Step 4: When done making the source, if you select "fetch" it will take you straight to the importer builder.
2. Create an Importer for your new Data Source
Now that you've created a new Data Source, you're going to create an Importer that will fetch data from an API. In the example below, you'll import Building Permit Data from the City of Seattle Open Data portal. If you're working from a different city, the process will be the same as importing data from Seattle's open data portal.
- To begin, select Create Importer on the Data Source's Details Page.
- Next, find the Building Permit API link from the open data portal. See the image below for a
reference on what that page looks like.
- Step 1: Copy the URL of the dataset's GeoJSON to the Address field. Select Test Now to preview the URL at the Output Preview column on the right-hand side.
- Step 2: Select the properties you'd like to target from the API.
- Step 3: Select the visual importer to transform the fields into Stae's official Building Permit data standard. Map your fields and, if applicable, filter out any of the API's input.
- Step 4: Select a time interval that you'd like the Importer to run and collect data
from the API.
On the source details page for the Building Permits source you just created, navigate to the Importers section and select "Create Importer".
On Seattle's Building Permits page, select the API dropdown and change the endpoint to GeoJSON format. Copy the endpoint and paste it into Step 1 of the Importer.
Step 1 - Copy the URL for your data
Paste the link GeoJSON link into the address field. Select Test Now to see the raw output preview. Next select, the API button and then select JSON from the dropdown menu.
For many API endpoints (like Seattle's open data portal), there is a limit on how much data it can return at once - for these you will need to configure pagination, which allows us to submit multiple requests to retrieve the entire dataset.
Step 2 - Select all or a portion of the data
Select the data fields you'd like to import. In most cases, you'll want to import all of the data set. Click the dropdown and navigate to features > ALL to fetch all of the fields.
Step 3 - Mapping fields and creating filters
Map all of the data fields to the Building Permit data standard in Stae.
A Data Type in Stae is a specification that data must conform to. For example, the Building Permit type requires that each data point have a unique ID and a geographic location specified.
Start by mapping ID and Location. In this dataset, PermitNum should map to ID and Geometry should map to Location.
Continue to map all the fields by selecting each of them in the dropdown. For some fields like dates, the Importer will automatically target the field to be imported while others require some knowledge of the data set being imported to be correctly mapped to Stae.
Step 3 (optional) - Creating Filters
Once you've mapped all the fields possible, you can also choose to filter the dataset you'd like to import. For example, you can skip any rows that don't have a location input.
If applicable, this filter will skip all rows before 2018 as well as any rows that don't have a location.
Step 4 - Name, describe, and select an interval for your importer
Name and describe your Importer. Lastly, you'll assign a regular interval to import your data to Stae. Select "Save" when you're ready to import the data.
3. Monitor the Importer
Once you've created an Importer, you can monitor its progress on the "Jobs" page. Select this link on the Importer section. On the jobs page, you should see the importer you just created running. Select the dropdown on the Importer currently running to see its progress.
If a row had a validation error, was skipped, or rejected the importer log will return which row contained the error.
Job Status Definitions
The job status will give you an idea on the health and integrity of your datasets. When there are errors, the log will return a response on what caused the problem.
Success: The row was successfully imported
Skipped: Matched your filter and ignored this row
Errored: The transform function had an error and the row was not imported. This could be due to data containing bad rows, for example invalid location coordinates. Check the job status log for the importer to trace the error.
Rejected: The row was not imported because it didn't meet a required validation such as having a unique ID or a location. For example, in the Building Permits dataset, some permits were missing locations so those rows would be rejected. Check out the data type's validations as well as the job status log to trace the error.
A list of importer jobs along with its status and start/end times.
*“The essential characteristic of the city ... is that it demands participation.”*
Lawrence Halprin, Cities
Have any questions or running into issues with this feature?
Reach us at: email@example.com