This is a tutorial about importing data from a public data portal into Stae. You'll start from scratch, first creating a Data Source on Stae and then creating an Importer for that source. This example uses Building Permits as the Data Source from the City of Seattle Open Data portal.
At the end of this tutorial, you’ll be able to
- Create a Data Source using one of Stae's official Data Sources
- Use Stae's Importer to transform data from a public portal or flat file on your local machine using the visual importer
- Schedule an interval that Stae will pull data from the public data portal
View of the Importer for NYC Tree Data
1. Create a Data Source
To create your own Data Source, navigate to **Create a Source Creator** on the Sources Page.
- Step 1: Name your source and give it a simple description.
- Step 2: Choose whether you’d like the source to be public or private.
- Step 3: Search for the Data Type that best matches your Data Source. In this example, use the Building Permits Data Type.
- Step 4: Select the Fetch option to pull data from Seattle's open data portal.
To create a new Data Source, navigate to the Sources page and click the "Create a Source" button on the top-right corner.
Step 1: Add a name and description to your Data Source
Step 2: Publish the Source to Seattle and make it visible to the public since the source originates from an open data portal. Stae Admins can select different visibility permissions based on their needs.
Step 3: Select the Data Type that maps to your Data Source. In this example, select the Building Permits Data Type.
Step 4: When done making the source, if you select "fetch" it will take you straight to the importer builder.
2. Create an Importer for your new Data Source
Now that you've created a new Data Source, you're going to create an Importer that will fetch data from an API. In the example below, you'll import Building Permit Data from the City of Seattle Open Data portal. If you're working from a different city, the process will be the same as importing data from Seattle's open data portal.
- To begin, select Create Importer on the Data Source's Details Page.
- Next, find the Building Permit API link from the open data portal. See the image below for a
reference on what that page looks like.
- Step 1: Copy the URL of the dataset's GeoJSON to the Address field. Select Test Now to preview the URL at the Output Preview column on the right-hand side.
- Step 2: Select the properties you'd like to target from the API.
- Step 3: Select the visual importer to transform the fields into Stae's official Building Permit data standard. Map your fields and, if applicable, filter out any of the API's input.
- Step 4: Select a time interval that you'd like the Importer to run and collect data
from the API.
On the source details page for the Building Permits source you just created, navigate to the Importers section and select "Create Importer".
On Seattle's Building Permits page, select the API dropdown and change the endpoint to GeoJSON format. Copy the endpoint and paste it into Step 1 of the Importer.
Step 1 - Copy the URL for your data
Select JSON from the dropdown menu and then paste the GeoJSON link into the address field. Select Test Now to see the raw output preview.
This section is for HTTP request headers and useful if you need to provide additional information like authentication.
For many public portal API endpoints, there is a limit on how much data it can return at once - for these you will need to configure the limit parameter. For this dataset we set a limit for the dataset at a high value, which allows us to retrieve the entire dataset.
For endpoints that require log-in and authentication, select the OAuth tab and enter either an Auth Code or a Username/Password provided to you by the vendor. By default, OAuth is disabled.
Select a time interval on which you'd like to automate fetching the data. Selecting manual will require a user to manually run the importer from the data source details page.
Upload data from a flat file
If you have a dataset you'd like to import from your local machine, you can select the file tab to drag and drop a file. Your file will be hosted on our cloud storage and available to export as a CSV, JSON, and GeoJSON. This step does not create a link to your local machine file path. For dynamic data, you'll want to use an API link.
Step 2 - Select all or a portion of the data
Select the data fields you'd like to import. In most cases, you'll want to import all of the data set. Click the dropdown and navigate to features > ALL to fetch all of the fields.
Step 2 - Format Fields
You can use our intelligent formatting tools to standardize fields like dates, time, longitude and latitude, and convert fields that contain text into numbers. Selecting the raw option will keep the data as is. Basic will do some minor formatting to types and values. For data that requires cleaning, aggressive and extreme formatting will attempt to normalize column names and fields into more usable types.
Step 3 - Mapping fields and creating filters
Map all of the data fields to the Building Permit data type. You can toggle between the input response and the transformed output by mousing over the preview window.
A Data Type in Stae is a specification that data must conform to. For example, the Building Permit type requires that each data point have a unique ID and a geographic location specified. Start by mapping ID and Location. In this dataset, PermitNum should map to ID and Geometry should map to Location.
Continue to map all the fields by selecting each of them in the dropdown. For some fields like dates, the Importer will automatically target the field to be imported while others require some knowledge of the data set being imported to be correctly mapped to Stae.
Step 3 (optional) - Creating Filters
Once you've mapped all the fields possible, you can also choose to filter the dataset you'd like to import. For example, you can skip any rows that don't have a location input.
If applicable, this filter will skip all rows before 2018 as well as any rows that don't have a location.
Step 4 - Name and describe your importer
Name and describe your Importer. Select "Save" when you're ready to import the data.
3. Monitor the Importer
Once you've created an Importer, you can monitor its progress on the "Jobs" page. Select this link on the Importer section. On the jobs page, you should see the importer you just created running. Select the dropdown on the Importer currently running to see its progress.
If a row had a validation error, was skipped, or rejected the importer log will return which row contained the error.
Job Status Definitions
The job status will give you an idea on the health and integrity of your datasets. When there are errors, the log will return a response on what caused the problem.
Success: The row was successfully imported
Skipped: Matched your filter and ignored this row
Errored: The transform function had an error and the row was not imported. This could be due to data containing bad rows, for example invalid location coordinates. Check the job status log for the importer to trace the error.
Rejected: The row was not imported because it didn't meet a required validation such as having a unique ID or a location. For example, in the Building Permits dataset, some permits were missing locations so those rows would be rejected. Check out the data type's validations as well as the job status log to trace the error.
A list of importer jobs along with its status and start/end times.
*“The essential characteristic of the city ... is that it demands participation.”*
Lawrence Halprin, Cities
Have any questions or running into issues with this feature?
Reach us at: email@example.com