Source Format Library and Creating a New Custom Source Format

A Source Format is a collection of standardized settings and criteria for a dataset which is uploaded frequently from the same source.  The Source Format defines the destination fields and values in Slate and allows users to prescribe consistent import behavior such as new record creation, updates to existing applicant records, and de-duplication within a source file.  Importing data through a Source Format can be automated by configuring standard Slate SFTP options.

Slate's Source Format Library, accessed from the Database menu, offers numerous predefined formats for data sources common to many partner institutions.  See Source Format Descriptions for a list and descriptions of the source formats in the library.  Most existing Source Formats require little to no customization.  Source Formats that import test score data require the corresponding Test object(s) be added from the Slate Template Library. 

Technolutions recommends use of existing Source Formats when possible for importing data. If a standard Source Format does not exist, then a custom Source Format will need to be created.

Important

Exercise extreme caution when uploading datasets of any kind, and in particular, datasets that come in a predefined format such as score report data files. Prior to uploading, always check to confirm that the dataset is consistent with the format type you have selected and with past datasets of this type. Slate is a transactional database, therefore data is constantly changing on a moment-by-moment basis.  As is the case with any enterprise database, there is no way to undo effects the uploaded dataset has already had on records in Slate as part of the import and merge process.

  Best Practice

We strongly recommend testing data imports in the test environment prior to importing data in production. Set up the import in the production environment, refresh your test environment, and then activate the import in the test environment first, to ensure the results are as expected. Once desired results are achieved in the test environment, then run the import in Production.

Generate a Source Format from a Previous Upload

  1. Select Database on the top navigation bar and select Upload Dataset.
  2. Select a previous ad hoc import to be used as the foundation for the new Source Format.
  3. Select Generate Source Format.
  4. Follow the instructions below for configuring the Source Format.

Create a New Source Format for a Source Not Previously Used

  1. Select Database on the top navigation bar and select Source Format.
  2. Select New Source Format to display a dialog box with some default selections.
  3. On the General Tab, fill in the information about your Source Format.  Each of the fields is explained below:
    • Status - Select Active to make the Source Format available for use in Upload Dataset.
    • Name - This is what you would like to call the Source Format
    • Format - This is where you are able specify which format the file will be: For example, Excel, CSV, Fixed Width, etc.  This setting is used solely for display purposes.  The actual formatting of the file is defined in the XML setting on the Format Definition tab.
    • Remap As Of Date - This is where you tell the Source Format how far back to look at data values to make them available in Remap.  For new Source Formats, you can simply put the current date.  A date is required once you are ready to Activate the Source Format. This setting is helpful if a file was ever inadvertently imported without headers. If this should happen, the file's first row, which likely includes data to be mapped, becomes the "headers." To fix this, simply adjust the remap as of date to a date after the errant file was imported. 
    • Remap Active -  When this is updated to Active, Slate will automatically process the uploaded file using the Remap settings, which you will define in a later step.  Leave this as Inactive until you are satisfied with your remap settings.  Prior to activating, the Remap As of Date must be populated or Slate can not reference the effective date.
    • Replace Existing Datasets - If you are importing cumulative files (every person is included every time), Replace Existing Datasets will delete the previous dataset every time a new dataset is imported.  If you have any exports or filters that reference the data stored within the dataset (versus referencing data imported into fields), this option should not be selected.  Otherwise, this is an appropriate option for most cumulative imports.
    • Scope - Select Person/Dataset Records unless importing Slate users.
    • Dataset - This specifies if the dataset will be importing into Persons/Applications, or if it will go into any other datasets (such as Organizations, or Volunteers).
    • Unsafe - If a Source Format is Unsafe, it will be allowed to update person-scoped data on student records with an active application. A Source Format should be marked as Unsafe only if the source is trusted to contain accurate, and up-to-date information. You do not want an import to update information that the student recently provided you on an application with information that may be outdated or incorrect.

    An import into Slate from your student information system is an example of a Source Format that you might want to be Unsafe.

    Purchased lists of student information are examples of imports that should be Safe Source Formats.

    • Hide - The recommended setting is to Create source interactions unless the Source Format will be used for a daily feed from an SIS or other application.  Daily or frequent data updates/imports will contribute to the creation of many rows of data interactions on records.  Aside from the exceptions of high frequency imports, it is preferable to create source interactions to display activity history on a record's timeline.
    • Disable Update Queue - By default, records that are imported or updated from this Source Format will enter an update queue. This queue is used to run deferred rules and also adds records to queues for those data exports that are configured to only include records updated since the query last ran.

      If the Source Format is a backfeed that sets irrelevant information in the context of query update queues and rules, then this setting should be set to Prevent records from entering update queue upon import (disables rules from firing). Using this setting will prevent a feedback loop, where each updated record is exported from Slate into an external system, which then exports data for each of those updated records back to Slate. If the update queue were enabled, those records in that import would be included in the export again the next time it ran, even if the import was the only update on those records.
    • Update Only - This tells Slate how to handle records in the incoming dataset that do not match records within Slate.
      • Allow record creation means that Slate will create a new record for any records in the incoming dataset that do not match records in Slate.
      • Update only means Slate will not create new records, and the dataset will only update records on which it matches.  Note that when Update Only is selected for an application data import, an application will not be created even if the person record already exists.
    • Dedupe Records - By default, Slate will create a new record for every row when Allow record creation is selected and an existing match cannot be found.  By selecting Dedupe records, Slate will evaluate the source file and dedupe records with the source based on an exact match of First Name, Last Name, Email, and Birthdate.
    • Notification and Notification Email - To receive email notification if an imported file associated with the Source Format fails, select Failures only and enter the email address(es) for notification. Select Successes and Failures to be notified of both successful and failed imported files for the Source Format.
    • Read Permission and Upload/Build Permission - Permission to the Source Format can be limited to specific permissions. 
  4. The settings on the Format Definition Tab are optional and will not apply to every Source Format.
    • XML - This is where you are able to really define the file that you are sending to Slate. You are able to tell it if it is delimited, if there is a header row, or even define Fixed Width files (See below for common examples).
    • SQL - For most data imports, the SQL text area should be left blank. Please submit a service request if you feel that a Source Format will require custom SQL.
    • PGP Private Key - Paste the private key from the PGP software.
    • PGP Password Each PGP private key is typically encrypted with a password.
  5. The settings on the Import Automation Tab are optional and will not apply to every Source Format.
    • Import Path/Mask - This is the field that tells Slate that you want to pull in files automatically. In this field, you should include the file name pattern that Slate should look for to know that the file should be imported using this Source Format. We recommend appending the file name with a date so you are not using the exact same name each time. Slate does keep the files for a period of time in a Processed directory within the /incoming folder - if the file names are exactly the same, each subsequent file load will delete the file from the Processed directory.

    An example Import from FTP Mask field would be: SIS_to_Slate_*.txt.

    Slate would automatically import any file in the /incoming directory that contained that file name pattern, including: SIS_to_Slate_09032013.txt.

    • Import Remote Server - Leave this field blank unless the data for this Source Format will be delivered via an API call from Slate or an external SFTP server. 

      - For API calls, include the full URL, including https:// and any necessary querystring parameters.
      - For remote SFTP servers, the syntax (including support for Remote Service Account users) is exactly the same as when configuring a Remote Server for data exports.
    • HTTP Headers - Add a list of header in the form, for example: <t name="HEADER">VALUE</t>
    • Import Order - If you specify an Import Path/Mask, you may optionally specify an import order if the order that Slate processes the file matters. For example, if you have an application file and a supplement file that will be delivered to the SFTP directory together, and you would want the application file to be processed prior to the supplement, you can enter 1 for the application's Source Format and 2 for the supplement's Source Format.
    • Import Frequency - By default, Slate sweeps the SFTP directories for new files that match the specified Import Path/Mask every 15 minutes. If you will only be dropping off files once per day, you can choose to increase the frequency by specifying the number of minutes between file sweeps - e.g., 1440 minutes.

      The Import Frequency setting is limited to only those Source Formats picking up from remote SFTP servers and web services—not the Technolutions SFTP. 
  6. Select Save when you are satisfied with the Source Format settings.
Common XML Examples
CSV File with a header row and text qualifiers:

<layout b="," h="1" t="&quot;" />

  • b="," - this indicates that the file is delimited by a comma (,).
  • h="1" - this indicates that the file contains a header row.
  • t="&quot;" - this indicates the data has double quotes around the text.  

The data row looks like this:
"Mr.","Joe","Brown","1745 E. New St., Apt. 22","Chicago","Illinois","46702"


Double-quotes are wrapped around each field, while a comma separates each field. By including the quotes, the program interprets "1745 E. New St., Apt. 22" as one field rather than two.  

 

Tab-Separated file with a header row and no text qualifiers:

<layout b="&#x9;" h="1" />

Slate supports additional file separators. For example, you will indicate that it is a pipe-delimited file like this: 

<layout b="|" h="1" />

 

Excel File with a header row:
<layout type="convert" h="1" />

 

CSV File without a header row:

Files without a header row would need to have the name of the fields defined. For example, a CSV file with text qualifiers and no header would be defined like this:
<layout b="," h="0" t="&quot;">
  <f s="1" id="First Field Name" />
  <f s="2" id="Second Field Name" /> 
</layout>

  • h="0" - This indicates that the file does not contain a header row.Within the <f ../> tags above:
  • s="1" - This means that you are defining the first field in the file, defined as the beginning of the file until the delimiter (in this case, a comma).
  • id="First Field Name" - This is what the ID of the field should be. It should be unique among the other fields in the Source Format. This is also what you will see as the "Source" field when you are mapping the values on the Remap page. Note: this ID should be no more than 64 characters in length.

 

A fixed-width file (no delimiters, where the beginning and end of each field is defined based on the specific location in the row) requires that the name, start index, and width are defined in the layout:

<layout width="43">
    <f s="1" w="25" id="Student Last Name" />
    <f s="26" w="16" id="Student First Name" />
    <f s="42" w="1" id="Student Middle Initial" />
  </layout>

The layout width defines how many characters each row of the file has. In the above example, each row would be 43 characters long.

Within the <f ... /> tags:

  • s="1" - This means that the field starts at character 1.
  • w="25" - This means that the field is 25 characters long.
  • id="Student Last Name"  - Like above, this is what the ID of the field should be. It should be unique among the other fields in the Source Format. This is also what you will see as the "Source" field when you are mapping the values on the Remap page. Note: this ID should be no more than 64 characters in length.

  Tip

Setting-up an XML or JSON file for import? Find examples of XML and JSON source formats in our article Importing Data with Web Services.

Upload Dataset

Now you will be able to upload a sample file using the Source Format that you created so you can define the Remap settings.

  1. Select Database on the top navigation bar and select Upload Dataset.
  2. Select the Source Format that you created from the File Format list.  
    • Source Formats will appear within the list of Predefined Formats
  3. Add a sample file, then click Upload.
    • The sample file should exactly match the format that you intend to send Slate.
    • Slate will process the file at this time.  When it has completed, the status on the Upload Dataset main page will change from Pending to Completed or Awaiting Activation.
  4. When the file has uploaded, clicking on the name of the Source Format will link to the Remap stage where source fields can be mapped to destination fields within Slate.
  5. When you are satisfied with how the file is mapped, go back to the new Source Format in the Admin Tool and set the Remap Active setting to Active.  Slate will then process any files that were uploaded on or after the Remap Effective date using the remap settings that you defined.  Any additional files that are uploaded to Slate using the new Source Format will also use the Source Format and remap settings that you configured using these steps.
Was this article helpful?
9 out of 15 found this helpful