# Data Import

# Target Audience

The target audience for this document may include:

  • Developers and implementers of this system
  • User data administrators preparing to import data

# Feature Description

This system follows the principle that everything is data. All types of data support importing through CSV files, with operations such as creation, update, and deletion supported during import.

TIP

The CSV parsing format of this system follows the RFC 4180 (opens new window) standard.

# Data Import Methods

Currently supported import methods include:

# Auto import upon system startup

The system's data supports automatic import of a set of CSV files from a configured server directory when the system starts, refer to Configure Import File Path Development for details

# Import from domain list page

  • For each domain model, manual import of CSV files is supported through the system interface.

# Import using "Reload Seed Data"

  • On the domain model list page, use the "Reload Seed Data" action to reload seed data from configured seed data folder

# Import using "Import Seed Data Package"

  • On the domain model list page, use the "Import Seed Data Package" action to upload the packaged zip file containing .CSV and other referenced files.

The platform development environment comes with a gradle task called packageSeedData to generate such a zip file

After import, you can view the import process records on the system interface.

TIP

Seed Data generally refers to a set of initial data inserted into the database during application initialization or deployment. This data is used to provide the basic data structure and reference data required by the application to ensure normal operation.

Specifically, the design principle of the Muyan low-code platform is that everything, including customized code, is data, so the entire business system is built entirely from seed data.

The range of supported import objects includes:

  • All object definitions in the system, such as users, user groups, roles, and other permission-related data, as well as organizations, contracts, contract lines, contract line definitions, etc. This also includes domain model definitions, domain model field definitions, form definitions, dynamic fields, dynamic field definitions, and all other metadata.

If you prefer to learn through an example, you can directly check A Complete Example.

# Feature Activation Development

This feature is enabled by default and does not require configuration to activate. However, the normal operation of this feature depends on the configuration of the import file path, as described below.

# Configure Import File Path Development

You need to configure the seedData.folder option in the application.yml file. This option specifies the root path for reading CSV files. Below is an example configuration for the development environment:

development:
    seedData:
      folder: ${SEED_DATA_FOLDER:/app/data}
1
2
3

TIP

This configuration will prioritize reading from the environment variable SEED_DATA_FOLDER

The directory structure is briefly described as follows. The first-level subdirectories of this directory correspond to different system environments. For example, all CSV files to be imported in the development environment should be placed in the development subdirectory. The currently predefined environments are listed as follows:

In the directories of each environment, seed files for different tenants are stored in subdirectories for different tenants.

  • development
  • testing
  • staging
  • production

In each environment's data directory, there will be an Organization.csv file that stores the list of top-level tenants to be imported first into the system. This file will import the top-level organization and tenant identifier for each tenant, serving as the basis for importing all tenant-specific data.

├── README.md
├── attachments --> Generally, attachments that need to be imported are placed in this directory, such as logos used by DynamicTheme
│   ├── README.md
│   ├── background.jpeg
│   ├── favicon.png
│   └── logo.png
├── css --> Generally, customized css files for DynamicTheme are placed in this directory
│   └── README.md
├── development --> (development environment) 
├── groovy --> Generally, customized DynamicLogic source code is placed in this directory
├── review --> (review environment)
│   ├── Organization.csv --> (List of all top-level tenants in the system to be imported first)
│   ├── README.md
│   ├── jiayu --> Corresponds to the tenant name of the top-level tenant
│   │   ├── DomainColumnClientSideTypeConfig.csv
│   │   ├── DynamicAction.csv
│   │   ├── DynamicActionDomainClass.csv
│   │   ├── DynamicActionDomainClass_jiayu.csv
│   │   └── UserGroup.csv
│   └── muyan --> Corresponds to the tenant name of the top-level tenant
│       ├── Group.csv
│       ├── GroupRole.csv
│       ├── Organization.csv
│       ├── RequestMap.csv
│       ├── Role.csv
│       ├── User.csv
│       └── UserGroup.csv
└── tree.txt
    
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

# File Update Determination

Before importing, for each CSV file, the system will compare the md5 code of the CSV file to be imported with the md5 code of the previously imported CSV file. If the md5 codes of the two files are the same and the status of the previous import was successful, the system will skip this import.

# Data Preparation and Import

The following sections describe the content related to CSV file data preparation.

# File Names

All CSV files should start with the class name of the object, without the package definition. For example, CSV import files for tech.muyan.Contract and tech.muyan.security.GroupRole objects can be named Contract.csv and GroupRole.csv, or Contract_2019.csv, GroupRole_security.csv, etc. The .csv suffix should be in lowercase.

# Import Order and Dependencies

For a set of data with associated relationships, the system will automatically identify the dependencies between them based on the association relationships in the model definition, and automatically import the dependent data before the data that depends on it during import.

If special dependency relationships need to be handled, please define them using loadAfter in the extInfo of the domain definition.

# Dynamic Domain Models

  • For detailed information on specifying the Seed Data import order in dynamic domain model definitions, refer to the Domain Model Metadata section.

# Legacy Domain

  • Here's an example of defining the loadAfter property in Legacy Domain definitions:

The following is an example from the DynamicIntegration Domain definition:

  // Indicates that Organization, DynamicLogic, and User objects need to be imported before importing DynamicIntegration
  static loadAfter = [Organization, DynamicLogic, User]
1
2

If a certain type of object should be imported first, to prevent the system from parsing its dependencies from its Domain definition, the loadAfter property can be set to an empty array. Here's an example from the DynamicLogic Domain definition:

  static loadAfter = []
1

# Header Row

The first row of all CSV files to be imported should be the header row. The header row describes the structure of the data in the CSV file. Below is an example of a header row for a CSV file importing a Contract object and its explanation:

name(*),title,effectiveDate,ownerOrganization.name,Tags(#),Purpose(#),Total Amount(#),Seal Type(#),Contract Quantity(#),language,DELETE_FLAG
1
  • For basic fields defined in the model, the column name is the field name in the model definition.
  • For dynamic fields customized in the interface, the column name is the value of the Label field displayed on the object's interface.
  • Columns with the suffix (*), such as name(*), are query fields, as described in the Query Fields section below.
  • Columns containing ., such as ownerOrganization.name, are associated object fields, as described in the Associated Object Queries section below.
  • Columns with the suffix (#), such as Purpose(#), are dynamic fields, as described in the Dynamic Fields section below.
  • The column named DELETE_FLAG is a special column. Its value can be Y or N. When it's Y, it indicates that the system needs to delete the object found through the query fields during import.
  • Columns with the suffix (F), such as code(F), indicate that the content of this column will be read from a file. The starting directory for the file is the directory configured by seedData.folder.
  • If the field type defined in the Domain for a certain column is StorageFieldValue, then the value in that column should be the file path of the attachment to be imported.

WARNING

Internally in the system, this field name is saved in the label field of the DynamicFieldInstance object. Be careful to distinguish this from the label field of the DynamicFieldDefinition field.

# Comment Lines

Lines starting with a semicolon (;) are comment lines and will be ignored during import. Below is a sample import CSV file containing comments. All lines starting with ; are comment lines and will not affect the import process.

name(*)
; Normal user, only have permission to login
USER
; Admin user
ADMIN
; Developer user
DEVELOPER
; Biz user, can operate on biz objects(contract)
BIZ_USER
; Biz Admin
BIZ_ADMIN
1
2
3
4
5
6
7
8
9
10
11

# Query Fields

To support creating or updating existing records through CSV files, the system supports querying existing records through query fields. All columns in the header of the import file with the suffix (*) will be identified by the system as query fields. The system will use strict equality matching to query existing records from the system based on the values of all query fields, and perform the following corresponding processing:

  • If the query result is empty, the system will attempt to create a new record and save it, and record the operation for this row of data in the import record as creation.
  • If the query result is not empty and only one record exists, the system will attempt to update that record, and record the operation for this row of data in the import record as update.
  • If the query result is not empty and more than one record exists, the system will skip the import of this row of data, and record the operation for this row of data in the import record as skipped.

TIP

If there are no query fields in the CSV file being imported, it means that the update and delete functions are not available, and all records will be processed as new creations during import.

# Associated Object Queries

The system supports querying associated objects through associated object fields and associating the queried associated objects with the objects to be imported.

For example, the field ownerOrganization.name mentioned above indicates that when importing the ownerOrganzation field of this object (this field is an Organization object), for each row of data being imported, it uses the name attribute of the Organization type to query existing Organization object records based on the value in that column of the CSV file, and associates the queried records with the Contract object to be imported. Depending on the query results, the specific handling may be as follows:

  • If the queried associated object does not exist, the import of that row is terminated, and the operation for this row of data in the import record is marked as failed.
  • If there is one and only one queried associated object, the import of that row continues, with the final result depending on the operations of other columns.
  • If there are multiple queried associated objects, the import of that row is terminated, and the operation for this row of data in the import record is marked as skipped.

TIP

Associated fields can also be used as query fields

WARNING

When importing an object, whether creating or updating, its dependent associated objects must already exist in the system, otherwise the import of that object will fail

# Dynamic Fields

When importing data, the system also supports importing values for dynamic fields created on the interface. All dynamic fields in the header row of the import CSV file should end with (#). The saving of dynamic field values occurs after the main object has been saved. For example, in the CSV file mentioned above, for the column titled Purpose(#), during import, the system will query the contract dynamic field with the Label "Purpose" and perform data import.

If the field is a selection field, then during import, the system will check whether the value in that column of the CSV file is within the range of options for that field. If its value is not within the optional range, it will terminate the import of that row of data and mark the import of that row as failed.

System Limitation

Because the saving of dynamic fields and the saving of the main object are not in the same transaction, it's possible for the main object to be saved successfully but the dynamic fields to fail to save. In this case, the state of the main object and the CSV file will become out of sync.

# Escaping Special Characters

  • If the content of a column contains an English comma (,), that column needs to be wrapped in English quotation marks (").
  • If the content of a column contains English quotation marks, and that column is wrapped in quotation marks, then it needs to be escaped with \.

# Whitespace Handling

When reading data from CSV files, the system will automatically remove whitespace characters before and after each column.

WARNING

If there are whitespace characters before or after columns in the CSV file, that row will be identified as needing update each time it's imported, and after the update, the value saved in the database will not include whitespace characters before or after.

# Object Deletion

CSV file import supports deleting existing data, implemented in the following way:

  • Add a DELETE_FLAG column in the header row of the CSV file
  • For rows to be deleted, set the value of the DELETE_FLAG column to Y in the CSV file

WARNING

The delete operation can only work correctly if query fields that can uniquely identify a record are specified in the header row of the CSV file.

For rows marked for deletion, here is the related error handling logic:

  • If there are 0 existing data records found by the query, the system will skip the deletion processing for this row.
  • If there are multiple existing data records found by the query, the system will skip the deletion processing for this row.
  • If there are no query fields in the imported CSV file, the system will skip the processing of all rows marked for deletion.

# Data Type Mapping

The following lists the types currently supported for import by the system and their data types after import

Data Type Data Type After Import
Boolean java.lang.Boolean
String java.lang.String
datetime java.time.LocalDateTime
date java.time.LocalDate
decimal java.math.BigDecimal
integer java.lang.Integer
long java.lang.Long
Enum java Enum
httpMethod org.springframework.http.HttpMethod
java class name Domain Object
StorageFieldValue tech.muyan.storage.StorageFieldValue

# Boolean

For Boolean type fields, the following lists the value correspondences. If the field value in the CSV file is not within the following range, the system will mark the import of that row as failed

Value in CSV file Value after import
Y, y, Yes, YES, true, TRUE, T, t, 是, 1 true
N, n, No, NO, false, FALSE, F, f, åĻ, 0 false

# Date and Time

For date type fields, the following is a list of supported date/time formats. The letters in the list represent the relevant parts of time and date. You can refer to SimpleDateFormat (opens new window) for more details.

 "yyyyMMdd" 
 "dd-MM-yyyy"
 "yyyy-MM-dd"
 "MM/dd/yyyy"
 "yyyy/MM/dd"
 "dd MMM yyyy"
 "dd MMMM yyyy"
 "yyyyMMddHHmm"
 "yyyyMMdd HHmm"
 "dd-MM-yyyy HH:mm"
 "yyyy-MM-dd HH:mm"
 "MM/dd/yyyy HH:mm"
 "yyyy/MM/dd HH:mm"
 "dd MMM yyyy HH:mm"
 "dd MMMM yyyy HH:mm"
 "yyyyMMddHHmmss"
 "yyyyMMdd HHmmss"
 "dd-MM-yyyy HH:mm:ss"
 "yyyy-MM-dd HH:mm:ss"
 "MM/dd/yyyy HH:mm:ss"
 "yyyy/MM/dd HH:mm:ss"
 "dd MMM yyyy HH:mm:ss"
 "dd MMMM yyyy HH:mm:ss"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

# Enumeration Types

If the enumeration type defines a get method that can query the enum object from its display value, the system supports importing enum data columns in CSV files where the column value is its interface display value. Otherwise, the system can only support importing enum data where the value in the CSV exactly matches its internal value.

# Attachment Types

For attachment type columns, when preparing the CSV file, the corresponding column should be filled with the relative file path calculated from the configured seedData.folder directory. During import, the system will import this column as a StorageFieldValue object in the system and establish the necessary object associations.

# Root Directory Fields

If a field is of type tech.muyan.Organization, you can use the placeholder $ROOT_ORG$ to specify a reference to the current tenant's root organization.

TIP

This special handling is to enable quick copying of system default related seed files between different tenants without needing to replace organization names for each tenant.

# Support for New Data Type Mappings Development

All data type mappings during import are defined in tech.muyan.importexport.converter.ConverterFactory. This class provides a utility method registerConverter for registering data mapping implementations for CSV import in the system. For example:

ConverterFactory.registerConverter(type -> type != null && type.contains(".enums."), new EnumConverter());
1
  • The above method registers a data mapping implementation named EnumConverter.
  • The condition for enabling this implementation is specified using a predictor that receives a character type parameter. This parameter is the type field in the metadata returned by calling the DomainMetaController.get method on the domain model class to be imported.
  • The predictor in the above example limits the condition for this type mapping to: the type attribute of the field is not empty and contains the part .enums..

# View Import Records

The system's input import records are saved in ImportRecord, which can be viewed by users with GET permission for the /ImportRecord address. By default, users with ROLE_ADMIN and ROLE_DEVELOPER roles have permission to view import records.

The following information is saved in the import record:

  • Type of imported object
  • MD5 code of the imported CSV file
  • Import status: can be
    • Not started
    • Successful
    • Failed
    • Partially successful
    • In progress
  • Import start time
  • Import end time
  • Title row of the imported CSV file
  • List of inserted object ids
  • List of updated object ids
  • List of deleted object ids
  • Original information of failed rows in CSV
  • Original information of skipped rows in CSV
  • Import log records
  • Number of skipped rows
  • Number of failed rows
  • Number of successfully updated rows
  • Number of successfully created rows
  • Number of successfully deleted rows

# A Complete Example

Below is an example of a complete CSV file that can be imported into the system, and based on this, the related settings and concepts mentioned above are explained.

CSV file name: Contract.csv

CSV file content example:

name(*),title,effectiveDate,total,ownerOrganization.name(*),contractStatus,Tags(#),Purpose(#),Total Amount(#),Seal Type(#),Contract Quantity(#),Is Important(#),language,lines.name[:],DELETE_FLAG
MuYan Technology 2020 Laptop Purchase Contract,Laptop Purchase Contract,2020-01-01 00:00:00,45000,MuYan Technology,Active,"[\"One-time Contract\",\"Low Risk\"]","Purchase laptops for development use",35000,Official Seal,3,Yes,contracts\language_2020.docx,"[line1:line2]",N
MuYan Technology 2019 Laptop Purchase Contract,Laptop Purchase Contract,2020-01-01 00:00:00,45000,MuYan Technology,Active,"[\"One-time Contract\",\"Low Risk\"]","Purchase laptops for development use",35000,Official Seal,3,Yes,contracts\language_2019.docx,"[line3:line4]",Y
1
2
3

# File Description

  • Mapping from file name to object: The file name is Contract.csv, so the system will query all domain models defined in DomainClass, find the model definition named Contract without a package, and decide that the import object corresponding to this file is Contract.

  • Existing record lookup: The system will combine the fields ending with (*) in the title row: name and ownerOrganization.name, use equal matching to query existing records in the system. If data is found based on these two pieces of information, the existing record will be updated; otherwise, a new record will be created.

  • Associated object query and dependency ordering: From the title row, we can see that the ownerOrganization.name field contains the . character, indicating that this field is an associated object field.

    • This field is a tech.muyan.Organization object in the system, so if the system simultaneously detects the existence of an Organization.csv file, it will automatically sort and import the Organization.csv file before the Contract.csv file.
    • From ownerOrganization.name, it can be determined that when importing, the system will use its name field to match and query when looking up the Organization object associated with Contract, and associate the found Organization object with the Contract object to be imported.
  • Dynamic fields: From the title row, we can see that the fields Tags, Purpose, Total Amount, Seal Type, Contract Quantity, and Is Important end with (#), so they are all dynamic fields. The system will look for dynamic fields of the Contract object defined in the system with labels matching these values, and update the corresponding dynamic field values accordingly.

# Field Descriptions

  • effectiveDate

    • This field is a date type, and its value 2020-01-01 00:00:00 conforms to the format definition of yyyy-MM-dd HH:mm:ss, so the system can parse and import this data normally.
  • contractStatus

    • This field is an enumeration type data, its value Active is the display value of the enumeration type ACTIVE. This requires the system to implement a get method to reverse lookup its defined value ACTIVE from its display value Active. If there is no method implemented to reverse lookup the enum value from the display value, then in the CSV file, ACTIVE must be filled in for the system to recognize and import the data normally.
  • Tags

    • Because the value of this field contains the , character, it needs to be wrapped in English quotation marks ".
    • Because the value of this field contains English quotation marks ", this character needs to be escaped using the escape character \.
    • In the system, the Tags field is defined as a multiple-choice field. From the value in the second row, we can see that the value of the Tags field is a valid JSON string. This contract's JSON string will be saved to the jsonValue column of the dynamic field.
    • During import, the system will verify that all options in the JSON string of this field (in this case: One-time Contract, Low Risk) are predefined, valid options in the system.
  • Is Important

  • language

    • This field is a StorageFieldValue field, so during import, the system will import the files corresponding to the file path
      contracts\language_2019.docx in this column as StorageFieldValue object, which is the object represent an attachment in our platform, a StorageFieldValue object will be created, and the association relationship between this object and the main Contract object will be established.
  • lines.name[:]

    • This indicates that this field is a list field, used to import lines field objects. The system will look up the corresponding ContractLine objects based on the value of the name attribute of each item in this field, and establish the association relationship between them and the main Contract object.
    • The list separator used here is :. If lines.name[] is used directly as the column header, it means using the default separator comma (,).
    • It should be noted that before importing, the objects associated with the lines field must already exist, otherwise it will cause the import to fail.
  • DELETE_FLAG

    • This field indicates whether to perform a delete operation for a certain row of data. According to the value of this field in each row of the above example file, the data in the first row performs an update or create operation, and the data in the second row performs a delete operation.

TIP

If the string to be imported contains English quotation marks, use another English quotation mark or backslash for escaping. For example, "" or \" represents an English quotation mark.

Last Updated: 9/17/2024, 3:08:20 AM