바로가기 메뉴 본문 바로가기 주메뉴 바로가기
  • 04-1aHave you explained the data attributes before and after cleansing?
    • Medical staff with appropriate knowledge must be involved in the data cleansing process, and data standardization and cleansing based on domain knowledge are essential.

    • Procedures like systematic data curation and establishment of data governance must be conducted in this process, and information about the data attributes before and after cleansing must be specified to assist stakeholders’ proper use of data.

    • Here requires information about the cleansing tool and standards, such as data quality assurance, data management optimization, the objective for data building, and data type analysis. The following are examples of data cleansing standards for each data type.
    ✓ Image data: Image size, aspect ratio, resolution, imaging equipment, personal data, intellectual property rights, etc.
    ✓ Text data: Amount of text, grammatical accuracy in text, appropriateness in the content of the text, relevancy to the topic, etc.
    ✓ Audio data: Volume, accuracy in pronunciation, noise and static, inaudible (based on acceptance range), personal data, copyright, etc.
    ✓ Video: Video resolution, any video loss, personal data, political opinion, insulting a specific person, etc.
    ✓ 3D: Acquisition of point cloud, optimization of mesh data, production of the standard model, etc.
    ✓ Sensor: Unit, missing value, recording time of the sensor, etc.

    • In addition, some data are cleansed again to enhance the data quality when creating datasets. Below are examples that can be explained as training data attributes after cleansing.
    ✓ Selection and processing of data: Prevent duplicates, remove abnormal data, sampling, etc.
    ✓ Statistical explanation: Number of training data per class, number of subjects, etc.