바로가기 메뉴 본문 바로가기 주메뉴 바로가기
  • 04-2Is data provenance documented and managed?
    Determine applicability: Consider this question when establishing a dataset for AI algorithm or model development, or using an open-source dataset, and determine if the requirement has been satisfied.

    • Unlike the general sector, the healthcare sector often collects data suited to the product’s purpose in order to ensure data reliability, due to limited circumstances such as a specific disease, region, or collaboration with medical institutions. As a result, the use of open-source datasets is minimal.

    • A medical institution can ensure data reliability by submitting a research proposal to the IRB and obtaining their approval to use patients’ medical data for data collection. 

    • Open-source datasets can be used in some cases, but there may be errors discovered by multiple users while using the data, resulting in changes in the data version through modification rebuilding of datasets.

    • The clear provenance of datasets used in the training, the time of deployment, and the version of open-source datasets must be managed to respond to these AI model problems that may occur due to datasets.