반치용/문제해결(trouble shooting)
[파이썬]dicom 파일 비식별화 예제 (기본)
Cat.8
2020. 5. 28. 13:17
필요사항 : python 3.7이상
관련 패키지 설치
pip install pydicom
pip install tqdm
코드
import os
import pydicom
#from tqdm import tqdm_notebook
from tqdm import tqdm
# get dcm_file_list
def get_file_list() :
try :
list_path = []
list_file = []
list_full = []
for (path, _, file) in os.walk('.\\'):
for each_file in file:
if each_file[-4:] == '.dcm':
list_path.append(path)
list_file.append(each_file)
list_full.append(os.path.join(os.getcwd(),path,each_file).replace('.\\',''))
return list_full
except :
return 'get_file_list error.'
# main de-identifier
def de_identifier(opt_each_file):
for filename in tqdm(get_file_list()):
try:
Metadata = pydicom.filereader.dcmread(str(filename))
except: return 'de_identifier // file reading error. '
try:
# de-identify
Metadata.PatientName = 'Anonymized'
Metadata.PatientBirthDate = 'Anonymized'
Metadata.PatientSex = 'Anonymized'
Metadata.OtherPatientIDs = 'Anonymized'
Metadata.PatientAge = 'Anonymized'
Metadata.RequestingPhysician = 'Anonymized'
Metadata.InstitutionName = 'Anonymized'
Metadata.InstitutionAddress = 'Anonymized'
Metadata.ReferringPhysicianName = 'Anonymized'
Metadata.StationName = 'Anonymized'
Metadata.PhysiciansofRecord = 'Anonymized'
Metadata.save_as(str(filename))
# TODO - revive
# sql_query(True)
if opt_each_file == 1 :
print(f'\[complete\] {filename}')
except:
# TODO - revive
# sql_query(False)
return 'de_identifier error'
print('de_identified.')
# run
de_identifier(0)
# de_identifier() [run] -> get_file_list() [get dcm filelist] -> de_identifier() [replace each attribute] -> each replace function [] -> de_identifier()
파일이 위치한 폴더 및 하위 폴더에 있는 모든 dcm header 파일을 비식별화하는 코드