Research Paper
산업
Differentially private upsampling for enhanced anomaly detection in imbalanced data
In real-world applications, anomaly detection tasks are critically important. For example, fraud detection for the financial domains and the diagnosis of diseases for the medical domains require highly accurate predictions, as errors can lead to severe consequences. These tasks often rely on sensitive personal data, making it necessary to apply privacy-preserving techniques. However, applying privacy-preserving techniques directly degrades performance. To mitigate this issue, the minority class in an imbalanced dataset can be upsampled to improve balance. In this paper, we propose a differentially private upsampling method using a kernel-based support function for imbalanced datasets. The proposed method employs kernel support vector domain description to estimate the distribution of minority class data under differential privacy constraints, generating synthetic instances based on gradient methods. Additionally, we propose a filtering process that leverages the support function of the majority class data to refine the generated samples without additional privacy loss. Experimental results on real-world datasets demonstrate that the proposed method maintains robust privacy guarantees and achieves superior performance in minority class metrics, comparable to non-private methods.Co-authors: Yujin Choi, Jinseong Park, Youngjoo Park, Jaewook Lee
2026
금융
Advancing financial privacy: A novel integrative approach for privacy-preserving optimal portfolio
We propose a new privacy-preserving mean-variance optimization model, merging Multi-Party Computation (MPC) with Homomorphic Encryption (HE) through an innovative method. Empirical tests show our model outperforms existing ones in privacy optimization, overcoming limitations regarding complex constraints. We highlight three findings: our model (i) outperforms others in privacy-preserving utility maximization with no-short-selling constraint; (ii) remains effective under complex box constraints, whereas the existing model entirely collapses; and (iii) achieves close alignment with the optimal portfolio from an economic perspective, providing high computational efficiency. It proves to be an effective solution for privacy optimization, a key aspect in mitigating ESG risks.Co-authors: Hyungjin Ko, Jaewook Lee
2026
산업
Homomorphic encryption-based fault diagnosis in IoT-enabled industrial systems
In IoT-enabled industrial environments, ensuring the privacy and security of operational data is paramount for fault diagnosis systems. This study presents a novel framework that seamlessly integrates homomorphic encryption (HE) with deep learning to achieve secure and efficient fault diagnosis for industrial bearings. By performing computations directly on encrypted sensor data, the framework guarantees full data confidentiality throughout the diagnostic process without requiring decryption. Key technical contributions of this work include the development of a minimax polynomial approximation for ReLU activations, which enhances diagnostic accuracy while preserving efficiency, and the design of an efficient 1D convolution method that combines two existing HE convolution techniques for optimal performance. Additionally, the framework incorporates frequency-domain optimizations using the Discrete Fourier Transform (DFT), which significantly enhance processing efficiency. The proposed model was trained on the CWRU bearing dataset and validated on a private dataset, achieving a diagnostic accuracy of 95.92%, comparable to state-of-the-art models operating on plaintext data. Furthermore, the DFT-based optimizations reduced inference time by nearly threefold while maintaining superior accuracy, underscoring the framework’s potential to provide secure and efficient fault diagnosis for industrial applications.Co-authors: Hoki Kim, Youngdoo Son
2025
산업
Privacy-preserving inference resistant to model extraction attacks
Privacy-Preserving Deep Learning (PPDL) has been successfully applied in the inference phase to preserve the privacy of input data. However, PPDL models are vulnerable to model extraction attacks, in which an adversary attempts to steal the trained model itself. In this paper, we propose a new defense method against model extraction attacks that is specifically designed for PPDL based on secure multi-party computations and homomorphic encryption. The proposed method confounds inference queries for out-of-distribution data by using a fake network with the target network while optimizing computational efficiency for PPDL environments. Furthermore, we introduce Wasserstein regularization to ensure that the fake network’s output distribution is indistinguishable from the target network, thwarting adversaries’ attempts to discern any discrepancies within the PPDL framework. The experimental results demonstrate that our defense method attains a good accuracy-security trade-off and is effective against a wide range of attacks, including adaptive attacks and transfer attacks. Our work contributes to the field of PPDL by providing an extended perspective to improve the algorithm’s security and reliability beyond privacy.Co-authors: Yujin Choi, Jaewook Lee, Saerom Park
2024
산업
Improving the utility of differentially private clustering through dynamical processing
In real-world applications, anomaly detection tasks are critically important. For example, fraud detection for the financial domains and the diagnosis of diseases for the medical domains require highly accurate predictions, as errors can lead to severe consequences. These tasks often rely on sensitive personal data, making it necessary to apply privacy-preserving techniques. However, applying privacy-preserving techniques directly degrades performance. To mitigate this issue, the minority class in an imbalanced dataset can be upsampled to improve balance. In this paper, we propose a differentially private upsampling method using a kernel-based support function for imbalanced datasets. The proposed method employs kernel support vector domain description to estimate the distribution of minority class data under differential privacy constraints, generating synthetic instances based on gradient methods. Additionally, we propose a filtering process that leverages the support function of the majority class data to refine the generated samples without additional privacy loss. Experimental results on real-world datasets demonstrate that the proposed method maintains robust privacy guarantees and achieves superior performance in minority class metrics, comparable to non-private methods.Co-authors: Yujin Choi, Jaewook Lee
2025
의학
Long-term nonskeletal complications in patients with thyroid cancer and hypoparathyroidism post total thyroidectomy
ContextThyroid cancer (TC) is a prevalent endocrine malignancy with rising incidence attributed to advancements in diagnostic technology. Despite its generally favorable prognosis, postsurgical complications, including hypoparathyroidism, can cause long-term health challenges.ObjectiveThis study evaluated the risk of nonskeletal complications in patients with TC with hypoparathyroidism (TC with hypoP).MethodsA retrospective cohort study was conducted using the National Health Insurance Service-National Sample Cohort (2002-2019), including patients with TC diagnosed between 2006 and 2019. Participants were categorized into TC with hypoP, TC without hypoparathyroidism (TC without hypoP), and matched controls. Propensity score matching and Cox proportional hazards models evaluated the incidence and risk of nonskeletal complications, including diabetes mellitus, dyslipidemia, cardiovascular and renal outcomes, and cataracts.ResultsThis study included 430 and 850 patients in the TC with hypoP and TC without hypoP groups, respectively, and their matched controls. The TC with hypoP group showed significantly higher risks of diabetes mellitus (HR 1.31, 95% CI 1.01-1.68), dyslipidemia (HR 1.29, 95% CI 1.06-1.57), urinary stones (HR 1.61, 95% CI 1.00-2.57), and cataracts (HR 1.50, 95% CI 1.15-1.95) than controls (all P < .05). Hypertension risk was higher in the TC with hypoP group vs the TC without hypoP group (HR 1.39, 95% CI 1.00-1.93, P = .048). Women had higher urinary stone risk, while cataract risk increased in patients aged over 50.ConclusionPatients with TC with hypoP are at an increased risk for specific nonskeletal complications, particularly older adults and women. These findings underscore the need for targeted monitoring and management strategies in this population. Further prospective studies are warranted to validate these associations and elucidate the underlying mechanisms.Co-authors: Eu Jeong Ku, Won Sang Yoo, Janghyeon Bae, Eun Kyung Lee, Hwa Young Ahn
2025
의학
Geographically Weighted Cause-Specific Hazard Model with Application to Prostate Cancer
In public health research, survival data denoting different causes of death are often collected across geographical regions. The data may cause invalid inference, however, if employed in a general competing risk model, which assumes constant relationships between risk factors and competing risks across regions. In addition, some applications might require spatially varying cause-specific hazard ratios. To address these limitations, this study proposes a geographically weighted cause-specific hazard regression (GWCHR) model to estimate spatially varying coefficients with a common spatial scale across multiple covariates. In identifying spatial variations of coefficients, we assign distance-based weights for each location in likelihood construction. We choose the bandwidth in the weighting function according to suitable selection criteria. We analyze the asymptotic properties of the proposed GWCHR model in detail. Our simulation studies compare the finite sample performance of the proposed model with general competing risk models. We apply the proposed method to prostate cancer data from Korea’s National Health Information Service database to examine the spatially varying effects of environmental and social factors on second primary cancers for prostate cancer patients.Co-authors: Mina Kim, Yeong-Hwa Kim, Molin Wang, Se Young Choi
2026
의학
Impact of type 2 diabetes on the development of dementia and death in Parkinson's disease
BackgroundIncreasing evidence suggests that type 2 diabetes (T2DM) can influence the progression of Parkinson's disease (PD). However, it remains unclear whether T2DM increases the risk of progression to dementia and death in PD.ObjectiveThis study aimed to investigate the impact of T2DM on the risk of developing dementia and death following a diagnosis of PD.MethodsWe examined 158,962 individuals (aged 60 years or older) without PD or dementia using the Korean National Health Insurance Service (NHIS)-senior cohort database. A multi-state model was used to estimate the hazard ratios characterizing the effect of T2DM on the risk of PD, dementia, and death while adjusting for potential confounding factors. Results were analyzed according to age and sex.ResultsT2DM increased the risk of development of PD (adjusted hazard ratio [aHR]: 1.26, 95 % confidence interval [CI]: 1.11–1.42), dementia (aHR: 1.31, 95 % CI: 1.24–1.39), or death (aHR: 1.58, 95 % CI: 1.52–1.65) compared to those without T2DM. However, after PD diagnosis, T2DM was not associated with progression to dementia (aHR: 1.09, 95 % CI: 0.96–1.48) and death (aHR: 1.10, 95 % CI: 0.85–1.42) although subgroup analysis showed an elevated risk for the progression from PD to dementia (aHR: 1.41 95 % CI: 1.06–1.89) in individuals under 70 years of age.ConclusionsT2DM increases the risk of PD, dementia, and death in the elderly population. However, its effect on the progression to dementia and death may occur independently of the onset of PD, despite significant age-related heterogeneity.Co-authors: Seung Hyun Lee, Mina Kim, Da-woon Kim, Yun Su Hwang, Kye Won Park, Sungyang Jo, Ji-Hoon Kang, Richard J. Cook, Sun Ju Chung
2026
산업
Fully Few-shot Class-incremental Audio Classification Using Multi-level Embedding Extractor and Ridge Regression Classifier
In the task of Few-shot Class-incremental Audio Classification (FCAC), training samples of each base class are required to be abundant to train model. However, it is not easy to collect abundant training samples for many base classes due to data scarcity and high collection cost. We discuss a more realistic issue, Fully FCAC (FFCAC), in which training samples of both base and incremental classes are only a few. Furthermore, we propose a FFCAC method using a model which is decoupled into a multi-level embedding extractor and a ridge regression classifier. The embedding extractor consists of an encoder of audio spectrogram Transformer and a fusion module, and is trained in the base session but frozen in all incremental sessions. The classifier is updated continually in each incremental session. Results on three public datasets show that our method exceeds current methods in accuracy, and has advantage over most of them in complexity.Co-authors: Yongjie Si, Yanxiong Li, Jiaxin Tan, Qianhua He, Il-Youp Kwak
2025