This paper defines a new, practical setting for Instance-Incremental Learning, focusing on cost-effective model promotion and resistance to catastrophic forgettingThis paper defines a new, practical setting for Instance-Incremental Learning, focusing on cost-effective model promotion and resistance to catastrophic forgetting

Data Scarcity Solution: S-CycleGAN for CT-to-Ultrasound Translation

2025/11/05 00:00

Abstract and 1 Introduction

  1. Related works

  2. Problem setting

  3. Methodology

    4.1. Decision boundary-aware distillation

    4.2. Knowledge consolidation

  4. Experimental results and 5.1. Experiment Setup

    5.2. Comparison with SOTA methods

    5.3. Ablation study

  5. Conclusion and future work and References

    \

Supplementary Material

  1. Details of the theoretical analysis on KCEMA mechanism in IIL
  2. Algorithm overview
  3. Dataset details
  4. Implementation details
  5. Visualization of dusted input images
  6. More experimental results

Abstract

Instance-incremental learning (IIL) focuses on learning continually with data of the same classes. Compared to class-incremental learning (CIL), the IIL is seldom explored because IIL suffers less from catastrophic forgetting (CF). However, besides retaining knowledge, in real-world deployment scenarios where the class space is always predefined, continual and cost-effective model promotion with the potential unavailability of previous data is a more essential demand. Therefore, we first define a new and more practical IIL setting as promoting the model’s performance besides resisting CF with only new observations. Two issues have to be tackled in the new IIL setting: 1) the notorious catastrophic forgetting because of no access to old data, and 2) broadening the existing decision boundary to new observations because of concept drift. To tackle these problems, our key insight is to moderately broaden the decision boundary to fail cases while retain the old boundary. Hence, we propose a novel decision boundary-aware distillation method with consolidating knowledge to teacher to ease the student learning new knowledge. We also establish the benchmarks on existing datasets Cifar-100 and ImageNet. Notably, extensive experiments demonstrate that the teacher model can be a better incremental learner than the student model, which overturns previous knowledge distillation-based methods treating student as the main role.

1. Introduction

In recent years, many excellent deep-learning-based networks are proposed for variety of tasks, such as image classification, segmentation, and detection. Although these networks perform well on the training data, they inevitably fail on some new data that is not trained in real-world application. Continually and efficiently promoting a deployed model’s performance on these new data is an essential demand. Current solution of retraining the network using all accumulated data has two drawbacks: 1) with the increasing data size, the training cost gets higher each time, for example, more GPUs hours and larger carbon footprint [20], and 2) in some cases the old data is no longer accessible because of the privacy policy or limited budget for data storage. In the case where only a little or no old data is available or utilized, retraining the deep learning model with new data always cause the performance degradation on the old data, i.e., the catastrophic forgetting (CF) problem. To address CF problem, incremental learning [4, 5, 22, 29], also known as continual learning, is proposed. Incremental learning significantly promotes the practical value of deep learning models and is attracting intense research interests.

\ Figure 1. Illustration of the new IIL setting. At the IIL learning phase t > 0, only the new data Dn(t)that is much smaller than the base data is available. Model should be promoted by only leveraging the new data each time and seeks a performance close to the full data model trained on all accumulated data. Fine-tuning with early stopping fails to enhance the model in the new IIL setting.

\ According to whether the new data comes from seen classes, incremental learning can be divided into three scenarios [16, 17]: instance-incremental learning (IIL) [3, 16] where all new data belongs to the seen classes, class-incremental learning (CIL) [4, 12, 15, 22] where new data has different class labels, and hybrid-incremental learning [6, 30] where new data consists of new observations from both old and new classes. Compare to CIL, IIL is relatively unexplored because it is less susceptible to the CF. Lomonaco and Maltoni [16] reported that fine-tuning a model with early stopping can well tame the CF problem in IIL. However, this conclusion not always holds when there is no access to the old training data and the new data has a much smaller size than old data, as depicted in Fig. 1. Fine-tuning often results in a shift in the decision boundary rather than expanding it to accommodate new observations. Besides retaining old knowledge, the real deployment concerns more on efficient model promotion in IIL. For instance, in the defect detection of industry products, classes of defect are always limited to known categories. But the morphology of those defects is varying time to time. Failures on those unseen defects should be corrected timely and efficiently to avoid the defective products flowing into the market. Unfortunately, existing research primarily focuses on retaining knowledge on old data rather than enriching the knowledge with new observations.

\ In this paper, to fast and cost-effective enhance a trained model with new observations of seen classes, we first define a new IIL setting as retaining the learned knowledge as well as promoting the model’s performance on new observations without access to old data. In simple words, we aim to promote the existing model by only leveraging the new data and attain a performance that is comparable to the model retrained with all accumulated data. The new IIL is challenging due to the concept drift [6] caused by the new observations, such as the color or shape variation compared to the old data. Hence, two issues have to be tackled in the new IIL setting: 1) the notorious catastrophic forgetting because of no access to old data, and 2) broadening the existing decision boundary to new observations.

\ To address above issues in the new IIL setting, we propose a novel IIL framework based on the teacher-student structure. The proposed framework consists of a decision boundary-aware distillation (DBD) process and a knowledge consolidation (KC) process. The DBD allows the student model to learn from new observations with awareness of the existing inter-class decision boundaries, which enables the model to determine where to strengthen its knowledge and where to retain it. However, the decision boundary is untraceable when there are insufficient samples located around the boundary because of no access to the old data in IIL. To overcome this, we draw inspiration from the practice of dusting the floor with flour to reveal hidden footprints. Similarly, we introduce random Gaussian noise to pollute the input space and manifest the learned decision boundary for distillation. During training the student model with boundary distillation, the updated knowledge is further consolidate back to the teacher model intermittently and repeatedly with the EMA mechanism [28]. Utilizing teacher model as the target model is a pioneering attempt and its feasibility is explained theoretically.

\ According to the new IIL setting, we reorganize the training set of some existing datasets commonly used in CIL, such as Cifar-100 [11] and ImageNet [24] to establish the benchmarks. Model is evaluated on the test data as well as the non-available base data in each incremental phase. Our main contributions can be summarized as follows: 1) We define a new IIL setting to seek for fast and cost-effective model promotion on new observations and establish the benchmarks; 2) We propose a novel decision boundary-aware distillation method to retain the learned knowledge as well as enriching it with new data; 3) We creatively consolidate the learned knowledge from student to teacher model to attain better performance and generalizability, and prove the feasibility theoretically; and 4) Extensive experiments demonstrate that the proposed method well accumulates knowledge with only new data while most of existing incremental learning methods failed.

\

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.

:::

:::info Authors:

(1) Qiang Nie, Hong Kong University of Science and Technology (Guangzhou);

(2) Weifu Fu, Tencent Youtu Lab;

(3) Yuhuan Lin, Tencent Youtu Lab;

(4) Jialin Li, Tencent Youtu Lab;

(5) Yifeng Zhou, Tencent Youtu Lab;

(6) Yong Liu, Tencent Youtu Lab;

(7) Qiang Nie, Hong Kong University of Science and Technology (Guangzhou);

(8) Chengjie Wang, Tencent Youtu Lab.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Share Insights

You May Also Like

Franklin Templeton updates XRP ETF filing for imminent launch

Franklin Templeton updates XRP ETF filing for imminent launch

Franklin Templeton, one of the world’s largest asset management firms, has taken a significant step in introducing the Spot XRP Exchange-Traded Fund (ETF). The company submitted an updated S-1 registration statement to the U.S. Securities and Exchange Commission (SEC) last week, removing language that likely stood in the way of approval. The change is indicative of a strong commitment to completing the fund sale in short order — as soon as this month. The amendment is primarily designed to eliminate the “8(a)” delay clause, a technological artifact of ETF filings under which the SEC can prevent the effectiveness of a registration statement from taking effect automatically until it affirmatively approves it. By deleting this provision, Franklin Templeton secures the right to render effective the filing of the Registration Statement automatically upon fulfillment of all other conditions. This development positions Franklin Templeton as one of the most ambitious asset managers to file for a crypto ETF amid the current market flow. It replicates an approach that Bitcoin and Ethereum ETF issuers previously adopted, expediting approvals and listings when the 8(a) clause was removed. The timing of this change is crucial. Analysts say it betrays a confidence that the SEC will not register additional complaints against XRP-related products — especially as the market continues to mature and regulatory infrastructures around crypto ETFs take clearer shape. For Franklin Templeton, which manages assets worth more than $1 trillion globally, an XRP ETF would be a significant addition to its cryptocurrency investment offerings. The firm already offers exposure to Bitcoin and Ethereum through similar products, indicating an increasing confidence in digital assets as an emerging investment asset class. Other asset managers race to launch XRP ETFs Franklin Templeton isn’t the only one seeking to launch an XRP ETF. Other asset managers, such as Canary Funds and Bitwise, have also revised their S-1 filings in recent weeks. Canary Funds has withdrawn its operating company’s delaying amendment and is seeking to go live in mid-November, subject to exchange approval. Bitwise, another major player in digital asset management, announced that it would list an XRP ETF on a prominent U.S. exchange. The company has already made public fees and custodial arrangements — the last steps generally completed when an ETF is on the verge of a launch. The surge in amended filings indicates growing industry optimism that the SEC may approve several XRP ETFs for marketing around the same time. For investors, this would provide new, regulated access to one of the world’s most widely traded cryptocurrencies, without the need to hold a token directly. Investors prepare for ripple effect on markets The competition to offer an XRP ETF demonstrates the next step toward institutional involvement in digital assets. If approved, these funds would provide investors with a straightforward, regulated way to gain token access to XRP price movements through traditional brokerages. An XRP ETF could also onboard new retail investors and boost the liquidity and trust of the asset, similarly to what spot Bitcoin ETFs achieved earlier this year. Those funds attracted billions of dollars in inflows within a matter of weeks, a subtle indication of the pent-up demand among institutional and retail investors. The SEC, which has become more receptive to digital-asset ETFs after approving products including Bitcoin and Ethereum, is still carefully weighing every filing. Final approval will be based on full disclosure, custody, and transparency of how pricing is happening through the base market. Still, market participants view the update in Franklin Templeton’s filing as their strongest sign yet that they are poised. With a swift response from the firm and news of other competing funds, this should mean that we don’t have long to wait for the first XRP ETF — marking another key turning point in crypto’s journey into traditional finance. If you're reading this, you’re already ahead. Stay there with our newsletter.
Share
Coinstats2025/11/05 09:16