This study introduces a new benchmark for 3D medical image retrieval using the TotalSegmentator dataset, showcasing how pre-trained vision embeddings—originally trained on natural images—can be repurposed for anatomical structure localization. By integrating a ColBERT-inspired re-ranking approach, the method boosts recall across diverse anatomical regions, though challenges remain in retrieving certain structures like the brain and face. The findings suggest that general image pre-training (e.g., ImageNet) can be as effective, if not slightly better, than domain-specific medical datasets. This benchmark lays the groundwork for future innovations in content-based medical image search and targeted organ retrieval.This study introduces a new benchmark for 3D medical image retrieval using the TotalSegmentator dataset, showcasing how pre-trained vision embeddings—originally trained on natural images—can be repurposed for anatomical structure localization. By integrating a ColBERT-inspired re-ranking approach, the method boosts recall across diverse anatomical regions, though challenges remain in retrieving certain structures like the brain and face. The findings suggest that general image pre-training (e.g., ImageNet) can be as effective, if not slightly better, than domain-specific medical datasets. This benchmark lays the groundwork for future innovations in content-based medical image search and targeted organ retrieval.

How Pre-Trained Vision Models Are Revolutionizing Anatomical Structure Retrieval

7 min read

Abstract and 1. Introduction

  1. Materials and Methods

    2.1 Vector Database and Indexing

    2.2 Feature Extractors

    2.3 Dataset and Pre-processing

    2.4 Search and Retrieval

    2.5 Re-ranking retrieval and evaluation

  2. Evaluation and 3.1 Search and Retrieval

    3.2 Re-ranking

  3. Discussion

    4.1 Dataset and 4.2 Re-ranking

    4.3 Embeddings

    4.4 Volume-based, Region-based and Localized Retrieval and 4.5 Localization-ratio

  4. Conclusion, Acknowledgement, and References

5 Conclusion

Our study establishes a new benchmark for the retrieval of anatomical structures within 3D medical volumes, utilizing the TotalSegmentator dataset to facilitate targeted queries of volumes or sub-volumes for specific anatomical structures. The results highlight the potential of leveraging pre-trained vision embeddings, originally trained on non-medical images, for medical image retrieval across diverse anatomical regions with a wide size range.

\ We introduced a re-ranking method based on a late interaction model from text retrieval, i.e. ColBERT Khattab and Zaharia [2020]. The proposed ColBERT-inspired method enhances the retrieval recall of all anatomical regions. Future investigations can focus on refining and optimizing the computational efficiency of the proposed re-ranking method.

\ We evaluated the performance of different embeddings pre-trained supervised and self-supervised on medical and non-medical data. The results indicate that pre-training on general natural images (e.g., ImageNet) yields slightly more effective embedding vectors than domain-specific natural images (e.g., RadImageNet). However, given the marginal difference, the choice of embeddings is unlikely to impact the user experience in downstream tasks significantly.

\ The retrieval of certain anatomical structures, such as the brain and face, demonstrates low recall across all embedding and retrieval methods. Subsequent research can explore the prevalence of such patterns and potential solutions.

\ This benchmark sets the stage for future advancements in content-based medical image retrieval, particularly in localizing specific organs or areas within scans.

Acknowledgement

The authors like to thank the Bayer team of the internal ML innovation platform for providing compute infrastructure and technical support.

\ We thank Timothy Deyer and his RadImageNet team for providing the RadImageNet pre-trained model weights for the SwinTransformer architecture.

References

Shiv Ram Dubey. A decade survey of content based image retrieval using deep learning. IEEE Transactions on Circuits and Systems for Video Technology, 32(5):2687–2704, 2021.

\ Wenqing Wang, Pengfei Jiao, Han Liu, Xiao Ma, and Zhuo Shang. Two-stage content based image retrieval using sparse representation and feature fusion. Multimedia Tools and Applications, 81(12):16621–16644, 2022.

\ Adnan Qayyum, Syed Muhammad Anwar, Muhammad Awais, and Muhammad Majid. Medical image retrieval using deep convolutional neural network. Neurocomputing, 266:8–20, 2017.

\ Farnaz Khun Jush, Tuan Truong, Steffen Vogler, and Matthias Lenga. Medical image retrieval using pretrained embeddings. arXiv preprint arXiv:2311.13547, 2023.

\ Asma Ben Abacha, Alberto Santamaria-Pang, Ho Hin Lee, Jameson Merkow, Qin Cai, Surya Teja Devarakonda, Abdullah Islam, Julia Gong, Matthew P Lungren, Thomas Lin, et al. 3d-mir: A benchmark and empirical study on 3d medical image retrieval in radiology. arXiv preprint arXiv:2311.13752, 2023.

\ Stefan Denner, David Zimmerer, Dimitrios Bounias, Markus Bujotzek, Shuhan Xiao, Lisa Kausch, Philipp Schader, Tobias Penzkofer, Paul F Jäger, and Klaus Maier-Hein. Leveraging foundation models for content-based medical image retrieval in radiology. arXiv preprint arXiv:2403.06567, 2024.

\ Tuan Truong, Farnaz Khun Jush, and Matthias Lenga. Benchmarking pretrained vision embeddings for near-and duplicate detection in medical images. arXiv preprint arXiv:2312.07273, 2023.

\ Michela Antonelli, Annika Reinke, Spyridon Bakas, Keyvan Farahani, Annette Kopp-Schneider, Bennett A Landman, Geert Litjens, Bjoern Menze, Olaf Ronneberger, Ronald M Summers, et al. The medical segmentation decathlon. Nature communications, 13(1):4128, 2022.

\ Jakob Wasserthal, Hanns-Christian Breit, Manfred T Meyer, Maurice Pradella, Daniel Hinck, Alexander W Sauter, Tobias Heye, Daniel T Boll, Joshy Cyriac, Shan Yang, et al. Totalsegmentator: Robust segmentation of 104 anatomic structures in ct images. Radiology: Artificial Intelligence, 5(5), 2023.

\ Omar Khattab and Matei Zaharia. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, pages 39–48, 2020.

\ Martin Aumüller, Erik Bernhardsson, and Alexander Faithfull. Ann-benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. Information Systems, 87:101374, 2020.

\ Moses S Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pages 380–388, 2002.

\ Yu A Malkov and Dmitry A Yashunin. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence, 42(4):824–836, 2018.

\ Ibraheem Taha, Matteo Lissandrini, Alkis Simitsis, and Yannis Ioannidis. A study on efficient indexing for table search in data lakes. In 2024 IEEE 18th International Conference on Semantic Computing (ICSC), pages 245–252. IEEE, 2024.

\ Jeff Johnson, Matthijs Douze, and Hervé Jégou. Billion-scale similarity search with gpus. IEEE Transactions on Big Data, 7(3):535–547, 2019.

\ Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.

\ Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9650–9660, 2021.

\ Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023.

\ Stephanie Fu, Netanel Tamir, Shobhita Sundaram, Lucy Chai, Richard Zhang, Tali Dekel, and Phillip Isola. Dreamsim: Learning new dimensions of human visual similarity using synthetic data. arXiv preprint arXiv:2306.09344, 2023.

\ Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.

\ Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.

\ Xueyan Mei, Zelong Liu, Philip M Robson, Brett Marinelli, Mingqian Huang, Amish Doshi, Adam Jacobi, Chendi Cao, Katherine E Link, Thomas Yang, et al. Radimagenet: an open radiologic deep learning research dataset for effective transfer learning. Radiology: Artificial Intelligence, 4(5):e210315, 2022.

\ Hirokatsu Kataoka, Kazushige Okayasu, Asato Matsumoto, Eisuke Yamagata, Ryosuke Yamada, Nakamasa Inoue, Akio Nakamura, and Yutaka Satoh. Pre-training without natural images. International Journal of Computer Vision (IJCV), 2022.

\ Qingyao Ai, Jiaxin Mao, Yiqun Liu, and W Bruce Croft. Unbiased learning to rank: Theory and practice. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pages 2305–2306, 2018.

\ Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W Bruce Croft, and Xueqi Cheng. A deep look into neural ranking models for information retrieval. Information Processing & Management, 57 (6):102067, 2020.

\ Sean MacAvaney, Andrew Yates, Arman Cohan, and Nazli Goharian. Cedr: Contextualized embeddings for document ranking. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pages 1101–1104, 2019.

\ Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

\ Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, and Matei Zaharia. Colbertv2: Effective and efficient retrieval via lightweight late interaction. arXiv preprint arXiv:2112.01488, 2021.

\

:::info Authors:

(1) Farnaz Khun Jush, Bayer AG, Berlin, Germany (farnaz.khunjush@bayer.com);

(2) Steffen Vogler, Bayer AG, Berlin, Germany (steffen.vogler@bayer.com);

(3) Tuan Truong, Bayer AG, Berlin, Germany (tuan.truong@bayer.com);

(4) Matthias Lenga, Bayer AG, Berlin, Germany (matthias.lenga@bayer.com).

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Market Opportunity
Gravity Logo
Gravity Price(G)
$0.003562
$0.003562$0.003562
-5.89%
USD
Gravity (G) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Microsoft Corp. $MSFT blue box area offers a buying opportunity

Microsoft Corp. $MSFT blue box area offers a buying opportunity

The post Microsoft Corp. $MSFT blue box area offers a buying opportunity appeared on BitcoinEthereumNews.com. In today’s article, we’ll examine the recent performance of Microsoft Corp. ($MSFT) through the lens of Elliott Wave Theory. We’ll review how the rally from the April 07, 2025 low unfolded as a 5-wave impulse followed by a 3-swing correction (ABC) and discuss our forecast for the next move. Let’s dive into the structure and expectations for this stock. Five wave impulse structure + ABC + WXY correction $MSFT 8H Elliott Wave chart 9.04.2025 In the 8-hour Elliott Wave count from Sep 04, 2025, we saw that $MSFT completed a 5-wave impulsive cycle at red III. As expected, this initial wave prompted a pullback. We anticipated this pullback to unfold in 3 swings and find buyers in the equal legs area between $497.02 and $471.06 This setup aligns with a typical Elliott Wave correction pattern (ABC), in which the market pauses briefly before resuming its primary trend. $MSFT 8H Elliott Wave chart 7.14.2025 The update, 10 days later, shows the stock finding support from the equal legs area as predicted allowing traders to get risk free. The stock is expected to bounce towards 525 – 532 before deciding if the bounce is a connector or the next leg higher. A break into new ATHs will confirm the latter and can see it trade higher towards 570 – 593 area. Until then, traders should get risk free and protect their capital in case of a WXY double correction. Conclusion In conclusion, our Elliott Wave analysis of Microsoft Corp. ($MSFT) suggested that it remains supported against April 07, 2025 lows and bounce from the blue box area. In the meantime, keep an eye out for any corrective pullbacks that may offer entry opportunities. By applying Elliott Wave Theory, traders can better anticipate the structure of upcoming moves and enhance risk management in volatile markets. Source: https://www.fxstreet.com/news/microsoft-corp-msft-blue-box-area-offers-a-buying-opportunity-202509171323
Share
BitcoinEthereumNews2025/09/18 03:50
Marathon Digital BTC Transfers Highlight Miner Stress

Marathon Digital BTC Transfers Highlight Miner Stress

The post Marathon Digital BTC Transfers Highlight Miner Stress appeared on BitcoinEthereumNews.com. In a tense week for crypto markets, marathon digital has drawn
Share
BitcoinEthereumNews2026/02/06 15:16
Fintech in a Fragmented World: Building Financial Products Across Geopolitical Lines

Fintech in a Fragmented World: Building Financial Products Across Geopolitical Lines

For most of the last ten years, the fintech growth story was one without borders. Startups made digital wallets, payment platforms, lending systems, and trading
Share
Globalfintechseries2026/02/06 15:17