Probabilistic Circuits (PCs) offer a unified framework for tractable probability distributions, enabling efficient probabilistic inference through structured computation graphs. Researchers are advancing their speed and scalability via GPU parallelization, tensorized designs, and even custom hardware like DAG Processing Units. With applications ranging from explainability and data compression to neuro-symbolic AI and large language model detoxification, PCs are emerging as a powerful foundation for the next wave of efficient, interpretable AI.Probabilistic Circuits (PCs) offer a unified framework for tractable probability distributions, enabling efficient probabilistic inference through structured computation graphs. Researchers are advancing their speed and scalability via GPU parallelization, tensorized designs, and even custom hardware like DAG Processing Units. With applications ranging from explainability and data compression to neuro-symbolic AI and large language model detoxification, PCs are emerging as a powerful foundation for the next wave of efficient, interpretable AI.

Why Researchers Are Betting on PCs to Power the Next Wave of AI

2025/08/25 07:10

Abstract and 1. Introduction

  1. Preliminaries and Related Work

  2. Key Bottlenecks in PC Parallelization

  3. Harnessing Block-Based PC Parallelization

    4.1. Fully Connected Sum Layers

    4.2. Generalizing To Practical Sum Layers

    4.3. Efficient Implementations by Compiling PC Layers

    4.4. Analysis: IO and Computation Overhead

  4. Optimizing Backpropagation with PC Flows

  5. Experiments

    6.1. Faster Models with PyJuice

    6.2. Better PCs At Scale

    6.3. Benchmarking Existing PCs

  6. Conclusion, Acknowledgements, Impact Statement, and References

A. Algorithm Details

B. Additional Technical Details

C. Experimental Details

D. Additional Experiments

\

2. Preliminaries and Related Work

Many probabilistic inference tasks can be cast into computing sums of products. By viewing them from a computation graph standpoint, PCs provide a unified perspective on many bespoke representations of tractable probability distributions, including Arithmetic Circuits (Darwiche, 2002; 2003), Sum-Product Networks (Poon & Domingos, 2011), Cutset Networks (Rahman et al., 2014), and Hidden Markov Models (Rabiner & Juang, 1986). Specifically, PCs define distributions with computation graphs consisting of sum and product operations, as elaborated below.

\

\ The key to guaranteeing exact and efficient computation of various probabilistic queries is to impose proper structural constraints on the DAG of the PC. As an example, with smoothness and decomposability (Poon & Domingos, 2011), computing any marginal probability amounts to a forward pass (children before parents) following Equation (1), with the only exception that we set the value of input nodes defined on marginalized variables to be 1. Please refer to Choi et al. (2020) for a comprehensive overview of different structural constraints and what queries they enable.

\

\ For example, Peharz et al. (2020a) demonstrate how the above parameter gradients can be used to apply ExpectationMaximization (EM) updates, and Vergari et al. (2021) elaborates how the forward pass can be used to compute various probabilistic and information-theoretic queries when coupled with PC structure transformation algorithms. Therefore, the speed and memory efficiency of these two procedures largely determine the overall efficiency of PCs.

\ Figure 1. Layering a PC by grouping nodes with the same topological depth (as indicated by the colors) into disjoint subsets. Both the forward and the backward computation can be carried out independently on nodes within the same layer.

\ Related work on accelerating PCs. There has been a great amount of effort put into speeding up training and inference for PCs. One of the initial attempts performs nodebased computations on both CPUs (Lowd & Rooshenas, 2015) and GPUs (Pronobis et al., 2017; Molina et al., 2019), i.e., by computing the outputs for a mini-batch of inputs (data) recursively for every node. Despite its simplicity, it fails to fully exploit the parallel computation capability possessed by modern GPUs since it can only parallelize over a batch of samples. This problem is mitigated by also parallelizing topologically independent nodes (Peharz et al., 2020a; Dang et al., 2021). Specifically, a PC is chunked into topological layers, where nodes in the same layer can be computed in parallel. This leads to 1-2 orders of magnitude speedup compared to node-based computation.

\ The regularity of edge connection patterns is another key factor influencing the design choices. Specifically, EiNets (Peharz et al., 2020a) leverage off-the-shelf Einsum operations to parallelize dense PCs where every layer contains groups of densely connected sum and product/input nodes. Mari et al. (2023) generalize the notion of dense PCs to tensorized PCs, which greatly expands the scope of EiNets. Dang et al. (2021) instead focus on speeding up sparse PCs, where different nodes could have drastically different numbers of edges. They use custom CUDA kernels to balance the workload of different GPU threads and achieve decent speedup on both sparse and dense PCs.

\ Another thread of work focuses on designing computation hardware that is more suitable for PCs. Specifically, Shah et al. (2021) propose DAG Processing Units (DPUs) that can efficiently traverse sparse PCs, Dadu et al. (2019) introduce an indirect read reorder-buffer to improve the efficiency of data-dependent memory accesses in PCs, and Yao et al. (2023) use addition-as-int multiplications to significantly improve the energy efficiency of PC inference algorithms.

\ Figure 2. Runtime breakdown of the feedforward pass of a PC with ∼150M edges. Both the IO and the computation overhead of the sum layers are significantly larger than the total runtime of product layers. Detailed configurations of the PC are shown in the table.

\ Applications of PCs. PCs have been applied to many domains such as explainability and causality (Correia et al., 2020; Wang & Kwiatkowska, 2023), graph link prediction (Loconte et al., 2023), lossless data compression (Liu et al., 2022), neuro-symbolic AI (Xu et al., 2018; Manhaeve et al., 2018; Ahmed et al., 2022a;b), gradient estimation (Ahmed et al., 2023b), graph neural networks rewiring (Qian et al., 2023), and even large language model detoxification (Ahmed et al., 2023a).

\

:::info Authors:

(1) Anji Liu, Department of Computer Science, University of California, Los Angeles, USA (liuanji@cs.ucla.edu);

(2) Kareem Ahmed, Department of Computer Science, University of California, Los Angeles, USA;

(3) Guy Van den Broeck, Department of Computer Science, University of California, Los Angeles, USA;

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Share Insights

You May Also Like

Franklin Templeton updates XRP ETF filing for imminent launch

Franklin Templeton updates XRP ETF filing for imminent launch

Franklin Templeton, one of the world’s largest asset management firms, has taken a significant step in introducing the Spot XRP Exchange-Traded Fund (ETF). The company submitted an updated S-1 registration statement to the U.S. Securities and Exchange Commission (SEC) last week, removing language that likely stood in the way of approval. The change is indicative of a strong commitment to completing the fund sale in short order — as soon as this month. The amendment is primarily designed to eliminate the “8(a)” delay clause, a technological artifact of ETF filings under which the SEC can prevent the effectiveness of a registration statement from taking effect automatically until it affirmatively approves it. By deleting this provision, Franklin Templeton secures the right to render effective the filing of the Registration Statement automatically upon fulfillment of all other conditions. This development positions Franklin Templeton as one of the most ambitious asset managers to file for a crypto ETF amid the current market flow. It replicates an approach that Bitcoin and Ethereum ETF issuers previously adopted, expediting approvals and listings when the 8(a) clause was removed. The timing of this change is crucial. Analysts say it betrays a confidence that the SEC will not register additional complaints against XRP-related products — especially as the market continues to mature and regulatory infrastructures around crypto ETFs take clearer shape. For Franklin Templeton, which manages assets worth more than $1 trillion globally, an XRP ETF would be a significant addition to its cryptocurrency investment offerings. The firm already offers exposure to Bitcoin and Ethereum through similar products, indicating an increasing confidence in digital assets as an emerging investment asset class. Other asset managers race to launch XRP ETFs Franklin Templeton isn’t the only one seeking to launch an XRP ETF. Other asset managers, such as Canary Funds and Bitwise, have also revised their S-1 filings in recent weeks. Canary Funds has withdrawn its operating company’s delaying amendment and is seeking to go live in mid-November, subject to exchange approval. Bitwise, another major player in digital asset management, announced that it would list an XRP ETF on a prominent U.S. exchange. The company has already made public fees and custodial arrangements — the last steps generally completed when an ETF is on the verge of a launch. The surge in amended filings indicates growing industry optimism that the SEC may approve several XRP ETFs for marketing around the same time. For investors, this would provide new, regulated access to one of the world’s most widely traded cryptocurrencies, without the need to hold a token directly. Investors prepare for ripple effect on markets The competition to offer an XRP ETF demonstrates the next step toward institutional involvement in digital assets. If approved, these funds would provide investors with a straightforward, regulated way to gain token access to XRP price movements through traditional brokerages. An XRP ETF could also onboard new retail investors and boost the liquidity and trust of the asset, similarly to what spot Bitcoin ETFs achieved earlier this year. Those funds attracted billions of dollars in inflows within a matter of weeks, a subtle indication of the pent-up demand among institutional and retail investors. The SEC, which has become more receptive to digital-asset ETFs after approving products including Bitcoin and Ethereum, is still carefully weighing every filing. Final approval will be based on full disclosure, custody, and transparency of how pricing is happening through the base market. Still, market participants view the update in Franklin Templeton’s filing as their strongest sign yet that they are poised. With a swift response from the firm and news of other competing funds, this should mean that we don’t have long to wait for the first XRP ETF — marking another key turning point in crypto’s journey into traditional finance. If you're reading this, you’re already ahead. Stay there with our newsletter.
Share
Coinstats2025/11/05 09:16