Transaction fees play a fundamental role in the proper functioning of the Bitcoin protocol. These fees act as an economic incentive for miners, who compete to create a block — which includes transactions — and compute a hash of that block that meets the minimum required difficulty for it to become a valid block. This process is what we refer to as “mining a block,” and the miner who successfully mines a valid block receives the current block reward of 3.125 BTC as in 2025, plus all the fees from the transactions included in that block.

Since block size is limited, miners tend to prioritize transactions with higher fees, as this allows them to maximize the revenue they earn.

In this context, every time a user wants to send a transaction, they must define how much they are willing to pay in fees for the transaction to be accepted and confirmed in the next block. If the user sets the fee too low, the transaction may take a long time to confirm, ormight never be included in any block and eventually be discarded.

Currently, the Bitcoin protocol offers two alternatives to solve this issue: Child Pays For Parent (CPFP) and Replace-by-Fee (RBF). Both can help a transaction get included in a block sooner by increasing the total fee a miner would earn for including it.

RBF allows a pending transaction to be replaced by a new one that pays a higher fee, thereby increasing its chance of being confirmed quickly.

Bitcoin Privacy problems

  • Although Bitcoin is often perceived as an anonymous system, its architecture does not guarantee privacy by default. Every transaction is publicly recorded on the blockchain, which makes it possible to analyze them and establish potential links between addresses. The outputs of a transaction — which are later used as inputs in another transaction (known as UTXOs) — must be spent in full. This usually means that payment transactions typically include a change output that returns BTC to the sender.
  • By analyzing a transaction, it is often possible to identify which output was the payment (sent to a third party) and which was the change (returned to the sender). If the change output is detected, it becomes easier to identify which address belongs to the sender, thus compromising the traceability of their activity on the blockchain.
  • There are heuristics to detect change outputs, such as comparing output values, address types, or analyzing typical transaction construction patterns used by wallets. These techniques can put user privacy at risk.
  • In this context, it’s important to consider how the RBF may contribute to the detection of change outputs.

What is RBF?

Replace-by-fee is a mempool policy which allows the transaction to be replaced by a new one that shares at least one of the same inputs, but pays a higher fee. When this replacement occurs, both transactions (the original and the replacement) never coexist in the mempool. Once the new one is accepted, the old one is removed. RBF policy has been in use for some time, but there are different ways to apply it:

  • First-seen-safe RBF: This variant allows a transaction to be replaced if the new one pays higher fees and has the same outputs as the original transaction. This was proposed in the early days due to concerns (later disproven) that RBF could facilitate double spending attacks.
  • Opt-in RBF: This variant only allows a transaction to be replaced if the original transaction has set a specific field indicating that it can be replaced. The downside of this method is that the user must know in advance whether they might want to replace the transaction. For this reason, many users would generally mark transactions as replaceable just in case they needed to replace them later.
  • Delayed RBF: This variant allows a transaction to be replaced after a certain number of blocks have been mined.
  • Full RBF: This variant allows any transaction to be replaced, as long as the new transaction pays at least the total fees of the original transaction(s) plus an additional fee sufficient to cover its own relay cost. A node replaces a transaction when it detects that a new transaction uses at least one of the same inputs and pays higher fees. Currently, in the version of Bitcoin Core (v28.0), Full-RBF is applied to all transactions. This means that transactions do not need to be explicitly marked as
    replaceable
    — they can be replaced directly. The mempool detects a replacement when a new transaction tries to enter using an input that is already being spent by another transaction in the mempool. When this happens, it checks that the new transaction exceeds the feerate (BTC/byte) and the total fee (i.e., the total BTC paid in fees).

Output change detection with RBF

When using Replace-By-Fee (RBF), the replacement transaction is typically very similar to the original one — the only major difference being the increased fee. If someone is monitoring the mempool and capturing all broadcasted transactions, they could obtain both the original and the replacement transactions.

Since these two transactions are almost identical, a malicious observer could compare them to identify which output is the payment and which is the change. This would compromise the sender’s privacy by revealing the address that belongs to them.

Let’s explore some common scenarios where RBF can undermine the sender’s privacy:

  • Same Number of Outputs:
    • Two Outputs:
      • This is the simplest case. The presence of only two outputs makes it easier to identify the change (since one of the outputs is the payment and the other one the change). To detect the change output, we compare the outputs of the original transaction and its replacement: if one of the outputs is identical in both versions, we assume it is the payment output; the other one, which changes in value, is considered the change,
        as it reflects the adjustment associated with the difference in fees.
      • In the first example (TX1 and TX2), we observe that one of the outputs remains the same, while the other decreases in value. This behavior is consistent with the increase in fees in the replacement transaction: if the inputs are the same, a higher fee can only be reflected by a reduction in the change output.
      • In the second example (TX3 and TX4), the same comparison pattern occurs, but in this case, the output that changes increases in value. This may happen if one of the inputs was modified, so that, despite the higher fee, the increased input value offsets the difference and causes the change amount to grow.
      • This type of analysis is especially reliable in transactions with two outputs, as there is only one possible payment output and one change output, eliminating ambiguity.
    • More than two outputs:
      • This case is a slight evolution of the previous one.
      • Here, we analyze transactions with more than two outputs, but the underlying idea remains the same: we compare the replacement transactions to nd one output that has a different value. This output, which differs, is identifed as the change.
      • However, unlike cases with exactly two outputs, we cannot guarantee that the remaining outputs are all payments. While we can assume the output identifed as change is indeed the change, we cannot rule out that some of the others might also be change. Generally, we can assume that unchanged outputs correspond to payments, but we must also consider the possibility that some wallets intentionally create multiple change outputs to make detection via heuristics more difficult.
      • In the first example, we observe that all outputs remain unchanged except one, which decreases in value. Following the same logic as in previous cases, this output is considered the change.
      • These are the most basic cases, and also the ones that provide the strongest guarantees when it comes to correctly identifying a change output.
    • To test whether these common scenarios were truly effective at detecting the change output, we developed a program to capture real transactions in the mempool that had been replaced. Later, we searched for transactions that followed these patterns and attempted to deduce the change output. These were the results:
    • The previous scenarios represents 76% of all captured transactions. This indicates that, although it is the easiest case for identifying the change output, it is also the most common.
    • As shown in Figure 7, when the transaction has only two outputs, the change output is detected in 88.6% of the cases. In this scenario, we can be confident that the identified change output is correct, since the other output must necessarily correspond to the payment.
    • In transactions with more than two outputs, the results remain positive, but it’s important to note that we cannot always be certain that the identified output is the only change output present in the transaction.
  • Different number of outputs
    • This case presents a greater difficulty compared to the previous ones. Until now, the comparison between different versions of a replaced transaction was relatively simple, as the number of outputs remained the same across versions. However, in this scenario, each replacement may contain a different number of outputs, which significantly complicates the task of identifying the change output.
    • In these cases, we do not have many clear patterns that allow us to reliably identify the change. The only pattern we’ve observed with some consistency is characterized by the appearance of an additional output in each replacement. We observe that, except for one, the rest of the outputs maintain the same value across versions, allowing us to assume they are payment outputs. The output that does change in value is split into two new outputs.
    • Following the example we see that from transaction TX1 to TX2, output 12 is split into two new outputs, 22 and 23. The sum of these two new values is almost equal to the original value of 12, taking into account the increased fee associated with the replacement. This suggests that output 12 was most likely the change output. The same pattern repeats in the step from TX2 to TX3.
    • This behavior can help us identify payment outputs (those that keep a constant value), but when it comes to identifying the change output, it always leaves us with at least two possible candidates. Without the aid of additional complementary heuristics, it is not possible to determine with certainty which of the two outputs is actually the change. Furthermore, it is important to keep in mind that we cannot assume there is only one change output.
  • Other heurisitcs
    • We have seen the high effectiveness in detecting the change output in transactions that use the Replace-by-Fee (RBF) policy. This leads us to ask the following question: is this effectiveness really as high as it seems, or is it just common for change outputs to be so easily identifiable? To answer this, we have compared the results obtained with heuristics for detecting change outputs.
    • To make this comparison rigorous, we focused exclusively on simple transactions with two outputs. This choice is due to the fact that it is the simplest and most reliable scenario. Moreover, in the most basic case—two outputs—we can assume with a high degree of confidence that the output identified as change by RBF is correct, which facilitates comparison and validation of the heuristic results.
    • For each RBF transaction chain, we analyzed only the final transaction (i.e., the version that ultimately ends up on the blockchain). This approach was chosen because heuristics work with the confirmed transaction, and do not take into account previous versions that were replaced.
    • The results obtained from this analysis are shown in the following table:
      | | Total RBF Transactions | 36,819| | RBF | 33,419| | Optimal Change Heuristic | 23,119| |Heurı́stica Address Type Consistency | 24,353| |Heurı́stica Roundnumber Payments | 18,452|
    • We now can confirm, that the RBF herusitics is much supperior to the classic heuristics used to detect the output change.

How can we imporve our privacy?

  • Now that we understand the privacy issues associated with using RBF, is there any way to avoid them? Fortunately, there are a few strategies that can make this type of analysis more difficult, thereby making change detection more challenging.
  • The first strategy is to create a replacement transaction with a different number of outputs compared to the original transaction. Although some information could still be inferred, this variation complicates the comparison between outputs across versions.
  • We also observed that many transactions reuse the same addresses across all replacements. This is a critical issue, as it makes it much easier for a malicious actor to identify the change output.
  • Finally, if avoiding change detection is very important for you, you could include more than one change output. This adds ambiguity and ensures that not all of the change can be confidently traced back to you.
  • By following these steps, you can enhance your privacy—though it’s worth noting that these methods typically result in higher fees due to increased transaction size.

Leave a Reply

Your email address will not be published. Required fields are marked *