The goal of decoy packets in general is offering the option to implementations of hiding any pattern that can be inferred from the sizes of TCP/IP packets being sent back and forth.
This concern applies to all 3 phases of the protocol. During the key exchange phase, a (limited) form of size pattern hiding can be accomplished through the garbage mechanism. During the application phase, but also during the version negotiation phase, this is done using decoy packets – a much more powerful mechanism than garbage, but only possible after keys have been exchanged.
It is fair to say that right now, with the version negotiation phase just consisting of a single message in both directions, this does not matter. In both directions, instead of a decoy message before the version packet, a decoy message could be sent immediately after it (making it part of the application phase), or garbage could be sent before it instead.
However, in hypothetical extensions, the version negotiation phase could consist of multiple messages too. If there are at least 3, it may not be possible to avoid a recognizable pattern of sizes in the middle ones. Having the decoy mechanism generically available means future extension designers do not have to worry about it.
Furthermore, design wise, I don’t believe it complicates implementations much necessarily, but of course that depends on implementation aspects. Think of a BIP324 connection as having a few states it progresses through:
- Sending/receiving public key
- Sending/receiving garbage + garbage terminator
- Sending/receiving packets
During the 3rd state, all communication is done in the form of packets using the keys negotiated before, decoys and all. The first (non-decoy) packet(s) negotiate the version, and everything after is treated as application layer. I think this makes sense, as version negotiation needs something packet-like anyway, so it might as well use the full packet interface anyway.
But it does add a bit of complexity to implementations
Certainly having support for decoy packets as a mechanism in general adds complexity to implementations which cannot be avoided if compatibility with the specification is desired. I don’t think having it additionally available during version negotiation adds much complexity on top.
given the lack of bounds on the decoy packets.
One cannot avoid supporting packets up to ~4M as that is how large application-layer BLOCK
messages are allowed to be. I believe the Bitcoin Core BIP324 implementation uses that as packet size limit for all packets, including decoys.
I noticed Core has not supported sending decoy packets in the handshake since adding BIP324 support in v26.0, but is this a worthy feature of the protocol that will later be implemented?
It is not very high on my personal priority list, but I do plan to work at some point on adding support for sending decoys to the Bitcoin Core BIP324 implementation, in particular for transactions (whose relay sizes do reveal some information). It would be great if someone worked on this.
My guess is to guard against “known plaintext” attack
A known plaintext would permit an attacker to guess the key stream coming out of the used ChaCha20 cipher, though there are no known ways to exploit that to achieve anything useful (like decrypting other messages, or forging messages). This was not a specific consideration when designing BIP324 or its decoy packet mechanism, as far as I recall.
Disclaimer: I am a co-author of BIP324, and am likely biased about the design decisions we made in it.