Bugfix: don't make collision from "tried" a feeler #19906

naumenkogs · 2020-09-07T10:41:54Z

The purpose of feelers is to move addrs from "new" to "tried."
We should not select a node evicted from collisions, as they are already "tried".

If we're going to connect to a feeler (as decided by a timer), we should not "waste" this iteration on connecting to a collision. The collisions belong to the "tried" table, so connecting to them don't accomplish the goal of feelers.

sdaftuar · 2020-09-11T00:05:14Z

Can you explain a bit more what this PR is trying to accomplish? Is this just about naming conventions, or something else?

naumenkogs · 2020-09-11T04:11:35Z

@sdaftuar does the first commit message not sound clear?

If we're going to connect to a feeler (as decided by a timer),
we should not "waste" this iteration on connecting to a collision.
The collisions belong to the "tried" table, so connecting to them
don't accomplish the goal of feelers.

We are supposed to attempt a feeler "transition" every 2 minutes. But instead, when a timer hits, we might connect to a tried address (from tried collisions), instead of a feeler. And the next feeler will be attempted after 2 more minutes.

I don't think it's exploitable: an attacker can't force us to do this all the time. It's just currently logically incorrect.

sdaftuar · 2020-09-11T16:38:50Z

Got it, thanks for clarifying -- I didn't realize that the concern was around the frequency that we pull things out of the new table to potentially move them to the tried table.

I think the underlying question here is how often we want to be making outbound connections to the network (this is a question I was thinking about in the context of #19858 as well). Both uses of FEELER connections today (moving things from new to tried, as well as the test-before-evict logic of resolving tried-table collisions) have the same property that their connections last a very short time. So I think that we could ask ourselves, given a certain rate at which we're comfortable making new outbound connections, how many of those should go to the test-before-evict logic, and how many to the logic that is pulling entries from the new-table?

In practice, it seems to me that tried-table collisions are relatively infrequent, probably less than a few times per day on nodes I've looked at, if I remember right. So the effect of this on the new-table FEELER logic is low. On the other hand, if it were to have a big effect somehow, then changing the logic so that we still make feeler connections to new-table peers at the once-every-2-minute rate will mean we're connecting to the network much more (assuming that we always make the test-before-evict connections when we need to).

So my inclination is that we don't need this change? Or if we do want it, we should just put a rate limit in place so that we can reason about what bounds are achievable in the event of tried-table-collisions.

BTW, the logic presented in this PR is incorrect, I think, because it only tries to resolve tried-table-collisions when we'd make a non-FEELER connection. However, once we have all our outbound peers in place, we never try to make new non-FEELER connections (well, at least until #19858), so I think this basically breaks the test-before-evict logic as drafted now.

sdaftuar · 2020-09-11T16:41:36Z

src/net.cpp

+            // It's much less relevant for feelers, because it's a frequent procedure
+            // to sanitize AddrMan (not critical), and selecting from AddrMan is
+            // sufficiently (for this case) random/difficult-to-Sybil due to the bucketing.
+            if (conn_type != ConnectionType::FEELER) {


It's rare that we get to this line of code -- on startup, when we have not enough outbound peers, we'll hit this. But once we have made our initial 10 outbound connections, then at line 1916 above we'll abort the loop iteration, never getting to this code.

So what would happen is that our FEELERS generate tried-table-collisions which never get resolved, eventually resulting in the old tried table entry getting evicted without an attempt having been made, breaking test-before-evict.

Also, making the test-before-evict connections no longer be FEELER breaks the logic in net_processing around disconnecting these peers right when we connect.

sdaftuar · 2020-09-11T16:42:03Z

src/net.cpp

+                addr = addrman.SelectTriedCollision();
+                if (!addr.IsValid()) {
+                    // If no collisions, consider an address from the "tried" table.
+                    addr = addrman.Select(true);


I think true means pull from the new table, not the tried table, so this line and the one below at line 1946 are backwards.

naumenkogs · 2020-09-14T09:41:43Z

We currently have the following documentation:

    /**
     * Feeler connections are short lived connections used to increase the
     * number of connectable addresses in our AddrMan. Approximately every
     * FEELER_INTERVAL, we attempt to connect to a random address from the new
     * table. If successful, we add it to the tried table.
     */
    FEELER,

It's not accurate since test-before-evict is not about new->tried, and it's not mentioned here. I'm pretty sure it may be inaccurate in other places as well, I didn't even know that this is done via feelers before you told me so here.

Considering what's above, this PR attempted (not so well yet) to separate these two uses, and make feelers do new->tried exclusively.

Now that I know that this other case was actually intended (at least by the author), there are 2 paths we may take:

update the documentation to mention that case as well (+ maybe don't reset feeler timer when resolve collisions, related to Better intervals between feelers #19869)
OR have a new conn type

@sdaftuar what do you think? Perhaps I'll sketch what would the latter option look like in the code later.

sdaftuar · 2020-09-14T13:19:45Z

Now that I know that this other case was actually intended (at least by the author), there are 2 paths we may take:

update the documentation to mention that case as well (+ maybe don't reset feeler timer when resolve collisions, related to Better intervals between feelers #19869)

OR have a new conn type

I agree that it's confusing that the term "feeler" has encompassed both behaviors (perhaps @EthanHeilman has a view as well?) In Eclipse Attacks on Bitcoin’s Peer-to-Peer Network, the "feeler" and "test-before-evict" countermeasures do not seem to use overlapping terminology, so my suggestion is that our code be clearer about mapping to those countermeasure names as well.

My suggestion would be that we rename ConnectionType::FEELER to something like ConnectionType::ADDRMAN_HELPER or ConnectionType::TEST_CONNECT or something to indicate either that the connection is short-lived, or just for the addrman, or both. And then document in our code that we use this brief-addrman-helper conn_type for both the feeler and test-before-evict peer selection logic.

I think we could also rename the feeler timer to be the "addrman-helper-timer", and differentiate the two behaviors in code comments better, if that would make the logic clearer.

I don't think it makes sense to add a second conn_type that has exactly the same properties as the first, just with a different name. Perhaps if we thought those behaviors might diverge in the future, it might make sense; but I don't anticipate that myself.

naumenkogs · 2020-09-15T08:05:58Z

closing in favour of #19958

naumenkogs added 2 commits September 7, 2020 13:18

Skip selecting collisions in feeler iterations

a0d8cff

If we're going to connect to a feeler (as decided by a timer), we should not "waste" this iteration on connecting to a collision. The collisions belong to the "tried" table, so connecting to them don't accomplish the goal of feelers.

Refactor code around feeler connection

0c3e80d

fanquake added the P2P label Sep 7, 2020

naumenkogs closed this Sep 7, 2020

naumenkogs reopened this Sep 7, 2020

sdaftuar suggested changes Sep 11, 2020

View reviewed changes

naumenkogs closed this Sep 15, 2020

bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bugfix: don't make collision from "tried" a feeler #19906

Bugfix: don't make collision from "tried" a feeler #19906

Uh oh!

naumenkogs commented Sep 7, 2020

Uh oh!

sdaftuar commented Sep 11, 2020

Uh oh!

naumenkogs commented Sep 11, 2020

Uh oh!

sdaftuar commented Sep 11, 2020

Uh oh!

sdaftuar Sep 11, 2020

Uh oh!

sdaftuar Sep 11, 2020

Uh oh!

sdaftuar Sep 11, 2020

Uh oh!

naumenkogs commented Sep 14, 2020 •

edited

Loading

Uh oh!

sdaftuar commented Sep 14, 2020

Uh oh!

naumenkogs commented Sep 15, 2020

Uh oh!

Uh oh!

Bugfix: don't make collision from "tried" a feeler #19906

Bugfix: don't make collision from "tried" a feeler #19906

Uh oh!

Conversation

naumenkogs commented Sep 7, 2020

Uh oh!

sdaftuar commented Sep 11, 2020

Uh oh!

naumenkogs commented Sep 11, 2020

Uh oh!

sdaftuar commented Sep 11, 2020

Uh oh!

sdaftuar Sep 11, 2020

Choose a reason for hiding this comment

Uh oh!

sdaftuar Sep 11, 2020

Choose a reason for hiding this comment

Uh oh!

sdaftuar Sep 11, 2020

Choose a reason for hiding this comment

Uh oh!

naumenkogs commented Sep 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sdaftuar commented Sep 14, 2020

Uh oh!

naumenkogs commented Sep 15, 2020

Uh oh!

Uh oh!

naumenkogs commented Sep 14, 2020 •

edited

Loading