Skip to content

DOCSP-31596: improve slow operation faq #728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 4, 2023
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 26 additions & 11 deletions source/faq.txt
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ will close the socket. We recommend that you select a value
for ``socketTimeoutMS`` that is two to three times as long as the
expected duration of the slowest operation that your application executes.

How Can I Prevent Sockets From Timing out Before They Become Active?
How Can I Prevent Sockets From Timing Out Before They Become Active?
--------------------------------------------------------------------

Having a large connection pool does not always reduce reconnection
Expand Down Expand Up @@ -267,25 +267,40 @@ are some things to check:
How Can I Prevent a Slow Operation From Delaying Other Operations?
------------------------------------------------------------------

You can prevent an operation from slowing down your application by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

I think this "preventative" FAQ entry is trickier than others to answer, and maybe the prior entry can benefit from this suggestion too, since it is also a "preventative" one.

Perhaps a better structure for presenting this information could be:

  • Help the reader understand whether their use case matches the problem this entry is meant to solve. This is especially important in a preventative as incorrectly assuming it applies to their case could lead to investing time and getting no or negative results.
    • I could be wrong, but I think that would include describing how to determine how other operations are suffering delays because one or more slow operations are either taking up all the connections so that the next operation has to wait, or that limited concurrent operations can run because some of the connections are being used by one or more slow operations. This might involve connection monitoring, since it's difficult to tell how many connections any given operation might be taking up.
  • Direct the reader to potential solutions to slow operations (e.g. Analyze Performance ) since making the operation quicker is probably a better solution.
  • For readers that cannot avoid the situation in which they need to run slow operations and have other operations running on the same connection pool/MongoClient that are getting delayed, provide the advice currently listed:
    1. increase max connections
    2. waitQueueTimeout (currently seems to be missing specifics on reason why you might use this)
  • Help the reader understand whether it was successful. In this case, maybe that means seeing (through monitoring) that a slow operation doesn't impact the running time of the other operations.
  • Optional: direct users to the Developer Forums for specific situations that this doesn't solve.

specifying the size of the connection pool, which is the cache of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue:

I think "cache" is more of a memory storage/lookup paradigm and is dissimilar from the functionality of a pool, which needs to manage the usage lifecycle of the resources. A cache manages storage and memory allocation.

Suggestion:

  1. I think it's more accurate to use a description of a connection pool that does not state it is a different paradigm.
  2. I think "increasing" matches the guidance that follows. If the guidance happens to include "decreasing" in certain cases, this could be "tuning" instead (and that specific advice should be added).
Suggested change
specifying the size of the connection pool, which is the cache of
increasing the size of the connection pool, which is a group of

connections that the driver maintains at any time. By increasing the
maximum connection pool size, you can reduce application latency.

To control the maximum size of a connection pool, you can set the
``maxPoolSize`` option in the :ref:`connection options
<node-connection-options>`. The default value of ``maxPoolSize`` is
``100``. If the number of in-use connections to a server reaches
``maxPoolSize``, the next request to that server will wait
until a connection becomes available. To prevent long-running operations
from slowing down your application, you can increase ``maxPoolSize``.
until a connection becomes available. The following code sets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question:
From reading this FAQ entry (especially from the waitQueueTimeout section), I thought that the issue was that the client couldn't open enough sockets/connections to send commands, but in the wording of this sentence, it sounds like the server has to wait because there aren't enough connections/sockets to send the response of a command back to the client. Which one is correct or is it both?

Whatever the answer is, could this relationship be clarified in the introductory paragraph?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed one of the sentences to reference both possible solutions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my response below, just removing the second strategy

``maxPoolSize`` to ``150`` when creating a new ``MongoClient``:

.. code-block:: js

const client = new MongoClient(uri, { maxPoolSize: 150 });

The driver does not limit the number of requests that can wait for
sockets to become available. Requests wait for the amount of time
specified in the ``waitQueueTimeoutMS`` option, which
defaults to ``0`` (no limit). You should set this option if it is
more important to stop long-running operations than it is to complete
every operation.
You can also limit application latency by specifying the amount of time
(in milliseconds) that requests wait for socket availability. The driver
does not limit this wait time by default, but to specify a value, you
can set the ``waitQueueTimeoutMS`` option. You should set this option if
it is more important to stop long-running operations than it is to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question:
Could you elaborate on how this setting stop long-running operations? I thought it just limits the amount of time an operation waits for the next available connection. If a long-running operation already has the connections it needs, wouldn't it only affect other operations that need connections? And if those operations hit that limit without getting a connection, do they throw exceptions (I think maybe they throw a WaitQueueFull exception or something like that)?

I think it would be helpful to the reader to explain or link to this info since it may not be an appropriate solution depending on their use case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original wording here was vague when referencing "long-running" operations. Maybe the original intention was to say that having a lot of operations waiting in the queue can also cause latency? I think it could be useful just to strike this section from this entry.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ccho-mongodb this is the response I left for why I removed the section on waitQueueTimeoutMS from the answer. In my research I did not find that this setting stops long running operations, but just clears out the queue when operations have been waiting for too long (as determined by waitQueueTimeoutMS)

complete every operation. The following code sets ``waitQueueTimeoutMS``
to ``100`` when creating a new ``MongoClient``:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:
I think it could to explain what happens when an operation cannot get a connection/socket that it needs by this amount of time so that the reader can understand the tradeoffs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not applicable as I removed the section

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: was this section intentionally removed? If so, what were the reasons for removing it?


.. code-block:: js

const client = new MongoClient(uri, { waitQueueTimeoutMS: 100 });

.. tip::

To learn more about connection pooling, see :ref:`How Does Connection
Pooling Work in the Node Driver? <node-faq-connection-pool>`.
To learn more about connection pooling, see the :ref:`How Does Connection
Pooling Work in the Node Driver? <node-faq-connection-pool>` entry on
this page.

How Can I Ensure My Connection String Is Valid for a Replica Set?
-----------------------------------------------------------------
Expand Down