Campaign sending cycle gets into race condition - process runs multiple times per minute instead of every 20-30 minutes

Hey all,

I have run across this problem on a couple of rare occasions. It has happened across many different versions, but most recently it was 3.5.2.

The situation is that the campaign processor normally runs once every 20-30 minutes, sending in batches of about 50. The issue is that a hiccup or bug develops in the cron job, and the processing script starts to run multiple times per minute. It doesn’t start sending out emails too fast, but the processor runs way to often, and my inbox gets flooded with hundreds of emails instead of just 25. In the most recent case, it was running 3-4 times per minute, instead of the normal once every 20-30 minutes.

The fix is relatively easy - I just go into the Admin panel, ‘Suspend’ the sending of this Campaign, wait a minute or two, and then ‘Requeue’ the Campaign.

Here is a snapshot of my email inbox, with the message timestamps showing. You can see the starting point, the first processing message is normal, but then it rapidly progresses into trying to send things 3 times / minute. After a few minutes, I did the stop/restart trick, and the cron returns to running at it’s normal spacing.

Screen Shot 2020-04-16 at 14.14.04

The contents of some of the emails is like this (these are 4 back-to-back emails, in the sequence I got them) :
1)

The following events occured while processing the message queue:
[Tue 14 Apr 2020 15:57] [<ipaddress>] This batch will be 9 emails,
because in the last 600 seconds 39 emails were sent
[Tue 14 Apr 2020 15:59] [<ipaddress>] batch limit reached: 9 (9)
[Tue 14 Apr 2020 15:59] [<ipaddress>] Script stage: 5
[Tue 14 Apr 2020 15:59] [<ipaddress>] 9 messages sent in 112.50 seconds
(287 msgs/hr)
[Tue 14 Apr 2020 15:59] [<ipaddress>] 1 failed (will retry later)
The following events occured while processing the message queue:

[Tue 14 Apr 2020 15:59] [v] This batch will be 1 emails,
because in the last 600 seconds 47 emails were sent
[Tue 14 Apr 2020 15:59] [<ipaddress>] batch limit reached: 1 (1)
[Tue 14 Apr 2020 15:59] [<ipaddress>] Script stage: 5
[Tue 14 Apr 2020 15:59] [<ipaddress>] 1 messages sent in 12.50 seconds
(287 msgs/hr)
The following events occured while processing the message queue:

[Tue 14 Apr 2020 15:59] [<ipaddress>] This batch will be 1 emails,
because in the last 600 seconds 47 emails were sent
[Tue 14 Apr 2020 15:59] [<ipaddress>] batch limit reached: 1 (1)
[Tue 14 Apr 2020 15:59] [<ipaddress>] Script stage: 5
[Tue 14 Apr 2020 15:59] [<ipaddress>] 1 messages sent in 12.50 seconds
(287 msgs/hr)
The following events occured while processing the message queue:

[Tue 14 Apr 2020 15:59] [<ipaddress>] This batch will be 1 emails,
because in the last 600 seconds 47 emails were sent
[Tue 14 Apr 2020 16:00] [<ipaddress>] batch limit reached: 1 (1)
[Tue 14 Apr 2020 16:00] [<ipaddress>] Script stage: 5
[Tue 14 Apr 2020 16:00] [<ipaddress>] 1 messages sent in 12.51 seconds
(287 msgs/hr)

And then, after the restart…

> [Tue 14 Apr 2020 16:30] [CL] Started
> [Tue 14 Apr 2020 16:30] [CL] Sending in batches of 48 messages
> [Tue 14 Apr 2020 16:30] [CL] This batch will be 24 emails, because in the
> last 600 seconds 24 emails were sent
> [Tue 14 Apr 2020 16:30] [CL] Processing has started,
> [Tue 14 Apr 2020 16:30] [CL] One campaign to process.
> [Tue 14 Apr 2020 16:30] [CL] Processing campaign 353
> [Tue 14 Apr 2020 16:30] [CL] Looking for subscribers
> [Tue 14 Apr 2020 16:30] [CL] Found them: 809 to process
> [Tue 14 Apr 2020 16:35] [CL] batch limit reached: 24 (24)
> [Tue 14 Apr 2020 16:35] [CL] Script stage: 5
> [Tue 14 Apr 2020 16:35] [CL] 24 messages sent in 300.01 seconds (287
> msgs/hr)
> [Tue 14 Apr 2020 16:35] [CL] 2 failed (will retry later)

PHPList runs rather well 98% of the time, so I don’t think it’s my configuration. And no changes to the configuration were made before, during, or after this problem happened.

So… is it a bug? How can I help narrow down what might be causing this?

Hmm… as I was editing this, I noticed that the content of the emails, during the problem, has an IP address in the email content, whereas the emails after I requeue it has just the string “CL”. I assume that means command-line. But… what does that indicate to me, as far as troubleshooting this goes? I am actually not the one that uses the system, it is my boss - I just administer it; she is the one logging into the page and composing the campaigns, and starting them up. Is it possible that she has started the processing in an incorrect manner somehow?

Thanks,
J

@J8334SWC If you are using a cron job to process the queue then you should not also use the browser interface. You can disable that in config.php

define('MANUALLY_PROCESS_QUEUE', 0);

I think that you have hit near the center - it is something in how my boss starts the process… because it works fine 98% of the time. After creating a Campaign, the only option is “Place Campaign in Queue for sending”… after which, it shows this page:

Screen Shot 2020-04-16 at 15.09.27

Is it the “Process Queue” button that should NOT be pressed at this point, right? So maybe sometimes she is pressing that button, when she should just be navigating away, and waiting. This webpage gives an “Do you really want to leave this page…?” warning if you click on something else - which I would think to lead someone to try and stay on the page, and to click the button presented.

@J8334SWC If you do have a scheduled cron job then there is no need for that button to be displayed. The config setting that I mentioned will stop it being displayed.

The warning about leaving the page is a bug that has not been addressed. It has nothing to do with processing the queue.

I was wondering if there was a way to hide that button. :slight_smile:

Thanks for the tips.