Day 46 of 60: Queue sort strategies

Posted by nik on August 24, 2006

I’ve been looking at different queue sort strategies to see what their overhead is. Since all the messages are going to be delivered to a single host these results aren’t necessarily going to be indicative of what you would see on a production server. However, they should serve to illustrate any inherent speed advantages of one sort strategy over another.

Read on for the resuls.

Continue reading…

Day 43 of 60: Multiple queues, multiple queue runners (pt 3)

Posted by nik on August 21, 2006

It’s definitely a bug.

Specifically, in the default case, and contrary to the documentation, sendmail does not run one queue runner for every queue directory. It runs precisely one. I brought this up on the Sendmail mailing list, sendmail-2006@support.sendmail.org. The most recent message in that discussion (at the time of writing) follows.
Continue reading…

Day 41 of 60: Multiple queues, multiple queue runners (pt 2)

Posted by nik on August 19, 2006

Day 38 of 60: Multiple queues, multiple queue runners (pt 1)

Posted by nik on August 16, 2006

I’ve started to get data about the effect of multiple queues with multiple queue runners.

As before I’m using 1, 5, 10, 20, 30, and 40 queue directories, and I’m instrumenting with queue-run-duration.d. This time I’m starting queue runners with the command sendmail -q30s. This will cause Sendmail to create a new queue runner to process the queue every 30 seconds.

The problem is that, even with 30,000 messages, Sendmail can process the whole lot in about 40 seconds, which doesn’t give enough time for more than two queue runners to start.

So I’m using the -w option to smtp-sink(1) to insert a 1 second delay at the DATA stage. So (roughly) the first 30 messages go through at the rate of one per second. Then a second queue runner starts, and messages go at the rate of 2 a second, and so on.

But it’s still slow going.

As I write this it occurs to me that I could use DTrace to induce this slowdown, by using chill() to have the process pause for a tenth of a second at the start of every job run. That’s something I may look at later. As Thursday is my normal day in London, look for updates on Friday.

Day 38 of 60: Multiple queues, one queue runner

Posted by nik on August 16, 2006

Today I’m looking at the results that I’ve obtained from the latest round of tests. These tests used sendmail -q to deliver 30,000 messages to a different zone. There were 10 runs to each test, and the different tests collected data on timings for 1, 5, 10, 20, 30, and 40 queue directories.

Continue reading…

Day 37 of 60: Instrumenting queue processing time

Posted by nik on August 15, 2006

Previously I’ve written about variables that may affect how rapidly Sendmail can process the mail queue. I’ve now started working to gather data on exactly how much influence these variables have.

Continue reading…

Day 33 of 60: Strategies for processing the queue.

Posted by nik on August 11, 2006

Note: If you’re not familiar with sendmail queues, the sendmail queue primer I wrote might be useful.

There are two aspects of mail queue management to consider with Sendmail. The first is the process that puts messages in the queue. I’ve looked at that in some detail already, and written a number of D scripts that should make it easy for you to instrument Sendmail on your production systems so you can decide how best to layout your queue directories for optimal inbound performance.

The flip side of the coin is to try and answer the question How do you maximise delivery from the queue?” This is a more complex question to answer, as the number of variables that you can control that affect this is much larger. Also, there’s more variability when delivering mail, as you are at the mercy, to some extent, of each remote site — how fast they process mail you send them, whether or not they’re actually up, how much latency there is between you and them, the speed of DNS lookups, and so on.

So, what can we test?

Continue reading…

Day 32 of 60: Complete instrumentation of queue creation

Posted by nik on August 10, 2006

Or: “How do I use DTrace with programs that fork?”

With some help from the dtrace-discuss[1] mailing list I’ve now written a couple of D scripts that can trace what Sendmail is doing between probe points. There’s a writeup, and sample output, below the fold.

[1] Note — the forum archive doesn’t seem to link to the discussion yet. When it does I’ll update this link to point to the discussion. The subject was “Using pid provider when process forks”.

Continue reading…

Day 31 of 60: Queues and connections

Posted by nik on August 09, 2006

Back on day 28 I looked at the effect of multiple queue directories with concurrent senders.

These results showed that there was considerable benefit with 10 senders and 10 queue directories. The benefit going to 20 queue directories with 10 senders was negligible.

At the time I wondered whether this was a general rule — i.e., is anything more than 10 queue directories overkill? Or is there a correlation between the number of queue directories compared to the number of simultaneous sending systems.

Continue reading…

Day 30 of 60: What are the single queue directory bottlenecks? (pt 2)

Posted by nik on August 08, 2006

Having established that there’s a significant increase in the amount of taken by the fdsync() and open() system calls when Sendmail creates queue entries with a single queue directory I’ve set about tracking down what that bottleneck is.

Continue reading…

Day 29 of 60: What are the single queue directory bottlenecks?

Posted by nik on August 07, 2006

Earlier posts have shown that using a single queue directory imposes a significant bottleneck when processing concurrent connections with Sendmail. Yesterday I posed some questions, and today I’ve started work on answering the first one.

The first question was:

What is responsible for the dramatic slow down in the single-queue case (test 4)?

Continue reading…

Day 28 of 60: Instrumenting Sendmail queue file creation (pt 4)

Posted by nik on August 06, 2006

Yesterday I looked at the effect of multiple queue directories when processing messages over a single connection.

Today I’ve been looking at how multiple queue directories can help when processing concurrent connections.

The methodology was identical to the previous tests. The only change was to the smtp-source(1) command line. The previous tests were run with -s 1, indicating one concurrent connection. These tests were run with -s 10, to force 10 concurrent connections.

Continue reading…

Day 27 of 60: Instrumenting Sendmail queue file creation (pt 3)

Posted by nik on August 05, 2006

I’ve commited the first sets of results to the repository in the aptly named results/ directory.

To refresh your memory, the question I intended to answer was:

does the number of queue directories (on a single disk) make a significant impact on the time taken to create new entries in the queue?

They’re quite surprising.

Continue reading…

Day 27 of 60: Instrumenting Sendmail queue file creation (pt 2)

Posted by nik on August 05, 2006

It’s time to run an instrumented Sendmail, throw some messages at it, and see how it performs. Specifically, does the number of queue directories (on a single disk) make a significant impact on the time taken to create new entries in the queue?

Continue reading…

Day 26 of 60: Instrumenting Sendmail queue file creation (pt 1)

Posted by nik on August 04, 2006

I’ve (finally) got Sendmail built, zones configured, DTrace working for functions declared static, and a mechanism for creating test SMTP sessions.

So it’s time to start putting this together, instrumenting Sendmail, and seeing whether or not I can use this to prove (or disprove) some common advice given when configuring Sendmail.

First, I’m going to look at queue directories.

Continue reading…

Day 26 of 60: smtp-source

Posted by nik on August 04, 2006

It’s the school holidays, and my two children have had friends staying over this past week, which meant that there hasn’t been much opportunity to work on this project, and even less opportunity to write about it. So these next few posts are going to be something of a catch up.

I’ve previously documented running Sendmail in a zone. One of the things that I need for testing Sendmail is a source of messages, and an easy mechanism to get them to Sendmail over SMTP.

Continue reading…


Close
E-mail It